Backup and restore multiple mongo databases in kubernetes

Erwan Riou
4 min readJan 24, 2021

If you find this article, it’s because you probably, like me, get stuck at how to do it. You probably found a lot of documentations about kubernetes jobs but you want something a bit different.

  1. First Step, with just one database

I will show here a way to backup and restore mongo database from a kubernete pod. First thing first, make sure to launch your project or so (database) and make sure then when you do the command below it display the list of pod currently running.

kubectl get pods

Once you verified that let’s see how backup a single pod where <POD_NAME> is one of the pods matching one of the name that you get when you did the first command on top on this paragraph.

kubectl exec -it <POD_NAME> -- bash -c "cd tmp && mongodump --archive > mongo.dump && exit"

To explain a little this command on top, it access the pod container with kubectl exec -it and the pod ID and then enter a bash inside it. Then in this bash we access the tmp folder that is the default folder where are located mongodb databases and then operate a mongodump that would create a dump archive of the database. Now right now you did a backup of the your database but, its still inside the pod and not accessible in your computer, so let’s see how we could extract it outside.

kubectl cp <POD_NAME>:/tmp/mongo.dump ~/Documents/Backups/mongo.dump

With the command kubectl cp you can copy files from inside a pod to your host and that exactly what we want to do. You can see here that i access again the pod and get the newly created file /tmp/mongo.dump to be copied in my Documents/Backups folder inside my computer. You could choose whatever destination please you in fact.

2. With multiple mongodb databases

Now that we have done a backup with one database, let’s see how we could do it with multiple database.

For that we would need to have a bit of flexibility, such a name pattern in your pods such as mongo or mongodb, or whatsoever. Make sure that all your databases names in kubernetes have a name keyword that other pods would not have because we are going to grep it. The idea here is do basically do the same thing we did below but, in a loop so we would make sure to backup all our databases. The first thing we would need to do is find all the pod that have a mongo databases. There is multiple way of doing it obviously but an easy one is to match a name pattern.

Let’s imagine for one second that in all our pods all our databases have the keyword mongo inside of it, like for example user-mongo or auth-mongo…

In order to grep them we would just need to do:

array=($(kubectl get pods | grep mongo | awk '{ print $1 }'))

This shell command make sure to grep all the pods that have the keyword mongo inside their name (localized in the first column, this is why i use awk). Then we make sure to create an array of this names because we are going to need to map over it.

As you guess the mapping is the easiest part. we just need to reuse what we learn in the first part and apply it inside our mapping:

for KEY in "${!array[@]}"; do
kubectl exec -it ${array[$KEY]} -- bash -c "cd tmp && mongodump --archive > mongo.dump && exit"
kubectl cp ${array[$KEY]}:/tmp/mongo.dump ~/Documents/Backups/${array[$KEY]}.dump
done

Here we are replacing the <POD_NAME> by the ${array[$KEY]} that have been match with the grep command on top of kubectl get pods.

Then we make sure to save the backups inside our computer keeping the same name pattern. So that it’s you could create a shell script using:

# GET LIST OF ALL DATABASES RUNNING
array=($(kubectl get pods | grep mongo | awk '{ print $1 }'))
# MAP OVER EACH DATABASE
for KEY in "${!array[@]}"; do
kubectl exec -it ${array[$KEY]} -- bash -c "cd tmp && mongodump --archive > mongo.dump && exit"
kubectl cp ${array[$KEY]}:/tmp/mongo.dump ~/Documents/Backups/${array[$KEY]}.dump
done

and it would backup all your databases in your computer instantly. Imagine that you could very well update a little the script and save it inside an S3 instead, or anything you would like to.

3. Restore a database

Now that we know how to backup one or multiple databases, let’s see how to restore them.

Nothing more simple, let’s use the mongorestore command instead!

kubectl cp ~/Documents/Backups/mongo.dump <POD_NAME>:/tmp
kubectl exec -it <POD_NAME> -- bash -c "cd tmp && mongorestore --archive < /tmp/mongo.dump && exit"

Here i make sure to use kubectl cp in order to copy the mongo.dump from inside my computer (host) into the pod container at the correct /tmp location. Once it’s done i use kubectl exec -it to enter a bash inside the container and run the mongorestore command with the archive i just copied previously and TADAA, it’s working!

--

--