Deploying Database

Is it worth deploying Database or Data Store on Containers?

Written by Ajit Gadge

| Sep 18, 2019

4 MIN READ

I’ve been discussing this million-dollar question with many DBAs, DevOps, Solution Architecting teams since the past few months – what are the real benefits of running databases/data store in containers?
Would like to share with you’ll some of my learnings from the many discussions:

Scalability

Scaling the Database/Data Store is one of the primary challenges we face when the workload is dynamic. You might be aware that horizontal scalability is one of the best ways to go about it. However, we must check if the data store/database is itself horizontally scalable?
Most data stores have Read Scalability, but what about Write Scalability? Most RDBMSs are yet to have the Write Scalability option as of today. There are some databases/data stores with a Sharding feature along with coordinating role/database load balancer, where Write Scalability can be achieved. So, in this case, when running a container itself has complete separate environment, we need to check if scalability at database (Layer 7) is going to achieve running databases/data stores in container form? Even if we have the Sharding concept in database, data needs to be persistent and needs to talk to its master node or other cluster nodes. The containers which are running databases need to persistently talk with other nodes as well. K8 can help here, but it makes the environment more complex while this can easily be achieved by database tools without container.

Ease of Installation and configuration

“How many times do you really need to install database/data store on a production setup in a month?” I’ve repeatedly enquired this with DBAs and DevOps teams and the answers I always get are zero or at max a single digit. Each production environment is different because its workloads are different, so you end up building new images or docker run files every time even when your base image is the same. So, is a Container really useful in this case? If you really have easy-to-install and configure databases/data store, then can we use automation scripts like Ansible for similar activities?

Rollup upgrade and patching

“How many times have you patched or upgraded your database in the last one year?” Again, the answers I got were either zero or once. So, database/data store is not an object in your setup which you tend to upgrade or patch every time, even if you are getting new features and benefits for your application – and that totally depends on the database. Containers are mostly used for the kind of activities where you daily/weekly upgrade your microservices without downtime. I noticed that organizations have blue – green pipeline of deployment and they unfortunately don’t consider database to be a part of that deployment.

Performance

It’s debatable. There is a theory that running database/data store in Container improves performance. Database performance most comes from I/O and Memory. If you are assigning external volume to Container for data store, you depend on persistence storage APIs like ceph or portworkx. Many Database vendors recommend local SSD for better performance and in a Container environment, you end up using external volume. So, you must check how containers with database perform on production workload.

Cost

I remember having an intense discussion with one of my ex-colleagues debating if Containers can save cost. He mentioned that it’s possible to save associated database licensing and VM costs when you use Containers in database.
Now, let’s see if it really saves cost –
Many RDBMS/Data store vendors still do not have a clear policy on licensing in usage of databases in Container. They still end up measuring underline cores. Hence, I’m not sure how do we arrive at a concrete conclusion that it saves costs.
You might save cost if you are using a licensed version of VMware for database deployment against Container. But I see that many vendors still recommend the use of VM with Container (open shift, VRealize). Also, production grade database deployments always prefer to use bare metal instead of VMs. DBAs still prefer this method.
Open source licenses are the same and not going to change for Container as well as VM/bare metal usage.
Again, if you are using Container at production or enterprise level, it is advisable to use the Enterprise Docker version instead of community along with K8 support, which actually will increase the cost.

Portability

One of the main use cases of Container is portability. You can just ship your Container (basically its images) with complete environment to another environment very easily. For example, moving to public cloud from hosted bare metal is easy if your microservices are running in Containers. But the question remains, how many times do you really want to move your production database?  Also, even if you are running databases in Container, it’s mapped to external volume. So, after moving your Container, you need to remap (sometimes rebuild and remap) that external volume in your new environment. Are you ready to take this amount of risk with your production data?

Challenges

There are also some key challenges while running databases in Container such as persistent storage, maintenance of your DevOps environment along with your database maintenance, container security, skills (you need DevOps engineers along with DBAs or the DBAs need to re-skill), and on and on. There could literally be a lengthy coverage on challenges, but that I’ll cover in my next blog :)
However, I could see there are few use cases where you can definitely use database/data store in Container.
1: Test/Dev environment or UAT: Companies with large development environment where the database needs to be deployed every day or every week and offloading this task from the DBAs or DevOps is essential, hence using Containers with database basic images for UAT/Dev makes sense. This environment is typically not in HA form and the data does not require persistence.
2: Public Cloud vendors/SAS based application ISVs: If you have customers who intend to offer cloud services to public or their customers or perhaps they have applications which prefer to offer a SAS based model with very high volume or dynamic deployment, then running a database Container along with an application Docker-run file makes sense.


Go to Top