You are looking for a Database, not a Swiss knife!
Sameer Kumar I Senior Solution Architect, Ashnik
We keep having so many conversations with our customers, technology associates and regional partners across various geographies, and no discussion is really different. It commonly revolves around the need of different types of databases and the different data-models.
For starters, an ISV vendor who is currently using a noSQL store to store unstructured data for events form sensors (in JSON format); wanted to know if that can actually be merged with relational data and stored together in a relational database. I was discussing with an architect from a popular cloud provisioning company, he asked and I quote “Why MongoDB, when you already have PostgreSQL?” Then this afternoon, I was catching up with my counterpart in one of our principle companies and he asks "What is the difference between MongoDB and Couchbase? Not technical differences, but from the customer’s demand point of view? I’m asking you because Ashnik partners with both”.
Then, another customer whom I met in Malaysia exclaimed, “Wow! You guys carry so many open source database technologies, so what are we discussing today?”
While they all seem to be intrigued by the fact that Ashnik carries so many technologies especially when it comes to databases; they all actually have one doubt – why does Ashnik have to work with so many technologies? Well, it is not just because we want to diversify our business footprint. In fact, that is the least of the reasons. What we have realized from our past discussions with customers is that databases are not meant to be Swiss Knives. Actually, you can really model your data the way you want it and fit it inside any database. But as time passes, the maintenance and management of it would outweigh the benefits of sticking to (and having to learn) one technology.
One of the examples (my favourite) which I often state is, if you’re looking at surfing in the sea, you wouldn’t look out for dirt bikes but rather a surfing skate. Well, that was a good example until ROBBIE MADDISON and DC Shoes spoiled it for me. If you have not already seen this 3 minutes 59 seconds video, then please do that. You will see how someone takes a Dirt Bike to Tahiti waves and rides (or shall I say surfs) through them. I was in awe for a few minutes after watching the video. After almost a month, DC Shoes came up with a follow-up behind the scene videos. It was spread across Part 1 (17:51), Part 2 (17:51) and Part 3 (20:31). I was again in awe. To make a video which is just short of 4minutes in length, so much of effort was spent (the making videos put together are a little less than 1hour). Obviously a lot of training, customization and practice went into this and yet it could have gone wrong. During the making you would see that there were so many risks they took and overcame. Though I learnt that one can ride/surf waves but, one has to be an expert and has to put in a lot of effort in engineering the bike and equipment. And with all these it is not what one would want to do every day, it is not scalable. Did I divert? Not actually, read on.
Getting back to the question I started with, what was my reply to each of them? So I explained to the ISV vendor that they need to evaluate the growth rate for the event’s data for various segments of implementations. If they anticipate that for larger customers they would be storing events at a very high rate or if the volume would grow too huge, it is best to stick with a noSQL database which allows distributing the data across few servers. To the architect from Cloud vendor, I described the exact need to carry both MongoDB and PostgreSQL – because there are customers who need to use a specialized noSQL store for sharding the data e.g. customer who want to archive the data or those who need high volume writes spread across multiple servers. To my counterpart from one of our principle company, I simply explained that it’s actually the technical aspects which differentiates MongoDB and Couchbase. Couchbase and MongoDB, both are noSQL databases capable of storing JSON data but they have very different schema modelling and scaling architectures. To answer which one to choose, you have to get into many aspects including environment, the team handling the setup, their preferences and scalability requirements. So it’s a call we need to take basis the need of the hour. Can Postgres be used to achieve the same kind of scalability as MongoDB? Probably. Can you replace your MongoDB setup with Couchbase? Possibly. But those are not the questions you should be asking. What you really should be asking is –
- What is my data looking like
- What model best describes the data that I am capturing
- What is the rate at which my data would grow
- What is my data volume going to be after a few years
- What is my data retention policy
- What is the kind of queries that I will run on this data
- What are my hardware and deployment considerations
- How does hardware considerations match with my scalability requirements
I could see that most of the aspects which DC Shoes covers in their making video, is also applicable while choosing a database. You can optimize, tune and may even customize your database to store and retrieve anything and everything. You can model your data and the relationship among them as per the database storage/model. But what are the chances that you will get it right? What are the risks of getting it wrong? Even if you got it right, will it hold good for the next few years of your operations?
To the customer in Malaysia who said we have so many OS technologies, I replied, “yes, but what we will be discussing is what your database needs are”. We help our customers identify the right technology for them. Usually end-users think that databases serve the purpose of storing the data and it can store anything and everything. They treat their favourite database platform just like a Swiss Knife and assume it would be the best choice for storing every data model their new project needs. Honestly, that’s an unfair assessment. Therefore, we help answer the questions that matter so that the customer can zero in on the fitting database platform for their project. We go beyond providing consultation for the right database platform, by helping them implement the setup and to decide effective maintenance and management policies for the database. If you find yourself in the dilemma of deciding the right database platform or are amidst a debate to migrate from one platform to other, talk to us. We might just have the answer for you.
- Sameer Kumar is Database Solution Architect working with Ashnik. He has worked on many complex setups and migration assignments for some of the key customers from Retail, BFSI and Telecom Sector. Sameer is a certified PostgreSQL and EDB Postgres Plus Advanced Server Professional. He is also a certified Postgres Trainer and has delivered many trainings for public and corporate batches. He is well versed with other RDBMS e.g. DB2, Oracle and SQL Server and is also trained on noSQL technologies viz MongoDB. He has worked closely with customer and helped them build analytics platform on noSQL databases and migrate from RDBMS to MongoDB. And while he’s in the free mode, he loves to take his cycle around Singapore for a spin.
- Would PostgreSQL leak your data? No! But YOU might!
- Integrating Docker Trusted Registry with Google Chat
- Chaos Engineering with Docker EE