| Nov 09, 2017

4 MIN READ

Data Pipeline and Analytics

Polyglot Persistence – Data Architecture for Enterprises?

You must be hearing the term Polyglot Persistence quite often. So I thought, why not take our readers through what exactly it is, the role it plays and more. Let’s start with the basics – Polyglot Persistence is the idea of using various, right-fitting databases to store data for either single or multiple applications. In the new-age world, most applications need to connect with multiple databases; and each with different types of data stored in different ways. This is where Polyglot Persistence offers its benefits of leveraging the strengths of various database technologies. It needs to be designed for each individual enterprise as it’s unique data architecture.
Till recent past, many enterprises were using the same type of database across enterprise to store business transactions, reporting, user sessions data, BI and log data. With the availability of multiple database technologies they have a choice and hence are opting for a polyglot environment. Enterprises are exploring to integrate the capabilities of both traditional and emerging technologies to manage enterprise wide data. Each database technology offers a  specific value add in the enterprise setup. For example: Store and online e-commerce company may want their offline store data and online data in the same place. Here, NoSQL can be used as a back-end data store for all their store POS and purchasing data from online website.
Use of NoSQL has led to simplifying interaction of applications with database. For example, if you have MongoDB and Angular.js the whole stack uses JSON objects which in turn makes interaction of application more effective with database. Also, as it supports javascript application can be written in one language for both server side and client side execution. This rise of adoption of NoSQL has made enterprises to look at polyglot persistence. Polyglot persistence can be very well applied to SQL, NoSQL, RDBMS or hybrid database.
To look into more details, let us consider an online hotel booking or e-commerce site for example:- The user session data, shopping cart data, or order data may not need availability, consistence or backup features. Let’s take this example further to apply the polyglot persistence approach to see how different data stores can be leveraged. For user sessions where the considerations are more of Rapid Access for reads and writes, not much needs to be durable. A key-value data store can be best fit. Also, key-value data store could be a right fit for shopping carts where the considerations are High availability across multiple locations. Key-value database make sense here since the shopping cart is usually accessed by user ID. Similarly, session data is keyed by the session ID. For all the financial transaction data where transactional updates and ACID compliance is very critical RDBMS is the right fit data store. For product catalogue, where the considerations are – lots of reads, infrequent writes. Products make natural aggregates. Document database could be the right fit. For recommendation engines, graph database is more relevant. For reporting, RDBMS or column database could be the best fit. For Analytics, purpose column database could be leveraged. Refer to the diagram here.
polyglot
For a scalable architecture, other applications in the organization should be able to use the same data from these database. As in our above example, the graph data store can serve data to other applications for further product or customer analytics. In order for multiple applications to leverage the data from the available data store, enterprises are building service layers around these data store. This way, instead of every application talking independently to the database it talks via service layer through the APIs.
While all these considerations sound good, there are challenges that needs to be considered while designing and deploying polyglot persistence. Firstly, as multiple technologies are used, the DBA team needs to gear up for new technologies – how to monitor, how to take backup, how to take data out of and inject into these systems.
Security is another important area of consideration. Most of the NoSQL databases have different levels of security features, that’s because they are designed to operate differently. Many enterprises have Data Ware House systems, Analytics systems which fetch data from these databases (which are part of a polyglot deployment). For this, they will have to find the right ETL tools which can move data from these databases.
Polyglot Persistence is all about using various database technologies to handle different types of data store needs. One can deploy Polyglot Persistence across the organization or for a single application. While Polyglot Persistence is a way forward, designing it also needs expertise. You need to factor in multiple things like – how to decide the fitting database technology or how to maintain data sync across various databases, etc. It is going to be an interesting time!!
Sandeep Khuperkar I Director and CTO, Ashnik


Sandeep is the Director and CTO at Ashnik. He brings more than 21 years of Industry experience (most of it spans across Red Hat & IBM India), with 14+ years in open source and building open source and Linux business model. He is on Advisory Board of JJM College of Engineering (Electronics Dept.)  And visiting lecturer with few of Engineering colleges and works towards enabling them on open source technologies. He is author, Enthusiast and community moderator at Opensource.com. He is also member of Open Source Initiative, Linux Foundation and Open Source Consortium Of India. 



Go to Top