- SG: +65 64383504
- IN: 022 25771219
- IN: 022 25792714
- IN: +91 9987536436
9 tips to build a web centric, high performance application – A DB Architect’s Perspective
Sameer Kumar I Senior Solution Architect, Ashnik
Lately, I have been little busy travelling around SEA and India, working with customers and SIs who are designing new age web-scale applications. These applications include stock trading platform, payment gateway, mobile banking solution, telecom VAS solutions, real time health/activity tracking system, e-commerce delivery platform etc. Not only these systems are mission critical or business critical, they share few more common traits. The requirements and design of these applications has a lot of resemblance and similarities. Below are some of the common characteristics that applies to any new enterprise application in today’s age, irrespective of the domain and it’s purpose-
1. They all would be public facing and available via multiple touch-points to the end users e.g. mobile app, web etc
2. They all will start with low traffic/user hits/user requests and will eventually grow to one of the busiest systems within the enterprise
3. These systems would have a lot of cross interaction with existing applications and/or with some of the applications being planned in next one year time frame
4. These system needs to be modular so that a particular functionality can replaced without impacting others
During discussions with the application and infrastructure team we came up with some of the generic points which are applicable in most of the situations while developing a web-scale and user facing system. I would mention some of the learnings I had and some of the recommendations we came up with during these various engagements-
1. It is best to keep a modular design with interfaces and contracts between different modules. These interfaces and contracts will help you minimize the changes and impact of changes done in a module. For example, if you always consolidate all your database interactions as a data access API, it will help you avoid impact of database changes. The data access API can be designed to return data to application as JSON or XML. This will abstract the application completely from the changes in database design, normalization and even database platform. Moreover all your database touch points are in one place which makes it easier to do a code refactoring while changing database design. Similar abstraction can be achieved at each layer namely data access, application, web and user interface.
2. While accepting user requests it is better to queue up the requests using some kind of messaging queues instead of keeping the user waiting while the server processes the request. Similarly for DB requests, instead of allowing application to make a lot of connections it is better to allow them to queue the requests. This also allows better resource utilization and less context switches. With PostgreSQL database you have options of tools like pgpool and pgBouncer to facilitate connection queueing.
3. It is often better to breakdown the application in a few pieces called modules (smaller sub-applications) or microservices which can function with other microservices/modules e.g. you can replace and change your customer registration module without impacting other modules which depend on customer registration to fetch customer details or enrol new customers. The interaction between these modules can be done via Web Service calls which facilitates interfacing and contracts between these sub-applications/microservices.
4. It is practically not possible to scale-up with single disk or single server, as the amount of writes increase. You should plan to deploy your application architecture in such a way that entities with high write transaction can be spread out when needed. EnterpriseDB’s Postgres Plus Advance Server facilitates Hash Partitioning which can help you achieve spreading the writes across different disks . Alternatively you can use a noSQL database like. MongoDB for sharding the entities with large write operations across multiple servers. In fact now hybrid application architecture is not rare, where people distribute some entities in a noSQL store and some are kept in relational database.
5. Many requirements which are complicated or requirements which are not clear needs a highly normalized schema design. This allows a flexibility to change the relationship hierarchy or introduce new concepts later on. A typical example of this would be application rules. You might end up storing rule metadata information across 3-4 tables and joining/co-relating those in real time can pose a major challenge, even at smaller scale. A good approach would be to use application based rule engine where rules can be specified in XML or JSON document and cached in application server. Similarly contact details is another such entity which has changed a lot in last few year and would certainly continue to change. It is better to store such information in noSQL database or flat data-types e.g. JSON or Array. PostgreSQL offers many such different data-types, extensions and indexes to enable you to store your data in best way.
6. Cache your data in application server. There is nothing new about caching, you might be already caching static data. But there is a lot of data which you are not caching. You can actually cache some of the operational data which gets updated in real time. In event of a crash, this data can be rebuilt/synchronized from transaction or activity log. I will be adding more on this in one of my next blogs.
7. Achieve higher concurrency with optimistic locking instead of using traditional way of “SELECT FOR UPDATE”.
8. Now a days, usually in most applications, data is first read and then updated. So while there is read load because of enquiries and searches, all the write operations also generate read request on the database. E.g. before registering a new user you would check if that user already exists. Creating read replicas and load balancing the read queries across several replicas can be quite useful. PostgreSQL provides steaming replication as an easy way of creating read replicas. Using the same feature one can also build a scalable and highly available cluster with PostgreSQL. Similar replication features exists in MongoDB as well with flexibility to specify read preference of from where to get the data.
9. Usually while designing system which involves financial transactions or inventory management e.g. ecommerce site, mobile banking, mobile bill payment or a simple mobile-wallet app you will have a major bottleneck in terms of liabilities or inventory management. Typically there is one row per item or a row for internal account. This usually becomes a point of contention and bottleneck. See if this can be avoided with some proxy or pseudo entities. I will explain more about this in one of my next blogs.
You can write to us at email@example.com if you are looking for consulting and development inputs in developing and deploying a system with high throughput requirements. Meanwhile stay tuned I will be getting back in 2016 with elaborated blogs on some of the above recommendations.
– Sameer Kumar I Senior Solution Architect, Ashnik
Sameer Kumar is Database Solution Architect working with Ashnik. He has worked on many complex setups and migration assignments for some of the key customers from Retail, BFSI and Telecom Sector. Sameer is a certified PostgreSQL and EDB Postgres Plus Advanced Server Professional. He is also a certified Postgres Trainer and has delivered many trainings for public and corporate batches. He is well versed with other RDBMS e.g. DB2, Oracle and SQL Server and is also trained on noSQL technologies viz MongoDB. He has worked closely with customer and helped them build analytics platform on noSQL databases and migrate from RDBMS to MongoDB. And while he’s in the free mode, he loves to take his cycle around Singapore for a spin.