No Comments

Did PostgreSQL fail to live up to Uber’s and Gitlab’s expectations? Not really!

Sameer Kumar I Senior Solution Architect, Ashnik
Singapore, 17 Feb 2017
Big Data Talk

by , , No Comments

17-Feb-2017

PostgreSQL is trending once again but this time, for not so good reasons. There was a major downtime at Gitlab caused by a series of events and process failures. As you would read in the details, PostgreSQL was (and continues to be) Gitlab’s backend database. It was PostgreSQL database service which was unavailable for several hours, and whose recovery lead to data loss. This is second time in recent times when PostgreSQL has been in lime light for negative publicity. Earlier Uber published a blog about why they decided to switch from PostgreSQL.

One of my very good friends asked me, “is PostgreSQL not able to live up to the recently attained popularity and expansion among users?” That made me ponder upon how people have perceived both of this news. Only select few have tried to really dig into what Uber said and if those reasons are relevant to their environments. Even fewer tried to read beyond the words “outage” and “PostgreSQL” in case of Gitlab news. Let me take you beyond the news for both these incidents –

Why Uber moved out of PostgreSQL?

I am a big fan of right tool for a workload, but I feel Uber could have been more verbose and clear in their explanation. From my understanding and the available information, what Uber is today doing with MySQL does not really require any heavy lifting that a full-fledged RDBMS is designed to do. That is one of the reasons why a tailored RDBMS, as I usually refer to MySQL with its concept of storage engines in place, suits them better. If you read their post and several other Uber engineering posts carefully you will notice that they are using a custom API on top of database (which is rather being used as a data-store) for all the CRUD. I am wondering their key reason behind using a DBMS at all. Here, the important take away is – it was not mainly due to the lack of PostgreSQL but because they didn’t need a RDBMS which does heavy lifting. There are many balanced blogs and reactions from PostgreSQL users. Particularly many good comments and inputs on community thread, or this blog by Markus Winand. Nonetheless, there are some key lessons for PostgreSQL to learn from Uber’s experience. The feedback was well received. What I liked the most is – PostgreSQL developers stepped up accepting what is needed and pointing out what Uber might have missed seeing in Postgres. Robert Haas, a major contributor for PostgreSQL project wrote a very thorough response and gave out some ideas. Humility exhibited by PostgreSQL developer community was stunning. In Uber’s criticism, they saw an opportunity to improve. As concluded by Markus Winand –

“ It seems that the article just explains why MySQL/InnoDB is a better backend for Schemaless than PostgreSQL. For those of you using Schemaless, take their advice! Unfortunately, the article doesn’t make this very clear because it doesn’t mention how their requirements changed with the introduction of Schemaless compared to 2013, when they migrated from MySQL to PostgreSQL. Sadly, the only thing that sticks in the reader’s mind is that PostgreSQL is lousy.“

Ashnik brings many different database and data-storage solutions to its customers – PostgreSQL, EDB Postgres Advanced Server, MongoDB, Couchbase and Elasticsearch. We also work with our customers who use database as a service on AWS – RDS, PPCD or Aurora. We many times face situations where customers want us to compare databases or give them a matrix of features. I typically try to avoid providing that information. Instead I spend time on understanding the problem and then I advise them the right solution or couple of solutions with pros and cons. There have been occasions when I have suggested a Postgres solution over MongoDB – even for a high volume, high write system.

If you look at another piece of news very recently about Gitlab’s Postgres DB outage, it will clear more air. Lets’ fundamentally look at it.

Why Gitlab’s PostgreSQL database had an outage?

I must say that media tracking IT news certainly did justice to the news in case of the recent Gitlab incident. The headline pretty much summarizes what happened. What was surprising to me was getting news feed on Twitter, LinkedIn and Facebook from advocates and users of proprietary database products – news feeds pointing fingers at PostgreSQL. Being a PostgreSQL fan and an advocate of Open Source, it was not just surprising but also bothersome. If you read the Gitlab’s own investigation you will see that the failure happened because of obsolete practices, untested backups and human error. Nothing, and mind it NOTHING can safe guard you against all these evils at once. Certainly, there are some learnings for Gitlab team and they are exploring fixes. This is not the first time when someone have had an outage because of their internal practices. A few years ago our team as well salvaged a customer from a similar situation – a situation which should not have occurred if patches were applied timely, backups were tested and disk storage was monitored.

Simon Riggs who is a committer and contributor to PostgreSQL project took time out of his schedule to explain internals details of what failed at Gitlab. That is a very good read to understand what you should and what you should not do. I must say that having a seasoned DBA and good training on your side is certainly virtuous when fighting such evils in critical times.

Ashnik has been providing consultation and training to its customers in South East Asia and India which has helped them avoid such pitfalls and has given them confidence to bank on Postgres with the best practices.

To conclude, I would say if you are a Postgres fan, don’t get disheartened by these stories. There is much more to them than the headlines. PostgreSQL is gaining popularity – not just as per Db-engines.com but also as per Gartner. If you are exploring PostgreSQL for serious workload, don’t let these stories bother you. Instead you should be comfortable knowing that you will be using a quality product developed by a very mature community and vendor – who accepts flaws, improves and fixes bugs more rapidly.

– Sameer Kumar I Senior Solution Architect, Ashnik


Sameer Kumar is Database Solution Architect working with Ashnik. He has worked on many complex setups and migration assignments for some of the key customers from Retail, BFSI and Telecom Sector. Sameer is a certified PostgreSQL and EDB Postgres Plus Advanced Server Professional. He is also a certified Postgres Trainer and has delivered many trainings for public and corporate batches. He is well versed with other RDBMS e.g. DB2, Oracle and SQL Server and is also trained on noSQL technologies viz MongoDB. He has worked closely with customer and helped them build analytics platform on noSQL databases and migrate from RDBMS to MongoDB. And while he’s in the free mode, he loves to take his cycle around Singapore for a spin.


3
0


More from  Sameer Kumar I Senior Solution Architect, Ashnik :
17-Feb-2017