Does avoiding UAT environment really save cost for you?
Sameer Kumar I Senior Solution Architect, Ashnik
Recently our team was called upon to salvage a situation wherein the customer was not able to access 2 tables in its production database. As it seemed, those were the most frequently accessed tables for them. Our team went through the diagnostic process and investigated the database logs. We realised that the issue was caused by a bug in the database product itself. We shared the same with database product support team, who confirmed the same. However it turned out that the issue was fixed in one of the patches a few months ago. We also worked out a solution to retrieve the tables which were not accessible.
Now one might think that we had the issue under control with a workaround for retrieving the table and a proper fix for software too. Apparently it was but not in reality. We came to know that there was no UAT environment where the fix and patch could be tested first. If the customer had a UAT environment we could have tested the fix could have got the database available again without disrupting the production environment. So the unavailability of the UAT environment meant that customer had to lose a crucial 2 days of time while the Virtual Machin snapshots were used to build a UAT environment.
That was not the end of it, challenges continued. As part of the resolution, we were to restore the database from one of the recent backups. But again there we found that the backups were failing for a while (due to scarcity of space) and the last good backup could not be restored. We finally had to restore some of the data from a logical dump taken a day before the table corruption and then restore part of the data using PITR from an older backup. Though we were able to get through the issue and retrieve all the data which was there in the tables, it caused a downtime for the customer. Had someone been monitoring the environment carefully to keep tab of the resource requirements, the backups would not have failed. And if there had been a plan to test backups periodically, customer could have saved time which was spent in restoring the corrupt backup.
All these issues could have been avoided if the customer had kept a close tab on patches being released and related fixes and had an UAT environment to apply and test.
Many a times in order to save cost of hardware and software licenses customers choose to create UAT databases on the same server hosting the Production DB. They all understand the importance of testing application patches in UAT and promoting the patches to production. But they miss the importance of testing the Database Software patches in UAT and applying them in production. While you might be able to save some cost by avoiding a proper UAT environment, you may end up with a situation causing downtime
1. Because of impact of a patch which was not tested prior to its application in production
2. Because of an issue whose resolution path first needs to be tested in a test environment
And this is not an isolated example, we come across such situations very often. This leads to me think if there is a scope for offering ‘UAT on Demand Service’ along with ‘Database Software Control’ services. Any thoughts?
Similar to the paradox of application patching and database patching, there lies another learning from this experience. Everyone lays a great emphasis in having a proper backup solution and strategy for taking timely backups. But rarely people spend efforts on testing these backups and ensuring their integrity. Ashnik offers a great deal of customized backup-recovery solutions and services to implement backup solutions which are in-line with your corporate policy – ‘pgBackup Assist’. Further we help you establish checkpoints for ensuring successful backup and help you in automating the restoration using scripts (which can be scheduled and monitored).
These learnings have helped us in strengthening our services offering and solutions. Thereby it helps the customers to solve their challenges and mitigate the risk. Stay tuned as our team brings more stories and learnings from live production setups, meanwhile if you are looking for Oracle Migration Services, PostgreSQL Services or Training get in touch with our team on email@example.com.
- Sameer Kumar is the Database Solution Architect working with Ashnik. Sameer and his team works towards providing enterprise level solutions based on open source and cloud technologies. He has an experience in working on various roles ranging from development to performance tuning and from installation and migration to support and services roles.
- Would PostgreSQL leak your data? No! But YOU might!
- Integrating Docker Trusted Registry with Google Chat
- Chaos Engineering with Docker EE