What Does “Database High Availability” Really Mean?

Database Platform | Jul 30, 2020

4 min read

What Does "Database High Availability" Really Mean?

For many of our customers, High Availability is a key concern. Their architects spend a lot of time in designing and planning for high availability of applications and databases. High availability is important for business continuity. A short downtime can lead to loss of business, therefore this topic needs to be addressed and that leads me to write this blog.
If you Google for High availability, you will find many definitions. One definition from Wikipedia is given below:
High availability (HA) is a characteristic of a system, which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.

Key Principles of High Availability

The following are the key principles of High Availability:

Eliminate any single point of failure: Adding redundancy, so that the failure of any one part of the system does not lead to the collapse of the entire system.
Reliable crossover: In a redundant system, the crossover point itself becomes a single point of failure. Fault-tolerant systems must provide a reliable crossover or automatic switchover mechanism to avoid failure.
Detection of failures: If the above two principles are proactively monitored, then a user may never see a system failure.

EDB Postgres has building blocks for covering all of the above key principles.

Elimination of single points of failover – Postgres supports the following types of physical standbys:
- Cold standby – A backup server that has backups and all necessary WAL files for recovery. This system by definition is not up and running. However, the system can be made available if needed. Mainly we use backup servers and WAL files for creating a new PostgreSQL node as part of disaster recovery.
- Warm Standby – In Warm Standby mode, Postgres runs in recovery mode and receives the updates using archived log files or using log shipping replication of Postgres. In this mode, Postgres is not accepting connections or queries.
- Hot Standby – In Hot Standby mode, Postgres runs in recovery mode and receives the updates using archived log files or using log shipping replication. In recovery mode, Postgres supports connections and read-only queries.

Any of the above can help in eliminating single points of failover. However, depending on the agreed level of performance/uptime, users can choose any one of the above. The most popular standby mode after Postgres 9.0 is Hot Standby.

Reliable crossover – For a reliable crossover, i.e., switching between master and standby(s) node(s), EDB provides a technology called EDB Postgres Failover Manager (EFM). This technology enables automatic failover of the Postgres master node to a Standby node in case of a software or hardware failure on the Master. EFM uses JGroups, which provides a reliable, distributed, and redundant infrastructure without a single point of failure.
Detection of failures – EDB Postgres Failover Manager continuously monitors the server and detects failures. It also executes the failover from the Master to one of the Replicas in order to make the system available for accepting database connections and executing queries. Properly configured, EFM can detect failures, and execute a failover within a few seconds.

Combining all the above can help in achieving High Availability of EDB Postgres within a data center or across data centers. If you are a cloud user, you can have High Availability within a region (across multiple zones) or across the regions (using a backplane network supported by the cloud vendors). For a detailed walkthrough of questions you need to ask when designing Highly available databases, watch our on-demand webinar.

PostgreSQL Database Uptime and Availability

Uptime and availability are generally used as synonymous. To achieve High Availability and maintain the agreed uptime, architects make sure to reduce the outages/downtime.
Service outages come in two main flavors:

1. Planned outages
2. Unplanned outages

Some people refer to them as Scheduled and Unscheduled downtime.

Planned outage/Scheduled downtime – Planned outage/scheduled downtime is a result of maintenance activities, which disrupt system operation and usually cannot be avoided. It might include patches to system software that require a reboot or database restart. In general Planned outage is a result of some logical, management-initiated event.
Unplanned outage/Unscheduled downtime – Unplanned Outage/unscheduled downtime is the result of downtime events due to some physical failures/events, such as hardware or software failure or environmental anomaly. For example, power outages, failed CPU or RAM components (or possibly other hardware components failure), network failure, security breaches, or various applications, middleware, and operating system failures result in Unplanned outage/Unscheduled downtime.

In the above outages/downtimes, the EDB Postgres Failover manager can help in minimizing the downtime. For planned outage/Scheduled downtime, a user/DBA can first patch all the standby(s) and use EDB Postgres Failover Manager perform switchover before patching the master (primary) node.
For unplanned outage/unscheduled downtime, EDB Postgres Failover Manager can detect failures and perform the failover to the appropriate standby, and make it the new master, which can then accept read/write connections and provide database services to the application. EDB Postgres Failover Manager also makes sure that the old master/primary doesn’t come back (after failover) to avoid a split-brain situation.
With EDB Postgres Failover Manager, if an architect wants to reduce the unavailability of their applications, they can also leverage multiple hosts connections of JDBC driver or libpq as given.

postgresql://host1:123,host2:456/somedb?target_session_attrs=read-write&application_name=myapp

The above will make the master/primary failover of Postgres transparent to the application.

Availability Calculation

Availability is usually calculated/expressed as a percentage of uptime in a given year based on the service level agreements. Some companies exclude the planned outage/scheduled downtime based on their agreements with customers on the availability of their services.
The below table shows the translation of five Nines (9) from a given availability percentage to the corresponding amount of time a system would be unavailable.

Availability %	Downtime per year	Downtime per month	Downtime per week	Downtime per day
99.99% (“four nines”)	52.60 minutes	4.38 minutes	1.01 minutes	8.64 seconds
99.995% (“four and a half nines”)	26.30 minutes	2.19 minutes	30.24 seconds	4.32 seconds
99.999% (“five nines”)	5.26 minutes	26.30 seconds	6.05 seconds	864.00 milliseconds

Based on the use cases and service level agreements, EDB has been able to help our customers to achieve five 9s with EDB Postgres.
Want to learn more how to operate Postgres at scale, with flexible deployment options? Check out the EDB Postgres Platform.

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Migrating to NGINX Plus Ingress Controller: A Production-Grade Migration Plan

Revolutionize Your CX with
Unified Observability

CloudOps Automation tool for Infrastructure monitoring and deployment.

From Chaos to Control – Transforming Log Management for a Leading Payment Solution Company

Revolutionize Your CX with Unified Observability

Automate and monitor your PostgreSQL with ease.

The CloudOps Automation Tool for easy Infrastructure deployment and monitoring

Maximize Potential of Your Data with Streaming Data Pipeline Architecture

AI Is Not Failing Because of Models. It’s Failing Because of Architecture.

Watch: Building an MCP Server for PostgreSQL: Making Databases Talk to AI

What Does "Database High Availability" Really Mean?

Key Principles of High Availability

PostgreSQL Database Uptime and Availability

Availability Calculation

Read More

Quick and Reliable Failure Detection with EDB Postgres Failover Manager

Chaos Engineering with Docker EE

How to achieve Durability with PostgreSQL without compromising on Availability