Cross Cluster In Elastic Stack – Use Case & Improvement Recommendations

Cross Cluster in Elastic Stack – Use case & Improvement recommendations

Data Pipeline and Analytics | Oct 16, 2018

3 MIN READ

I wanted to share some recommendations for setting up an Elastic Stack cluster to achieve high availability / DR.

Recently, I came across a situation where a large one organization is building the ELK stack (Elastic Stack) across the data center, the architecture of which is as belowIf you are a System Admin or a DBA, there is nothing wrong in above architecture to achieve DC/DR deployment. But there are some limitations while you are using cross-cluster DC with Elastic Stack. Let’s look at this in detail.
Typically, architecture and design solution in Elastic Stack is considered in a single data center solution and all the nodes of Elastic Search are in single DC. In the above-given solution, all the nodes in ES are spread across two DC. This architecture creates some challenges as mentioned below:

Table of Contents

1. Network Latency:

Network disruption is very common over WAN, especially if DCs/Servers are separately located, physically. Most of the large organizations have dedicated WAN links with very good speed and latency. But Elastic Search is built to be resilient to networking disconnects and that resiliency is intended to handle the exception and not the norm. Latency is very a common problem in WAN network. High network latency may slow the indexing activity in Elastic Search. Elastic Search indexes the data first into primary shards and then replicates to the replica. But if there is latency in the network then indexing is slow and its missing shard (Secondary Replica) is also a very common problem.

2. Unreliable connectivity:

In case DCs loose network connectivity or get isolated for few milliseconds, it is likely that the remote shard may go missing and comes back in a disconnected state.
To sync up this, one needs to sync replica again with primary to provide a consistent result.

3. Master Availability:

In above eventuality, assume that DC1 to DC2 network is down for few milliseconds and Master node1 is currently acting as elected master in DC1, then DC2 eligible master may start electing a new master within DC2. This master electing process starts because DC2 eligible master node is not able to ping the existing master node in DC1 due to network reliability. Even in Elastic Stack which uses Zen protocol for node availability due to network availability, there are chances that it may create an issue. Due to this DC2 may start indexing new data which is inconsistence with DC1. When the link is restored, these nodes will also be pushing data and documents across the network while still handling the full indexing and request load. This necessitates larger or more powerful clusters to ensure enough CPU and IOPS to maintain acceptable performance during such events.
Now, let’s look at the possible solution to achieve cross-cluster in Elastic Stack.

1. Data in both DCs with 2 clusters.

One can configure Elastic Stack with messaging queue like Kafka/Redis MQ etc. Beats can send data to message queue which can replicate at both DCs. From these message queues, local Logstash processes the data and sends it to local Elastic Search cluster. So, each DC will have its own ES cluster. So, indexing of document from relevant queue happens to local ES cluster only.
In case of network down/lost between DCs, it will restore and continue where it left and continue indexing data into local ES cluster.

2. Snapshot and Restore using Curator:

If someone really does not want “Active-Active” cluster at both DCs then one can use curator tool to do a continuous or timely snapshot and restore at another end. One can configure curator to take a snapshot in the interval from DC1 and restore continuously/timely at DC2. Every time when a curator restores the data, it makes sure that only incremental data restores at other end and that is available for search/ DR case.

3. Cross-Cluster Search:

Cross-Cluster Search has recently started receiving support. More details are here. You can search in both DCs as single big cluster but indexing of data can happen locally.
These are some solutions which will surely help achieve high availability / DR in Elastic Stack. You can get in touch with us on success@ashnik.com for further queries.

Quick and Reliable Failure Detection with EDB Postgres Failover Manager

Jul 20, 2020 | 6 MIN READ

Elastic Stack Sizing consideration and architecture

Feb 13, 2019 | 7 MIN READ

How to explore data using Elastic search and Kibana? Part – 1

Jun 18, 2018 | 5 MIN READ

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Talking Open Source Podcast: Demystifying AI For Enterprise - Part 1 Watch Now!

Revolutionize Your CX with
Unified Observability

CloudOps Automation tool for Infrastructure monitoring and deployment.

Indonesia’s top digital credit service provider leverages Ashnik’s PostgreSQL expertise and services

Revolutionize Your CX with Unified Observability

Automate and monitor your PostgreSQL with ease.

The CloudOps Automation Tool for easy Infrastructure deployment and monitoring

Maximize Potential of Your Data with Streaming Data Pipeline Architecture

End-to-End Traceability and Unified Observability for the Modern Infrastructure

Watch: How to auto-scale in deployments using Kubernetes(K8s): A Technical Demo