No Comments

Using ELK Stack for Monitoring JVM at scale

Nikhil Shetty | Database Consultant, Ashnik
Singapore, 21 Nov 2019
Nikhil-FImg

by , , No Comments

21-Nov-2019

When you are tasked to monitor performance of hundreds of Java applications, life can be pretty stressed out. Well, ELK stack comes to your rescue to provide scalable, comprehensive way of effectively monitoring and managing performance of Java applications. In this article, I am sharing my experience of how to leverage versatility of ELK.

Today, countless server-side applications are running on Java. As you might already be aware, Java applications run on JVMs. JVM helps in managing system memory and provides a portable platform to run JAVA applications. The term ‘heap memory’ is used to describe the memory being used in JVM and ‘Garbage Collection’ is used to manage this heap memory.

Mostly, during peak load scenarios, Java applications end up using very high heap memory. This results in garbage collection to run very often to remove all the unwanted objects from the memory, leading to performance slowdown. So, monitoring these applications becomes a priority to keep a check on heap utilization, garbage collection count, how often garbage collector is running, etc.

How to monitor Java applications?

One of the ways to monitor Java applications is to use Jconsole which gives information as shown below:

nikhilB-img1

Java, by default, does not expose its metrics. We need to use Java Management Extensions (JMX). Jconsole uses the information provided by JMX in a more visualized way in the form of graphs for various metrices.

While it seems like a pretty good approach, it can be tedious to monitor thousands of java applications. Also, these tools do not provide flexibility of customization to suite your specific requirements.

Enter ELK stack!

Elasticsearch, Logstash and Kibana (ELK) can be used to monitor java applications for heap usage, gc, thread count, etc.. Organizations are using ELK stack as a solution for different use-cases such as log monitoring, search applications, threat detection, etc. Most of these solutions involve the use of beats to capture data from thousands of sources, publish it onto Elasticsearch and visualize it through Kibana. Logstash is also used, if filtering and transformation of source data is required.

One of the beats called Metricbeat, is by default, used to monitor the various server metrics such as CPU, memory, process-id etc. but this is at the server level and not for Java applications that run on JVM.

Apart from its default functionality, Metricbeat also provides host of modules for monitoring Kafka, Kubernetes, Docker, etc. One such module is the jolokia module which can be used to monitor JVM.

Let us see how we can setup ELK for this

We have a sample springboot application. First, let’s do some configuration at the source. Remember, we are already running our java application with JMX enabled. On top of this, we need to install a jolokia agent through which the Metricbeat jolokia module can capture JVM metrics and send it to Elasticsearch.

There are various ways to run jolokia agents, you can go through them here.

We will be using the method of Java agent for running jolokia agents. Download the jolokia agent (a small jar file) and place it on your server running java application. In our case, we have kept it at /opt/jolokia-jvm-1.6.2-agent.jar. Now run your java application using below parameters:

java -javaagent:/opt/jolokia-jvm-1.6.2-agent.jar=port=7777,host=192.168.183.139 \

-Dcom.sun.management.jmxremote.port=5000 \

-Djava.rmi.server.hostname=192.168.183.139 \

-jar spring-boot-postgres-0.0.1-SNAPSHOT.jar

Note – jolokia agent is running on port 7777 and JMX is enabled on port 5000.

Next, Install Metricbeat, document for which can be found here. Go to the default installation directory /etc/Metricbeat/. Here, you will see a directory called modules.d where all the different modules are present, but they are all disabled by default. To enable the jolokia module, copy the jolokia.yml.disabled to jolokia.yml. We must make some configuration changes in jolokia.yml as shown below:

nikhilB-img2

Note – 192.168.183.139 is the server running java application and jolokia agent. Here we will be collecting metrics information for Memory usage, Garbage collection and Thread count.

We must also configure Metricbeat to send data to Elaticsearch as shown below. In Metricbeat.yml file, enable the output.elaticsearch parameter and provide Elasticsearch IP and port.

nikhilB-img3

Note – 192.168.183.136 is the server running Elaticsearch on port 9201.

Once this is done, start the Metricbeat, login to Elasticsearch and check whether you are receiving index data from Metricbeat.

You can now start creating visualizations and dashboards in Kibana for monitoring JVM. Follow the same process for every Java application and you will have a centralized monitoring server for all JVM monitoring.

nikhilB-img4

Sample custom dashboard for monitoring GC, thread count, memory and CPU usage

Once we have enabled metrics monitoring, comes the question – how do I proactively monitor the issues due to which my heap memory is high? This can be achieved by sending the logs of these JVM applications using filebeat to Elasticsearch and monitoring error statements. Now you have everything in one centralized system leading to faster response times to resolve critical application issues.

As you can see, ELK offers versatility to be used in many such scenario. You can check out experiences of my colleagues in using Elastic in other interesting ways here:

https://www.ashnik.com/tips-on-performance-improvement-in-elasticsearch-part-i/

https://www.ashnik.com/how-to-improve-search-capabilities-using-elasticsearch/

https://www.ashnik.com/elastic-stack-sizing-consideration-and-architecture/

I would be happy to share more details if you want to discuss this use case. You can send email at success@ashnik.com.

0
0

  • Nikhil has joined Ashnik as a database consultant for postgres having more than three years of experience working as a database administrator at TCS and as a senior analyst at Allianz Technology on different technologies such as Postgres, Oracle and SAP HANA.

More From Nikhil Shetty | Database Consultant, Ashnik :
21-Nov-2019