How To Explore Data Using Elastic Search And Kibana?

Data Pipeline and Analytics | Jun 18, 2018

5 min read

How to explore data using Elastic search and Kibana? Part – 1

Introduction:

Exploratory Data Analysis (EDA) helps to uncover the underlying structure of data and its dynamics through which we can maximize the insights. EDA is also critical to extract important variables and to detect outliers and anomalies. Even though there are many algorithms in Machine Learning, EDA is one of the most critical parts to understand and drive the business.
In this part of the article, I am going to talk about installation and configuration of Elasticsearch and Kibana with an x-pack basic license, indexing data using python and some use cases of elastic stack’s Graph with sample dashboards.

Elastic Search:

Elasticsearch is a highly scalable open-source full-text search and analytics search engine based on Lucene. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is a RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data, so you can discover the expected and uncover the unexpected.
Some use cases:
1. Movie Recommendation system
2. Loan predictions system
3. Online web store for various products and design recommendation system based on the history of purchase.
4. Price alerting about various products like I am interested in buying some mobile phone and I want to be notified if the price of gadget falls below $X from any provider within the next month.

Installation of Elastic Search

1. Install java, elastic search requires at least java
2. Download the latest Elastic search 6.3.0 tar as follows:

curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.0.tar.gz

3. Then extract it as follows:
tar -xvf elasticsearch-6.3.0.tar.gz
4. It will then create a bunch of files and folders in your current directory. We then go into the bin directory as follows:
cd elasticsearch-6.3.0/bin
And now we are ready to start our node and single cluster using the command:
./elasticsearch
Elasticsearch instance should be running at https://localhost:9200 in your browser if you run with default configuration.
Keep the terminal open where the above command elastic search is running to be able to keep the instance running. you could also use nohup mode to run the instance in the background.
nohup ./elasticsearch &

Kibana:

Kibana is an open source data exploration and visualization tool built on Elastic Search to help you understand data better. It provides visualization capabilities on top of the content indexed on an Elasticsearch cluster. Users can create bar, line and scatter plots, or pie charts and maps on top of large volumes of data.

Installation of Kibana:

1. Download the latest Kibana for Windows () and for Linux()
2. Unzip or Untar the file and open config/kibana.yml in an editor. Set elasticsearch.url to point at your Elasticsearch instance in our case it should be like localhost:9200
3. For Windows Run bin/kibana or binkibana.bat
4. Open https://localhost:5601 which will show you the Kibana UI.
Note: If you are using an x-pack in Kibana then the default username is elastic and password is changeme.
If are not using x-pack, then the Kibana URL(https://localhost:5601) will redirect you on the main page and if you have installed the x-pack, the first page will look like:
(In this article, I am using x-pack).

Creating Dashboards:

A Kibana dashboard displays a collection of saved visualizations.

The first page of Kibana UI

Sample Dashboard:

Indexing data

Elastic Search indexes data into its internal data format and stores them in a basic data structure like a JSON object. Below is the python code to insert datainto ES. Install elasticsearch library as shown for indexing through python.
pip install elasticsearch
Note: The code assumes that the elastic search is running on localhost with default configuration.
Creating Simple Index using Python:
1. Create py file and copy following code.
from datetime import datetime
from elasticsearch import Elasticsearch
es = Elasticsearch() # This line will change for x-pack,
# need to add user name and password.
doc = {
‘author’: ‘tushar raut’,
‘text’: ‘Elasticsearch: ELK stack is cool.’,
‘timestamp’: datetime.now(),

```
}
```

res = es.index(index="test-index", doc_type='tweet', id=1, body=doc)

```
print(res['created'])
```

res = es.get(index="test-index", doc_type='tweet', id=1)

```
print(res['_source'])
```
```
es.indices.refresh(index="test-index")
```

res = es.search(index="test-index", body={"query": {"match_all": {}}})

print("Got %d Hits:" % res['hits']['total'])
for hit in res['hits']['hits']:

print("%(timestamp)s %(author)s: %(text)s" % hit["_source"])

2. Execute the above python script: python index_test.py
3. Go to the Kibana à Management à Index Patterns, there you will see the index which is created in Elasticsearch.
4. Create an index and click on discover menu to see data within that index.

Creating Graph:

There are potential relationships living among the documents in your Elastic Stack; linkages between people, places, preferences, products, you name it. Graph offers a relationship-oriented approach that lets you explore the connections in your data using the relevant capabilities of Elasticsearch.
Graph is an API- and UI-driven tool that helps you surface relevant relationships in your data while leveraging Elasticsearch features like distributed query execution, real-time data availability, and indexing at any scale.
Use cases:
1. Fraud: Discover which vendor is responsible for a group of compromised credit cards by exploring the shops where purchases were made.
2. Recommendations: Suggest the next best song for a listener who digs Mozart based on their preferences and keep them engaged and happy.
3. Security: Identify potential bad actors and other unexpected associates by looking at external IPs that machines on your network are talking to.
Example:
We have a very good example of movie recommendation system. The source and data is available here:
The simple graph is just to recommend movies based on parameters like number of likes for a movie for the respective year.
1. Click on Graph, then select index, click on + icon to select fields.
2. Add some movie name in search bar and click on the search icon.
3. The graph will be shown as follows:

Another example of graph based on security analysis:

So, from the above graph, it becomes very easy to understand which movies are highly liked by people – the movie ”Rocky” was liked by people who also liked Rocky-II, Jaws, and some others. And this way, the graph makes life easy to understand the insights of data by plotting and visualizing using Elastic stack and graph feature.
So, in this article, I wanted to cover the first step of data exploration, i.e. installation and configuration of Elasticsearch and Kibana, indexing data using python module of Elasticsearch. In the next part I’ll go through how to deal with large dataset using python and load that data into Elasticsearch for real-time search and analytics and explore that data with Elastic stack’s Machine Learning feature. Watch this space!
Tushar Raut I Full Stack Developer, Ashnik

Tushar is a Full Stack Developer at Ashnik. One of his key responsibilities is to help design, develop and test integration services for the Elastic Stack. He works closely with Solution Architect teams on the integration and implementation aspects of customer solutions.
He is experienced in working with various technologies like Python, Java, C, Bash Scripting and has also developed tools using Django and Flask Web frameworks.

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Migrating to NGINX Plus Ingress Controller: A Production-Grade Migration Plan

Revolutionize Your CX with
Unified Observability

CloudOps Automation tool for Infrastructure monitoring and deployment.

From Chaos to Control – Transforming Log Management for a Leading Payment Solution Company

Revolutionize Your CX with Unified Observability

Automate and monitor your PostgreSQL with ease.

The CloudOps Automation Tool for easy Infrastructure deployment and monitoring

Maximize Potential of Your Data with Streaming Data Pipeline Architecture

AI Is Not Failing Because of Models. It’s Failing Because of Architecture.

Watch: Building an MCP Server for PostgreSQL: Making Databases Talk to AI

How to explore data using Elastic search and Kibana? Part – 1

Introduction:

Elastic Search:

Installation of Elastic Search

Kibana:

Installation of Kibana:

Creating Dashboards:

Indexing data

Creating Graph:

Read More

Tips on Performance Improvement in Elasticsearch PART – I

Elastic Stack Sizing consideration and architecture

Cross Cluster in Elastic Stack – Use case & Improvement recommendations