How to Improve Search capabilities using Elasticsearch
Monika Agrawal | Solution Consultant, Ashnik
Today, I’d like to share one of Ashnik’s recently proposed solutions for an insurance customer based in Asia. Our team has been working on putting together this solution framework with the help of Elasticsearch. To give you a quick background of the customer, they are the largest, pan-Asian life Insurance groups and are present in 18 markets across APAC. The conglomerate offers full-scale insurance and financial services, wealth planning and investment management services.
In the customer’s setup, the customer was ingesting data from ODS (SQL database) to NoSQL database. All the web as well as mobile applications were connected to this NoSQL database for searching the relevant data as per their business requirements. Data pipeline used an ETL tool to aggregate data from SQL CDC, and then got loaded into a message queue in JSON format and further got published into the various assigned queues.
NoSQL database had a connector which subscribed these published queues and loaded documents into the NoSQL database for further search operation. Sharing a diagram below, for an easier understanding.
Let me share with you what challenges did the customer face. In their existing setup, the Data ingestion from MS-SQL (RDBMS) database to the NoSQL database used to take approximately 2-3 days. Because, data was first extracted in JSON format and then loaded using a manual process in the NoSQL database.
- Any changes made to the data (for eg: update, delete and insert), to get it captured in the NoSQL database would take approximately 24 hours.
- One of the key reasons for this delay in NoSQL database update was due to the time consumed in data ingestion and queue mechanism which computed and sent records one by one to the NoSQL database.
- Client was unable to get the relevant search results e.g. search as you type capability on their business data.
So, this is what Ashnik proposed to them – a technology framework comprising of ELK stack – Elasticsearch, Logstash and Kibana for data ingestion, transformation and publishing the latest data into Elasticsearch database. Logstash was used for data capture from the source systems, transformation and loading into Elasticsearch. To implement this, Elasticsearch was used to gather all the policies, agents and customer data in one place and act as the single point of exposure for all kind of searches through the Kibana portal.
For all users who have access to the portal, it was critical that the system is able to look up for 300-400K agents, 5 million policies and 10 million customer information quickly, while ensuring that the information is propagated to the index in less than 10 seconds after any record has been updated in the source systems and the search is carried out in less than 1-minute’s time.
Also, sharing the proposed setup diagram to give you a clearer understanding.
The proposed solution sees many benefits:
- With the new Elastic stack solution, ingestion of data from ODS (MSSQL database) as a source to Elasticsearch takes less than an hour for full load.
- Changes in data at source e.g. update, delete and insert gets captured in Elasticsearch in less than 5 minutes, which before used to take couple of hours to complete.
- The client was not only able to get enhanced search capabilities – as Elasticsearch is one of the best text search engines – but also the most relevant search is now possible. For example, search as you type, score based, along with benefits like fuzziness, synonyms, facet search etc.
- Ashnik’s team also leveraged the capabilities of Elastic stack, by building customized N-gram analyzer and suggestions on search results.
- Elastic stack provided the capability of building real-time dashboards and visualizations for business analytics purpose.
- The client is also planning to explore Graph (feature of X-Pack) the recommendation engine, machine learning capabilities of Elastic stack for anomalies detection, moving forward.
Most financial institutes have a core RDBMS system and they have peripheral applications like search, analytics, DW, reporting applications. Now, the customer can leverage Elastic stack as one single platform to sync this RDBMS data in real-time and perform new age application search, business analytics, real time reporting, machine learning and more. There are several potential areas to scale up using the Elastic platform.
- Monika is a Solution Consultant with Ashnik, Singapore. She is an experienced professional in data mining methodology and integration tools with more than 8 years of experience in Business Intelligence, Integration Services and Database Engineering. Having worked for some of the global companies, she has been instrumental in leading and executing multiple projects and POCs in the capacity of product consultation, analytics and solution designing.
- ELK powers the Search for Bank Service Monitoring and Anomaly Detection
- How to Improve Search capabilities using Elasticsearch
- How to Monitor at Scale in Elasticsearch?