ElasticSearch as a backend. Why or why not?


If you look at google, you may find a variety of answers: “Analytics Engine”, “Data Store”, “Search Engine”, “Big Data Solution”. Most of these answers leaves you asking for more questions rather than answering what it is. The reality is, it is all of those things. It is very scalable, can take in unstructured data from multiple origins, provides efficient search solutions and much more.

At its core, you can think of Elasticsearch as a server that can process JSON requests and give you back JSON data.

It is becoming more and more popular in the field of data engineering and data analytics. But what about its involvement with the good old field of web development.


Elasticsearch’s versatility is fascinating, among all the things it is capable of, we want to look at it from a web development perspective. Elasticsearch is a NoSQL database. That means it stores data in an unstructured way and that you cannot use SQL to query it. However, unlike most NoSQL databases, Elasticsearch has a strong focus on search capabilities and features, so much so, in fact, that the easiest way to get data from ES is to search for it using the extensive Elasticsearch API.


 In Elasticsearch terms,

 Index = Database, Type = Table, Document = Row.

Index is a collection of documents that have similar characteristics. For example, we can have an index for customer data and another one for product information. A document is similar to a row in a table and is stored as a JSON document. Imagine a e-commerce site where we have a lot of products. An index named products would contain the data inside elasticsearch. Products will be the documents stored inside the index.

Indexing a document


Despite all its features, it is not advisable to use elastic as a primary backend without any kind of backing database for your application. Here are some of the reasons:

  • There could be data loss while dealing with huge amounts of data.
  • Index sizes need to be pre-determined. Schema/Mapping changes require re-indexing. If the data grows in size or evolves and cannot be managed with original sharding or mapping strategies, have to migrate data into newer indexes.
  • Performance is going to be a problem if all data queries need to be served out of ElasticSearch especially if volume of data is huge and all data is being indexed without specific attention paid to the query patterns being used.
  • Also, general operations such as indexing (inserting values) are more expensive compared to other databases.

So then, why all the fuss?

ElasticSearch is not supposed to serve as a primary database, it has a different objective. It must be used as a derived storage system to take advantage of its speed. Here, data in a derived system is the result of taking some existing data from another system and transforming or processing it in some way. If you lose derived data, you can recreate it from the original source. It is also extremely useful when you need real time searches or record displays. 

A search engine and a database do some fundamentally different things. A good search engine such as  ElasticSearch supports far more elaborate and complex indexing, facets, highlighting etc. and you also get replies 'real-time'. A search engine doesn't return every single document that matches your query. Instead, it will score documents according to how much they match, and return the top scoring ones. When you query a database such as MongoDB, you should expect it to return everything that matches your query.

You can store the entire document in ElasticSearch, but it is usually not the optimal solution. Normally you will have it configured to return the document id's, which you use to fetch the document from a database. There are various plug-ins and clients to integrate elasticsearch with Node.js or MongoDB or MySQL. 

With this, we have the answer to our question:

Choose the right tool for the job.

Connect with me on LinkedIn!