elasticsearch with rails search engine

Elasticsearch with Rails – How to Create a Simple Search Engine in 1 Hour

   Back to list

Have you ever had a problem in your application with a slow search engine? It acts like a simple SQL LIKE query and it’s really slow? Or maybe you want to create an efficient and simple search form on your webpage? If you answered yes to either – this is the tutorial for you!

In this tutorial, I’ll cover how to integrate your Rails application with Elasticsearch and create a simple search engine in 1 hour!

First, let’s cover what Elasticsearch is and how it works. It’s a search engine based on Apache Lucene – an information retrieval software library. It provides distributed full-text search based on an HTTP interface. It’s written in Java and developed by Elastic.co. It’s similar to Apache Solr or Sphinx (maybe you have used one of them before).

It’s also used by a lot of big companies like Facebook, Netflix or Mozilla. You can find out more on their official webpage.

What will we build?

We will build a simple Rails 5 application integrated with Elasticsearch. I’ll cover basic search methods, how to use Elasticsearch and an integration with a database-elastic, how to index records, how to customize the index process and indices structure. We will also build a simple endpoint which will return serialized search results based on a search query. Now let’s get start and write some code!

Basic setup

We’ll start by creating a new Rails 5 app from the scratch. We’ll use PostgreSQL as our database system:

$ rails new elasticsearch_with_rails -d=postgresql

Now, we need to install Elasticsearch. If you have already installed it, just skip this part.

I’ll cover how to install it on macOS using brew manager. If you use Ubuntu, this is a great tutorial which covers an installation process.

To install Elasticsearch, just run the following command:

$ brew install elasticsearch

I’m using Elasticsearch 6.1.1 and this version will be used in this tutorial. If you want to run it simply type:

$ elasticsearch

Just to make sure that everything is ok, go to http://localhost:9200/ and check what you get. Your response should be similar to mine:

{
  "name": "XlePXbs",
  "cluster_name": "elasticsearch_piotrjaworski",
  "cluster_uuid": "omH3-u0OQ3qubIhCgj6rKg",
  "version": {
    "number": "6.1.1",
    "build_hash": "bd92e7f",
    "build_date": "2017-12-17T20:23:25.338Z",
    "build_snapshot": false,
    "lucene_version": "7.1.0",
    "minimum_wire_compatibility_version": "5.6.0",
    "minimum_index_compatibility_version": "5.0.0"
  },
  "tagline": "You Know, for Search"
}

Elasticsearch setup

Once Elasticsearch is ready on your machine, let’s make some cleanups in the Gemfile and add Elasticsearch gem (elasticsearch-model):

source 'https://rubygems.org'

git_source(:github) do |repo_name|
  repo_name = "#{repo_name}/#{repo_name}" unless repo_name.include?("/")
  "https://github.com/#{repo_name}.git"
end

gem 'rails', '~> 5.1.4'
gem 'pg', '~> 0.18'
gem 'puma', '~> 3.7'
gem 'sass-rails', '~> 5.0'
gem 'uglifier', '>= 1.3.0'
gem 'turbolinks', '~> 5'
gem 'elasticsearch-model'

group :development do
  gem 'listen', '>= 3.0.5', '< 3.2'
  gem 'spring'
  gem 'spring-watcher-listen', '~> 2.0.0'
end

There is also an elasticsearch-rails gem but we won’t need it in this tutorial. It contains some useful commands for Rails you can find out more about it here.

Let’s install everything by running bundler:

$ bundle install

By default, Elasticsearch uses http://localhost:9200 as an instance endpoint. If you want to change it, you need to change the configuration in the client setup. Also, you can use AWS Elasticsearch but for this purpose, you need to install a gem:

gem 'faraday_middleware-aws-signers-v4'

Here is an example AWS config. It should be added in an initializer, for example elasticsearch.rb:

require 'faraday_middleware/aws_signers_v4'

Elasticsearch::Model.client = Elasticsearch::Client.new({
  log: true
}) do |f|
  f.request(
    :aws_signers_v4,
    credentials: Aws::Credentials.new(ENV.fetch('AWS_ACCESS_KEY'), ENV.fetch('AWS_SECRET_ACCESS_KEY')),
    service_name: 'es',
    region: ENV.fetch('AWS_REGION')
  )
end

If you want to customize it more, you can find a lot of info and more examples here. There is a lot of options and customizations, so I won’t list them all.

We will use a default localhost configuration but with an extended version – we will display all logs. Add the initializer/elasticsearch.rb:

Elasticsearch::Model.client = Elasticsearch::Client.new({
  log: true
})

Models setup

Elasticsearch is basically used for searching, so we need to create a few models and populate a database with some data. Let’s do it, I want to use simple schema, a book and an author. Author has many books and we will be able to search for an ISBN, a book name, author name etc.

Author model, first name and last name:

$ rails g model author first_name last_name

Book model, ISBN, author_Id, published_at, number of pages and a name:

$ rails g model book name isbn author:references published_at:date pages:integer

Let’s create a database and run all migrations:

$ rake db:create && rake db:migrate

Ok, let’s add a basic Elasticsearch setup to our book class. For now only include two modules, Elasticsearch::Model and Elasticsearch::Model::Callbacks. The first module adds a lot of methods, for example, a search method (Book.search which is also available via __elasticsearch__ namespace – Book.__elasticsearch__.search) which search through all indices to find a result. You can find out more here.

The second module just adds callbacks to our model, basically, when you add a model to a database, remove it or update – Elasticsearch is updated.

require 'elasticsearch/model'

class Book < ApplicationRecord
  belongs_to :author

  include Elasticsearch::Model
  include Elasticsearch::Model::Callbacks
end

It looks more or less like this – see attached logs after updating or deleting a record:

elasticsearch with rails search engine screenshot 1

elasticsearch with rails search engine screenshot 2

Index definition

Maybe you’re wondering what an index looks like? Yeah… we don’t know how a new record will be added and serialized. But what about a situation where a model has any associations? We need to handle it by ourselves. We need to redefine the as_indexed_json method. We want to add some fields, not all, so only – id, name, isbn, published_at, pages and info about an author. We can do it like this:

...

  def as_indexed_json(options = {})
    self.as_json(
      only: [:id, :name, :isbn, :published_at, :pages],
      include: {
        author: {
          only: [:first_name, :last_name]
        }
      }
    )
  end  
end

Indexing

Before you index any records, you need to create an elasticsearch index. You can do it by running:

Book.__elasticsearch__.create_index!(force: true)

Ok, let’s add some dummy records and indices, let’s add these code to the seeds file:

Book.__elasticsearch__.create_index!(force: true)

author1 = Author.create!(first_name: 'John', last_name: 'Black')
author2 = Author.create!(first_name: 'Joe', last_name: 'Doe')

Book.create!(name: 'Ruby on Rails', isbn: '1234', published_at: Date.today, author: author1, pages: 100)
Book.create!(name: 'Ruby', isbn: '2234', published_at: Date.today, author: author1, pages: 200)
Book.create!(name: 'JavaScript', isbn: '3234', published_at: Date.today, author: author2, pages: 300)
Book.create!(name: 'HTML & CSS', isbn: '4234', published_at: Date.today, author: author2, pages: 400)

Let’s populate our database now!

$ rake db:seed

You should see logs from Elasticsearch in your console:

elasticsearch with rails search engine screenshot 3

You can check newly created indices here:

http://localhost:9200/books

If you want to return all records which are added to Elasticsearch, just open this url:

http://localhost:9200/books/_search?pretty=true&q=*:*&size=50

Controller and basic API

Our models and Elasticsearch are ready but let’s add an API endpoint which will return a serialized records based on a query.

Let’s start from adding new routes. Add a new resource to the routes.rb:

resources :books, only: [:index]

How can we return records now? Well, we should use the __elasticsearch__.search. There are a lot of ways to perform a query in Elasticsearch. Basically, Elasticsearch offers tons of different options how to search for a requested result.

It’s API is really, really rich and it could take months to really feel that you know something about the API 🙂

But for now, let’s focus on some basic definitions.

We can define a different search method or search for a request by scanning all fields or use selected fields.

Book.__elasticsearch__.search(params[:query]).results

The first definition defines the most basic search. Scan for results based on all fields.

Book.__elasticsearch__.search(
  query: {
    multi_match: {
      query: params[:query],
      fields: ['name', 'author.first_name', 'author.last_name', 'isbn']
    }
  }
).results

The second implementation is a bit more difficult. It uses multi_match method to search for multi-query fields on the selected, indexed fields.

Book.__elasticsearch__.search(
  query: {
    query_string: {
      query: "java*"
    }
  }
).results

The last one searches for a phrase which matches to everything what start with the “java” word, so it matches to javascript, javatest, javaone, java.

We can also use it (the search method) as Book.search but I prefer to use the __elasticsearch__ prefix just to be sure, that I use a method provided by the elasticsearch-model gem.

Now that you know how to define a search method, let’s add a BooksController. We will render a json with serialized results and the total indices amount:

class BooksController < ApplicationController
  def index
    response = Book.__elasticsearch__.search(
      query: {
        multi_match: {
          query: params[:query],
          fields: ['name', 'author.first_name', 'author.last_name', 'isbn']
        }
      }
    ).results

    render json: {
      results: response.results,
      total: response.total
    }
  end
end

Now you can search for your result using this url:

http://localhost:3000/books?query=Joe

or via terminal and curl:

$ curl http://localhost:3000/books?query=javascript

We forgot about one thing – what happens when we update an author name? We should update books’ indices too! How we can do it? Well, the easiest way is to run the index_document method to force reindex a record. We can do it in an after_save callback. How? Like this:

class Author < ApplicationRecord
  has_many :books

  after_save :index_books_in_elasticsearch

  private

  def index_books_in_elasticsearch
    books.find_each { |book| book.__elasticsearch__.index_document }
  end
end

Honestly, this isn’t very powerful. Imagine indexing huge amount of records in the same in the rails app. It’s not a good idea. We can also do it using a worker and Sidekiq. So run a sidekiq job after an author was updated/created.

Summary

Now you know how to implement Elasticsearch with yours Rails application and add some basic stuff, like custom indexing, searching and indexing records from an associated class. I just wanted to cover same basic stuff instead and describe them instead of writing about a lot of things and going through them quickly.

I hope that it would be useful for your apps and you will discover a new passion for Elasticsearch 🙂

If you have any questions, feel free to post them below!

You can find the source code here.

Send this to a friend