elasticsearch snippets

NoSQL, Pagination and Search with Mongoid, Kaminari and Tire

Tagged mongoid, kaminari, tire, elasticsearch, pagination  Languages ruby

This example uses Mongoid, Kaminari and Tire (ElasticSearch):

require 'kaminari/models/mongoid_extension'

class Product
  include Mongoid::Document
  include Mongoid::Paranoia
  include Mongoid::Timestamps

  index :application_id, :unique => true

  # NOTE Best to include after Mongoid
  include Tire::Model::Search
  include Tire::Model::Callbacks

  include Kaminari::MongoidExtension::Criteria
  include Kaminari::MongoidExtension::Document
end

Now you can paginate and search all you want:

# Search and paginate
Product.tire.search :page => 1, :per_page => 100, :load => true do
    query             { string "Ho ho" }
    #sort              { by     :rating, 'desc' }
end
# Paginate
Product.page(1).per(100)

Gotchas

* Use Model.tire.search/index, instead of Model.search which conflicts with Mongoid's index/search methods. * Use :load => true to return an array containing the model you're searching for instead of Tire::Result::Items. * Use :per_page and :page with Model.tire.search, not the page/per methods. * Mongoid queries are not the same as ActiveRecord queries, see http://mongoid.org/docs/querying/criteria.html#where * MongoDB URL http://localhost:28017/ * ElasticSearch URL http://localhost:9200/products/_mapping * Boolean queries are difficult https://gist.github.com/1263816

How to use ElasticSearch with Python

Tagged elasticsearch, python, pyes  Languages python

This is a short example on how to use ElasticSearch with Python.

First install pyes (pyes documentation).

Then run this code:

# https://pyes.readthedocs.org/en/latest/references/pyes.es.html
# http://davedash.com/2011/02/25/bulk-load-elasticsearch-using-pyes/
from pyes import *

index_name = 'xxx'
type_name = 'car'

conn = ES('127.0.0.1:9200', timeout=3.5)

docs = [
    {"name":"good",  "id":'1'},
    {"name":"bad", "id":'2'},
    {"name":"ugly", "id":'3'}
]

# Bulk index
for doc in docs:
    # index(doc, index, doc_type, id=None, parent=None, force_insert=False, op_type=None, bulk=False, version=None, querystring_args=None)
    conn.index(doc, index_name, type_name, id=doc['id'], bulk=True)

print conn.refresh()

# Search
def search(query):
    q = StringQuery(query, default_operator="AND")
    result = conn.search(query=q, indices=[index_name])
    for r in result:
        print r


search("good")

You can also use CURL to verify that it works:

# Show index mapping
curl -vvv "http://127.0.0.1:9200/xxx/_mapping?pretty=1"

# Delete index
curl -XDELETE -vvv "http://127.0.0.1:9200/xxx"

# Search
curl -vvv "http://127.0.0.1:9200/xxx/_search?pretty=1"

ElasticSearch Wildcard and NGram Search With Tire

Tagged ngram, wildcard, elasticsearch, tire  Languages ruby

How to implement wildcard search with Tire and Elasticsearch:

settings analysis: {
    filter: {
      ngram_filter: {
        type: "nGram",
        min_gram: 1,
        max_gram: 15
      }
    },
    analyzer: {
      index_ngram_analyzer: {
        tokenizer: "standard",
        filter: ['standard', 'lowercase', "stop", "ngram_filter"],
        type: "custom"
      },
      search_ngram_analyzer: {
        tokenizer: "standard",
        filter: ['standard', 'lowercase', "stop"],
        type: "custom"
      }
    }
  }

  mapping do
    indexes :name,
      search_analyzer: 'search_ngram_analyzer',
      index_analyzer: 'index_ngram_analyzer', 
      #analyzer: 'index_ngram_analyzer', 
      boost: 100.0
      # …
  end

With curl, make sure the mapping is set up properly:

curl 'http://localhost:9200/activities/_mapping?pretty=true'
{
  "skulls" : {
    "skull" : {
      "_all" : {
        "auto_boost" : true
      },
      "properties" : {
        "name" : {
          "type" : "string",
          "boost" : 100.0,
          "analyzer" : "index_ngram_analyzer"
        }
      }
    }
  }
}

You now have wildcard search as long as you remember to specify the fields that you want to search, because by default the _all field is used for search:

# This searches the _all field
curl 'http://localhost:9200/activities/_search?q=simpsons&pretty=true'

# Yes, it really works
curl -XGET 'http://localhost:9200/activities/_search?pretty' -d ' 
{ 
   "query" : { 
      "query_string" : { 
         "query" : "simpsons", 
         "fields" : ["name"] 
      } 
   } 
}'

Elasticsearch: How to delete log entries older than X days with Curator

Tagged curator, elasticsearch  Languages bash, cron
# Install curator
pip install curator
# Download curator config file
curl -o curator.yml https://raw.githubusercontent.com/elastic/curator/master/examples/curator.yml

Next, download, read, and edit the action file: https://www.elastic.co/guide/en/elasticsearch/client/curator/current/actionfile.html

# Run curator
curator --config curator.yml action_file.yml

Add this to crontab:

# Run curator at 00:01
01 00 * * * /usr/local/bin/curator --config /etc/elasticsearch/curator/curator.yml /etc/elasticsearch/curator/remove-old-data.yml >> /var/log/elasticsearch-cu
rotor.log

Tested with Elasticsearch 6.0 and curator version 5.4.1.