sphinx snippets

How to install and use the Sphinx search engine and acts_as_sphinx plugin on Debian Etch

Tagged sphinx, search, acts_as_sphinx, debian, etch, rails, install, libstemmer  Languages bash

Inspiration for this snippet was taken from this post on the Sphinx forum, plus this blog post.

Compiling Sphinx

First install the prerequisites:

sudo aptitude install libmysql++-dev libmysqlclient15-dev checkinstall

Next download sphinx, libstemmer and install everything and the fish:

cd /usr/local/src

wget http://sphinxsearch.com/downloads/sphinx-0.9.9.tar.gz
tar zxvf sphinx-0.9.9.tar.gz 

cd sphinx-0.9.9/

# Add stemming support for Swedish, Finnish and other fun languages.
wget http://snowball.tartarus.org/dist/libstemmer_c.tgz
tar zxvf libstemmer_c.tgz

./configure --with-libstemmer
make

make install

Configure Sphinx

Create a sphinx.conf file in your Rails config directory, as described here, or use this template.

Install acts_as_sphinx plugin

./script/plugin install http://svn.datanoise.com/acts_as_sphinx

Add acts_as_sphinx to your model:

class Documents
   acts_as_sphinx
end

Indexing content

rake sphinx:index

(in /var/www/xxx.com/releases/20080429144230)
Sphinx 0.9.8-rc2 (r1234)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file './sphinx.conf'...
indexing index 'xxx.com'...
collected 5077 docs, 0.6 MB
sorted 0.1 Mhits, 100.0% done
total 5077 docs, 632096 bytes
total 0.160 sec, 3950427.25 bytes/sec, 31729.86 docs/sec

Reindexing content

sphinx:index shouldn't be run while the searchd process is running, so use rake sphinx:rotate instead, which restarts the searchd process after indexing.

Starting the daemon

mkdir -m 664 /var/log/sphinx
rake sphinx:start

(in /var/www/xxx.com/releases/20080429144230)
Sphinx 0.9.8-rc2 (r1234)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file './sphinx.conf'...
Sphinx searchd server started.

Searching

Documents.find_with_sphinx 'why did I write this'

Sphinx configuration file template

Tagged sphinx, template, configuration  Languages 
source feed_items
{
        type                    = mysql

        sql_host                = 127.0.0.1
        sql_user                = root
        sql_pass                =
        sql_db                  = xxx_production
        sql_port                = 3306  # optional, default is 3306
        sql_sock                = /var/run/mysqld/mysqld.sock

        sql_query_pre           = SET NAMES utf8
        #sql_query_pre          = SET SESSION query_cache_type=OFF

    # Unique ID should be first column
        sql_query               = \
                SELECT i.id, i.title, i.link, f.link, f.title FROM feed_items i LEFT JOIN feeds f ON f.id = i.feed_id
}


index feed_items
{
        source                  = feed_items
        path                    = /var/sphinx/xxx
        morphology              = libstemmer_sv
        charset_type            = utf-8
}


indexer
{
        mem_limit               = 32M
}

searchd
{
        address                 = 127.0.0.1
        port                    = 3312
        log                     = /var/log/sphinx/searchd.log
        query_log               = /var/log/sphinx/query.log
        pid_file                = /var/log/searchd.pid
        max_matches             = 1000
}

How to configure wildcard and fuzzy search for Sphinx and Thinking Sphinx

Tagged sphinx, search, thinking-sphinx, wildcard, fuzzy  Languages ruby

This how-to explains how to configure wildcard and fuzzy search for Sphinx and the Thinking Sphinx Rails plugin.

Configure wildcard and fuzzy search in your model

First set the enable_star and min_infix_len properties inside the define_index block:

class Post...
  define_index do
   ...

    set_property :enable_star => true
    set_property :min_infix_len => 1 
  end

Optionally you can make the settings global by adding them to config/sphinx.yml:

production:
    enable_star: true
    min_infix_len: 1

Stop, configure, reindex and start Sphinx

For Sphinx to pickup the changes we need to stop, configure, reindex and start Sphinx. Thinking Sphinx has some rake tasks that allow you to do this:

RAILS_ENV=xxx
rake ts:stop
rake ts:conf
rake ts:in
rake ts:start

Verify Sphinx configuration

Now open the Sphinx configuration file in an editor:

$ vim config/production.sphinx.conf

Verify that you can see the correct settings:

...
index post_core
{
...
   min_infix_len = 1
   enable_star = true
}
...

Test

Fire up the console and run some queries:

Post.search('xxx', :star => true)

Create a search controller

Now all that's left is to create the search controller and view:

class SearchController...
  def index
    @query = params[:query]
    options = {
            :page => params[:page], :per_page => params[:per_page], :star => true,
            :field_weights => { :title => 20, :tags => 10, :body => 5 }
    }
    @posts = Post.search(@query, options)
  end

Note that to get relevant search results you need to assign different weights to fields.

And finally, here's the view code:

<% @posts.each do |post| %>
Nude pics go here...
<% end %>

References

Thinking Sphinx advanced documentation Sphinx Documentation: min_infix_len Sphinx Documentation: min_prefix_len Sphinx Documentation: enable_star