Full text search on Rails using the acts_as_solr plugin
Now let’s learn with a simple example how you can use this plugin to add full text search functionality to your models. But before that, let’s check out acts_as_solr and Solr features:
Features
Solr features
- Based on the proven and widely known Lucene search library
- Using the fast and lightweight Jetty HTTP server
- Many filters, plugins and complements available from the community (stemmers, charset converters, stop-words filters with lists in many languages)
acts_as_solr features
Warming up
To start using this plugin you’ll have set up a Rails application ( rails -d mysql acts_as_solr_sample ) and then install the plugin like this:
ruby script/plugin install git://github.com/mauricio/acts_as_solr.git
To be sure that the plugin was correctly installed, check if there’s a file at “RAILS_ROOT/config/solr.yml” and a folder at “RAILS_ROOT/config/solr”. If both the config file and folder exist, you’re ready to start using acts_as_solr in your application.
As Solr is based on Lucene that’s a Java full text search engine you’ll also have to install a Java Runtime Environment (JRE) in your machine if you don’t already have one, you can find the latest version here - http://java.com/
Configuration
The first configuration file we have to check out is the solr.yml:
# Config file for the acts_as_solr plugin.
#
# If you change the host or port number here, make sure you update
# them in your Solr config file
development:
url: http://localhost:8982/solr
# uncomment this line if you want to have Solr errors raised at your application
# if this property is undefined or set to false the errors will be logged
# using the rails logger but they will not be raised to the application
# raise_error: true
production:
url: http://localhost:8983/solr
# raise_error: true
This is the configuration to start the Solr server and also the configuration that the plugin will use to make requests to this server so you need to be sure that the host contains the real name of the machine that is going to host the Solr server, specially in production. Every environment can use it’s own configuration and the raise_error config tells acts_as_solr if the errors received when trying to talk to the Solr server should be raised to your application or not. The default value is “false” which means that the errors are not going to be raised to your application but they will be logged using the Rails logger. We’ll get back to error handling later.
The files under “RAILS_ROOT/config/solr” are the heart of your Solr server configuration, they tell Solr which filters and field configurations should be used and that’s also were your index files are going to be stored. The Solr index files live at “RAILS_ROOT/config/solr/data/RAILS_ENV”. Be sure to ignore the index folders when pushing data to your source control system.
Using the plugin
Now it’s time to get your hands dirty using the plugin for real, let’s start building the model that’s going to be searched, the NewsStory:
class NewsStory < ActiveRecord::Base
acts_as_solr
validates_presence_of :title, :description
def to_s
title
end
end
And here’s the migration that’s going to create it:
class CreateNewsStories <> ActiveRecord::Migration
t.string :title, :null =>:false
t.text :description
t.timestamps
end
end
def self.down
drop_table :news_stories
end
end
With our model created and the migration run (rake db:migrate) we’ll have to start the Solr server:
rake solr:start
After this you should get some output like this:
Solr started successfully on 8982, pid: 12770.
2009-05-29 15:00:09.853::INFO: Logging to STDERR via org.mortbay.log.StdErrLog
2009-05-29 15:00:09.966::INFO: jetty-6.1.18
2009-05-29 15:00:10.027::INFO: Extract file:/home/mauricio/NetBeansProjects/acts_as_solr_sample/vendor/plugins/acts_as_solr/jetty/webapps/solr.war to /tmp/Jetty_localhost_8982_solr.war__solr__6dieve/webapp
2009-05-29 15:00:10.972::INFO: Opened /home/mauricio/NetBeansProjects/acts_as_solr_sample/log/development_2009_05_29.request.log
2009-05-29 15:00:10.992::INFO: Started SelectChannelConnector@localhost:8982
The Jetty server that loads the Solr webapp is now ready to begin indexing your data and answering for search calls. Let’s add some news stories do the database (fire your “ruby script/console”):
NewsStory.create(
:title => 'acts_as_solr rocks',
:description => 'a simple and easy way to do full text searching in your rails app' )
NewsStory.create(
:title => 'couchdb is the next big thing',
:description => 'you shuld start paying attention to it, nice and easy way to store and search data' )
Now it’s time for us to search for them using Solr:
news_stories = NewsStory.find_by_solr( 'easy' )
news_stories.each { |news_story| puts news_story.title }
You should receive an object ( a ActsAsSolr::SearchResults ) with the two news stories you just persisted to your dabase, this object behaves just like a common Array, so you can use it anywhere you’d expect to use an Array but it also implements the same methods found at the will_paginate collection so you can also use it at your will_paginate view helpers.
And guess what? That’s it!
Now you have full text search working in your application in an almost effortless way. Read the plugin docs to get a better feel of the options you can use at the acts_as_solr and find_by_solr methods and you’re ready to go live using one of the most advanced open source search tools available today.
You can find the plugin here at GitHub - http://github.com/mauricio/acts_as_solr/tree
with_scope and named_scopes ignoring stacked :order clauses
If you’ve been using with_scope and named_scopes a lot with ActiveRecord you have probably noticed that the :order clauses defined at the scopes are lost and only the first :order clause is used. If you defined an :order clause you’d like to have it merged with the other ones already provided. Here’s a simple example:
class User
named_scope :by_first_name, :order => "#{quoted_table_name}.first_name ASC"
named_scope :by_last_name, :order => "#{quoted_table_name}.last_name ASC"
end
Our user has two named scopes defined and both of them define an :order clause, if we try to run a finder like this:
User.by_first_name.by_last_name.all
This is the generated query:
SELECT * FROM `users` ORDER BY `users`.first_name ASC
As you’ve noticed, only the first :order clause was used, the last one was lost. Our ideal SQL query would have to look like this, with both :order clauses being used:
SELECT * FROM `users` ORDER BY `users`.last_name ASC , `users`.first_name ASC
That’s why we’re going to hack the with_scope method a litle bit to reach our goal. This issue was already reported to the Rails issue tracker but there’s no fix yet so our only hope is to monkeypatch Rails to behave as we expect it to, so here’s a really simple fix for the problem:
ActiveRecord::Base.class_eval do
class << self
def merge_orders( *orders )
orders.join( ' , ' )
end
def with_scope_with_hack(method_scoping = {}, action = :merge, &block)
method_scoping = method_scoping.method_scoping if method_scoping.respond_to?(:method_scoping)
# Dup first and second level of hash (method and params).
method_scoping = method_scoping.inject({}) do |hash, (method, params)|
hash[method] = (params == true) ? params : params.dup
hash
end
method_scoping.assert_valid_keys([ :find, :create ])
if f = method_scoping[:find]
f.assert_valid_keys(VALID_FIND_OPTIONS)
set_readonly_option! f
end
# Merge scopings
if [:merge, :reverse_merge].include?(action) && current_scoped_methods
method_scoping = current_scoped_methods.inject(method_scoping) do |hash, (method, params)|
case hash[method]
when Hash
if method == :find
(hash[method].keys + params.keys).uniq.each do |key|
merge = hash[method][key] && params[key] # merge if both scopes have the same key
if key == :conditions && merge
if params[key].is_a?(Hash) && hash[method][key].is_a?(Hash)
hash[method][key] = merge_conditions(hash[method][key].deep_merge(params[key]))
else
hash[method][key] = merge_conditions(params[key], hash[method][key])
end
elsif key == :include && merge
hash[method][key] = merge_includes(hash[method][key], params[key]).uniq
elsif key == :joins && merge
hash[method][key] = merge_joins(params[key], hash[method][key])
elsif key == :order && merge
hash[method][key] = merge_orders(params[key], hash[method][key])
else
hash[method][key] = hash[method][key] || params[key]
end
end
else
if action == :reverse_merge
hash[method] = hash[method].merge(params)
else
hash[method] = params.merge(hash[method])
end
end
else
hash[method] = params
end
hash
end
end
self.scoped_methods << method_scoping
begin
yield
ensure
self.scoped_methods.pop
end
end
alias_method_chain :with_scope, :hack
end
end
You can place this code at an initializer (maybe called with_scope_fix.rb) or at your lib folder and require it in your initializers. And now all your :order clauses defined by named_scope or with_scope calls will be correctly merged and will not be lost in your code.