Additional Ruby Solr Sunspot setup tweaks
Posted on
Building a search for your rails app can become a touch complicated when you need to order by relevancy. Solr reduces this complexity by moving the heavy lifting off to it’s own service. It’s fairly quick & the rails wrapper has keeps things very tidy, however with a handful of small tweaks you can make it a touch more reliable and a little quicker.
For this article, I’m assuming you’ve already setup sunspot_rails on your rails app and you just want to know a few tweaks I found worthwhile in my Heroku environment.
Reindex with a worker
I like to use WebSolr (It’s an Heroku Addon), it’s pretty awesome but from time to time it does throw 50x errors (especially when I’ve been hitting their service a little hard). To stop the end user seeing the 503 errors, setup the solr_index
method to run asynchronously like so:
class Post < ActiveRecord::Base
searchable do
string :title
text :body
time :published_at
end
handle_asynchronously :solr_index
end
This means, that if the solr service returns a 50x error it’ll try again a few seconds later & the end user shouldn’t notice the issue. Also I don’t add in :solr_index!
, as when I’ve had both methods running asynchronously I noticed my memory usage growing very rapidly.
## Only reindex when a limited amount of attributes change
Odds are you’ll only be searching against a few attributes within your model, to tell solr to only reindex when it sees these attributes have changed, use the only_reindex_attribute_changes_of
argument with an array of symbols.
class Post < ActiveRecord::Base
searchable only_reindex_attribute_changes_of: [:title, :body, :published_at] do
string :title
text :body
time :published_at
end
end
## Only index what you need
Use the if
argument with a proc to only index models that should be in your results.
class Post < ActiveRecord::Base
searchable if: proc { |post| post.published? } do
string :title
text :body
time :published_at
end
end
In this case, if the post isn’t published it will be omitted from the search index completely.
Improving indexing speed with includes
If you’re calling a relationship within your searchable block, use the include
argument. It eager loads in the associated models
class Post < ActiveRecord::Base
searchable include: [:categories, :author] do
string :title
string :author_name do
author.name
end
string :categories_list do
categories.map(&:name).join(", ")
end
text :body
time :published_at
end
end