All Posts in Sunspot Solr

May 30, 2014 - Comments Off on Configuring Sunspot Solr Search Controller

Configuring Sunspot Solr Search Controller

Search is the compass of the internet. It guides us to the content that we are really looking for and helps avoid the stuff we don’t really care about. Or at least that’s how it is supposed to work. It turns out that beyond just the complexity of installing and configuring a search server, it can also be difficult to account for the various use cases of your search tool. Lets take a quick look at how The Mechanism engineers were able to tackle this challenge when building a restaurant search application for SafeFARE.

The good folks at foodallergy.org enlisted our services to build a restaurant search application that will allow users to find allergy-aware restaurants based on any combination of 9 criteria. Using the Ruby on Rails framework and Sunspot Solr (a Ruby DSL for the Lucene Apache Solr search server) we built this search app, and learned a few things on the way.

If a user searches for restaurants in a ZIP code should we only return restaurants within that ZIP code, or should we include restaurants from other nearby ZIP codes in our search results? And if we include other ZIP codes, how many other ZIP codes? How should we order the results? These and other similar questions helped up to come up with the structure of our search controller.

Figure 1.1

if params[:search].present?

@search = Restaurant.solr_search do

fulltext params[:restaurant_name] # runs a full text search of

with(:approved, :true) #facets approved restaurants

if params[:cuisine_search].present? #user also entered cuisine preference

any_of do

params[:cuisine_search].each do |tag|

with(:cuisines_name, tag) # facet by matching cuisines

end

end

end

if params[:address].present? || params[:city_search].present? || params[:state_search].present? || params[:zip_search].present?

#if any location fields are present, geocode that location

with(:location).in_radius(*Geocoder.coordinates(whereat), howfar)

#facet based on user given location,

end

order_by_geodist(:location,request.location.latitude,request.location.longitude)

@restaurants = @search.results

end

 

It took us about a week but we were finally able to come up with enough if statements to cover every one of the 362,880 possible combinations of search queries. Figure 1.1 is a small sampling of how we implement search when a user types in a restaurant name, cuisine preference, and restaurant location. First we search the solr index for whatever the user enters in the restaurant_name field, then cut that list down to only the approved restaurants, then we check to see if the user also entered a cuisine preference, if so we facet our list down to restaurants that match that cuisine, if the user did not enter a cuisine, we skip that step, then we check if the user entered a location that they would like to search like a city, or state, and we facet our list down to only restaurant’s in that area. Using this strategy we can create sort of a Venn diagram that allows us to drill down only to the information that we want, and point that result to the restaurant variable. To increase the functionality of the site, The Mechanism engineers implemented an IP lookup to automatically detect the IP address and location of the user, and order search results by how close the restaurant is to the user.

A second major challenge that many developers face when using a search server is deployment. In order to use solr in a production environment, you will need a Java app servlet like Tomcat or Jetty, and you will need an instance of Apache Solr. Developers may consider installing standalone versions of Tomcat and Solr Sunspot depending on their hardware capabilities, but sunspot comes bundled with a Jetty server which can be used in production by running the command RAILS_ENV=production rake sunspot:solr:start

And voila! we have implemented an advanced search tool that will help users find allergy-aware restaurants all across the nation and may even save somebody’s life one day.

Published by: Sharon Terry in The Programming Mechanism
Tags: , ,

July 12, 2012 - Comments Off on Adding CCK fields to Apachesolr documents in Drupal 7

Adding CCK fields to Apachesolr documents in Drupal 7

Drupal's core searching functionality is awesome in that it can be replaced by other search modules and even other search engines. We're using Apache Solr for its lightning fast response time and powerful indexing and faceting. In a nutshell, Solr is a standalone index/search engine to which Drupal sends its search queries. The real beauty is that the indexed results are all cached and served back to the Drupal site as XML documents ridiculously fast, much faster than Drupal's core search. The core search module has to hit the Database and fully load each node/entity for every matching result.

By default Apache Solr will index most of your content type's fields, however many CCK fields that you add will not be included in the default indexing. That is to say, if you add a checkbox, or text field to your content type you will have to explicitly direct Solr to add it to the index. Each piece of content that is indexed by Solr is processed and stored as a Solr document which holds all of the indexed fields as well as some Solr metadata. That is how Solr can return results so quickly, it is only sending the fields you require, not loading the entire object.
Here is some sample code showing how to add some custom fields to the Solr document. These two hooks are all you need to get started, just add them to a custom module and be sure to re-index afterwards to update the Solr index.


/**
* Implements hook_apachesolr_index_document_build().
*
* Add custom fields to the solr document
*/
function themech_solr_apachesolr_index_document_build(ApacheSolrDocument $document, $entity, $entity_type, $env_id) {
if($entity->type == 'publication') {
if (isset($entity->field_publication_author[$entity->language])) {
foreach($entity->field_publication_author[$entity->language] AS $id => $obj) {
if(isset($entity->field_publication_author[$entity->language][$id])) {
$document->setMultiValue('sm_field_publication_author', $entity->field_publication_author[$entity->language][$id]['entity']->name);
}
}
}
if(isset($entity->field_publication_attachment[$entity->language])) {
foreach($entity->field_publication_attachment[$entity->language] AS $id => $obj) {
$document->setMultiValue('sm_field_publication_attachment', $entity->field_publication_attachment[$entity->language][$id]['uri']);
}
}
if(isset($entity->field_publication_recommended[$entity->language])) {
$document->setMultiValue('is_field_publication_recommended', $entity->field_publication_recommended[$entity->language][0]['value']);
}
}
}

/**
* Implementation of hook_apachesolr_query_alter($query)
*
* Add the newly indexed fields from above to the query result.
*/
function themech_solr_apachesolr_query_alter($query) {
$query->addParams(array('fl' => array('sm_field_publication_author')));
$query->addParams(array('fl' => array('sm_field_publication_attachment')));
$query->addParams(array('fl' => array('sm_field_publication_recommended')));
}

In this example, we're adding any names that were selected in a selectlist of authors, a checkbox state, and a file attachment uri. The first function checks to see if the Publication node has data in certain fields then adds them to the $document object via SetMultiValue(). The fields are now stored in Solr, but as they were custom additions to the document, you have to specify them in the query to tell Solr to pull them back out with the rest of the document.

You can index anything you can put into a content type, and each content type can have specific fields indexed. With Solr you can create thumbnail gallery search results, or integrate with your commerce site to generate product category and price range searches, as well as tune the results based on custom weights and ratings. The possibilities are almost limitless. Maybe as many as a googol (1x10^100), or in Drupal terms... a Droogol. :+)

Links:
Apache Solr
Apachesolr Search Integration

Published by: chazcheadle in The Programming Mechanism
Tags: , ,