Apache Solr

Overriding the Facet API breadcrumbs

Facet API for Drupal is an amazing module that enables you to easily create and manage faceted search interfaces. The UI is fantastic and easy and it works perfectly together with either Search API or Apache Solr. It also changes the breadcrumb as soon as you start filtering further, which in many (probably most) occasions, is a nice default behavior. But not every project wants these kind of breadcrumbs. In a project we tried first to override the breadcrumb with hook_breadcrumb_alter(), but as we have a lot of search pages, the code was getting ridiculously ugly.

So we looked at where exactly Facet API is changing the breadcrumbs. This happens in a method called setBreadcrumb in a url processor class. So, we need to create our own processor and override the default behavior. First of all, we need to let Facet API know we have our own url processor:

<?php
/**
* Implements hook_facetapi_url_processors().
*/
function yourmodule_facetapi_url_processors() {
  return array(
   
'standard' => array(
     
'handler' => array(
       
'label' => t('Your module URL processor'),
       
'class' => 'FacetApiYourClass',
      ),
    ),
  );
}
?>

We are doing something wrong here: the standard key is also used in facetapi_facetapi_url_processors(), so we should use another name, because we are now overriding the default class. The trick is to extend on that class so the other methods will still do the heavy lifting. Oh module_invoke_all, we sometimes love your merge behavior (although technically here it's a CTools plugin, but the result is the same for both).

<?php
class FacetApiYourClass extends FacetapiUrlProcessorStandard {
  public function
setBreadcrumb() {
   
// Keep default behavior.
 
}
}
?>

If this class is not defined in your .module, make sure you add it to the .info file so the registry picks it up.

That's it. Again, this post should probably also be tagged with 'You're doing it wrong', but it works perfectly for our current use case.

Launch of the new Boek.be site


Today we relaunched the new Boek.be website. With a tight deadline to bring this project online in only 3 weeks because tomorrow it's the start of 'Boekenbeurs', Belgian's famous annual book fair, we're more than happy of what we've achieved in that small period. Users can register and create their own collection of favorite books, add comments, rate and share them on facebook and/or twitter. There's more to come in the next few weeks, but it's a nice start.

I briefly want to highlight 2 main Drupal modules we use extensively in the site. The first one is Apache Solr. With over 250k of nodes (and growing), this was an obvious decision. We all know Drupal core search isn't the best in performance and decent searches. We used all hooks available to manipulate searches and as good Drupal citizens contributed some patches back to the community. There is still one left which we'd like to go in, so please review!

I'm biased for the second one, since I'm one of the maintainers, but the site (and also Stubru.be) uses modules grouped in the Display suite concept. Our themers love it, no more fiddling with node, views, user, whatever templates, just CSS. Our account managers love it, they use the interface to manipulate each context if the client asks for a little change without asking us themers or developers. And we developers love it, we have hooks :)

We're working on a document which we'll post somewhere next week on d.o with a more detailed description on how the site was built, so stay tuned!

Random results with Apache Solr and Drupal

The schema.xml that comes with the Drupal Apache Solr module doesn't define the random_* field compared to the default xml included in the apachesolr package. We needed that functionality for a project where we wanted to display 3 blocks which showed random results based on a couple of fields available in the node, in our case the author, title and a cck field. With 300k nodes, a random result was giving a nicer experience instead of seeing the same results coming back over and over. Adding random order is pretty easy in a few simple steps: http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField...

Implementing the tags from that manual did not have a lot of success, however, after some fiddling, following changes in the xml seem to do the trick. Feel free to add comments and suggestions.

   <!-- goes in types -->
    <fieldType name="rand" class="solr.RandomSortField" indexed="true" />

  <!-- goes in fields -->
   <dynamicField name="random*" type="rand" indexed="true" stored="true"/>

After indexing your nodes, try to run following query on your solr admin page:

http://localhost:port/solr/select?q=whatever&morekeyshere&sort=random_127789 desc

Our blocks are defined via hook_block which uses the apachesolr_search_execute() function to launch our query to the solr engine. With the hook_apachesolr_modify_query you can add a sort parameter and you'll get your random results.

<?php
function hook_apachesolr_modify_query(&$query, &$params, $caller) {
  if (
$caller == 'whatever') {
   
$seed = rand(1, 200);
   
$params['qt'] = 'standard';
   
$params['sort'] = 'random_'. $seed .' asc';
  }
}
?>

Apache Solr Spielerei

If you haven't heard of Apache Solr and the integration with Drupal, than you're probably still struggling with the default search shipped with Drupal core. Pity you. Now, this won't be an introduction on the excellent search engine, no, this is a tale about the combination between Apache Solr, node caching and Node displays. Take a look at following snippet:

<?php
/**
* Creme de la creme:
* Put the full node object in the index, so no node_loads are needed for results.
*/
function nd_search_apachesolr_update_index(&$document, $node) {
 
$node->body = $node->content['body']['#value'];
  unset(
$node->content);
 
$document->tm_node = serialize($node);
}
?>

This code lives in nd_search, a small contrib which you can download from the Node Displays Contributions project. This code indexes the complete node object into the Apache Solr engine which you can you use later on either in custom code or on the search results page. Drupal core gives you the freedom to define a custom search function to render the results instead of the default page, which is - IMHO - pretty hard to customize. ND search implements hook_search_page and in combination with the power of ND, we have full control how to render a node per content type, and this without getting any extra data from the database. The code underneath explains it all.

<?php
/**
* Get the serialized version from the node, and unserialize it.
* @param $doc The apache solr document to be converted.
*
* @return Node version from the document.
*/
function _solr_document_to_node($doc) {
 
$node_serialized = $doc['node']->getField('tm_node');
 
$node = unserialize($node_serialized['value']);
  return
$node;
}

/**
* Implementation of hook_search_page().
*/
function apachesolr_search_search_page($results) {
$output = '';

  foreach (

$results as $key => $result) {
   
$node = _solr_document_to_node($result);
   
$node->build_mode = NODE_BUILD_SEARCH_RESULT;
   
$output .= node_view($node);
  }

 

$output .= theme('pager', NULL, 10, 0);

  return

$output;
}
?>

Pretty cool, right? The module also indexes all CCK fiels for you which you can use in custom code if you want to fire custom queries on one of those fields. Following snippet comes from a block where we want to search on a CCK field called 'name'. The result we get back uses the same function to unserialize the node object and after that we call node_view which is altered through the ND module with a custom build mode. Score again!

<?php
  $filter
= 'ss_cck_field_name:swentel';
 
$search_results = apachesolr_search_execute($filter, '', '');
 
$output = '';
  foreach (
$search_results as $key => $result) {
   
$nid = $result['node']->getField('nid');
    if (
$nid['value'] == arg(1)) { // Don't list the same node we're looking at right now.
     
continue;
    }
   
$node = _solr_document_to_node($result);
   
$node->build_mode = 'nd_blocks';
   
$output .= node_view($node, FALSE, FALSE);
  }
  return
$output;
?>

With this power, imagine a search results page with 2 or 3 blocks which doesn't fire any extra queries at the database for extra data. Our ultimate - and probably improbably - dream is to cache all data in apache solr so we don't need to access MySQL anymore. Of course, that's bullocks, but with the project we're currently building (more than 300K nodes to start with) we're pretty sure we'll be able to deliver a nice search experience for our end users.

Note: I'm pretty biased when it comes to the ND project since I'm one of the co-developers , but hey, we're so excited about it and we're planning a lot of new features pretty soon, but more news on that later!

Subscribe to RSS - Apache Solr

You are here