Chapter 1. Getting started

Welcome to Hibernate Search! The following chapter will guide you through the initial steps required to integrate Hibernate Search into an existing Hibernate enabled application. In case you are a Hibernate new timer we recommend you start here.

1.1. System Requirements

Table 1.1. System requirements

Java RuntimeA JDK or JRE version 5 or greater. You can download a Java Runtime for Windows/Linux/Solaris here .
Hibernate Searchhibernate-search.jar and all the dependencies from the lib directory of the Hibernate Search distribution, especially lucene :)
Hibernate CoreThis instructions have been tested against Hibernate 3.2.x. Next to the main hibernate3.jar you will need all required libaries from the lib directory of the distribution. Refer to README.txt in the lib directory of the distibution to determine the minimum runtime requirements.
Hibernate AnnotationsEven though Hibernate Search can be used without Hibernate Annotations the following instructions will use them for ease of use. The tutorial is tested against version 3.3.x of Hibernate Annotations.

You can download all dependencies from the Hibernate download site. You can also verify the dependency versions against the Hibernate Compatibility Matrix.

1.2. Maven

Instead of managing all dependencies yourself maven users have the possibility to use the JBoss maven repository. Just add the JBoss repository url to the repositories section of your pom.xml or settings.xml:

<repository>
  <id>repository.jboss.org</id>
  <name>JBoss Maven Repository</name>
  <url>http://repository.jboss.org/maven2</url>
  <layout>default</layout>
</repository>
      

Then add the following dependencies to your pom.xml:

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search</artifactId>
   <version>3.0.0.ga</version>
</dependency>
<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-annotations</artifactId>
   <version>3.3.0.ga</version>
</dependency>
<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-entitymanager</artifactId>
   <version>3.3.1.ga</version>
</dependency>
      

Not all three dependencies are required. hibernate-search alone contains everything needed to use Hibernate Search. hibernate-annotations is only needed if you use non Hibernate Search annotations like we do in the examples of this tutorial. Last but not least, hibernate-entitymanager is only required if you use Hibernate Search in conjunction with JPA.

1.3. Configuration

Once you have downloaded and added all required dependencies to your application you have to add a few properties to your hibernate configuration file. If you are using Hibernate directly this can be done in hibernate.properties or hibernate.cfg.xml. If you are using Hibernate via JPA you can also add the properties to persistence.xml. The good news is that for standard use most properties offer a sensible default.

Apache Lucene has a notion of Directory to store the index files. Hibernate Search handles the initialization and configuration of a Lucene Directory instance via a DirectoryProvider. In this tutorial we will use a subclass of DirectoryProvider called FSDirectoryProvider. This will give us the ability to physically inspect the Lucene indexes created by Hibernate Search (eg via Luke). Once you have a working configuration you can start experimenting with other directory providers (see Section 3.1, “Directory configuration”).

Lets assume that your application contains the Hibernate managed class example.Book and you now want to add free text search capabilities to your application in order to search body and summary of the books contained in your database.

package exmaple.Book
...
@Entity
public class Book {

  @Id
  private Integer id; 
  private String body;  
  private String summary; 
  @ManyToMany private Set<Author> authors = new HashSet<Author>();
  @ManyToOne private Author mainAuthor;
  private Date publicationDate;
  
  public Book() {
  } 
  
  // standard getters/setters follow here
... 
    

First you have to tell Hibernate Search which DirectoryProvider to use. This can be achieved by setting the hibernate.search.default.directory_provider property. You also have to specify the default root directory for all indexes via hibernate.search.default.indexBase.

...
# the default directory provider
hibernate.search.default.directory_provider = org.hibernate.search.store.FSDirectoryProvider

# the default base directory for the indecies
hibernate.search.default.indexBase = /var/lucene/indexes    
...
    

Next you have to add three annotations to the Book class. The first annotation @Indexed marks Book as indexable. By design Hibernate Search needs to store an untokenized id in the index to ensure index unicity for a given entity. @DocumentId marks the property to use for this purpose. Most if not all the time, the property is the database primary key. Last but not least you have to index the fields you want to make searchable. In our example these fields are body and summary. Both properties get annotated with @Field. The property index=Index.TOKENIZED will ensure that the text will be tokenized using the default Lucene analyzer whereas store=Store.NO ensures that the actual data will not be stored in the index. Usually, tokenizing means chunking a sentence into individual words (and potentially excluding common words like a, the etc).

These settings are sufficient for an initial test. For more details on entity mapping refer to Section 4.1, “Mapping an entity”. In case you want to store and retrieve the indexed data in order to avoid database roundtrips, refer to projections in Section 5.1.2.5, “Projection”

package exmaple.Book
...
@Entity
@Indexed
public class Book {

  @Id
  @DocumentId
  private Integer id;
  
  @Field(index=Index.TOKENIZED, store=Store.NO)
  private String body;
  
  @Field(index=Index.TOKENIZED, store=Store.NO)
  private String summary; 
  @ManyToMany private Set<Author> authors = new HashSet<Author>();
  @ManyToOne private Author mainAuthor;
  private Date publicationDate;
  
  public Book() {
  } 
  
  // standard getters/setters follow here
... 
  

1.4. Indexing

Hibernate Search will index every entity persisted, updated or removed through Hibernate core transparently for the application. However, the data already present in your database needs to be indexed once to populate the Lucene index. Once you have added the above properties and annotations it is time to trigger an initial batch index of your books. You can achieve this by adding one of the following code examples to your code (see also Chapter 6, Manual indexing):

Example using Hibernate Session:

FullTextSession fullTextSession = Search.createFullTextSession(session);
Transaction tx = fullTextSession.beginTransaction();
List books = session.createQuery("from Book as book").list();
for (Book book : books) {
    fullTextSession.index(book);
}
tx.commit(); //index are written at commit time       
    

Example using JPA:

EntityManager em = entityManagerFactory.createEntityManager();
FullTextEntityManager fullTextEntityManager = Search.createFullTextEntityManager(em);
List books = em.createQuery("select book from Book as book").getResultList();
for (Book book : books) {
    fullTextEntityManager.index(book);
} 
    

After executing the above code, you should be able to see a Lucene index under /var/lucene/indexes/example.Book. Go ahead an inspect this index. It will help you to understand how Hibernate Search works.

1.5. Searching

Now it is time to execute a first search. The following code will prepare a query against the fields summary and body, execute it and return a list of Books:

Example using Hibernate Session:

FullTextSession fullTextSession = Search.createFullTextSession(session);

Transaction tx = fullTextSession.beginTransaction();

MultiFieldQueryParser parser = new MultiFieldQueryParser( new String[]{"summary", "body"}, 
  new StandardAnalyzer());
Query query = parser.parse( "Java rocks!" );
org.hibernate.Query hibQuery = fullTextSession.createFullTextQuery( query, Book.class );
List result = hibQuery.list();
  
tx.commit();
session.close();  
    

Example using JPA:

EntityManager em = entityManagerFactory.createEntityManager();

FullTextEntityManager fullTextEntityManager = 
    org.hibernate.hibernate.search.jpa.Search.createFullTextEntityManager(em);
MultiFieldQueryParser parser = new MultiFieldQueryParser( new String[]{"summary", "body"}, 
  new StandardAnalyzer());
Query query = parser.parse( "Java rocks!" );
org.hibernate.Query hibQuery = fullTextEntityManager.createFullTextQuery( query, Book.class );
List result = hibQuery.list();
    

1.6. Analyzer

Assume that one of your indexed book entities contains the text "Java rocks" and you want to get hits for all of the following queries: "rock", "rocks", "rocked" and "rocking". In Lucene this can be achieved by choosing an analyzer class which applies word stemming during the indexing process. Hibernate Search offers several ways to configure the analyzer to use (see Section 4.1.5, “Analyzer”):

  • Setting the hibernate.search.analyzer property in the configuration file. The specified class will then be the default analyzer.

  • Setting the Analyzer annotation at the entity level.

  • Setting the Analyzer annotation at the field level.

The following example uses the entity level annotation to apply a English language analyzer which would help you to achieve your goal. The class EnglishAnalyzer is a custom class using the Snowball English Stemmer from the Lucene Sandbox.

package example.Book
...
@Entity
@Indexed
@Analyzer(impl = example.EnglishAnalyzer.class)
public class Book {

  @Id
  @DocumentId
  private Integer id;
  
  @Field(index=Index.TOKENIZED, store=Store.NO)
  private String body;
  
  @Field(index=Index.TOKENIZED, store=Store.NO)
  private String summary; 
  @ManyToMany private Set<Author> authors = new HashSet<Author>();
  @ManyToOne private Author mainAuthor;
  private Date publicationDate;
  
  public Book() {
  } 
  
  // standard getters/setters follow here
... 
}

public class EnglishAnalyzer extends Analyzer {
    /**
     * {@inheritDoc}
     */
    @Override
    public TokenStream tokenStream(String fieldName, Reader reader) {
        TokenStream result = new StandardTokenizer(reader);
        result = new StandardFilter(result);
        result = new LowerCaseFilter(result);
        result = new SnowballFilter(result, name);
        return result;
    }
}
  

1.7. What's next

The above paragraphs hopefully helped you getting started with Hibernate Search. You should by now have a file system based index and be able to search and retrieve a list of managed objects via Hibernate Search. The next step is to get more familiar with the overall architecture ((Chapter 2, Architecture)) and explore the basic features in more detail.

Two topics which where only briefly touched in this tutorial were analyzer configuration (Section 4.1.5, “Analyzer”) and field bridges (Section 4.2, “Property/Field Bridge”), both important features required for more fine-grained indexing.

More advanced topics cover clustering (Section 3.4, “JMS Master/Slave configuration”) and large indexes handling (Section 3.2, “Index sharding”).