Once you have downloaded and added all required dependencies to your
application you have to add a couple of properties to your hibernate
configuration file. If you are using Hibernate directly this can be done
in hibernate.properties
or
hibernate.cfg.xml
. If you are using Hibernate via JPA
you can also add the properties to persistence.xml
. The
good news is that for standard use most properties offer a sensible
default. An example persistence.xml
configuration
could look like this:
Example 1.3. Basic configuration options to be added to
,
hibernate.properties
or
hibernate.cfg.xml
persistence.xml
... <property name="hibernate.search.default.directory_provider" value="org.hibernate.search.store.FSDirectoryProvider"/> <property name="hibernate.search.default.indexBase" value="/var/lucene/indexes"/> ...
First you have to tell Hibernate Search which
DirectoryProvider
to use. This can be achieved by
setting the hibernate.search.default.directory_provider
property. Apache Lucene has the notion of a Directory
to store the index files. Hibernate Search handles the initialization and
configuration of a Lucene Directory
instance via a
DirectoryProvider
. In this tutorial we will use a
subclass of DirectoryProvider
called
FSDirectoryProvider
. This will give us the ability
to physically inspect the Lucene indexes created by Hibernate Search (eg
via Luke). Once you have
a working configuration you can start experimenting with other directory
providers (see Section 3.1, “Directory configuration”). Next to
the directory provider you also have to specify the default root directory
for all indexes via
hibernate.search.default.indexBase
.
Lets assume that your application contains the Hibernate managed
classes example.Book
and
example.Author
and you want to add free text search
capabilities to your application in order to search the books contained in
your database.
Example 1.4. Example entities Book and Author before adding Hibernate Search specific annotations
package example; ... @Entity public class Book { @Id @GeneratedValue private Integer id; private String title; private String subtitle; @ManyToMany private Set<Author> authors = new HashSet<Author>(); private Date publicationDate; public Book() { } // standard getters/setters follow here ... }
package example; ... @Entity public class Author { @Id @GeneratedValue private Integer id; private String name; public Author() { } // standard getters/setters follow here ... }
To achieve this you have to add a few annotations to the
Book
and Author
class. The
first annotation @Indexed
marks
Book
as indexable. By design Hibernate Search needs
to store an untokenized id in the index to ensure index unicity for a
given entity. @DocumentId
marks the property to use for
this purpose and is in most cases the same as the database primary key. In
fact since the 3.1.0 release of Hibernate Search
@DocumentId
is optional in the case where an
@Id
annotation exists.
Next you have to mark the fields you want to make searchable. Let's
start with title
and subtitle
and
annotate both with @Field
. The parameter
index=Index.TOKENIZED
will ensure that the text will be
tokenized using the default Lucene analyzer. Usually, tokenizing means
chunking a sentence into individual words and potentially excluding common
words like 'a'
or 'the
'. We will
talk more about analyzers a little later on. The second parameter we
specify within @Field
,
store=Store.NO
, ensures that the actual data will not be stored
in the index. Whether this data is stored in the index or not has nothing
to do with the ability to search for it. From Lucene's perspective it is
not necessary to keep the data once the index is created. The benefit of
storing it is the ability to retrieve it via projections (Section 5.1.2.5, “Projection”).
Without projections, Hibernate Search will per default execute a
Lucene query in order to find the database identifiers of the entities
matching the query critera and use these identifiers to retrieve managed
objects from the database. The decision for or against projection has to
be made on a case to case basis. The default behaviour -
Store.NO
- is recommended since it returns managed
objects whereas projections only return object arrays.
After this short look under the hood let's go back to annotating the
Book
class. Another annotation we have not yet
discussed is @DateBridge
. This annotation is one of the
built-in field bridges in Hibernate Search. The Lucene index is purely
string based. For this reason Hibernate Search must convert the data types
of the indexed fields to strings and vice versa. A range of predefined
bridges are provided, including the DateBridge
which will convert a java.util.Date
into a
String
with the specified resolution. For more
details see Section 4.2, “Property/Field Bridge”.
This leaves us with @IndexedEmbedded.
This
annotation is used to index associated entities
(@ManyToMany
, @*ToOne
and
@Embedded
) as part of the owning entity. This is needed
since a Lucene index document is a flat data structure which does not know
anything about object relations. To ensure that the authors' name wil be
searchable you have to make sure that the names are indexed as part of the
book itself. On top of @IndexedEmbedded
you will also
have to mark all fields of the associated entity you want to have included
in the index with @Indexed
. For more details see Section 4.1.3, “Embedded and associated objects”.
These settings should be sufficient for now. For more details on entity mapping refer to Section 4.1, “Mapping an entity”.
Example 1.5. Example entities after adding Hibernate Search annotations
package example; ... @Entity @Indexed public class Book { @Id @GeneratedValue @DocumentId private Integer id; @Field(index=Index.TOKENIZED, store=Store.NO) private String title; @Field(index=Index.TOKENIZED, store=Store.NO) private String subtitle; @IndexedEmbedded @ManyToMany private Set<Author> authors = new HashSet<Author>(); @Field(index = Index.UN_TOKENIZED, store = Store.YES) @DateBridge(resolution = Resolution.DAY) private Date publicationDate; public Book() { } // standard getters/setters follow here ... }
package example;
...
@Entity
public class Author {
@Id
@GeneratedValue
private Integer id;
@Field(index=Index.TOKENIZED, store=Store.NO)
private String name;
public Author() {
}
// standard getters/setters follow here
...
}