Chapter 7. Index Optimization

From time to time, the Lucene index needs to be optimized. The process is essentially a defragmentation: until the optimization occurs, deleted documents are just marked as such, no physical deletion is applied, the optimization can also adjust the number of files in the Lucene Directory.

The optimization speeds up searches but in no way speeds up indexation (update). During an optimization, searches can be performed (but will most likely be slowed down), and all index updates will be stopped. Prefer optimizing:

7.1. Automatic optimization

Hibernate Search can optimize automatically an index after:

  • a certain amount of operations have been applied (insertion, deletion)

  • or a certain amout of transactions have been applied

The configuration can be global or defined at the index level:

hibernate.search.default.optimizer.operation_limit.max = 1000
hibernate.search.default.optimizer.transaction_limit.max = 100

hibernate.search.Animal.optimizer.transaction_limit.max = 50

An optimization will be triggered to the Animal index as soon as either:

  • the number of addition and deletion reaches 1000

  • the number of transactions reaches 50 (hibernate.search.Animal.optimizer.transaction_limit.max having priority over hibernate.search.default.optimizer.transaction_limit.max)

If none of these parameters are defined, not optimization is processed automatically.

7.2. Manual optimization

You can programmatically optimize (defragment) a Lucene index from Hibernate Search through the SearchFactory

searchFactory.optimize(Order.class);

searchFactory.optimize();

The first example reindex the Lucene index holding Orders, the second, optimize all indexes.

The SearchFactory can be accessed from a FullTextSession:

FullTextSession fullTextSession = Search.createFullTextSession(regularSession);
SearchFactory searchFactory = fullTextSession.getSearchFactory();

Note that searchFactory.optimize() has no effect on a JMS backend. You must apply the optimize operation on the Master node.

7.3. Adjusting optimization

Apache Lucene has a few parameters to influence how optimization is performed. Hibernate Search expose those parameters.

Further index optimisation parameters include hibernate.search.[default|<indexname>].merge_factor, hibernate.search.[default|<indexname>].max_merge_docs and hibernate.search.[default|<indexname>].max_buffered_docs - see Section 3.7, “Tuning Lucene indexing performance” for more details.