JBoss.orgCommunity Documentation

Chapter 24. QueryHandler configuration

24.1. How does it work?
24.2. Configuration
24.2.1. Common requirements
24.2.2. Query-handler configuration
24.2.3. JBoss-Cache template configuration

Let's talk about indexing content in cluster.

For couple of reasons, we can't replicate index. It means that some data added and indexed on one cluster node will be replicated to another cluster node, but will not be indexed on that node.

So, how do the indexing works in cluster environment?

As, we can not index the same data on all nodes of cluster, we must index it on one node. Node, that can index data and do changes on lucene index, is called "coordinator". Coordinator-node is choosen automaticaly, so we do not need special configuration for coordinator.

But, how can another nodes save their changes to lucene index?

First of all, data is already saved and replicated to another cluster-nodes, so we need only deliver message like "we need to index this data" to coordinator. Thats why Jboss-cache is used.

All nodes of cluster writes messages into JBoss-cache but only coordinator takes those messages and makes changes Lucene index.

How do the search works in cluster environment?

Search engine do not works with indexer, coordinator, etc. Search needs only lucene index. But only one cluster node can change lucene index - asking you. Yes - lucene index is shared. So, all cluster nodes must be configured to use lucene index from shared directory.

A little bit about indexing process (no matter, cluster or not): Indexer does not write changes to FS lucene index immediately. At first, Indexer write the changes to Volatile index. If Volatile index size become 1Mb or more, it is flushed to FS. Also, there is timer, that flushes volatile index by timeout. Volatile index timeout is configured by "max-volatile-time" paremeter.

See more about Search Configuration.

Common scheme of Shared Index