Chapter 24. QueryHandler configuration

JBoss.orgCommunity Documentation

Chapter 24. QueryHandler configuration

24.1. How does it work?

24.2. Configuration

24.2.1. Common requirements
24.2.2. Query-handler configuration
24.2.3. JBoss-Cache template configuration

24.1. How does it work?

Let's talk about indexing content in cluster.

For couple of reasons, we can't replicate index. It means that some data added and indexed on one cluster node will be replicated to another cluster node, but will not be indexed on that node.

So, how do the indexing works in cluster environment?

As, we can not index the same data on all nodes of cluster, we must index it on one node. Node, that can index data and do changes on lucene index, is called "coordinator". Coordinator-node is choosen automaticaly, so we do not need special configuration for coordinator.

But, how can another nodes save their changes to lucene index?

First of all, data is already saved and replicated to another cluster-nodes, so we need only deliver message like "we need to index this data" to coordinator. Thats why Jboss-cache is used.

All nodes of cluster writes messages into JBoss-cache but only coordinator takes those messages and makes changes Lucene index.

How do the search works in cluster environment?

Search engine do not works with indexer, coordinator, etc. Search needs only lucene index. But only one cluster node can change lucene index - asking you. Yes - lucene index is shared. So, all cluster nodes must be configured to use lucene index from shared directory.

A little bit about indexing process (no matter, cluster or not): Indexer does not write changes to FS lucene index immediately. At first, Indexer write the changes to Volatile index. If Volatile index size become 1Mb or more, it is flushed to FS. Also, there is timer, that flushes volatile index by timeout. Volatile index timeout is configured by "max-volatile-time" paremeter.

See more about Search Configuration.

Common scheme of Shared Index

24.2. Configuration

24.2.1. Common requirements

Now, let's see what we need to run Search engine in cluster environment.

Shared directory for storing Lucene index (i.e. NFS);
Changes filter configured as org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter;
Note
This filter ignore changes on non-coordinator nodes, and index changes on coordinator node.
configure JBoss-cache, course;

24.2.2. Query-handler configuration

Configuration example:

<workspace name="ws">
   <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
      <properties>
         <property name="index-dir" value="shareddir/index/db1/ws" />
         <property name="changesfilter-class"
            value="org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter" />
         <property name="jbosscache-configuration" value="jbosscache-indexer.xml" />
         <property name="jgroups-configuration" value="udp-mux.xml" />
         <property name="jgroups-multiplexer-stack" value="true" />
         <property name="jbosscache-cluster-name" value="JCR-cluster-indexer-ws" />
         <property name="max-volatile-time" value="60" />
      </properties>
   </query-handler>
</workspace>

Table 24.1. Config properties description

Property name	Description
index-dir	path to index
jbosscache-configuration	template of JBoss-cache configuration for all query-handlers in repository
jgroups-configuration	jgroups-configuration is template configuration for all components (search, cache, locks) [Add link to document describing template configurations]
jgroups-multiplexer-stack	[TODO about jgroups-multiplexer-stack - add link to JBoss doc]
jbosscache-cluster-name	cluster name (must be unique)
max-volatile-time	max time to live for Volatile Index

24.2.3. JBoss-Cache template configuration

JBoss-Cache template configuration for query handler.

jbosscache-indexer.xml

<?xml version="1.0" encoding="UTF-8"?>
<jbosscache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:jboss:jbosscache-core:config:3.1">

   <locking useLockStriping="false" concurrencyLevel="50000" lockParentForChildInsertRemove="false"
      lockAcquisitionTimeout="20000" />
   <!-- Configure the TransactionManager -->
   <transaction transactionManagerLookupClass="org.jboss.cache.transaction.JBossStandaloneJTAManagerLookup" />

   <clustering mode="replication" clusterName="${jbosscache-cluster-name}">
      <stateRetrieval timeout="20000" fetchInMemoryState="false" />
      <jgroupsConfig multiplexerStack="jcr.stack" />
      <sync />
   </clustering>
   <!-- Eviction configuration -->
   <eviction wakeUpInterval="5000">
      <default algorithmClass="org.jboss.cache.eviction.FIFOAlgorithm" eventQueueSize="1000000">
         <property name="maxNodes" value="10000" />
         <property name="minTimeToLive" value="60000" />
      </default>
   </eviction>

</jbosscache>

See more about template configurations here.