"indexProviders" : { "lucene" : { "classname" : "lucene", "directory" : "target/indexes" },
The Lucene index provider allows storing indexes and executing queries using Lucene 5. This index provider supports all 4 index types, where each index kind may be limited to handling only a subset of JCR Query constraints and operands as follows:
Index Kind |
Supported Features |
Unsupported Features |
Multi-column support |
Value |
most constraints and operands, including LIKE; both single and multi-valued properties |
JOIN and FULL TEXT SEARCH constraints |
yes |
Enumerated |
same as Value indexes |
same as Value indexes |
yes |
Unique |
same as Value indexes |
same as Value indexes |
yes |
Nodetype |
same as Value indexes |
same as Value indexes |
no |
Text |
FULL TEXT SEARCH constraint |
any other JCR query operands and constraints |
no |
Even though most index types support multiple columns in the index definitions, it is not recommended to define multiple columns per index.
This is because Lucene does not support merging of Documents and therefore any index update operation requires performing the merge in-memory and then overwriting the existing Document, incurring a significant performance penalty
There are several Lucene-related attributes that can be configured as follows:
Attribute |
Description |
Optional |
Default value |
directory |
the path on disk where indexes should be stored |
yes if path and relativeTo are present |
|
path |
a relative path to the relativeTo attribute |
yes if directory is present |
|
relativeTo |
the folder relative to which path is resolved |
yes if directory is present |
|
directoryClass |
the Lucene directory class type |
yes |
FSDirectory.open |
analyzerClass |
the Lucene analyzer instance |
yes |
StandardAnalyzer |
lockFactory |
the Lucene lock factory instance |
yes |
FSLockFactory.getDefault() |
codec |
the Lucene codec instance |
yes |
Codec.getDefault() |
The standard JSON configuration looks like this:
"indexProviders" : { "lucene" : { "classname" : "lucene", "directory" : "target/indexes" },
while the advanced configuration looks like:
"indexProviders" : { "lucene" : { "classname" : "lucene", "lockFactoryClass" : "org.apache.lucene.store.NoLockFactory", "directoryClass" : "org.apache.lucene.store.RAMDirectory", "analyzerClass" : "org.apache.lucene.analysis.ro.RomanianAnalyzer", "codec" : "Lucene53" } },
In either case, you need to make sure the index provider artifact is present in your classpath:
<dependency> <groupId>org.modeshape</groupId> <artifactId>modeshape-lucene-index-provider</artifactId> </dependency>
<index-provider name="lucene" classname="lucene" path="modeshape/artifacts/indexes/" relative-to="jboss.server.data.dir" module="org.modeshape.index-provider.lucene"/>
or
<index-provider name="lucene" classname="lucene" module="org.modeshape.index-provider.lucene" lockFactoryClass="org.apache.lucene.store.NoLockFactory" directoryClass="org.apache.lucene.store.RAMDirectory" analyzerClass="org.apache.lucene.analysis.ro.RomanianAnalyzer" codec="Lucene53"/>
Make sure you set the module="org.modeshape.index-provider.lucene" attribute, without which the Lucene index provider cannot be loaded by the server
When running full text search queries via the CONTAINS keyword, JCR allows the property name to be optional. For example:
"select [jcr:path] from [nt:testType] as n where contains(n.*,'the quick Dog')"
as opposed to
"select [jcr:path] from [nt:testType] as n where contains(FTSProp,'the quick Dog')"
If you're using the first style of querying, make sure there is only one text index which applies to [nt:testType], otherwise your query will not return any results. If you want to have multiple text indexes for the same node type, make sure that the full text search queries fully define the property name (as in the second example).
If you're using text extactors and want to be able to full-text-search the content of [nt:file] nodes, define your text index as:
"textFromFiles" : { "kind" : "text", "provider" : "lucene", "nodeType" : "nt:resource", "columns" : "jcr:data(BINARY)" }