JBoss Community Archive (Read Only)

ModeShape 5

Lucene

Overview

The Lucene index provider allows storing indexes and executing queries using Lucene 5. This index provider supports all 4 index types, where each index kind may be limited to handling only a subset of JCR Query constraints and operands as follows:

Index Kind

Supported Features

Unsupported Features

Multi-column support

Value

most constraints and operands, including LIKE; both single and multi-valued properties

JOIN and FULL TEXT SEARCH constraints

yes

Enumerated

same as Value indexes

same as Value indexes

yes

Unique

same as Value indexes

same as Value indexes

yes

Nodetype

same as Value indexes

same as Value indexes

no

Text

FULL TEXT SEARCH constraint

any other JCR query operands and constraints

no

Even though most index types support multiple columns in the index definitions, it is not recommended to define multiple columns per index.
This is because Lucene does not support merging of Documents and therefore any index update operation requires performing the merge in-memory and then overwriting the existing Document, incurring a significant performance penalty

Configuration

There are several Lucene-related attributes that can be configured as follows:

Attribute

Description

Optional

Default value

directory

the path on disk where indexes should be stored

yes if path and relativeTo are present

 

path

a relative path to the relativeTo attribute

yes if directory is present

 

relativeTo

the folder relative to which path is resolved

yes if directory is present

 

directoryClass

the Lucene directory class type

yes

FSDirectory.open

analyzerClass

the Lucene analyzer instance

yes

StandardAnalyzer

lockFactory

the Lucene lock factory instance

yes

FSLockFactory.getDefault()

codec

the Lucene codec instance

yes

Codec.getDefault()

JSON

The standard JSON configuration looks like this:

 "indexProviders" : {
        "lucene" : {
            "classname" : "lucene",
            "directory" : "target/indexes"
        },

while the advanced configuration looks like:

"indexProviders" : {
        "lucene" : {
            "classname" : "lucene",
            "lockFactoryClass" : "org.apache.lucene.store.NoLockFactory",
            "directoryClass" : "org.apache.lucene.store.RAMDirectory",
            "analyzerClass" : "org.apache.lucene.analysis.ro.RomanianAnalyzer",
            "codec" : "Lucene53"
        }
    },

In either case, you need to make sure the index provider artifact is present in your classpath:

  <dependency>
    <groupId>org.modeshape</groupId>
    <artifactId>modeshape-lucene-index-provider</artifactId>
  </dependency>

JBoss AS

<index-provider name="lucene" classname="lucene" path="modeshape/artifacts/indexes/" relative-to="jboss.server.data.dir" module="org.modeshape.index-provider.lucene"/>

or

<index-provider name="lucene" classname="lucene" module="org.modeshape.index-provider.lucene" 
                lockFactoryClass="org.apache.lucene.store.NoLockFactory" directoryClass="org.apache.lucene.store.RAMDirectory" analyzerClass="org.apache.lucene.analysis.ro.RomanianAnalyzer"
                codec="Lucene53"/>

Make sure you set the module="org.modeshape.index-provider.lucene" attribute, without which the Lucene index provider cannot be loaded by the server

Additional considerations

When running full text search queries via the CONTAINS keyword, JCR allows the property name to be optional. For example:

"select [jcr:path] from [nt:testType] as n where contains(n.*,'the quick Dog')"

as opposed to

"select [jcr:path] from [nt:testType] as n where contains(FTSProp,'the quick Dog')"

If you're using the first style of querying, make sure there is only one text index which applies to [nt:testType], otherwise your query will not return any results. If you want to have multiple text indexes for the same node type, make sure that the full text search queries fully define the property name (as in the second example).

If you're using text extactors and want to be able to full-text-search the content of [nt:file] nodes, define your text index as:

 "textFromFiles" : {
    "kind" : "text",
    "provider" : "lucene",
    "nodeType" : "nt:resource",
    "columns" : "jcr:data(BINARY)"
  }
JBoss.org Content Archive (Read Only), exported from JBoss Community Documentation Editor at 2020-03-11 12:12:58 UTC, last content change 2016-04-07 07:26:52 UTC.