org.modeshape.search.lucene
Class LuceneSearchEngine

java.lang.Object
  extended by org.modeshape.graph.search.AbstractSearchEngine<WorkspaceType,ProcessorType>
      extended by org.modeshape.search.lucene.AbstractLuceneSearchEngine<LuceneSearchWorkspace,LuceneSearchProcessor>
          extended by org.modeshape.search.lucene.LuceneSearchEngine
All Implemented Interfaces:
SearchEngine

public class LuceneSearchEngine
extends AbstractLuceneSearchEngine<LuceneSearchWorkspace,LuceneSearchProcessor>

A SearchEngine implementation that relies upon two separate indexes to manage the node properties and the node structure (path and children). Using two indexes is more efficient when the node content and structure are updated independently. For example, the structure of the nodes changes whenever same-name-sibling indexes are changed, when sibling nodes are deleted, or when nodes are moved around; in all of these cases, the properties of the nodes do not change.


Nested Class Summary
protected static class LuceneSearchEngine.CrawlSubgraph
           
protected static class LuceneSearchEngine.ForwardRequest
           
protected static class LuceneSearchEngine.WorkForWorkspaces
           
protected static class LuceneSearchEngine.WorkRequest
           
protected static class LuceneSearchEngine.WorkspaceWork
           
 
Nested classes/interfaces inherited from class org.modeshape.search.lucene.AbstractLuceneSearchEngine
AbstractLuceneSearchEngine.AbstractLuceneProcessor<WorkspaceType extends SearchEngineWorkspace,SessionType extends AbstractLuceneSearchEngine.WorkspaceSession>, AbstractLuceneSearchEngine.TupleCollector, AbstractLuceneSearchEngine.WorkspaceSession
 
Nested classes/interfaces inherited from class org.modeshape.graph.search.AbstractSearchEngine
AbstractSearchEngine.SearchWorkspaces, AbstractSearchEngine.Workspaces<WorkspaceType extends SearchEngineWorkspace>
 
Field Summary
protected  ThreadLocal<DateFormat> dateFormatter
          A thread-local DateFormat instance that is thread-safe, since a new instance is created for each thread.
protected static TextEncoder DEFAULT_ENCODER
          The default encoder is the FileNameEncoder, which is based upon the UrlEncoder except that it also encodes the '*' character, which is required for Windows.
static IndexRules DEFAULT_RULES
          The default set of IndexRules used by LuceneSearchEngine instances when no rules are provided.
 
Fields inherited from class org.modeshape.graph.search.AbstractSearchEngine
DEFAULT_VERIFY_WORKSPACE_IN_SOURCE
 
Constructor Summary
LuceneSearchEngine(String sourceName, RepositoryConnectionFactory connectionFactory, boolean verifyWorkspaceInSource, int maxDepthPerIndexRead, File indexStorageDirectory, IndexRules rules, org.apache.lucene.analysis.Analyzer analyzer)
          Create a new instance of a SearchEngine that uses Lucene and a two-index design, and that stores the indexes in the supplied directory.
LuceneSearchEngine(String sourceName, RepositoryConnectionFactory connectionFactory, boolean verifyWorkspaceInSource, int maxDepthPerIndexRead, IndexRules rules, org.apache.lucene.analysis.Analyzer analyzer)
          Create a new instance of a SearchEngine that uses Lucene and a two-index design, and that stores the Lucene indexes in memory.
LuceneSearchEngine(String sourceName, RepositoryConnectionFactory connectionFactory, boolean verifyWorkspaceInSource, int maxDepthPerIndexRead, LuceneConfiguration configuration, IndexRules rules, org.apache.lucene.analysis.Analyzer analyzer)
          Create a new instance of a SearchEngine that uses Lucene and a two-index design, and that stores the indexes using the supplied LuceneConfiguration.
 
Method Summary
protected  LuceneSearchProcessor createProcessor(ExecutionContext context, AbstractSearchEngine.Workspaces<LuceneSearchWorkspace> workspaces, Observer observer, boolean readOnly)
          Create the SearchEngineProcessor implementation that can be used to operate against the SearchEngineWorkspace instances.
protected  LuceneSearchWorkspace createWorkspace(ExecutionContext context, String workspaceName)
          Create the index(es) required for the named workspace.
 void index(ExecutionContext context, Iterable<ChangeRequest> changes)
          Update the indexes with the supplied set of changes to the content.
protected static ChangeRequest merge(ExecutionContext context, ChangeRequest original, ChangeRequest change)
           
 
Methods inherited from class org.modeshape.graph.search.AbstractSearchEngine
createProcessor, getConnectionFactory, getSourceName, graph, isVerifyWorkspaceInSource
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_RULES

public static final IndexRules DEFAULT_RULES
The default set of IndexRules used by LuceneSearchEngine instances when no rules are provided. These rules default to index and analyze all properties, and to index the dna:uuid and jcr:uuid properties to be indexed and stored only (not analyzed and not included in full-text search. The rules also treat jcr:created and jcr:lastModified properties as dates.


DEFAULT_ENCODER

protected static final TextEncoder DEFAULT_ENCODER
The default encoder is the FileNameEncoder, which is based upon the UrlEncoder except that it also encodes the '*' character, which is required for Windows.


dateFormatter

protected ThreadLocal<DateFormat> dateFormatter
A thread-local DateFormat instance that is thread-safe, since a new instance is created for each thread.

Constructor Detail

LuceneSearchEngine

public LuceneSearchEngine(String sourceName,
                          RepositoryConnectionFactory connectionFactory,
                          boolean verifyWorkspaceInSource,
                          int maxDepthPerIndexRead,
                          LuceneConfiguration configuration,
                          IndexRules rules,
                          org.apache.lucene.analysis.Analyzer analyzer)
Create a new instance of a SearchEngine that uses Lucene and a two-index design, and that stores the indexes using the supplied LuceneConfiguration.

Parameters:
sourceName - the name of the source that this engine will search over
connectionFactory - the factory for making connections to the source
verifyWorkspaceInSource - true if the workspaces are to be verified using the source, or false if this engine is used in a way such that all workspaces are known to exist
maxDepthPerIndexRead - the maximum depth that any single request is to read when indexing
configuration - the configuration of the Lucene indexes
rules - the index rule, or null if the default index rules should be used
analyzer - the analyzer, or null if the default analyzer should be used
Throws:
IllegalArgumentException - if any of the source name, connection factory, or configuration are null

LuceneSearchEngine

public LuceneSearchEngine(String sourceName,
                          RepositoryConnectionFactory connectionFactory,
                          boolean verifyWorkspaceInSource,
                          int maxDepthPerIndexRead,
                          File indexStorageDirectory,
                          IndexRules rules,
                          org.apache.lucene.analysis.Analyzer analyzer)
Create a new instance of a SearchEngine that uses Lucene and a two-index design, and that stores the indexes in the supplied directory.

This is identical to the following:

 TextEncoder encoder = new UrlEncoder();
 LuceneConfiguration config = LuceneConfigurations.using(indexStorageDirectory, null, encoder, encoder);
 new LuceneSearchEngine(sourceName, connectionFactory, verifyWorkspaceInSource, config, rules, analyzer);
 
where the default encoder is used to ensure that workspace names and index names can be turned into file system directory names.

Parameters:
sourceName - the name of the source that this engine will search over
connectionFactory - the factory for making connections to the source
verifyWorkspaceInSource - true if the workspaces are to be verified using the source, or false if this engine is used in a way such that all workspaces are known to exist
maxDepthPerIndexRead - the maximum depth that any single request is to read when indexing
indexStorageDirectory - the file system directory in which the indexes are to be kept
rules - the index rule, or null if the default index rules should be used
analyzer - the analyzer, or null if the default analyzer should be used
Throws:
IllegalArgumentException - if any of the source name, connection factory, or directory are null

LuceneSearchEngine

public LuceneSearchEngine(String sourceName,
                          RepositoryConnectionFactory connectionFactory,
                          boolean verifyWorkspaceInSource,
                          int maxDepthPerIndexRead,
                          IndexRules rules,
                          org.apache.lucene.analysis.Analyzer analyzer)
Create a new instance of a SearchEngine that uses Lucene and a two-index design, and that stores the Lucene indexes in memory.

This is identical to the following:

 new LuceneSearchEngine(sourceName, connectionFactory, verifyWorkspaceInSource, LuceneConfigurations.inMemory(), rules, analyzer);
 

Parameters:
sourceName - the name of the source that this engine will search over
connectionFactory - the factory for making connections to the source
verifyWorkspaceInSource - true if the workspaces are to be verified using the source, or false if this engine is used in a way such that all workspaces are known to exist
maxDepthPerIndexRead - the maximum depth that any single request is to read when indexing
rules - the index rule, or null if the default index rules should be used
analyzer - the analyzer, or null if the default analyzer should be used
Throws:
IllegalArgumentException - if any of the source name or connection factory are null
Method Detail

createProcessor

protected LuceneSearchProcessor createProcessor(ExecutionContext context,
                                                AbstractSearchEngine.Workspaces<LuceneSearchWorkspace> workspaces,
                                                Observer observer,
                                                boolean readOnly)
Create the SearchEngineProcessor implementation that can be used to operate against the SearchEngineWorkspace instances.

Note that the resulting processor must be closed by the caller when completed.

Specified by:
createProcessor in class AbstractSearchEngine<LuceneSearchWorkspace,LuceneSearchProcessor>
Parameters:
context - the context in which the processor is to be used; never null
workspaces - the set of existing search workspaces; never null
observer - the observer of any events created by the processor; may be null
readOnly - true if the processor will only be reading or searching, or false if the processor will be used to update the workspaces
Returns:
the processor; may not be null
See Also:
AbstractSearchEngine.createProcessor(ExecutionContext, org.modeshape.graph.search.AbstractSearchEngine.Workspaces, Observer, boolean)

createWorkspace

protected LuceneSearchWorkspace createWorkspace(ExecutionContext context,
                                                String workspaceName)
                                         throws SearchEngineException
Create the index(es) required for the named workspace.

Specified by:
createWorkspace in class AbstractSearchEngine<LuceneSearchWorkspace,LuceneSearchProcessor>
Parameters:
context - the context in which the operation is to be performed; may not be null
workspaceName - the name of the workspace; may not be null
Returns:
the workspace; never null
Throws:
SearchEngineException - if there is a problem creating the workspace.
See Also:
AbstractSearchEngine.createWorkspace(org.modeshape.graph.ExecutionContext, java.lang.String)

index

public void index(ExecutionContext context,
                  Iterable<ChangeRequest> changes)
           throws SearchEngineException
Update the indexes with the supplied set of changes to the content.

Parameters:
context - the execution context for which this session is to be established; may not be null
changes - the set of changes to the content
Throws:
SearchEngineException - if there is a problem updating the indexes
See Also:
SearchEngine.index(org.modeshape.graph.ExecutionContext, java.lang.Iterable)

merge

protected static ChangeRequest merge(ExecutionContext context,
                                     ChangeRequest original,
                                     ChangeRequest change)


Copyright © 2008-2010 JBoss, a division of Red Hat. All Rights Reserved.