Chapter 10. Querying and Searching using JCR

The JCR API defines a way to query a repository for content that meets user-defined criteria. The JCR API actually makes it possible for implementations to support multiple query languages, but the only language required by JCR 1.0 is a subset of XPath. The 1.0 specification also defines a SQL-like query language, but supporting it is optional.

ModeShape now supports this query feature, including the required XPath language and two other languages not defined in the JCR 1.0 specification. This chapter describes how your applications can use queries to search your repositories, and defines the three query languages that are available with ModeShape.

10.1. JCR Query API

Like most operations in the JCR API, querying is done through a Session instance, from which can be obtained the QueryManager that defines methods for creating Query objects, storing queries as Nodes in the repository, and reconstituting queries that were stored on Nodes. Thus, querying a repository generally follows this pattern:



// Obtain the query manager for the session ...

javax.jcr.query.QueryManager queryManager = session.getWorkspace().getQueryManager();


// Create a query object ...

String language = ...

String expression = ...

javax.jcr.Query query = queryManager.createQuery(expression,language);


// Execute the query and get the results ...

javax.jcr.QueryResult result = query.execute();


// Iterate over the nodes in the results ...

javax.jcr.NodeIterator nodeIter = result.getNodes();

while ( nodeIter.hasNext() ) {

    javax.jcr.Node node = nodeIter.nextNode();

        ...

}


// Or iterate over the rows in the results ...

String[] columnNames = result.getColumnNames();

javax.jcr.query.RowIterator rowIter = result.getRows();

while ( rowIter.hasNext() ) {

    javax.jcr.query.Row row = rowIter.nextRow();

    // Iterate over the column values in each row ...

    javax.jcr.Value[] values = row.getValues();

    for ( javax.jcr.Value value : values ) {

                ...

    }

    // Or access the column values by name ...

    for ( String columnName : columnNames ) {

        javax.jcr.Value value = row.getValue(columnName);

                ...

    }

}


// When finished, close the session ...

session.logout();

For more detail about these methods or about how to use other facets of the JCR query API, please consult Section 6.7 of the JCR 1.0 specification.

10.2. JCR XPath Query Language

The JCR 1.0 specification uses the XPath query language because node structures in JCR are very analogous to the structure of an XML document. Thus, XPath provides a useful language for selecting and searching workspace content. And since JCR 1.0 defines a mapping between XML and a workspace view called the "document view", adapting XPath to workspace content is quite natural.

A JCR XPath query specifies the subset of nodes in a workspace that satisfy the constraints defined in the query. Constraints can limit the nodes in the results to be those nodes with a specific (primary or mixin) node type, with properties having particular values, or to be within a specific subtree of the workspace. The query also defines how the nodes are to be returned in the result sets using column specifiers and ordering specifiers.

Note

As an aside, ModeShape actually implements XPath queries by transforming them into the equivalent JCR-SQL2 representation. And the JCR-SQL2 language, although often more verbose, is much more capable of representing complex queries with multiple combinations of type, property, and path constraints.

10.2.1. Column Specifiers

JCR 1.0 specifies that support is required only for returning column values based upon single-valued, non-residual properties that are declared on or inherited by the node types specified in the type constraint. ModeShape follows this requirement, and does not specifying residual properties. However, ModeShape does allow multi-valued properties to be specified as result columns. And as per the specification, ModeShape always returns the "jcr:path" and "jcr:score" pseudo-columns.

ModeShape uses the last location step with an attribute axis to specify the properties that are to be returned as result columns. Multiple properties are specified with a union. For example, the following table shows several XPath queries and how they map to JCR-SQL2 queries.

Table 10.1. Specifying result set columns

XPath	JCR-SQL2
`//*`	SELECT * FROM [nt:base]
`//element(*,my:type)`	SELECT * FROM [my:type]
`//element(*,my:type)/@my:title`	SELECT [my:title] FROM [my:type]
`//element(*,my:type)/(@my:title \| @my:text)`	SELECT [my:title], [my:text] FROM [my:type]
`//element(*,my:type)/(@my:title union @my:text)`	SELECT [my:title], [my:text] FROM [my:type]

10.2.2. Type Constraints

JCR 1.0 specifies that support is required only for specifying constraints of one primary type, and it is optional to support specifying constraints on one (or more) mixin types. The specification also defines that the XPath element test be used to test against node types, and that it is optional to support element tests on location steps other than the last one. Type constraints are inherently inheritance-sensitive, in that a constraint against a particular node type 'X' will be satisfied by nodes explicitly declared to be of type 'X' or of subtypes of 'X'.

ModeShape does support using the element test to test against primary or mixin type. ModeShape also only supports using an element test on the last location step. For example, the following table shows several XPath queries and how they map to JCR-SQL2 queries.

Table 10.2. Specifying type constraints

XPath	JCR-SQL2
`//*`	SELECT * FROM [nt:base]
`//element(*,my:type)`	SELECT * FROM [my:type]
`/jcr:root/nodes/element(*,my:type)`	SELECT * FROM [my:type] WHERE PATH([my:type])> LIKE '/nodes/%' AND DEPTH([my:type]) = CAST(2 AS LONG)
`/jcr:root/nodes//element(*,my:type)`	SELECT * FROM [my:type] WHERE PATH([my:type]) LIKE '/nodes/%'
`/jcr:root/nodes//element(ex:nodeName,my:type)`	SELECT * FROM [my:type] WHERE PATH([my:type]) LIKE '/nodes/%' AND NAME([my:type]) = 'ex:nodeName'

Note that the JCR-SQL2 language supported by ModeShape is far more capable of joining multiple sets of nodes with different type, property and path constraints.

10.2.3. Property Constraints

JCR 1.0 specifies that attribute tests on the last location step is required, but that predicate tests on any other location steps are optional.

ModeShape does support using attribute tests on the last location step to specify property constraints, as well as supporting axis and filter predicates on other location steps. For example, the following table shows several XPath queries and how they map to JCR-SQL2 queries.

Table 10.3. Specifying property constraints

XPath	JCR-SQL2
`//*[@prop1]`	SELECT * FROM [nt:base] WHERE [nt:base].prop1 IS NOT NULL
`//element(*,my:type)[@prop1]`	SELECT * FROM [my:type] WHERE [my:type].prop1 IS NOT NULL
`//element(*,my:type)[@prop1=xs:boolean('true')]`	SELECT * FROM [my:type] WHERE [my:type].prop1 = CAST('true' AS BOOLEAN)
`//element(*,my:type)[@id<1 and @name='john']`	SELECT * FROM [my:type] WHERE id < 1 AND name = 'john'
`//element(*,my:type)[a/b/@id]`	SELECT * FROM [my:type] JOIN [nt:base] as nodeSet1 ON ISCHILDNODE(nodeSet1,[my:type]) JOIN [nt:base] as nodeSet2 ON ISCHILDNODE(nodeSet2,nodeSet1) WHERE (NAME(nodeSet1) = 'a' AND NAME(nodeSet2) = 'b') AND nodeSet2.id IS NOT NULL
`//element(,my:type)[.//*/@id]`	SELECT * FROM [my:type] JOIN [nt:base] as nodeSet1 ON ISCHILDNODE(nodeSet1,[my:type]) JOIN [nt:base] as nodeSet2 ON ISCHILDNODE(nodeSet2,nodeSet1) WHERE nodeSet2.id IS NOT NULLL
`//element(*,my:type)[.//@id]`	SELECT * FROM [my:type] JOIN [nt:base] as nodeSet1 ON ISDESCENDANTNODE(nodeSet1,[my:type]) WHERE nodeSet2.id IS NOT NULLL

Section 6.6.3.3 of the JCR 1.0 specification contains an in-depth description of property value constraints using various comparison operators.

10.2.4. Path Constraints

JCR 1.0 specifies that exact, child node, and descendants-or-self path constraints be supported on the location steps in an XPath query.

ModeShape does support the four kinds of path constraints. For example, the following table shows several XPath queries and how they map to JCR-SQL2 queries.

Table 10.4. Specifying path constraints

XPath	JCR-SQL2
`/jcr:root/a/b[*]`	SELECT * FROM [nt:base] WHERE PATH([nt:base]) = '/a/b'
`/jcr:root/a[1]/b[*]`	SELECT * FROM [nt:base] WHERE PATH([nt:base]) = '/a/b'
`/jcr:root/a[2]/b`	SELECT * FROM [nt:base] WHERE PATH([nt:base]) = '/a[2]/b'
`/jcr:root/a/b[2]//c[4]`	SELECT * FROM [my:type] WHERE PATH([nt:base]) = '/a/b[2]/c[4]' OR PATH(nodeSet1) LIKE '/a/b[2]/%/c[4]'
`/jcr:root/a/b//c//d`	SELECT * FROM [my:type] WHERE PATH([nt:base]) = '/a/b/c/d' OR PATH([nt:base]) LIKE '/a/b/%/c/d' OR PATH([nt:base]) LIKE '/a/b/c/%/d' OR PATH([nt:base]) LIKE '/a/b/%/c/%/d'
`//element(*,my:type)[@id<1 and @name='john']`	SELECT * FROM [my:type] WHERE id < 1 AND name = 'john'
`/jcr:root/a/b//element(*,my:type)`	SELECT * FROM [my:type] WHERE PATH([my:type]) = '/a/b/%'

Note that the JCR-SQL2 language supported by ModeShape is capable of representing a wider combination of path constraints.

10.2.5. Ordering Specifiers

JCR 1.0 extends the XPath grammar to add support for ordering the results according to the natural ordering of the values of one or more properties on the nodes.

ModeShape does support zero or more ordering specifiers, including whether each specifier is ascending or descending. If no ordering specifiers are defined, the ordering of the results is not predefined and may vary (though ordering by score is often the approach). For example, the following table shows several XPath queries and how they map to JCR-SQL2 queries.

Table 10.5. Specifying result ordering

XPath	JCR-SQL2
`//element(,) order by @title`	SELECT nodeSet1.title FROM [nt:base] AS nodeSet1 ORDER BY nodeSet1.title
`//element(,) order by @title, @jcr:score`	SELECT nodeSet1.title FROM [nt:base] AS nodeSet1 ORDER BY nodeSet1.title, SCORE([nt:base])

Note that the JCR-SQL2 language supported by ModeShape has a far richer ORDER BY clause, allowing the use of any kind of dynamic operand, including ordering upon arithmetic operations of multiple dynamic operands.

10.2.6. Miscellaneous

JCR 1.0 defines a number of other optional and required features, and these are summarized in this section.

Only abbreviated XPath syntax is supported.
Only the child axis (the default axis, represented by '/' in abbreviated syntax), descendant-or-self axis (represented by '//' in abbreviated syntax), self axis (represented by '.' in abbreviated syntax), and attribute axis (represent by '@' in abbreviated syntax) are supported.
The text() node test is not supported.
The element() node test is supported.
The jcr:like() function is supported.
The jcr:contains() function is supported.
The jcr:score() function is supported.
The jcr:deref() function is not supported.

10.3. JCR-SQL Query Language

The JCR-SQL query language is defined by the JCR 1.0 specification as a way to express queries using strings that are similar to SQL. Support for the language is optional, and in fact this language was deprecated in the JCR 2.0 specification in favor of the improved and more powerful (and more SQL-like) JCR-SQL2 language, which is covered in the next section. As such, ModeShape does not support the original JCR-SQL language.

10.4. JCR-SQL2 Query Language

The JCR-SQL2 query language is defined by the JCR 2.0 specification as a way to express queries using strings that are similar to SQL. This query language is an improvement over the earlier JCR-SQL language, which has been deprecated in JCR 2.0 (see previous section).

ModeShape includes full support for the complete JCR-SQL2 query language. However, ModeShape adds several extensions to make it even more powerful:

Support for the "FULL OUTER JOIN" and "CROSS JOIN" join types, in addition to the "LEFT OUTER JOIN", "RIGHT OUTER JOIN" and "INNER JOIN" types defined by JCR-SQL2. Note that "JOIN" is a shorthand for "INNER JOIN". For detail, see the grammar for joins.
Support for the UNION, INTERSECT, and EXCEPT set operations on multiple result sets to form a single result set. As with standard SQL, the result sets being combined must have the same columns. The UNION operator combines the rows from two result sets, the INTERSECT operator returns the difference between two result sets, and the EXCEPT operator returns the rows that are common to two result sets. Duplicate rows are removed unless the operator is followed by the ALL keyword. For detail, see the grammar for set queries.
Removal of duplicate rows in the results, using "SELECT DISTINCT ...". For detail, see the grammar for queries.
Limiting the number of rows in the result set with the "LIMIT count" clause, where count is the maximum number of rows that should be returned. This clause may optionally be followed by the "OFFSET number" clause to specify the number of initial rows that should be skipped. For detail, see the grammar for limits and offsets.
Additional dynamic operands "DEPTH([<selectorName>])" and "PATH([<selectorName>])" that enable placing constraints on the node depth and path, respectively. These dynamic operands can be used in a manner similar to "NAME([<selectorName>])" and "LOCALNAME([<selectorName>])" that are defined by JCR-SQL2. Note in each of these cases, the selector name is optional if there is only one selector in the query. For detail, see the grammar for dynamic operands.
Additional dynamic operand "REFERENCE([<selectorName>.]<propertyName>)" and "REFERENCE([<selectorName>])" that enables placing constraints on one or any of the reference properties, respectively, and which can be used in a manner similar to " PropertyValue([<selectorName>.]<propertyName>)". Note in each of these cases, the selector name is optional if there is only one selector in the query, and that the property name can be excluded if the constraint should apply to all reference properties. For detail, see the grammar for dynamic operands.
Support for the IN and NOT IN clauses to more easily and concisely supply multiple of discrete static operands. For example, "WHERE ... [my:type].[prop1] IN (3,5,7,10,11,50) ...". For detail, see the grammar for set constraints.
Support for the BETWEEN clause to more easily and concisely supply a range of discrete operands. For example, "WHERE ... [my:type].[prop1] BETWEEN 3 EXCLUSIVE AND 10 ...". For detail, see the grammar for between constraints.
Support for simple arithmetic in numeric-based criteria and order-by clauses. For example, "... WHERE SCORE(type1) + SCORE(type2) > 1.0" or "... ORDER BY (SCORE(type1) * SCORE(type2)) ASC, LENGTH(type2.property1) DESC". For detail, see the grammar for order-by clauses.

The grammar for the JCR-SQL2 query language is actually a superset of that defined by the JCR 2.0 specification, and as such the complete grammar is included here.

Note

The grammar is presented using the same EBNF nomenclature as used in the JCR 2.0 specification. Terms are surrounded by '[' and ']' denote optional terms that appear zero or one times. Terms surrounded by '{' and '}' denote terms that appear zero or more times. Parentheses are used to identify groups, and are often used to surround possible values. Literals (or keywords) are denoted by single-quotes.

10.4.1. Queries

QueryCommand ::= Query | SetQuery

SetQuery ::= Query ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query
                 { ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query }

Query ::= 'SELECT' ['DISTINCT'] columns
          'FROM' Source
          ['WHERE' Constraint]
          ['ORDER BY' orderings]
          [Limit]

10.4.2. Sources

Source ::= Selector | Join

Selector ::= nodeTypeName ['AS' selectorName]

nodeTypeName ::= Name

10.4.3. Joins

	
Join ::= left [JoinType] 'JOIN' right 'ON' JoinCondition
         // If JoinType is omitted INNER is assumed.

left ::= Source
right ::= Source

JoinType ::= Inner | LeftOuter | RightOuter | FullOuter | Cross

Inner ::= 'INNER' ['JOIN']

LeftOuter ::= 'LEFT JOIN' | 'OUTER JOIN' | 'LEFT OUTER JOIN'

RightOuter ::= 'RIGHT OUTER' ['JOIN']

RightOuter ::= 'FULL OUTER' ['JOIN']

RightOuter ::= 'CROSS' ['JOIN']

JoinCondition ::= EquiJoinCondition | SameNodeJoinCondition | 
                  ChildNodeJoinCondition | DescendantNodeJoinCondition

10.4.4. Equi-Join Conditions

	
EquiJoinCondition ::= selector1Name'.'property1Name '=' selector2Name'.'property2Name

selector1Name ::= selectorName
selector2Name ::= selectorName
property1Name ::= propertyName
property2Name ::= propertyName

10.4.5. Same-Node Join Conditions

	
SameNodeJoinCondition ::= 'ISSAMENODE(' selector1Name ',' selector2Name [',' selector2Path] ')'

selector2Path ::= Path

10.4.6. Child-Node Join Conditions

	
ChildNodeJoinCondition ::= 'ISCHILDNODE(' childSelectorName ',' parentSelectorName ')'

childSelectorName ::= selectorName
parentSelectorName ::= selectorName

10.4.7. Descendant-Node Join Conditions

	
DescendantNodeJoinCondition ::= 'ISDESCENDANTNODE(' descendantSelectorName 
                                                ',' ancestorSelectorName ')'
descendantSelectorName ::= selectorName
ancestorSelectorName ::= selectorName

10.4.8. Constraints

	
Constraint ::= ConstraintItem | '(' ConstraintItem ')'

ConstraintItem ::= And | Or | Not | Comparison | Between | PropertyExistence | 
                   SetConstraint | FullTextSearch | SameNode | ChildNode | DescendantNode

10.4.9. And Constraints


And ::= constraint1 'AND' constraint2

constraint1 ::= Constraint
constraint2 ::= Constraint

10.4.10. Or Constraints

	
Or ::= constraint1 'OR' constraint2

10.4.11. Not Constraints

	
Not ::= 'NOT' Constraint

10.4.12. Comparison Constraints

	
Comparison ::= DynamicOperand Operator StaticOperand

Operator ::= '=' | '!=' | '<' | '<=' | '>' | '>=' | 'LIKE'

10.4.13. Between Constraints

	
Between ::= DynamicOperand ['NOT'] 'BETWEEN' lowerBound ['EXCLUSIVE'] 
                                   'AND' upperBound ['EXCLUSIVE']

lowerBound ::= StaticOperand
upperBound ::= StaticOperand

10.4.14. Property Existence Constraints

	
PropertyExistence ::= selectorName'.'propertyName 'IS' ['NOT'] 'NULL' | 
                      propertyName 'IS' ['NOT'] 'NULL' /* If only one selector exists in this query */

10.4.15. Set Constraints


SetConstraint ::= selectorName'.'propertyName ['NOT'] 'IN' | 
                      propertyName ['NOT'] 'IN' /* If only one selector exists in this query */
                      '(' firstStaticOperand {',' additionalStaticOperand } ')'
firstStaticOperand ::= StaticOperand
additionalStaticOperand ::= StaticOperand

10.4.16. Full-text Search Constraints

	
FullTextSearch ::= 'CONTAINS(' ([selectorName'.']propertyName | selectorName'.*') 
                           ',' ''' fullTextSearchExpression''' ')'
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      preceding the propertyName is optional */

fullTextSearchExpression ::= FulltextSearch

where FulltextSearch is defined by the following, and is the same as the full-text search language supported by ModeShape:


FulltextSearch ::= Disjunct {Space 'OR' Space Disjunct}

Disjunct ::= Term {Space Term}

Term ::= ['-'] SimpleTerm

SimpleTerm ::= Word | '"' Word {Space Word} '"'

Word ::= NonSpaceChar {NonSpaceChar}

Space ::= SpaceChar {SpaceChar}

NonSpaceChar ::= Char - SpaceChar /* Any Char except SpaceChar */

SpaceChar ::= ' '

Char ::= /* Any character */

10.4.17. Same-Node Constraint

	
SameNode ::= 'ISSAMENODE(' [selectorName ','] Path ')' 
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      preceding the path is optional */

10.4.18. Child-Node Constraints


ChildNode ::= 'ISCHILDNODE(' [selectorName ','] Path ')' 
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      preceding the path is optional */

10.4.19. Descendant-Node Constraints


DescendantNode ::= 'ISDESCENDANTNODE(' [selectorName ','] Path ')' 
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      preceding the propertyName is optional */

10.4.20. Paths and Names


Name ::= '[' quotedName ']' | '[' simpleName ']' | simpleName

quotedName ::= /* A JCR Name (see the JCR specification) */
simpleName ::= /* A JCR Name that contains only SQL-legal 
                  characters (namely letters, digits, and underscore) */

Path ::= '[' quotedPath ']' | '[' simplePath ']' | simplePath

quotedPath ::= /* A JCR Path that contains non-SQL-legal characters */
simplePath ::= /* A JCR Path (rather Name) that contains only SQL-legal 
                    characters (namely letters, digits, and underscore) */

10.4.21. Static Operands

	
StaticOperand ::= Literal | BindVariableValue

Literal
Literal ::= CastLiteral | UncastLiteral

CastLiteral ::= 'CAST(' UncastLiteral ' AS ' PropertyType ')'

PropertyType ::= 'STRING' | 'BINARY' | 'DATE' | 'LONG' | 'DOUBLE' | 'DECIMAL' | 
                 'BOOLEAN' | 'NAME' | 'PATH' | 'REFERENCE' | 'WEAKREFERENCE' | 'URI'
                 /* 'WEAKREFERENCE' is not currently supported in JCR 1.0 */

UncastLiteral ::= UnquotedLiteral | ''' UnquotedLiteral ''' | '"' UnquotedLiteral '"'

UnquotedLiteral ::= /* String form of a JCR Value, as defined in the JCR specification */

10.4.22. Bind Variables

	
BindVariableValue ::= '$'bindVariableName

bindVariableName ::= /* A string that conforms to the JCR Name syntax, though the prefix
                        does not need to be a registered namespace prefix. */

10.4.23. Dynamic Operands

	
DynamicOperand ::= PropertyValue | ReferenceValue | Length | NodeName | NodeLocalName | NodePath | NodeDepth | 
                   FullTextSearchScore | LowerCase | UpperCase | Arithmetic | 
                   '(' DynamicOperand ')'

PropertyValue ::= [selectorName'.'] propertyName
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      preceding the propertyName is optional */

ReferenceValue ::= 'REFERENCE(' selectorName '.' propertyName ')' |
                   'REFERENCE(' selectorName ')' |
                   'REFERENCE()' |
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      preceding the propertyName is optional. Also, the property name may be excluded 
                      if the constraint should apply to any reference property. *&#47;

Length ::= 'LENGTH(' PropertyValue ')'

NodeName ::= 'NAME(' [selectorName] ')'
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      is optional */

NodeLocalName ::= 'LOCALNAME(' [selectorName] ')'
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      is optional */

NodePath ::= 'PATH(' [selectorName] ')'
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      is optional */

NodeDepth ::= 'DEPTH(' [selectorName] ')'
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      is optional */

FullTextSearchScore ::= 'SCORE(' [selectorName] ')'
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      is optional */

LowerCase ::= 'LOWER(' DynamicOperand ')'

UpperCase ::= 'UPPER(' DynamicOperand ')'

Arithmetic ::= DynamicOperand ('+'|'-'|'*'|'/') DynamicOperand

10.4.24. Ordering

	
orderings ::= Ordering {',' Ordering}

Ordering ::= DynamicOperand [Order]

Order ::= 'ASC' | 'DESC'

10.4.25. Columns

	
columns ::= (Column ',' {Column}) | '*'

Column ::= ([selectorName'.']propertyName ['AS' columnName]) | (selectorName'.*')
                   /* If only one selector exists in this query, explicit specification of the selectorName
                      preceding the propertyName is optional */
selectorName ::= Name
propertyName ::= Name
columnName ::= Name

10.4.26. Limit and Offset

	
Limit ::= 'LIMIT' count [ 'OFFSET' offset ]
count ::= /* Positive integer value */
offset ::= /* Non-negative integer value */

10.5. Full-Text Search Language

There are times when a formal structured query language is overkill, and the easiest way to find the right content is to perform a search, like you would with a search engine such as Google or Yahoo! This is where ModeShape's full-text search language comes in, because it allows you to use the JCR query API but with a far simpler, Google-style search grammar.

This query language is actually defined by the JCR 2.0 specification as the full-text search expression grammar used in the second parameter of the CONTAINS(...) function of the JCR-SQL2 language. We just pulled it out and made it available as a first-class query language.

This language allows a JCR client to construct a query to find nodes with property values that match the supplied terms. Nodes that "best" match the terms are returned before nodes that have a lesser match. Of course, ModeShape uses a complex system to analyze the node content and the query terms, and may perform a number of optimizations, such as (but not limited to) eliminating stop words (e.g., "the", "a", "and", etc.), treating terms independent of case, and converting words to base forms using a process called stemming (e.g., "running" into "run", "customers" into "customer").

Search terms can also include phrases by simply wrapping the phrase with double-quotes. For example, the search term 'table "customer invoice"' would rank higher those nodes with properties containing the phrase "customer invoice" than nodes with properties containing just "customer" or "invoice".

Term in the query are implicitly AND-ed together, meaning that the matches occur when a node has property values that match all of the terms. However, it is also possible to put an "OR" in between two terms where either of those terms may occur.

It is also possible to specify that terms should not appear in the results. This is called a negative term, and it reduces the rank of any node whose property values contain the the value. To specify a negative term, simply prefix the term with a hyphen ('-').

The grammar for this full-text search language is specified in Section 6.7.19 of the JCR 2.0 specification, but it is also included here as a convenience.

Note

10.5.1. Full-text Search Expressions


FulltextSearch ::= Disjunct {Space 'OR' Space Disjunct}

Disjunct ::= Term {Space Term}

Term ::= ['-'] SimpleTerm

SimpleTerm ::= Word | '"' Word {Space Word} '"'

Word ::= NonSpaceChar {NonSpaceChar}

Space ::= SpaceChar {SpaceChar}

NonSpaceChar ::= Char - SpaceChar /* Any Char except SpaceChar */

SpaceChar ::= ' '

Char ::= /* Any character */

As you can see, this is a pretty simple and straightforward query language. But this language makes it extremely easy to find all the nodes in the repository that match a set of terms.

When using this query language, the QueryResult always contains the "jcr:path" and "jcr:score" columns.