JBoss.orgCommunity Documentation
The JCR API defines a way to query a repository for content that meets user-defined criteria. The JCR 2.0 API actually makes it possible for implementations to support multiple query languages, and the specification requires support for two languages: JCR-SQL2 and JCR-QOM. JCR 1.0 defined two other languages (XPath and JCR-SQL), though these languages were deprecated in JCR 2.0.
At this time, ModeShape supports all of these query languages, plus one search-engine-like language
called "search" that
is actually just the full-text search expression grammar
used in the second parameter of the CONTAINS(...)
function of the JCR-SQL2 language.
ModeShape handles all of these languages in nearly the same manner, the only difference being whether the query is
represented as a string or build programmatically using the javax.jcr.query.qom
part of the JCR API.
A language-independent representation, called the query model, is constructed by parsing the string representation of the query (using a language-specific parser) or the JCR-QOM objects created by the client.
The language-independent query model is used to create a canonical (relational) query plan.
The canonical query plan is then validated to ensure that all identifiers in the query are resolvable.
The canonical query plan is then optimized using a flexible rule-based optimizer. Optimizations include (but are not limited to): replace view references; unify handling of aliases; convert right outer joins into left outer joins; choose algorithms for each join; raise and lower criteria; push projection of columns as low in the plan as possible; duplicate criteria across identity joins; rewrite identity joins involving only columns that form keys; remove parts of the plan that (based upon the criteria) will return no rows; determination of the low-level "access" queries that will be submitted to the connector layer.
The optimized query plan is then executed, whereby each access query is pushed down to the connector and the results are then processed and combined to produce the desired result set.
Note that only the parsing step is dependent upon the query language. This means that all of the query languages are processed using the same, unified engine.
The rest of this chapter describes how your applications can use queries to search your repositories, and outlines the specifics of each of the four query languages available in ModeShape.
With ModeShape, all query operations can be performed using only the JCR API interfaces. The first step is to obtain the QueryManager from your Session instance. The QueryManager interface defines methods for creating Query objects, executing queries, storing queries (not results) as Nodes in the repository, and reconstituting queries that were stored on Nodes. Thus, querying a repository generally follows this pattern:
// Obtain the query manager for the session ...
javax.jcr.query.QueryManager queryManager = session.getWorkspace().getQueryManager();
// Create a query object ...
String language = ...
String expression = ...
javax.jcr.Query query = queryManager.createQuery(expression,language);
// Execute the query and get the results ...
javax.jcr.QueryResult result = query.execute();
// Iterate over the nodes in the results ...
javax.jcr.NodeIterator nodeIter = result.getNodes();
while ( nodeIter.hasNext() ) {
javax.jcr.Node node = nodeIter.nextNode();
...
}
// Or iterate over the rows in the results ...
String[] columnNames = result.getColumnNames();
javax.jcr.query.RowIterator rowIter = result.getRows();
while ( rowIter.hasNext() ) {
javax.jcr.query.Row row = rowIter.nextRow();
// Iterate over the column values in each row ...
javax.jcr.Value[] values = row.getValues();
for ( javax.jcr.Value value : values ) {
...
}
// Or access the column values by name ...
for ( String columnName : columnNames ) {
javax.jcr.Value value = row.getValue(columnName);
...
}
}
// When finished, close the session ...
session.logout();
For more detail about these methods or about how to use other facets of the JCR query API, please consult Section 6.7 of the JCR 1.0 specification.
The JCR 1.0 specification uses the XPath query language because node structures in JCR are very analogous to the structure of an XML document. Thus, XPath provides a useful language for selecting and searching workspace content. And since JCR 1.0 defines a mapping between XML and a workspace view called the "document view", adapting XPath to workspace content is quite natural.
A JCR XPath query specifies the subset of nodes in a workspace that satisfy the constraints defined in the query. Constraints can limit the nodes in the results to be those nodes with a specific (primary or mixin) node type, with properties having particular values, or to be within a specific subtree of the workspace. The query also defines how the nodes are to be returned in the result sets using column specifiers and ordering specifiers.
As an aside, ModeShape actually implements XPath queries by transforming them into the equivalent JCR-SQL2 representation. And the JCR-SQL2 language, although often more verbose, is much more capable of representing complex queries with multiple combinations of type, property, and path constraints.
JCR 1.0 specifies that support is required only for returning column values based upon single-valued, non-residual
properties that are declared on or inherited by the node types specified in the type constraint.
ModeShape follows this requirement, and does not specifying residual properties. However, ModeShape does allow
multi-valued properties to be specified as result columns.
And as per the specification, ModeShape always returns the "jcr:path
" and "jcr:score
"
pseudo-columns.
ModeShape uses the last location step with an attribute axis to specify the properties that are to be returned as result columns. Multiple properties are specified with a union. For example, the following table shows several XPath queries and how they map to JCR-SQL2 queries.
Table 10.1. Specifying result set columns
XPath | JCR-SQL2 |
---|---|
//* | SELECT * FROM [nt:base] |
//element(*,my:type) | SELECT * FROM [my:type] |
//element(*,my:type)/@my:title | SELECT [my:title] FROM [my:type] |
//element(*,my:type)/(@my:title | @my:text) | SELECT [my:title], [my:text] FROM [my:type] |
//element(*,my:type)/(@my:title union @my:text) | SELECT [my:title], [my:text] FROM [my:type] |
JCR 1.0 specifies that support is required only for specifying constraints of one primary type, and it
is optional to support specifying constraints on one (or more) mixin types. The specification
also defines that the XPath element
test be used to test against node types,
and that it is optional to support element
tests on location steps other than the last one.
Type constraints are inherently inheritance-sensitive, in that a constraint against a particular node type
'X' will be satisfied by nodes explicitly declared to be of type 'X' or of subtypes of 'X'.
ModeShape does support using the element
test to test against primary or mixin type.
ModeShape also only supports using an element
test on the last location step.
For example, the following table shows several XPath queries and how they map to JCR-SQL2 queries.
Table 10.2. Specifying type constraints
XPath | JCR-SQL2 |
---|---|
//* | SELECT * FROM [nt:base] |
//element(*,my:type) | SELECT * FROM [my:type] |
/jcr:root/nodes/element(*,my:type) | SELECT * FROM [my:type] WHERE PATH([my:type])> LIKE '/nodes/%' AND DEPTH([my:type]) = CAST(2 AS LONG) |
/jcr:root/nodes//element(*,my:type) | SELECT * FROM [my:type] WHERE PATH([my:type]) LIKE '/nodes/%' |
/jcr:root/nodes//element(ex:nodeName,my:type) | SELECT * FROM [my:type] WHERE PATH([my:type]) LIKE '/nodes/%' AND NAME([my:type]) = 'ex:nodeName' |
Note that the JCR-SQL2 language supported by ModeShape is far more capable of joining multiple sets of nodes with different type, property and path constraints.
JCR 1.0 specifies that attribute tests on the last location step is required, but that predicate tests on any other location steps are optional.
ModeShape does support using attribute tests on the last location step to specify property constraints, as well as supporting axis and filter predicates on other location steps. For example, the following table shows several XPath queries and how they map to JCR-SQL2 queries.
Table 10.3. Specifying property constraints
XPath | JCR-SQL2 |
---|---|
//*[@prop1] | SELECT * FROM [nt:base] WHERE [nt:base].prop1 IS NOT NULL |
//element(*,my:type)[@prop1] | SELECT * FROM [my:type] WHERE [my:type].prop1 IS NOT NULL |
//element(*,my:type)[@prop1=xs:boolean('true')] | SELECT * FROM [my:type] WHERE [my:type].prop1 = CAST('true' AS BOOLEAN) |
//element(*,my:type)[@id<1 and @name='john'] | SELECT * FROM [my:type] WHERE id < 1 AND name = 'john' |
//element(*,my:type)[a/b/@id] | SELECT * FROM [my:type] JOIN [nt:base] as nodeSet1 ON ISCHILDNODE(nodeSet1,[my:type]) JOIN [nt:base] as nodeSet2 ON ISCHILDNODE(nodeSet2,nodeSet1) WHERE (NAME(nodeSet1) = 'a' AND NAME(nodeSet2) = 'b') AND nodeSet2.id IS NOT NULL |
//element(*,my:type)[./*/*/@id] | SELECT * FROM [my:type] JOIN [nt:base] as nodeSet1 ON ISCHILDNODE(nodeSet1,[my:type]) JOIN [nt:base] as nodeSet2 ON ISCHILDNODE(nodeSet2,nodeSet1) WHERE nodeSet2.id IS NOT NULLL |
//element(*,my:type)[.//@id] | SELECT * FROM [my:type] JOIN [nt:base] as nodeSet1 ON ISDESCENDANTNODE(nodeSet1,[my:type]) WHERE nodeSet2.id IS NOT NULLL |
Section 6.6.3.3 of the JCR 1.0 specification contains an in-depth description of property value constraints using various comparison operators.
JCR 1.0 specifies that exact, child node, and descendants-or-self path constraints be supported on the location steps in an XPath query.
ModeShape does support the four kinds of path constraints. For example, the following table shows several XPath queries and how they map to JCR-SQL2 queries.
Table 10.4. Specifying path constraints
XPath | JCR-SQL2 |
---|---|
/jcr:root/a/b[*] | SELECT * FROM [nt:base] WHERE PATH([nt:base]) = '/a/b' |
/jcr:root/a[1]/b[*] | SELECT * FROM [nt:base] WHERE PATH([nt:base]) = '/a/b' |
/jcr:root/a[2]/b | SELECT * FROM [nt:base] WHERE PATH([nt:base]) = '/a[2]/b' |
/jcr:root/a/b[2]//c[4] | SELECT * FROM [my:type] WHERE PATH([nt:base]) = '/a/b[2]/c[4]' OR PATH(nodeSet1) LIKE '/a/b[2]/%/c[4]' |
/jcr:root/a/b//c//d | SELECT * FROM [my:type] WHERE PATH([nt:base]) = '/a/b/c/d' OR PATH([nt:base]) LIKE '/a/b/%/c/d' OR PATH([nt:base]) LIKE '/a/b/c/%/d' OR PATH([nt:base]) LIKE '/a/b/%/c/%/d' |
//element(*,my:type)[@id<1 and @name='john'] | SELECT * FROM [my:type] WHERE id < 1 AND name = 'john' |
/jcr:root/a/b//element(*,my:type) | SELECT * FROM [my:type] WHERE PATH([my:type]) = '/a/b/%' |
Note that the JCR-SQL2 language supported by ModeShape is capable of representing a wider combination of path constraints.
JCR 1.0 extends the XPath grammar to add support for ordering the results according to the natural ordering of the values of one or more properties on the nodes.
ModeShape does support zero or more ordering specifiers, including whether each specifier is ascending or descending. If no ordering specifiers are defined, the ordering of the results is not predefined and may vary (though ordering by score is often the approach). For example, the following table shows several XPath queries and how they map to JCR-SQL2 queries.
Table 10.5. Specifying result ordering
XPath | JCR-SQL2 |
---|---|
//element(*,*) order by @title | SELECT nodeSet1.title FROM [nt:base] AS nodeSet1 ORDER BY nodeSet1.title |
//element(*,*) order by @title, @jcr:score | SELECT nodeSet1.title FROM [nt:base] AS nodeSet1 ORDER BY nodeSet1.title, SCORE([nt:base]) |
Note that the JCR-SQL2 language supported by ModeShape has a far richer ORDER BY
clause,
allowing the use of any kind of dynamic operand, including ordering upon arithmetic operations
of multiple dynamic operands.
JCR 1.0 defines a number of other optional and required features, and these are summarized in this section.
Only abbreviated XPath syntax is supported.
Only the child
axis (the default axis, represented by '/' in abbreviated syntax),
descendant-or-self
axis (represented by '//' in abbreviated syntax),
self
axis (represented by '.' in abbreviated syntax),
and attribute
axis (represent by '@' in abbreviated syntax) are supported.
The text()
node test is not supported.
The element()
node test is supported.
The jcr:like()
function is supported.
The jcr:contains()
function is supported.
The jcr:score()
function is supported.
The jcr:deref()
function is not supported.
The JCR-SQL query language is defined by the JCR 1.0 specification as a way to express queries using strings that are similar to SQL. Support for the language is optional, and in fact this language was deprecated in the JCR 2.0 specification in favor of the improved and more powerful (and more SQL-like) JCR-SQL2 language, which is covered in the next section.
ModeShape includes support for the JCR-SQL language, and adds several extensions to make it even more powerful and useful:
Support for the UNION
, INTERSECT
, and EXCEPT
set operations on multiple result
sets to form a single result set. As with standard SQL, the result sets being combined must have the same columns.
The UNION
operator combines the rows from two result sets, the INTERSECT
operator returns
the difference between two result sets, and the EXCEPT
operator returns the rows that are common to
two result sets. Duplicate rows are removed unless the operator is followed by the ALL
keyword.
For detail, see the grammar for set queries.
Removal of duplicate rows in the results, using "SELECT DISTINCT ...
".
Limiting the number of rows in the result set with the "LIMIT count
" clause, where count
is the maximum number of rows that should be returned. This clause may optionally be followed by the
"OFFSET number
" clause to specify the number of initial rows that should be skipped.
Support for the IN
and NOT IN
clauses to more easily and concisely supply multiple
of discrete static operands.
For example, "WHERE ... [my:type].[prop1] IN (3,5,7,10,11,50) ...
".
Support for the BETWEEN
clause to more easily and concisely supply a range of discrete operands.
For example, "WHERE ... [my:type].[prop1] BETWEEN 3 EXCLUSIVE AND 10 ...
".
The grammar for the JCR-SQL query language is actually a superset of that defined by the JCR 1.0 specification, and as such the complete grammar is included here.
The grammar is presented using the same EBNF nomenclature as used in the JCR 1.0 specification. Terms are surrounded by '[' and ']' denote optional terms that appear zero or one times. Terms surrounded by '{' and '}' denote terms that appear zero or more times. Parentheses are used to identify groups, and are often used to surround possible values. Literals (or keywords) are denoted by single-quotes.
QueryCommand ::= Query | SetQuery SetQuery ::= Query ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query { ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query } Query ::= Select From [Where] [OrderBy] [Limit] Select ::= 'SELECT' ('*' | Proplist ) From ::= 'FROM' NtList Where ::= 'WHERE' WhereExp OrderBy ::= 'ORDER BY' propname [Order] {',' propname [Order]} Order ::= 'DESC' | 'ASC' Proplist ::= propname {',' propname} NtList ::= ntname {',' ntname} WhereExp ::= propname Op value | propname 'IS' ['NOT'] 'NULL' | like | contains | whereexp ('AND'|'OR') whereexp | 'NOT' whereexp | '(' whereexp ')' | joinpropname '=' joinpropname | between | propname ['NOT'] 'IN' '(' value {',' value } ')' Op ::= '='|'>'|'<'|'>='|'<='|'<>' joinpropname ::= quotedjoinpropname | unquotedjoinpropname quotedjoinpropname ::= ''' unquotedjoinpropname ''' unquotedjoinpropname ::= ntname '.jcr:path' propname ::= quotedpropname | unquotedpropname quotedpropname ::= ''' unquotedpropname ''' unquotedpropname ::= /* A property name, possible a pseudo-property: jcr:score or jcr:path */ ntname ::= quotedntname | unquotedntname quotedntname ::= ''' unquotedntname ''' unquotedntname ::= /* A node type name */ value ::= ''' literalvalue ''' | literalvalue literalvalue ::= /* A property value (in standard string form) */ like ::= propname 'LIKE' likepattern [ escape ] likepattern ::= ''' likechar { likepattern } ''' likechar ::= char | '%' | '_' escape ::= 'ESCAPE' ''' likechar ''' char ::= /* Any character valid within the string representation of a value except for the characters % and _ themselves. These must be escaped */ contains ::= 'CONTAINS(' scope ',' searchexp ')' scope ::= unquotedpropname | '.' searchexp ::= ''' exp ''' exp ::= ['-']term {whitespace ['OR'] whitespace ['-']term} term ::= word | '"' word {whitespace word} '"' word ::= /* A string containing no whitespace */ whitespace ::= /* A string of only whitespace*/ between ::= propname ['NOT'] 'BETWEEN' lowerBound ['EXCLUSIVE'] 'AND' upperBound ['EXCLUSIVE'] lowerBound ::= value upperBound ::= value Limit ::= 'LIMIT' count [ 'OFFSET' offset ] count ::= /* Positive integer value */ offset ::= /* Non-negative integer value */
The JCR-SQL2 query language is defined by the JCR 2.0 specification as a way to express queries using strings that are similar to SQL. This query language is an improvement over the JCR-SQL language, providing among other things far richer specifications of joins and criteria.
ModeShape includes full support for the complete JCR-SQL2 query language. However, ModeShape adds several extensions to make it even more powerful:
Support for the "FULL OUTER JOIN
" and "CROSS JOIN
" join types, in addition to the
"LEFT OUTER JOIN
", "RIGHT OUTER JOIN
" and "INNER JOIN
" types defined by
JCR-SQL2. Note that "JOIN
" is a shorthand for "INNER JOIN
".
For detail, see the grammar for joins.
Support for the UNION
, INTERSECT
, and EXCEPT
set operations on multiple result
sets to form a single result set. As with standard SQL, the result sets being combined must have the same columns.
The UNION
operator combines the rows from two result sets, the INTERSECT
operator returns
the difference between two result sets, and the EXCEPT
operator returns the rows that are common to
two result sets. Duplicate rows are removed unless the operator is followed by the ALL
keyword.
For detail, see the grammar for set queries.
Removal of duplicate rows in the results, using "SELECT DISTINCT ...
".
For detail, see the grammar for queries.
Limiting the number of rows in the result set with the "LIMIT count
" clause, where count
is the maximum number of rows that should be returned. This clause may optionally be followed by the
"OFFSET number
" clause to specify the number of initial rows that should be skipped.
For detail, see the grammar for limits and offsets.
Additional dynamic operands "DEPTH([<selectorName>])
" and "PATH([<selectorName>])
"
that enable placing constraints on the node depth and path, respectively. These dynamic operands
can be used in a manner similar to "NAME([<selectorName>])
" and "LOCALNAME([<selectorName>])
"
that are defined by JCR-SQL2. Note in each of these cases, the selector name is optional if there is only one
selector in the query.
For detail, see the grammar for dynamic operands.
Additional dynamic operand "REFERENCE([<selectorName>.]<propertyName>)
" and
"REFERENCE([<selectorName>])
" that
enables placing constraints on one or any of the reference properties, respectively, and which can be used in a manner similar to "
PropertyValue([<selectorName>.]<propertyName>)
". Note in each of these cases, the
selector name is optional if there is only one selector in the query, and that the property name can be excluded
if the constraint should apply to all reference properties.
For detail, see the grammar for dynamic operands.
Support for the IN
and NOT IN
clauses to more easily and concisely supply multiple
of discrete static operands.
For example, "WHERE ... [my:type].[prop1] IN (3,5,7,10,11,50) ...
".
For detail, see the grammar for set constraints.
Support for the BETWEEN
clause to more easily and concisely supply a range of discrete operands.
For example, "WHERE ... [my:type].[prop1] BETWEEN 3 EXCLUSIVE AND 10 ...
".
For detail, see the grammar for between constraints.
Support for simple arithmetic in numeric-based criteria and order-by clauses. For example,
"... WHERE
SCORE(type1) +
SCORE(type2) > 1.0
" or
"... ORDER BY
(SCORE(type1) * SCORE(type2)) ASC,
LENGTH(type2.property1) DESC
".
For detail, see the grammar for order-by clauses.
The grammar for the JCR-SQL2 query language is actually a superset of that defined by the JCR 2.0 specification, and as such the complete grammar is included here.
The grammar is presented using the same EBNF nomenclature as used in the JCR 2.0 specification. Terms are surrounded by '[' and ']' denote optional terms that appear zero or one times. Terms surrounded by '{' and '}' denote terms that appear zero or more times. Parentheses are used to identify groups, and are often used to surround possible values. Literals (or keywords) are denoted by single-quotes.
QueryCommand ::= Query | SetQuery SetQuery ::= Query ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query { ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query } Query ::= 'SELECT' ['DISTINCT'] columns 'FROM' Source ['WHERE' Constraint] ['ORDER BY' orderings] [Limit]
Source ::= Selector | Join Selector ::= nodeTypeName ['AS' selectorName] nodeTypeName ::= Name
Join ::= left [JoinType] 'JOIN' right 'ON' JoinCondition // If JoinType is omitted INNER is assumed. left ::= Source right ::= Source JoinType ::= Inner | LeftOuter | RightOuter | FullOuter | Cross Inner ::= 'INNER' ['JOIN'] LeftOuter ::= 'LEFT JOIN' | 'OUTER JOIN' | 'LEFT OUTER JOIN' RightOuter ::= 'RIGHT OUTER' ['JOIN'] RightOuter ::= 'FULL OUTER' ['JOIN'] RightOuter ::= 'CROSS' ['JOIN'] JoinCondition ::= EquiJoinCondition | SameNodeJoinCondition | ChildNodeJoinCondition | DescendantNodeJoinCondition
EquiJoinCondition ::= selector1Name'.'property1Name '=' selector2Name'.'property2Name selector1Name ::= selectorName selector2Name ::= selectorName property1Name ::= propertyName property2Name ::= propertyName
SameNodeJoinCondition ::= 'ISSAMENODE(' selector1Name ',' selector2Name [',' selector2Path] ')' selector2Path ::= Path
ChildNodeJoinCondition ::= 'ISCHILDNODE(' childSelectorName ',' parentSelectorName ')' childSelectorName ::= selectorName parentSelectorName ::= selectorName
DescendantNodeJoinCondition ::= 'ISDESCENDANTNODE(' descendantSelectorName ',' ancestorSelectorName ')' descendantSelectorName ::= selectorName ancestorSelectorName ::= selectorName
Constraint ::= ConstraintItem | '(' ConstraintItem ')' ConstraintItem ::= And | Or | Not | Comparison | Between | PropertyExistence | SetConstraint | FullTextSearch | SameNode | ChildNode | DescendantNode
And ::= constraint1 'AND' constraint2 constraint1 ::= Constraint constraint2 ::= Constraint
Comparison ::= DynamicOperand Operator StaticOperand Operator ::= '=' | '!=' | '<' | '<=' | '>' | '>=' | 'LIKE'
Between ::= DynamicOperand ['NOT'] 'BETWEEN' lowerBound ['EXCLUSIVE'] 'AND' upperBound ['EXCLUSIVE'] lowerBound ::= StaticOperand upperBound ::= StaticOperand
PropertyExistence ::= selectorName'.'propertyName 'IS' ['NOT'] 'NULL' | propertyName 'IS' ['NOT'] 'NULL' /* If only one selector exists in this query */
SetConstraint ::= selectorName'.'propertyName ['NOT'] 'IN' | propertyName ['NOT'] 'IN' /* If only one selector exists in this query */ '(' firstStaticOperand {',' additionalStaticOperand } ')' firstStaticOperand ::= StaticOperand additionalStaticOperand ::= StaticOperand
FullTextSearch ::= 'CONTAINS(' ([selectorName'.']propertyName | selectorName'.*') ',' ''' fullTextSearchExpression''' ')' /* If only one selector exists in this query, explicit specification of the selectorName preceding the propertyName is optional */ fullTextSearchExpression ::= FulltextSearch
where FulltextSearch
is defined by the following, and is the same as the
full-text search language supported by ModeShape:
FulltextSearch ::= Disjunct {Space 'OR' Space Disjunct} Disjunct ::= Term {Space Term} Term ::= ['-'] SimpleTerm SimpleTerm ::= Word | '"' Word {Space Word} '"' Word ::= NonSpaceChar {NonSpaceChar} Space ::= SpaceChar {SpaceChar} NonSpaceChar ::= Char - SpaceChar /* Any Char except SpaceChar */ SpaceChar ::= ' ' Char ::= /* Any character */
SameNode ::= 'ISSAMENODE(' [selectorName ','] Path ')' /* If only one selector exists in this query, explicit specification of the selectorName preceding the path is optional */
ChildNode ::= 'ISCHILDNODE(' [selectorName ','] Path ')' /* If only one selector exists in this query, explicit specification of the selectorName preceding the path is optional */
DescendantNode ::= 'ISDESCENDANTNODE(' [selectorName ','] Path ')' /* If only one selector exists in this query, explicit specification of the selectorName preceding the propertyName is optional */
Name ::= '[' quotedName ']' | '[' simpleName ']' | simpleName quotedName ::= /* A JCR Name (see the JCR specification) */ simpleName ::= /* A JCR Name that contains only SQL-legal characters (namely letters, digits, and underscore) */ Path ::= '[' quotedPath ']' | '[' simplePath ']' | simplePath quotedPath ::= /* A JCR Path that contains non-SQL-legal characters */ simplePath ::= /* A JCR Path (rather Name) that contains only SQL-legal characters (namely letters, digits, and underscore) */
StaticOperand ::= Literal | BindVariableValue Literal Literal ::= CastLiteral | UncastLiteral CastLiteral ::= 'CAST(' UncastLiteral ' AS ' PropertyType ')' PropertyType ::= 'STRING' | 'BINARY' | 'DATE' | 'LONG' | 'DOUBLE' | 'DECIMAL' | 'BOOLEAN' | 'NAME' | 'PATH' | 'REFERENCE' | 'WEAKREFERENCE' | 'URI' /* 'WEAKREFERENCE' is not currently supported in JCR 1.0 */ UncastLiteral ::= UnquotedLiteral | ''' UnquotedLiteral ''' | '"' UnquotedLiteral '"' UnquotedLiteral ::= /* String form of a JCR Value, as defined in the JCR specification */
BindVariableValue ::= '$'bindVariableName bindVariableName ::= /* A string that conforms to the JCR Name syntax, though the prefix does not need to be a registered namespace prefix. */
DynamicOperand ::= PropertyValue | ReferenceValue | Length | NodeName | NodeLocalName | NodePath | NodeDepth | FullTextSearchScore | LowerCase | UpperCase | Arithmetic | '(' DynamicOperand ')' PropertyValue ::= [selectorName'.'] propertyName /* If only one selector exists in this query, explicit specification of the selectorName preceding the propertyName is optional */ ReferenceValue ::= 'REFERENCE(' selectorName '.' propertyName ')' | 'REFERENCE(' selectorName ')' | 'REFERENCE()' | /* If only one selector exists in this query, explicit specification of the selectorName preceding the propertyName is optional. Also, the property name may be excluded if the constraint should apply to any reference property. */ Length ::= 'LENGTH(' PropertyValue ')' NodeName ::= 'NAME(' [selectorName] ')' /* If only one selector exists in this query, explicit specification of the selectorName is optional */ NodeLocalName ::= 'LOCALNAME(' [selectorName] ')' /* If only one selector exists in this query, explicit specification of the selectorName is optional */ NodePath ::= 'PATH(' [selectorName] ')' /* If only one selector exists in this query, explicit specification of the selectorName is optional */ NodeDepth ::= 'DEPTH(' [selectorName] ')' /* If only one selector exists in this query, explicit specification of the selectorName is optional */ FullTextSearchScore ::= 'SCORE(' [selectorName] ')' /* If only one selector exists in this query, explicit specification of the selectorName is optional */ LowerCase ::= 'LOWER(' DynamicOperand ')' UpperCase ::= 'UPPER(' DynamicOperand ')' Arithmetic ::= DynamicOperand ('+'|'-'|'*'|'/') DynamicOperand
orderings ::= Ordering {',' Ordering} Ordering ::= DynamicOperand [Order] Order ::= 'ASC' | 'DESC'
columns ::= (Column ',' {Column}) | '*' Column ::= ([selectorName'.']propertyName ['AS' columnName]) | (selectorName'.*') /* If only one selector exists in this query, explicit specification of the selectorName preceding the propertyName is optional */ selectorName ::= Name propertyName ::= Name columnName ::= Name
There are times when a formal structured query language is overkill, and the easiest way to find the right content is to perform a search, like you would with a search engine such as Google or Yahoo! This is where ModeShape's full-text search language comes in, because it allows you to use the JCR query API but with a far simpler, Google-style search grammar.
This query language is actually defined by the JCR 2.0 specification as the
full-text search expression grammar
used in the second parameter of the CONTAINS(...)
function of the JCR-SQL2 language.
We just pulled it out and made it available as a first-class query language, such that a full-text
search query supplied by the user, full-text-query, is equivalent to executing this JCR-SQL2:
SELECT * FROM [nt:base] WHERE CONTAINS([nt:base],'full-text-query')
SELECT * FROM [nt:base] WHERE CONTAINS([nt:base],'full-text-query')
This language allows a JCR client to construct a query to find nodes with property values that match the supplied terms. Nodes that "best" match the terms are returned before nodes that have a lesser match. Of course, ModeShape uses a complex system to analyze the node content and the query terms, and may perform a number of optimizations, such as (but not limited to) eliminating stop words (e.g., "the", "a", "and", etc.), treating terms independent of case, and converting words to base forms using a process called stemming (e.g., "running" into "run", "customers" into "customer").
Search terms can also include phrases by simply wrapping the phrase with double-quotes. For example,
the search term 'table "customer invoice"
' would rank higher those nodes with properties containing
the phrase "customer invoice" than nodes with properties containing just "customer" or "invoice".
Term in the query are implicitly AND-ed together, meaning that the matches occur when a node has property values that match all of the terms. However, it is also possible to put an "OR" in between two terms where either of those terms may occur.
It is also possible to specify that terms should not appear in the results. This is called a negative term, and it reduces the rank of any node whose property values contain the the value. To specify a negative term, simply prefix the term with a hyphen ('-').
The grammar for this full-text search language is specified in Section 6.7.19 of the JCR 2.0 specification, but it is also included here as a convenience.
The grammar is presented using the same EBNF nomenclature as used in the JCR 2.0 specification. Terms are surrounded by '[' and ']' denote optional terms that appear zero or one times. Terms surrounded by '{' and '}' denote terms that appear zero or more times. Parentheses are used to identify groups, and are often used to surround possible values.
FulltextSearch ::= Disjunct {Space 'OR' Space Disjunct} Disjunct ::= Term {Space Term} Term ::= ['-'] SimpleTerm SimpleTerm ::= Word | '"' Word {Space Word} '"' Word ::= NonSpaceChar {NonSpaceChar} Space ::= SpaceChar {SpaceChar} NonSpaceChar ::= Char - SpaceChar /* Any Char except SpaceChar */ SpaceChar ::= ' ' Char ::= /* Any character */
As you can see, this is a pretty simple and straightforward query language. But this language makes it extremely easy to find all the nodes in the repository that match a set of terms.
When using this query language, the QueryResult always contains the "jcr:path" and "jcr:score" columns.
JCR 2.0 introduces a new API for programmatically constructing a query. This API allows the client to construct the lower-level
objects for each part of the query, and is a great fit for applications that would otherwise generate fairly complicated
query expressions. Using this API is a matter of getting the QueryObjectModelFactory from the session's QueryManager,
and using the factory to create the various components, starting with the lowest-level components. Then, these lower-level
components can be passed to other factory methods to create the higher-level components, and so on, until finally
the createQuery(...)
method is called to return the QueryObjectModel.
Here is a simple example that shows how this is done for the simple query "SELECT * FROM [nt:unstructured] AS unstructNodes
":
// Obtain the query manager for the session ...
javax.jcr.query.QueryManager queryManager = session.getWorkspace().getQueryManager();
// Create a query object model factory ...
QueryObjectModelFactory factory = queryManager.getQOMFactory();
// Create the FROM clause: a selector for the [nt:unstructured] nodes ...
Selector source = factory.selector("nt:unstructured","unstructNodes");
// Create the SELECT clause (we want all columns defined on the node type) ...
Column[] columns = null;
// Create the WHERE clause (we have none for this query) ...
Constraint constraint = null;
// Define the orderings (we have none for this query)...
Ordering[] orderings = null;
// Create the query ...
QueryObjectModel query = factory.createQuery(source,constraint,orderings,columns);
// Execute the query and get the results ...
// (This is the same as before.)
javax.jcr.QueryResult result = query.execute();
From this point on, processing the results is the same as when using the JCR Query API:
// Iterate over the nodes in the results ...
javax.jcr.NodeIterator nodeIter = result.getNodes();
while ( nodeIter.hasNext() ) {
javax.jcr.Node node = nodeIter.nextNode();
...
}
// Or iterate over the rows in the results ...
String[] columnNames = result.getColumnNames();
javax.jcr.query.RowIterator rowIter = result.getRows();
while ( rowIter.hasNext() ) {
javax.jcr.query.Row row = rowIter.nextRow();
// Iterate over the column values in each row ...
javax.jcr.Value[] values = row.getValues();
for ( javax.jcr.Value value : values ) {
...
}
// Or access the column values by name ...
for ( String columnName : columnNames ) {
javax.jcr.Value value = row.getValue(columnName);
...
}
}
// When finished, close the session ...
session.logout();
Of course, most queries will create the columns, orderings, and constraints using the QueryObjectModelFactory, whereas the example above just assumes all of the columns, no orderings, and no constraints.