Skip to end of metadata
Go to start of metadata

The JCR-SQL query language is defined by the JCR 1.0 specification as a way to express queries using strings that are similar to SQL. Support for the language is optional, and in fact this language was deprecated in the JCR 2.0 specification in favor of the improved and more powerful (and more SQL-like) JCR-SQL2 language.

As an aside, ModeShape's parser for JCR-SQL queries is actually just a simplified and more limited version of the parser for JCR-SQL2 queries. All other processing, however, is done in exactly the same way.

The JCR 2.0 specification defines how nodes in a repository are mapped onto relational tables queryable through a SQL-like language, including JCR-SQL and JCR-SQL2. Basically, each node type is mapped as a relational view with a single column for each of the node type's (residual and non-residual) property definitions. Conceptually, each node in the repository then appears as a record inside the view corresponding to the node type for which "Node.isNodeType(nodeTypeName)" would return true.

Since each node likely returns true from this method for multiple node type (e.g., the primary node type, the mixin types, and all supertypes of the primary and mixin node types), all nodes will likely appear as records in multiple views. And since each view only exposes those properties defined by (or inherited by) the corresponding node type, a full picture of a node will likely require joining the views for multiple node types. This special kind of join, where the nodes have the same identity on each side of the join, is referred to as an identity join, and is handled very efficiently by ModeShape.

ModeShape includes support for the JCR-SQL language, and adds several extensions to make it even more powerful and useful:

  • Support for the UNION, INTERSECT, and EXCEPT set operations on multiple result sets to form a single result set. As with standard SQL, the result sets being combined must have the same columns. The UNION operator combines the rows from two result sets, the INTERSECT operator returns the difference between two result sets, and the EXCEPT operator returns the rows that are common to two result sets. Duplicate rows are removed unless the operator is followed by the ALL keyword. For detail, see the grammar for set queries.
  • Removal of duplicate rows in the results, using "SELECT DISTINCT ...".
  • Limiting the number of rows in the result set with the "LIMIT count" clause, where count is the maximum number of rows that should be returned. This clause may optionally be followed by the "OFFSET number" clause to specify the number of initial rows that should be skipped.
  • Support for the IN and NOT IN clauses to more easily and concisely supply multiple of discrete static operands. For example, "WHERE ... prop1 IN (3,5,7,10,11,50) ...".
  • Support for the BETWEEN clause to more easily and concisely supply a range of discrete operands. For example, "WHERE ... prop1 BETWEEN 3 EXCLUSIVE AND 10 ...".
  • Support for (non-correlated) subqueries in the WHERE clause, wherever a static operand can be used. Subqueries can even be used within another subquery. All subqueries must return a single column, and each row's single value will be treated as a literal value. If the subquery is used in a clause that expects a single value (e.g., in a comparison), only the subquery's first row will be used. If the subquery is used in a clause that allows multiple values (e.g., IN (...)), then all of the subquery's rows will be used. For example, this query "WHERE ... prop1 IN ( SELECT my:prop2 FROM my:type2 WHERE my:prop3 < '1000' ) AND ..." will use the results of the subquery as the literal values in the IN clause.

The grammar for the JCR-SQL query language is actually a superset of that defined by the JCR 1.0 specification, and as such the complete grammar is included here.

The grammar is presented using the same EBNF nomenclature as used in the JCR 1.0 specification. Terms are surrounded by '[' and ']' denote optional terms that appear zero or one times. Terms surrounded by '{' and '}' denote terms that appear zero or more times. Parentheses are used to identify groups, and are often used to surround possible values. Literals (or keywords) are denoted by single-quotes.


Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.