Skip navigation links
ModeShape Distribution 5.0.0.Final

Package org.modeshape.jcr.query

The Query API provides a mechanism for building and executing queries.

See: Description

Package org.modeshape.jcr.query Description

The Query API provides a mechanism for building and executing queries. The framework provides a reusable and extensible query engine that is capable of planning, validating, optimizing, and executing queries against a generic back-end system.

Abstract query model

At the heart of the entire query system is a single representation of what constitutes a query. The abstract query model defines a language-independent vocabulary for queries, and consists of a family of Java classes each represent the important semantic elements needed to fully define a query.

There are two ways to construct abstract query models. The first is to programmatically construct a query model using the QueryBuilder, which provides a fluent API that makes it easy to create a query with Java code. The second (and more common) approach is to use a QueryParser that parses a query represented in a specific language (like SQL or XPath) and then creates the query's equivalent abstract query model. There's even a QueryParsers class that can manage the parsers for multiple languages.

The abstract query model classes are immutable, making them very easily shared or reused if that is advantageous.

SQL language

One of the QueryParser implementation provided out of the box is the BasicSqlQueryParser, which understands a subset of SQL.

QueryEngine

The QueryEngine is the component that accepts and executes queries expressed as abstract query models. Each submitted query is planned, validated, optimized, and then processed to compute and return the final query results.

Note that the QueryEngine is thread-safe.

Planning

In the planning stage, a canonical plan is generated for each query. This plan is a tree of PlanNode objects that each represent a different aspect of the query, and is a form that is easily manipulated by subsequent stages. Any implementation of Planner can be used, though a CanonicalPlanner implementation is provided and will be sufficient for most cases. In fact, the subsequent execution steps often require the plan to be in its canonical form, so for most situations it may be best to simply reuse the CanonicalPlanner and in other simply extend it.

Note that query plans are mutable and not thread-safe, meaning that such plans are not intended to be shared or reused.

Optimization

In the optimization stage, the canonical query plan is evaluated, validated, and manipulated to produce a more a single optimized query processing plan. The query plan is often changed in situ, although this is not required of the Optimizer implementations. A library of existing OptimizerRule classes is provided, though it's very easy to add more optimizer rules.

The RuleBasedOptimizer is an implementation that optimizes a query using a stack of rules. A new stack is created for each rule, though the rules are required to be immutable and thus often shared and reused. And, the RuleBasedOptimizer is easily subclassed to define a custom stack of rules.

Validation

The canonical planner or the optimization rules have access to the table and column definitions that may be queried. The query framework does not prescribe the semantics of a table or column, but instead provides a Schemata interface that provides access to the immutable Schemata.Table definitions (that then contain the Schemata.Column and Schemata.Key definitions).

The canonical planner and a number of the provided optimizer rules use the Schemata to verify that the query is referencing an existing table and columns, whatever they are defined to be. Although any Schemata implementaiton can be used, the query framework provides an ImmutableSchemata class with a builder with a fluent API that can create the corresponding immutable table, column and key definitions.

Processing

In the processing stage, the optimized query plan is used to construct and assemble a NodeSequence that abstractly represents the nodes that satisfy the query. When the sequence is accessed, the processing logic of the query engine dynamically performs the necessary operations to return the correct nodes.

QueryResults

A query over a graph of content will result in a set of nodes that matched the criteria specified in the query. Each node contained in the results will be identified by its NodeKey as well as any values for the selected properties. Typically, queries will result in a single node per row, although joins may result in multiple rows per row.

Skip navigation links
ModeShape Distribution 5.0.0.Final

Copyright © 2008–2016 JBoss, a division of Red Hat. All rights reserved.