Package org.modeshape.graph.query

The Query API provides a mechanism for building and executing queries.

See:
          Description


Interface Summary
Queryable An interface defining the ability to submit a query and obtain results.
QueryBuilder.DynamicOperandBuilder Interface that defines a dynamic operand portion of a criteria.
QueryBuilder.OrderByOperandBuilder  
QueryResults The resulting output of a query.
QueryResults.Columns Definition of the columns that are available in the results, which outline the structure of the tuples in the results, and which can be used to access the individual values in each of the tuples.
QueryResults.Cursor An interface used to walk through the results.
 

Class Summary
QueryBuilder A component that can be used to programmatically create QueryCommand objects.
QueryContext An immutable context in which queries are to be executed.
QueryEngine A query engine that is able to execute formal queries expressed in the Graph API's Abstract Query Model.
QueryResults.Statistics  
 

Package org.modeshape.graph.query Description

The Query API provides a mechanism for building and executing queries. The framework provides a reusable and extensible query engine that is capable of planning, validating, optimizing, and executing queries against a generic back-end system. Simply subclass the QueryProcessor that creates a ProcessingComponent to operates against the back-end system, and assemble a QueryEngine that can be used to provide a rich query capability.

Abstract query model

At the heart of the entire query system is a single representation of what constitutes a query. The abstract query model defines a language-independent vocabulary for queries, and consists of a family of Java classes each represent the important semantic elements needed to fully define a query.

There are two ways to construct abstract query models. The first is to programmatically construct a query model using the QueryBuilder, which provides a fluent API that makes it easy to create a query with Java code. The second (and more common) approach is to use a QueryParser that parses a query represented in a specific language (like SQL or XPath) and then creates the query's equivalent abstract query model. There's even a QueryParsers class that can manage the parsers for multiple languages.

The abstract query model classes are immutable, making them very easily shared or reused if that is advantageous.

SQL language

One of the QueryParser implementation provided out of the box is the SqlQueryParser, which understands a subset of SQL.

QueryEngine

The QueryEngine is the component that accepts and executes queries expressed as abstract query models. Each submitted query is planned, validated, optimized, and then processed to compute and return the final query results.

Note that the QueryEngine is thread-safe.

Planning

In the planning stage, a canonical plan is generated for each query. This plan is a tree of PlanNode objects that each represent a different aspect of the query, and is a form that is easily manipulated by subsequent stages. Any implementation of Planner can be used, though a CanonicalPlanner implementation is provided and will be sufficient for most cases. In fact, the subsequent execution steps often require the plan to be in its canonical form, so for most situations it may be best to simply reuse the CanonicalPlanner and in other simply extend it.

Note that query plans are mutable and not thread-safe, meaning that such plans are not intended to be shared or reused.

Optimization

In the optimization stage, the canonical query plan is evaluated, validated, and manipulated to produce a more a single optimized query processing plan. The query plan is often changed in situ, although this is not required of the Optimizer implementations. A library of existing OptimizerRule classes is provided, though it's very easy to add more optimizer rules.

The RuleBasedOptimizer is an implementation that optimizes a query using a stack of rules. A new stack is created for each rule, though the rules are required to be immutable and thus often shared and reused. And, the RuleBasedOptimizer is easily subclassed to define a custom stack of rules.

Validation

The canonical planner or the optimization rules have access to the table and column definitions that may be queried. The query framework does not prescribe the semantics of a table or column, but instead provides a Schemata interface that provides access to the immutable Schemata.Table definitions (that then contain the Schemata.Column and Schemata.Key definitions).

The canonical planner and a number of the provided optimizer rules use the Schemata to verify that the query is referencing an existing table and columns, whatever they are defined to be. Although any Schemata implementaiton can be used, the query framework provides an ImmutableSchemata class with a builder with a fluent API that can create the corresponding immutable table, column and key definitions.

Processing

In the processing stage, the optimized query plan is used to construct and assemble the ProcessingComponent that correspond to the various parts of the quer plan. The resulting components form the basic processing engine for that query. At the bottom are the "access" components that perform the low-level access of the tuples from the graph container. Above these are the other components that implement various operations, such as limits, joins (using merge and nested loop algorithms), unions, intersects, distinct, sorts, and even column projections. At the top is a single component that produces tuples that represent the results of the query.

Once the QueryProcessor creates the ProcessingComponent assembly, the top-level component is executed. Execution involves requesting from the child processing component(s) the next batch of results, processing each of the tuples according to the specific ProcessingComponent algorithm, and finally returning the processed tuples.

QueryResults

A query over a graph of content will result in a set of nodes that matched the criteria specified in the query. Each node contained in the results will be identified by its Location as well as any values for the selected properties. Typically, queries will result in a single node per row, although joins may result in multiple rows per row.



Copyright © 2008-2010 JBoss, a division of Red Hat. All Rights Reserved.