Chapter 1. SQL Support

Teiid supports SQL for issuing queries and for defining view transformations; see also Procedure Language for how SQL is used in virtual procedures and update procedures.

Teiid provides nearly all of the functionality of SQL-92 DML. SQL-99 and later features are constantly being added based upon community need. The following does not attempt to cover SQL exhaustively, but rather highlights SQL's usage within Teiid. See the grammar for the exact form of SQL accepted by Teiid.

1.1. Identifiers

SQL commands contain references to tables and columns. These references are in the form of identifiers, which uniquely identify the tables and columns in the context of the command. All queries are processed in the context of a virtual database, or VDB. Because information can be federated across multiple sources, tables and columns must be scoped in some manner to avoid conflicts. This scoping is provided by models, which contain the information for each data source or set of views.

Fully-qualified table and column names are of the following form, where the separate 'parts' of the identifier are delimited by periods.

TABLE: <model_name>.<table_spec>
COLUMN: <model_name>.<table_spec>.<column_name>

Syntax Rules:

Identifiers can consist of alphanumeric characters, or the underscore (_) character, and must begin with an alphabetic character. Any Unicode character may be used in an identifier.
Because different data sources organize tables in different ways, some prepending catalog or schema or user information, Teiid allows the 'table_spec' to be a dot-delimited construct.
Identifiers are not case-sensitive in Teiid.
The separate parts of an identifier can be quoted, with double quotes. This is not required, but some tools do this automatically. Quotes establish another level of grouping, in addition to the dot delimiters. Quotes should not be used in such a way that the table specification, which may itself have multiple parts, is split between two quoted sections.

Some examples of valid fully-qualified table identifiers are:

MyModel.MySchema.Portfolios
"MyModel"."MySchema.Portfolios"
MyModel.Inventory
MyModel.MyCatalog.dbo.Authors

Some examples of valid fully-qualified column identifiers are:

MyModel.MySchema.Portfolios.portfolioID
"MyModel"."MySchema.Portfolios"."portfolioID"
MyModel.Inventory.totalPallets
MyModel.MyCatalog.dbo.Authors.lastName

Fully-qualified identifiers can always be used in SQL commands. Partially- or unqualified forms can also be used, as long as the resulting names are unambiguous in the context of the command. Different forms of qualification can be mixed in the same query.

1.2. Expressions

Identifiers, literals, and functions can be combined into expressions. Expressions can be used almost anywhere in a query -- SELECT, FROM (if specifying join criteria, WHERE, GROUP BY, HAVING. However you currently cannot use expressions in an ORDER BY clause.

Teiid supports the following types of expressions:

1.2.1. Column Identifiers

Column identifiers are used to specify the output columns in SELECT statements, the columns and their values for INSERT and UPDATE statements, and criteria used in WHERE and FROM clauses. They are also used in GROUP BY, HAVING, and ORDER BY clauses. The syntax for column identifiers was defined in the Identifiers section above.

1.2.2. Literals

Literal values represent fixed values. These can any of the 'standard' data types.

Syntax Rules:

Integer values will be assigned an integral data type big enough to hold the value (integer, long, or biginteger).
Floating point values will always be parsed as a double.
The keyword 'null' is used to represent an absent or unknown value and is inherently untyped. In many cases, a null literal value will be assigned an implied type based on context. For example, in the function '5 + null', the null value will be assigned the type 'integer' to match the type of the value '5'. A null literal used in the SELECT clause of a query with no implied context will be assigned to type 'string'.

Some examples of simple literal values are:

‘abc’
‘isn’’t true’ - use an extra single tick to escape a tick in a string with single ticks
5
-37.75e01 - scientific notation
100.0 - parsed as double
true
false
'\u0027' - unicode character

1.3. Criteria

Criteria are of two basic forms:

Predicates that evaluate to true or false
Logical criteria that combine predicates (AND, OR, NOT)

Syntax Rules:

expression (=|<>|!=|<|>|<=|>=) (expression|((ANY|ALL|SOME) subquery))
expression [NOT] IS NULL
expression [NOT] IN (expression[,expression]*)|subquery
expression [NOT] LIKE expression [ESCAPE char]
EXISTS(subquery)
expression BETWEEN minExpression AND maxExpression
criteria AND|OR criteria
NOT criteria
Criteria may be nested using parenthesis.

Some examples of valid criteria are:

(balance > 2500.0)
100*(50 - x)/(25 - y) > z
concat(areaCode,concat('-',phone)) LIKE '314%1'

Comparing null Values

Null values represent an unknown value. Comparison with a null value will evaluate to 'unknown', which can never be true even if 'not' is used.

1.4. SQL Commands

There are 4 basic commands for manipulating data in SQL, corresponding to the CRUD create, read, update, and delete operations: INSERT, SELECT, UPDATE, and DELETE. In addition, procedures can be executed using the EXECUTE command or through a procedural relational command.

1.4.1. SELECT Command

The SELECT command is used to retrieve records any number of relations.

A SELECT command has a number of clauses:

All of these clauses other than OPTION are defined by the SQL specification. The specification also specifies the order that these clauses will be logically processed. Below is the processing order where each stage passes a set of rows to the following stage. Note that this processing model is logical and does not represent the way any actual database engine performs the processing, although it is a useful model for understanding questions about SQL.

FROM stage - gathers all rows from all tables involved in the query and logically joins them with a Cartesian product, producing a single large table with all columns from all tables. Joins and join criteria are then applied to filter rows that do not match the join structure.
WHERE stage - applies a criteria to every output row from the FROM stage, further reducing the number of rows.
GROUP BY stage - groups sets of rows with matching values in the group by columns.
HAVING stage - applies criteria to each group of rows. Criteria can only be applied to columns that will have constant values within a group (those in the grouping columns or aggregate functions applied across the group).
SELECT stage - specifies the column expressions that should be returned from the query. Expressions are evaluated, including aggregate functions based on the groups of rows, which will no longer exist after this point. The output columns are named using either column aliases or an implicit name determined by the engine. If SELECT DISTINCT is specified, duplicate removal will be performed on the rows being returned from the SELECT stage.
ORDER BY stage - sorts the rows returned from the SELECT stage as desired. Supports sorting on multiple columns in specified order, ascending or descending. The output columns will be identical to those columns returned from the SELECT stage and will have the same name.
LIMIT stage - returns only the specified rows (with skip and limit values).

This model can be used to understand many questions about SQL. For example, columns aliased in the SELECT clause can only be referenced by alias in the ORDER BY clause. Without knowledge of the processing model, this can be somewhat confusing. Seen in light of the model, it is clear that the ORDER BY stage is the only stage occurring after the SELECT stage, which is where the columns are named. Because the WHERE clause is processed before the SELECT, the columns have not yet been named and the aliases are not yet known.

1.4.2. INSERT Command

The INSERT command is used to add a record to a table.

Example Syntax

INSERT INTO table (column,...) VALUES (value,...)
INSERT INTO table (column,...) query

1.4.3. UPDATE Command

The UPDATE command is used to modify records in a table. The operation may result in 1 or more records being updated, or in no records being updated if none match the criteria.

Example Syntax

UPDATE table SET (column=value,...) [WHERE criteria]

1.4.4. DELETE Command

The DELETE command is used to remove records from a table. The operation may result in 1 or more records being deleted, or in no records being deleted if none match the criteria.

Example Syntax

DELETE FROM table [WHERE criteria]

1.4.5. EXECUTE Command

The EXECUTE command is used to execute a procedure, such as a virtual procedure or a stored procedure. Procedures may have zero or more scalar input parameters. The return value from a procedure is a result set, the same as is returned from a SELECT. Note that EXEC can be used as a short form of this command.

Example Syntax

EXECUTE proc()
EXECUTE proc(value, ...)
EXECUTE proc(name1=value1,name4=param4, ...) - named parameter syntax

Syntax Rules:

The default order of parameter specification is the same as how they are defined in the procedure definition.
You can specify the parameters in any order by name. Parameters that are have default values and/or are nullable in the metadata, can be omitted from the named parameter call and will have the appropriate value passed at runtime.
If the procedure does not return a result set, the values from the RETURN, OUT, and IN_OUT parameters will be returned as a single row.

1.4.6. Procedural Relational Command

Procedural relational commands use the syntax of a SELECT to emulate an EXEC. In a procedural relational command a procedure group names is used in a FROM clause in place of a table. That procedure will be executed in place of a normal table access if all of the necessary input values can be found in criteria against the procedure. Each combination of input values found in the criteria results in an execution of the procedure.

Example Syntax

select * from proc
select output_param1, output_param2 from proc where input_param1 = 'x'
select output_param1, output_param2 from proc, table where input_param1 = table.col1 and input_param2 = table.col2

Syntax Rules:

The procedure as a table projects the same columns as an exec with the addition of the input parameters. For procedures that do not return a result set, IN_OUT columns will be projected as two columns, one that represents the output value and one named {column name}_IN that represents the input of the parameter.
Input values are passed via criteria. Values can be passed by '=','is null', or 'in' predicates.
The procedure virtual group automatically has an access pattern on its IN and IN_OUT parameters which allows it to be planned correctly as a dependent join when necessary or fail when sufficient criteria cannot be found.
Procedures containing duplicate names between the parameters (IN, IN_OUT, OUT, RETURN) and result set columns cannot be used in a procedural relational command.
Default values for IN, IN_OUT parameters are not used if there is no criteria present for a given input. Default values are only valid for named procedure syntax.

Multiple Execution

The usage of 'in' or join criteria can result in the procedure being executed multiple times.

1.5. Temp Tables

Teiid supports creating temporary,or "temp", tables. Temp tables are dynamically created, but are treated as any other physical table.

Temp tables can be defined implicitly by referencing them in a SELECT INTO or in an INSERT statement or explicitly with a CREATE TABLE statement. Implicitly created temp tables must have a name that starts with '#'.

Creation syntax:

CREATE LOCAL TEMPORARY TABLE<temporary table name> (<column name> <data type>,...)
SELECT <element name>,...INTO <temporary table name> FROM <table name>
INSERT INTO <temporary table name> ((<column name>,...)VALUES (<value>,...)

Drop syntax:

DROP TABLE <temporary table name>

Limitations:

With the CREATE TABLE syntax only basic table definition (column name and type information) is supported.
The "ON COMMIT" clause is not supported in the CREATE TABLE statement.
"drop behavior" option is not supported in the drop statement.
Only local temporary tables are supported. This implies that the scope of temp table will be either to the sesssion or the block of a virtual procedure that creates it.
Session level temp tables are not fail-over safe.
temp tables are non-transactional.
Temp tables do not support update or delete operations.

The following example is a series of statements that loads a temporary table with data from 2 sources, and with a manually inserted record, and then uses that temp table in a subsequent query.

... 
CREATE LOCAL TEMPORARY TABLE TEMP (a integer, b integer, c integer); 
SELECT * INTO temp FROM Src1; SELECT * INTO temp FROM Src2; 
INSERT INTO temp VALUES (1,2,3); 
SELECT a,b,c FROM Src3, temp WHERE Src3.a = temp.b; 
...

See virtual procedures for more on temp table usage.

1.6. SQL Clauses

This section describes the clauses that are used in the various SQL commands described in the previous section. Nearly all these features follow standard SQL syntax and functionality, so any SQL reference can be used for more information.

1.6.1. SELECT Clause

SQL queries start with the SELECT keyword and are often referred to as "SELECT statements". Teiid supports most of the standard SQL query constructs.

Usage:

SELECT [DISTINCT|ALL] ((expression [[AS] name])|(group identifier.STAR))*|STAR ...

Syntax Rules:

Aliased expressions are only used as the output column names and in the ORDER BY clause. They cannot be used in other clauses of the query.
DISTINCT may only be specified if the SELECT symbols are comparable.

1.6.2. FROM Clause

The FROM clause specifies the target table(s) for SELECT, UPDATE, and DELETE statements.

Example Syntax:

FROM {table [AS alias]}
FROM table1 [INNER|LEFT OUTER|RIGHT OUTER|FULL OUTER] JOIN table1 ON join-criteria
FROM table1 CROSS JOIN table1
FROM (subquery) [AS alias]
FROM table1 JOIN table2 MAKEDEP ON join-criteria
FROM table1 JOIN table2 MAKENOTDEP ON join-criteria
FROM table1 left outer join /* optional */ table2 ON join-criteria

DEP Hints

MAKEDEP and MAKENOTDEP are hints used to control dependent join behavior. They should only be used in situations where the optimizer does not chose the most optimal plan based upon query structure, metadata, and costing information.

1.6.3. WHERE Clause

The WHERE clause defines the criteria to limit the records affected by SELECT, UPDATE, and DELETE statements.

The general form of the WHERE is:

WHERE criteria

1.6.4. GROUP BY Clause

The GROUP BY clause denotes that rows should be grouped according to the specified expression values. One row will be returned for each group, after optionally filtering those aggregate rows based on a HAVING clause.

The general form of the GROUP BY is:

GROUP BY expression (,expression)*

Syntax Rules:

Column references in the group by clause must by to unaliased output columns.
Expressions used in the group by must appear in the select clause.
Column references and expessions in the select clause that are not used in the group by clause must appear in aggregate functions.
If an aggregate function is used in the SELECT clause and no GROUP BY is specified, an implicit GROUP BY will be performed with the entire result set as a single group. In this case, every column in the SELECT must be an aggregate function as no other column value will be fixed across the entire group.
The group by columns must be of a comparable type.

1.6.5. HAVING Clause

The HAVING clause operates exactly as a WHERE clause although it operates on the output of a GROUP BY. It supports the same syntax as the WHERE clause.

Syntax Rules:

Expressions used in the group by clause must either contain an aggregate function: COUNT, AVG, SUM, MIN, MAX. or be one of the grouping expressions.

1.6.6. ORDER BY Clause

The ORDER BY clause specifies how the returned records from a SELECT should be sorted. The options are ASC (ascending) and DESC (descending).

Usage:

ORDER BY column1 [ASC|DESC], ...

Syntax Rules:

Sort columns may be specified positionally by a 1 based integer or string literal, by SELECT clause alias name, or by a column reference.
Column references may appear in the SELECT clause as the expression for an aliased column or may reference columns from tables in the FROM clause. If the column reference is not in the SELECT clause the query must not be a set operation, specify SELECT DISTINCT, or contain a GROUP BY clause.
The ORDER BY columns must be of a comparable type.
If an ORDER BY is used in an inline view or view definition without a limit clause, it will be removed by the Teiid optimizer.

Warning

The use of positional ordering is no longer supported by the ANSI SQL standard and is a deprecated feature in Teiid. It is preferable to use alias names in the order by clause.

1.6.7. LIMIT Clause

The LIMIT clause specifies a limit on the number of records returned from the SELECT command. An optional offset (the number of rows to skip) can be specified.

Usage:

LIMIT [offset,] limit

Examples:

LIMIT 100 - returns the first 100 records (rows 1-100)
LIMIT 500, 100 - skips 500 records and returns the next 100 records (rows 501-600)

1.6.8. INTO Clause

Warning

Usage of the INTO Clause for inserting into a table has been been deprecated. An INSERT with a query command should be used instead.

When the into clause is specified with a SELECT, the results of the query are inserted into the specified table. This is often used to insert records into a temporary table. The INTO clause immediately precedes the FROM clause.

Usage:

INTO table FROM ...

Syntax Rules:

The INTO clause is logically applied last in processing, after the ORDER BY and LIMIT clauses.
Teiid's support for SELECT INTO is similar to MS SQL Server. The target of the INTO clause is a table where the result of the rest select command will be inserted. SELECT INTO should not be used UNION query.

1.6.9. OPTION Clause

The OPTION keyword denotes options the user can pass in with the command. These options are Teiid-specific and not covered by any SQL specification.

Usage:

OPTION option, (option)*

Supported options:

SHOWPLAN - returns the query plan along with the results
PLANONLY - returns the query plan, but does not execute the command
MAKEDEP [table, (table)*] - specifies source tables that should be made dependent in the join
MAKENOTDEP [table, (table)*] - prevents a dependent join from being used
DEBUG - prints query planner debug information in the log and returns it through the JDBC API

Examples:

LIMIT 100 - returns the first 100 records (rows 1-100)
LIMIT 500, 100 - skips 500 records and returns the next 100 records (rows 501-600)

1.7. Set Operations

Teiid supports the UNION, UNION ALL, INTERSECT, EXCEPT set operation as a way of combining the results of commands.

Usage:

command (UNION|INTERSECT|EXCEPT) [ALL] command [ORDER BY...]

Syntax Rules:

The output columns will be named by the output columns of the first set operation branch.
Each SELECT must have the same number of output columns and compatible data types for each relative column. Data type conversion will be performed if data types are inconsistent and implicit conversions exist.
If UNION, INTERSECT, or EXCEPT is specified without all, then the output columns must be comparable types.
INTERSECT ALL, and EXCEPT ALL are currently not supported.

1.8. Subqueries

A subquery is a SQL query embedded within another SQL query. The query containing the subquery is the outer query.

Supported subquery types:

Scalar subquery - a subquery that returns only a single column with a single value. Scalar subqueries are a type of expression and can be used where single valued expressions are expected.
Correlated subquery - a subquery that contains a column reference to from the outer query.
Uncorrelated subquery - a subquery that contains no references to the outer sub-query.

Supported subquery locations:

Subqueries in the FROM clause
Subqueries in the WHERE/HAVING Clauses
Subqueries may be used in any expression or CASE CRITERIA in the SELECT clasue.

1.8.1. Inline views

Subqueries in the FROM clause of the outer query (also known as "inline views") can return any number of rows and columns. This type of subquery must always be given an alias.

Example 1.1. Example Subquery in FROM Clause (Inline View)

SELECT a FROM (SELECT Y.b, Y.c FROM Y WHERE Y.d = ‘3’) AS X WHERE a = X.c AND b = X.b

1.8.2. Subqueries in the WHERE and HAVING clauses

Subqueries supported in the criteria of the outer query include subqueries in an IN clause, subqueries using the ANY/SOME or ALL predicate quantifier, and subqueries using the EXISTS predicate.

Example 1.2. Example Subquery in WHERE Using EXISTS

SELECT a FROM X WHERE EXISTS (SELECT b, c FROM Y WHERE c=3)

The following usages of subqueries must each select only one column, but can return any number of rows.

Example 1.3. Example Subqueries in WHERE Clause

SELECT a FROM X WHERE a IN (SELECT b FROM Y WHERE c=3) 
SELECT a FROM X WHERE a >= ANY (SELECT b FROM Y WHERE c=3) 
SELECT a FROM X WHERE a < SOME (SELECT b FROM Y WHERE c=4) 
SELECT a FROM X WHERE a = ALL (SELECT b FROM Y WHERE c=2)