JBoss.orgCommunity Documentation
Teiid supports materialized views. Materialized views are just like other views, but their transformations are pre-computed and stored just like a regular table. When queries are issued against the views through the Teiid Server, the cached results are used. This saves the cost of accessing all the underlying data sources and re-computing the view transforms each time a query is executed.
Materialized views are appropriate when the underlying data does not change rapidly, or when it is acceptable to retrieve data that is "stale" within some period of time, or when it is preferred for end-user queries to access staged data rather than placing additional query load on operational sources.
Caching of relational table or view records (pre-computing all transformations)
Model-based definition of virtual groups to cache (requires Teiid Designer)
User ability to override use of materialized view cache for specific queries through OPTION NO CACHE
When client applications issue queries against a Relational table or view that has been defined as a materialized view, the Teiid query engine automatically routes that query to obtain the results from the cache database.
Individual queries may override the use of materialized views by specifying OPTION NOCACHE on the query. This parameter may specify one or more virtual groups to override (separated by commas, spaces optional). If no virtual groups are specified, materialized views tables will not be used transitively.
Example 4.2. Specific NOCACHE
SELECT * from vg1, vg2, vg3 WHERE … OPTION NOCACHE vg1, vg3
Only the vg1 and vg3 caches will be skipped vg2 or any materialized views nested under vg1 and vg3 will be used.
Option NOCACHE may be specified in virtual group transformation queries. In that way, transformations can specify to always use real-time data obtained directly from a source. The use of caching and non-caching can be mixed in transformation definitions, just as with user queries.
Materialized views are defined in Teiid Designer by setting the materialized property on a table or view in a virtual (view) relational model. Setting this property's value to true (the default is false) allows the data generated for this virtual table to be treated as a materialized view.
It is important to ensure that all key/index information is present as these will be used by the materialization process to enhance the performance of the materialized table.
The target materialized table may also be set in the properties. If the value is left blank, the default, then internal materialization will be used. Otherwise for external materialization, the value should reference the fully qualified name of a table (or possibly view) with the same columns as the materialized view. For most basic scenarios the simplicity of internal materialization makes it the more appealing option. Other considerations for chosing between internal and external materialization are:
Does the cached data need to be fully durable? If yes, then external materialization should be used. Internal materialization should not survive a cluster restart.
Is full control needed of loading and refresh? If yes, then external materialzation should be used. Internal materialization does offer several system supported methods for refreshing, but does not give full access to the materialized table.
External materialized views cache their data in an external database system. External materialized views give the administrator full control over the loading and refresh strategies.
Since the actual physical cache for materialized views is maintained external to the Teiid system, there is no pre-defined policy for clearing and managing the cache. These policies will be defined and enforced by administrators of the Teiid system.
Typical Usage Steps
Create materialized views and corresponding physical materialized target tables in Designer. This can be done through setting the materialized and target table manually, or by selecting the desired views, right clicking, then selecting Modeling->"Create Materialized Views"
Generate the DDL for your physical model materialization target tables. This can be done by selecting the model, right clicking, then choosing Export->"Metadata Modeling"->"Data Definition Language (DDL) File". This script can be used to create the desired schema for your materialization target on whatever source you choose.
Determine a load and refresh strategy. With the schema created the most simplistic approach is to just load the data.
The load can even be done through Teiid with insert into target_table select * from matview option nocache
.
That however may be too simplistic because you index creation may be more performant if deferred until after the table has been created.
Also full snapshot refreshes are best done to a staging table then swapping it for the existing physical table to ensure that the refresh
does not impact user queries and to ensure that the table is valid prior to use.
Internal materialization creates Teiid temporary tables to hold the materialized table. While these tables are not fully durable, they perform well in most circumstances and the data is present at each Teiid instance which removes the single point of failure and network overhead of an external database. Internal materialization also provides more built-in facilities for refreshing and monitoring.
An internal materialized view table is initially in an invalid state (there is no data). The first user query will trigger an implicit loading of the data.
All other queries against the materialized view will block until the load completes.
In some situations administrators may wish to better control when the cache is loaded with a call to SYS.refreshMatView
. The initial load may itself trigger the initial load
of dependent materialized views. After the initial load user queries against the materialized view table will only block if it is in an invalid state.
The valid state may also be controled through the SYS.refreshMatView
procedure.
Example 4.3. Invalidating Refresh
CALL SYS.refreshMatView(viewname=>'schema.matview', invalidate=>true)
matview will be refreshed and user queries will block until the refresh is complete (or fails).
While the initial load may trigger a transitive loading of dependent materialized views,
subsequent refreshes performed with refreshMatView
will use dependent materialized view tables if they exist. Only one load may occur at a time. If a load is already in progress when
the SYS.refreshMatView
procedure is called, it will return -1 immediately rather than preempting the current load.
The cache hint may be used to automatically trigger a full snapshot refresh after a specified time to live.
Example 4.4. Auto-refresh Transformation Query
/*+ cache(ttl:3600000) */ select t.col, t1.col from t, t1 where t.id = t1.id
In advanced use-cases the cache hint may also be used to mark an internal materialized view as updatable.
An updatable internal materialized view may use the SYS.refreshMatViewRow
procedure to update a single row in the materialized table.
To be updatable the materialized view must have a single column primary key. Composite keys are not yet supported by SYS.refreshMatViewRow
.
Example 4.5. Updatable Scenario
Transofrmation Query:
/*+ cache(updatable) */ select t.col, t1.col from t, t1 where t.id = t1.id
Update:
CALL SYS.updateMatViewRow(viewname=>'schema.matview', key=>5)
Given that the schema.matview defines interger column col as it's primary key, the update will check the live source(s) for the row values. If it exists, the materialized view table row will be updated. If it does not exist the correpsonding row will be deleted.
The update query will not use dependent materialized view tables, so care should be taken to ensure that getting a single
row from this transformation query performs well. This may require the use of depedent join hints.
When the updatable option is not specified, accessing the materialized view table is more efficient because modifications do not need to be considered.
Therefore, only specify the updatable option if row based incremental updates are needed. Even when performing row updates, full snapshot refreshes may be needed to ensure consistency.