Skip to end of metadata
Go to start of metadata

Apache Accumulo Translator

The Apache Accumulo Translator, known by the type name accumulo, exposes querying functionality to Accumulo Data Sources. Apache Accumulo is a sorted, distributed key value store with robust, scalable, high performance data storage and retrieval system. This translator provides an easy way connect to Accumulo system and provides relational way using SQL to add records from directly from user or from other sources that are integrated with Teiid. It also gives ability to read/update/delete existing records from Accumulo store. Teiid has capability to pass-in logged in user's roles as visibility properties to restrict the data access.

"versions"
The development was done using Accumulo 1.5.0, Hadoop 2.2.0 and Zookeeper 3.4.5
This document assumes that user is familiar with Accumulo source and has basic understanding of how Teiid works. This document only contains details about Accumulo translator.

Intended Usecases

The usage Accumulo translator can be highly dependent on user's usecase(s). Here are some common scenarios.

  • Accumulo source can be used in Teiid, to continually add/update the documents in the Accumulo system from other sources automatically.
  • Access Accumulo through SQL interface.
  • Make use of cell level security through enterprise roles.
  • Accumulo translator can be used as an indexing system to gather data from other enterprise sources such as RDBMS, Web Service, SalesForce etc, all in single client call transparently with out any coding.

Usage

Apache Accumulo is distributed key value store with unique data model. It allows to group its key-value pairs in a collection called "table". The key structure is defined as

Based on above information, one can define a schema representing Accumulo table structures in Teiid using DDL or using Teiid Designer with help of metadata extension properties defined below. Since no data type information is defined on the columns, by default all columns are considered as string data types. However, during modeling of the schema, one can use various other data types supported through Teiid to define a data type of column, that user wishes to expose as.

Once this schema is defined and exposed through VDB in a Teiid database, and Accumulo Data Sources is created, the user can issue "INSERT/UPDATE/DELETE" based SQL calls to insert/update/delete records into the Accumulo, and issue "SELECT" based calls to retrieve records from Accumulo. You can use full range of SQL with Teiid system integrating other sources along with Accumulo source.

By default, Accumulo table structure is flat can not define relationships among tables. So, a SQL JOIN is performed in Teiid layer rather than pushed to source even if both tables on either side of the JOIN reside in the Accumulo. Currently any criteria based on EQUALITY and/or COMPARISON using complex AND/OR clauses are handled by Accumulo translator and will be properly executed at source.

An Example Dynamic VDB that shows Accumulo translator can be defined as

"accumulo-vdb.xml"

The translator does NOT provide a connection to the Accumulo. For that purpose, Teiid has a JCA adapter that provides a connection to Accumulo using Accumulo Java libraries. To define such connector, see Accumulo Data Sources or see an example in "<jboss-as>/docs/teiid/datasources/accumulo"

If you are using Designer Tooling, to create VDB

  • Create/use a Teiid Designer Model project
  • Use "Teiid Connection >> Source Model" importer, create Accumulo Data Source using data source creation wizard and use accumulo as translator in the importer. The table is created in a source model by the time you finish with this importer.
  • Create a VDB and deploy into Teiid Server and use either jdbc, odbc, odata etc to query.

Properties

Accumulo translator is capable of traversing through Accumulo table structures and build a metadata structure for Teiid translator. The schema importer can understand simple tables by traversing a single ROWID of data, then looks for all the unique keys, based on it it comes up with a tabular structure for Accumulo based table. Using the following import properties, you can further refine the import behavior.

Import Properties

Property Name Description Required Default
ColumnNamePattern How the column name should be formed false {CF}_{CQ}
ValueIn Where the value for column is defined CQ or VALUE false {VALUE}
{CQ}, {CF}, {ROWID} are expressions that you can use to define above properties in any pattern, and respective values of Column Qualifer, Column Familiy or ROWID will be replaced at import time. ROW ID of the Accumulo table, is automatically created as ROWID column, and will be defined as Primary Key on the table.

You can also define the metadata for the Accumulo based model, using DDL or using the Teiid Designer. When doing such exercise, the Accumulo Translator currently defines following extended metadata properties to be defined on its Teiid schema model to guide the translator to make proper decisions.
The following properties are described under NAMESPACE "http://www.teiid.org/translator/accumulo/2013", for user convenience this namespace has alias name teiid_accumulo defind in Teiid. To define a extension property use expression like "teiid_accumulo:{property-name} value". All the properties below are intended to be used as OPTION properties on COLUMNS. See DDL Metadata for more information on defining DDL based metadata.

Extension Metadata Properties

Property Name Description Required Default
CF Column Family true none
CQ Column Qualifier false empty
VALUE-IN Value of column defined in. Possible values (VALUE, CQ) false VALUE

How to use above Properties

Say for example you have a table called "User" in your Accumulo instance, and doing a scan returned following data

If you used the default importer from the Accumulo translator(like Dynamic VDB defined above), the table generated will be like below

You can use "Import Property" as "ColumnNamePattern" as "{CQ}" will generate the following (note the names of the column)

respectively if the column name is defined by Column Family, you can use "ColumnNamePattern" as "{CF}", and if the value for that column exists in the Column Qualifier then you can use "ValueIn" as "{CQ}". Using import properties you can dictate how the table should be modeled.

If you did not use built in import (not using Teiid Designer's Teiid Connection >> Source Model or Dynamic VDB), and would like to manually design the table in Designer like below

Then you must make sure you supply the Extension Metadata Properties defined above on the User table's columns from Accumulo extended metadata(In Designer, right click on Model, and select "Model Extension Definitions" and select Accumulo. For example on FirstName column, you would supply

and repeat for each and every column, so that Teiid knows how to communicate correctly with Accumulo.

JCA Resource Adapter

The Teiid specific Accumulo Resource Adapter should be used with this translator. See Accumulo Data Sources for connecting to a Accumulo Source.

Native Queries

Currently this feature is not applicable. Based on user demand Teiid could expose a way for user to submit a MAP-REDUCE job.

Direct Query Procedure

This feature is not applicable for this translator.

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.