JBoss.orgCommunity Documentation

Chapter 6. Performance Tuning

6.1. Memory Management
6.1.1. Big Data/Memory
6.1.2. Disk Usage
6.2. Threading
6.3. Cache Tuning
6.4. Socket Transports
6.5. LOBs
6.6. Other Considerations

The BufferManager is responsible for tracking both memory and disk usage by Teiid. Configuring the BufferManager properly is one of the most important parts of ensuring high performance. See the <jboss-install>/server/<profile>/deploy/teiid/teiid-jboss-beans.xml file for all BufferManager settings.

The Teiid engine uses batching to reduce the number of memory rows processed at a given time. The batch sizes may be adjusted to larger values if few clients will be accessing the Teiid server simultaneously.

The maxReserveKb setting determines the total size in kilobytes of batches that can be held by the BufferManager in memory. This number does not account for persistent batches held by soft (such as index pages) or weak references. The default value of -1 will auto-calculate a typical max based upon the max heap available to the VM. The auto-calculated value assumes a 64bit architecture and will limit buffer usage to 50% of the first gigabyte of memory beyond the first 300 megabytes (which are assumed for use by the AS and other Teiid purposes) and 75% of the memory beyond that.

The BufferManager automatically triggers the use of a canonical value cache if enabled when more than 25% of the reserve is in use. This can dramatically cut the memory usage in situations where similar value sets are being read through Teiid, but does introduce a lookup cost. If you are processing small or highly similar datasets through Teiid, and wish to conserve memory, you should consider enabling value caching.

Note

Memory consumption can be significantly more or less than the nominal target depending upon actual column values and whether value caching is enabled. Large non built-in type objects can exceed their default size estimate. If an out of memory errors occur, then set a lower the maxReserveKb value. Also note that source lob values are held by memory references that are not cleared when a batch is persisted. With heavy lob usage you should ensure that buffers of other memory associated with lob references are appropiately sized.

The maxProcessingKb setting determines the total size in kilobytes of batches that can be used by active plans regardless of the memory held based on maxReserveKb. The default value of -1 will auto-calculate a typical max based upon the max heap available to the VM and max active plans. The auto-calculated value assumes a 64bit architecture and will limit processing batch usage to 10% of memory beyond the first 300 megabytes (which are assumed for use by the AS and other Teiid purposes).

In systems where large intermediate results are normal (scrolling cursors or sorting over millions of rows) you can consider increasing the maxProcessingKb and decreasing the maxReserveKb so that each request has access to an effectively smaller buffer space.

Each intermediate result buffer, temporary LOB, and temporary table is stored in its own set of buffer files, where an individual file is limited to maxFileSize megabytes. Consider increasing the storage space available to all such files maxBufferSpace if your installation makes use of internal materialization, makes heavy use of SQL/XML, or processes large row counts.

Usage of extremely large VM sizes and or datasets requires additional considerations. Teiid has a non-negligible amount of overhead per batch/table page on the order of 100-200 bytes. Depending on the data types involved each full batch/table page will represent a variable number of rows (a power of two multiple above or below the processor batch size). If you are dealing with datasets with billions of rows and you run into OutOfMemory issues, consider increasing the processor batch size in the <jboss-install>/server/<profile>/deploy/teiid/teiid-jboss-beans.xml file to force the allocation of larger batches and table pages. If the processor batch size is increased and/or you are dealing with extremely wide result sets (several hundred columns), then the default setting of 8MB for the maxStorageObjectSize in the <jboss-install>/server/<profile>/deploy/teiid/teiid-jboss-beans.xml file may be too low. The sizing for maxStorageObjectSize is terms of serialized size, which will be much closer to the raw data size then the Java memory footprint estimation used for maxReservedKB. maxStorageObjectSize should not be set too large relative to memoryBufferSpace since it will reduce the performance of the memory buffer. The memory buffer supports only 1 concurrent writer for each maxStorageObjectSize of the memoryBufferSpace.

The memoryBufferSpace setting controls the amount of on or off heap memory allocated as byte buffers for use by the Teiid buffer manager. This setting defaults to -1, which automatically determines a setting based upon whether it is on or off heap and the value for maxReserveKB.

You can take advantage of the buffer manager memory buffer to access system memory without allocating it to the heap. Setting memoryBufferOffHeap to true in <jboss-install>/server/<profile>/deploy/teiid/teiid-jboss-beans.xml will allocate the Teiid memory buffer off heap. Depending on whether your installation is dedicated to Teiid and the amount of system memory available, this may be perferable to on-heap allocation. The primary benefit is additional memory usage for Teiid without additional garbage collection tuning. This becomes especially important in situations where more than 32GB of memory is desired for the VM. Note that when using off-heap allocation, the memory must still be available to the java process and that setting the value of memoryBufferSpace too high may cause the VM to swap rather than reside in memory. With large off-heap buffer sizes (greater than several gigabytes) you may also need to adjust VM settings. For Sun VMs the relevant VM settings are MaxDirectMemorySize and UseLargePages. For example adding:

-XX:MaxDirectMemorySize=12g -XX:+UseLargePages

to the VM process arguments would allow for an effective allocation of approximately an 11GB Teiid memory buffer (the memoryBufferSpace setting) accounting for any additional direct memory that may be needed by the AS or applications running in the AS.

Socket threads are configured for each transport. They handle NIO non-blocking IO operations as well as directly servicing any operation that can run without blocking. For longer running operations, the socket threads queue with work the query engine.

The query engine has several settings that determine its thread utilization. maxThreads sets the total number of threads available for query engine work (processing plans, transaction control operations, processing source queries, etc.). You should consider increasing the maximum threads on systems with a large number of available processors and/or when it's common to issue non-transactional queries with that issue a large number of concurrent source requests. maxActivePlans, which should always be smaller than maxThreads, sets the number of the maxThreads that should be used for user query processing. Increasing the maxActivePlans should be considered for workloads with a high number of long running queries and/or systems with a large number of available processors. If memory issues arise from increasing the max threads and the max active plans, then consider decreasing the processor/connector batch sizes to limit the base number of memory rows consumed by each plan. userRequestSourceConcurrency, which should always be smaller than maxThreads, sets the number of concurrently executing source queries per user request. Setting this value to 1 forces serial execution of all source queries by the processing thread. The default value is computed based upon 2*maxThreads/maxActivePlans. Using the respective default values, this means that each user request would be allowed 6 concurrently executing source queries. If the default calculated value is not applicable to your workload, for example if you have queries that generate more concurrent long running source queries, you should adjust this value.

Caching can be tuned for cached result (including user query results and procedure results) and prepared plans (including user and stored procedure plans). Even though it is possible to disable or otherwise severely constrain these caches, this would probably never be done in practice as it would lead to poor performance.

Cache statistics can be obtained through the Admin Console or Adminshell. The statistics can be used to help tune cache parameters and ensure a hit ratio.

Plans are currently fully held in memory and may have a significant memory footprint. When making extensive use of prepared statements and/or virtual procedures, the size of the plan cache may be increased proportionally to number of gigabytes intended for use by Teiid.

While the result cache parameters control the cache result entries (max number, eviction, etc.), the result batches themselves are accessed through the BufferManager. If the size of the result cache is increased, you may need to tune the BufferManager configuration to ensure there is enough buffer space.

Result set and prepared plan caches have their entries invalidated by data and metadata events. By default these events are captured by running commands through Teiid. See the Developers Guide for further customization. Teiid stores compiled forms of update plans or trigger actions with the prepared plan, so it is recommended to leave the maxStaleness of the prepared plan cache set to 0 so that metadata changes, for example disabling a trigger, may take effect immediately. The default staleness for result set caching is 60 seconds to improve efficiency with rapidly changing sources. Consider decreasing this value to make the result set cache more consistent with the underlying data. Even with a setting of 0 full transactional consistency is not guaranteed.

Teiid separates the configuration of its socket transports for JDBC, ODBC, and Admin access. Typical installations will not need to adjust the default thread and buffer size settings. The default input output buffer sizes are set to 0, which will use the system default. Before adjusting this value keep in mind that each JDBC, ODBC, and Admin client will create a new socket connection. Setting these values to a large buffer size should only be done if the number of client is constrained. All JDBC/ODBC socket operations are non-blocking, so setting the number of maxThreads higher than the maximum effective parallelism of the machine should not result in greater performance. The default value 0 for JDBC socket threads will set the max to the number of available processors.

LOBs and XML documents are streamed from the Teiid Server to the Teiid JDBC API.   Normally, these values are not materialized in the server memory - avoiding potential out-of-memory issues. When using style sheets, or XQuery, whole XML documents must be materialized on the server. Even when using the XMLQuery or XMLTable functions and document projection is applied, memory issues may occur for large documents.

LOBs are broken into pieces when being created and streamed.  The maximum size of each piece when fetched by the client can be configured with the "lobChunkSizeInKB" property in the <jboss-install>/server/<profile>/deploy/teiid/teiid-jboss-beans.xml file. The default value is 100 KB. When dealing with extremely large LOBs, you may consider increasing this value to decrease the amount of round-trips to stream the result. Setting the value too high may cause the server or client to have memory issues.

Source LOB values are typically accessed by reference, rather than having the value copied to a temporary location. Thus care must be taken to ensure that source LOBs are returned in a memory-safe manner.

When using Teiid in a development environment, you may consider setting the maxSourceRows property in the <jboss-install>/server/<profile>/deploy/teiid/teiid-jboss-beans.xml file to reasonably small level value (e.g. 10000) to prevent large amounts of data from being pulled from sources. Leaving the exceptionOnMaxSourceRows set to true will alert the developer through an exception that an attempt was made to retrieve more than the specified number of rows.