Deprecation

This document is DEPRECATED.

Please consider any information here as out of date. DO NOT use this document.

Instead, refer to http://infinispan.org/documentation.

Please update your bookmarks accordingly.

Skip to end of metadata
Go to start of metadata

Introduction

Cache loader is Infinispan's connection to a (persistent) data store. Cache loader fetches data from a store when that data is not in the cache, and when modifications are made to data in the cache the CacheLoader is called to store those modifications back to the store. Cache loaders are associated with individual caches, i.e. different caches from the same cache manager might have different cache store configurations.

Configuration

Cache loaders can be configured in a chain. Cache read operations will look at all of the cache loaders in the order they've been configured until it finds a valid, non-null element of data. When performing writes all cache loaders are written to except if the ignoreModifications element has been set to true for a specific cache loader. See the configuration section below for details.

  • passivation (false by default) has a significant impact on how Infinispan interacts with the loaders, and is discussed in the next paragraph.
  • shared (false by default) indicates that the cache loader is shared among different cache instances, for example where all instances in a cluster use the same JDBC settings to talk to the same remote, shared database. Setting this to true prevents repeated and unnecessary writes of the same data to the cache loader by different cache instances.
  • preload (false by default) if true, when the cache starts, data stored in the cache loader will be pre-loaded into memory. This is particularly useful when data in the cache loader is needed immediately after startup and you want to avoid cache operations being delayed as a result of loading this data lazily. Can be used to provide a 'warm-cache' on startup, however there is a performance penalty as startup time is affected by this process. Note that preloading is done in a local fashion, so any data loaded is only stored locally in the node. No replication or distribution of the preloaded data happens. Also, Infinispan only preloads up to the maximum configured number of entries in eviction.
  • class attribute (mandatory) defines the class of the cache loader implementation.
  • fetchPersistentState (false by default) determines whether or not to fetch the persistent state of a cache when joining a cluster. The aim here is to take the persistent state of a cache and apply it to the local cache store of the joining node. Hence, if cache store is configured to be shared, since caches access the same cache store, fetch persistent state is ignored. Only one configured cache loader may set this property to true; if more than one cache loader does so, a configuration exception will be thrown when starting your cache service.
  • purgeSynchronously will control whether the expiration takes place in the eviction thread, i.e. if purgeSynchronously (false by default) is set to true, the eviction thread will block until the purging is finished, otherwise would return immediately. If the cache loader supports multi-threaded purge then purgeThreads (1 by default) are used for purging expired entries. There are cache loaders that support multi-threaded purge (e.g. FileCacheStore) and caches that don't (e.g. JDBCCacheStore); check the actual cache loader configuration in order to see that.
  • ignoreModifications(false by default) determines whether write methods are pushed down to the specific cache loader. Situations may arise where transient application data should only reside in a file based cache loader on the same server as the in-memory cache, for example, with a further JDBC based cache loader used by all servers in the network. This feature allows you to write to the 'local' file cache loader but not the shared JDBCCacheLoader.
  • purgeOnStatup empties the specified cache loader (if ignoreModifications is false) when the cache loader starts up.
  • additional attributes configure aspects specific to each cache loader, e.g. the location attribute in the previous example refers to where the FileCacheStore will keep the files that contain data. Other loaders, with more complex configuration, also introduce additional sub-elements to the basic configuration. See for example the JDBC cache store configuration examples below
  • singletonStore (default for enabled is false) element enables modifications to be stored by only one node in the cluster, the coordinator. Essentially, whenever any data comes in to some node it is always replicated(or distributed) so as to keep the caches in-memory states in sync; the coordinator, though, has the sole responsibility of pushing that state to disk. This functionality can be activated setting the enabled attribute to true in all nodes, but again only the coordinator of the cluster will the modifications in the underlying cache loader as defined in loader element. You cannot define a shared and with singletonStore enabled at the same time.
  • pushStateWhenCoordinator (true by default) If true, when a node becomes the coordinator, it will transfer in-memory state to the underlying cache loader. This can be very useful in situations where the coordinator crashes and the new coordinator is elected.
  • async element has to do with cache store persisting data (a)synchronously to the actual store. It is discussed in detail here.

Cache Passivation

A cache loader can be used to enforce entry passivation and activation on eviction in a cache. Cache passivation is the process of removing an object from in-memory cache and writing it to a secondary data store (e.g., file system, database) on eviction. Cache Activation is the process of restoring an object from the data store into the in-memory cache when it's needed to be used. In both cases, the configured cache loader will be used to read from the data store and write to the data store.

When an eviction policy in effect evicts an entry from the cache, if passivation is enabled, a notification that the entry is being passivated will be emitted to the cache listeners and the entry will be stored. When a user attempts to retrieve a entry that was evicted earlier, the entry is (lazily) loaded from the cache loader into memory. When the entry and its children have been loaded, they're removed from the cache loader and a notification is emitted to the cache listeners that the entry has been activated. In order to enable passivation just set passivation to true (false by default). When passivation is used, only the first cache loader configured is used and all others are ignored.

Cache Loader Behavior with Passivation Disabled vs Enabled

When passivation is disabled, whenever an element is modified, added or removed, then that modification is persisted in the backend store via the cache loader. There is no direct relationship between eviction and cache loading. If you don't use eviction, what's in the persistent store is basically a copy of what's in memory. If you do use eviction, what's in the persistent store is basically a superset of what's in memory (i.e. it includes entries that have been evicted from memory). When passivation is enabled, there is a direct relationship between eviction and the cache loader. Writes to the persistent store via the cache loader only occur as part of the eviction process. Data is deleted from the persistent store when the application reads it back into memory. In this case, what's in memory and what's in the persistent store are two subsets of the total information set, with no intersection between the subsets. Following is a simple example, showing what state is in RAM and in the persistent store after each step of a 6 step process:

  1. Insert keyOne
  2. Insert keyTwo
  3. Eviction thread runs, evicts keyOne
  4. Read keyOne
  5. Eviction thread runs, evicts keyTwo
  6. Remove keyTwo

When passivation is disabled :

  1. Memory: keyOne Disk: keyOne
  2. Memory: keyOne, keyTwo Disk: keyOne, keyTwo
  3. Memory: keyTwo Disk: keyOne, keyTwo
  4. Memory: keyOne, keyTwo Disk: keyOne, keyTwo
  5. Memory: keyOne Disk: keyOne, keyTwo
  6. Memory: keyOne Disk: keyOne

When passivation is enabled :

  1. Memory: keyOne Disk:
  2. Memory: keyOne, keyTwo Disk:
  3. Memory: keyTwo Disk: keyOne
  4. Memory: keyOne, keyTwo Disk:
  5. Memory: keyOne Disk: keyTwo
  6. Memory: keyOne Disk:

File system based cache loaders

Infinispan ships with several cache loaders that utilize the file system as a data store. They all require a location attribute, which maps to a directory to be used as a persistent store. (e.g., location="/tmp/myDataStore" ).

  • FileCacheStore: a simple filesystem-based implementation. Usage on shared filesystems like NFS, Windows shares, etc. should be avoided as these do not implement proper file locking and can cause data corruption. File systems are inherently not transactional, so when attempting to use your cache in a transactional context, failures when writing to the file (which happens during the commit phase) cannot be recovered. Please visit the file cache store configuration documentation for more information on the configurable parameters of this store.
  • BdbjeCacheStore: a cache loader implementation based on the Oracle/Sleepycat's BerkeleyDB Java Edition.
  • JdbmCacheStore: a cache loader implementation based on the JDBM engine, a fast and free alternative to BerkeleyDB.
  • LevelDBCacheStore: a cache store implementation based on Google's LevelDB, a fast key-value store.
  • SingleFileCacheStore: starting with Infinispan 6.0, a brand new filesystem-based cache store implementation has been created that requires no extra dependencies. The reason this brand new file based cache store has been created is because FileCacheStore didn't perform as expected, and it also caused issues due to the number of files created. This brand new single-file based cache store vastly outperforms the existing FileCacheStore and in some cases, particularly when reading from the store, it even outperforms the LevelDB based cache store. One key aspect of this cache store is that it keeps keys in-memory along with the information where the value is located in the file. This offers great speed improvements but results in extra memory consumption. To limit this, the cache store's maximum size can be set, but this will only work as expected in a very limited set of use cases. For more detailed information, check the single-file cache store configuration javadoc.

Note that the BerkeleyDB implementation requires a commercial license if distributed with an application (see http://www.oracle.com/database/berkeley-db/index.html for details).

For detailed description of all the parameters supported by the stores, please consult the javadoc.

JDBC based cache loaders

Based on the type of keys to be persisted, there are three JDBC cache loaders:

  • JdbcBinaryCacheStore - can store any type of keys. It stores all the keys that have the same hash value (hashCode method on key) in the same table row/blob, having as primary key the hash value. While this offers great flexibility (can store any key type), it impacts concurrency/throughput. E.g. If storing k1,k2 and k3 keys, and keys had same hash code, then they'd persisted in the same table row. Now, if 3 threads try to concurrently update k1, k2 and k3 respectively, they would need to do it sequentially since these threads would be updating the same row.
  • JdbcStringBasedCacheStore - stores each key in its own row, increasing throughput under concurrent load. In order to store each key in its own column, it relies on a (pluggable) bijection that maps the each key to a String object. The bijection is defined by the Key2StringMapper interface. Infinispans ships a default implementation (smartly named DefaultKey2StringMapper) that knows how to handle primitive types.
  • JdbcMixedCacheStore - it is a hybrid implementation that, based on the key type, delegates to either JdbcBinaryCacheStore or JdbcStringBasedCacheStore.

Which JDBC cache loader should I use?

It is generally preferable to use JdbcStringBasedCacheStore when you are in control of the key types, as it offers better throughput under heavy load. One scenario in which it is not possible to use it though, is when you can't write an Key2StringMapper to map the keys to to string objects (e.g. when you don't have control over the types of the keys, for whatever reason). Then you should use either JdbcBinaryCacheStore or JdbcMixedCacheStore. The later is preferred to the former when the majority of the keys are handled by JdbcStringBasedCacheStore, but you still have some keys you cannot convert through Key2StringMapper.

Connection management (pooling)

In order to obtain a connection to the database all the JDBC cache loaders rely on an ConnectionFactory implementation. The connection factory is specified programmatically using one of the connectionPool(), dataSource() or simpleConnection() methods on the JdbcBinaryCacheStoreConfigurationBuilder class or declaratively using one of the <connectionPool />, <dataSource /> or <simpleConnection /> elements. Infinispan ships with three ConnectionFactory implementations:

  • PooledConnectionFactory is a factory based on C3P0. Refer to javadoc for details on configuring it.
  • ManagedConnectionFactory is a connection factory that can be used within managed environments, such as application servers. It knows how to look into the JNDI tree at a certain location (configurable) and delegate connection management to the DataSource. Refer to javadoc javadoc for details on how this can be configured.
  • SimpleConnectionFactory is a factory implementation that will create database connection on a per invocation basis. Not recommended in production.

The PooledConnectionFactory is generally recommended for stand-alone deployments (i.e. not running within AS or servlet container). ManagedConnectionFactory can be used when running in a managed environment where a DataSource is present, so that connection pooling is performed within the DataSource.

Sample configurations

Bellow is an sample configuration for the JdbcBinaryCacheStore. For detailed description of all the parameters used refer to the javadoc. Please note the use of multiple XML schemas, since each cachestore has its own schema.

Bellow is an sample configuration for the JdbcStringBasedCacheStore. For detailed description of all the parameters used refer to the javadoc.

Bellow is an sample configuration for the JdbcMixedCacheStore. For detailed description of all the parameters used refer to the javadoc.

Finally, below is an example of a JDBC cache store with a managed connection factory, which is chosen implicitly by specifying a datasource JNDI location:

Apache Derby users
If you're connecting to an Apache Derby database, make sure you set dataColumnType to BLOB:

Cloud cache loader

The CloudCacheStore implementation utilizes JClouds to communicate with cloud storage providers such as Amazon's S3, Rackspace's Cloudfiles or any other such provider supported by JClouds. If you're planning to use Amazon S3 for storage, consider using it with Infinispan. Infinispan itself provides in-memory caching for your data to minimize the amount of remote access calls, thus reducing the latency and cost of fetching your Amazon S3 data. With cache replication, you are also able to load data from your local cluster without having to remotely access it every time. Note that Amazon S3 does not support transactions. If transactions are used in your application then there is some possibility of state inconsistency when using this cache loader. However, writes are atomic, in that if a write fails nothing is considered written and data is never corrupted. For a list of configuration refer to the javadoc.

Remote cache loader

The RemoteCacheStore is a cache loader implementation that stores data in a remote infinispan cluster. In order to communicate with the remote cluster, the RemoteCacheStore uses the HotRod client/server architecture. HotRod bering the load balancing and fault tolerance of calls and the possibility to fine-tune the connection between the RemoteCacheStore and the actual cluster. Please refer to HotRod for more information on the protocol, client and server configuration. For a list of RemoteCacheStore configuration refer to the javadoc. Example:

In this sample configuration, the remote cache store is configured to use the remote cache named "mycache" on servers "one" and "two". It also configures connection pooling and provides a custom transport executor. Additionally the cache store is asynchronous.

Cassandra cache loader

The CassandraCacheStore was introduced in Infinispan 4.2. Read the specific page for details on implementation and configuration.

Cluster cache loader

The ClusterCacheLoader is a cache loader implementation that retrieves data from other cluster members.

It is a cache loader only as it doesn't persist anything (it is not a Store), therefore features like fetchPersistentState (and like) are not applicable.

A cluster cache loader can be used as a non-blocking (partial) alternative to stateTransfer : keys not already available in the local node are fetched on-demand from other nodes in the cluster. This is a kind of lazy-loading of the cache content.

For a list of ClusterCacheLoader configuration refer to the javadoc.

Note: The ClusterCacheLoader does not support preloading(preload=true). It also won't provide state if fetchPersistentSate=true.

JPA cache store

The implementation depends on JPA 2.0 specification to access entity meta model.

In normal use cases, it's recommended to leverage Infinispan for JPA second level cache and/or query cache.  However, if you'd like to use only Infinispan API and you want Infinispan to persist into a cache store using a common format (e.g., a database with well defined schema), then JPA Cache Store could be right for you.

When using JPA Cache Store, the key should be the ID of the entity, while the value should be the entity object.  Only a single @Id or @EmbeddedId annotated property is allowed.  Auto-generated ID is not supported.  Lastly, all entries will be stored as immortal entries.

Sample Usage

For example, given a persistence unit "myPersistenceUnit", and a JPA entity User:

persistence.xml

User entity class (see test for full example)

Then you can configure a cache "usersCache" to use JPA Cache Store, so that when you put data into the cache, the data would be persisted into the database based on JPA configuration.

Normally a single Infinispan cache can store multiple types of key/value pairs, for example:

It's important to note that, when a cache is configured to use a JPA Cache Store, that cache would only be able to store ONE type of data. 

Use of @EmbeddedId is supported so that you can also use composite keys (see the test code for full example).

Lastly, auto-generated IDs (e.g., @GeneratedValue) is not supported. When putting things into the cache with a JPA cache store, the key should be the ID value!

Configuration

Sample Programatic Configuration

Parameter Description
persistenceUnitName JPA persistence unit name in JPA configuration (persistence.xml) that contains the JPA entity class
entityClass JPA entity class that is expected to be stored in this cache. Only one class is allowed.

Sample XML Configuration

Parameter Description
persistenceUnitName JPA persistence unit name in JPA configuration (persistence.xml) that contains the JPA entity class
entityClassName Fully qualified JPA entity class name that is expected to be stored in this cache. Only one class is allowed.

Additional References

Refer to the test case for code samples in action.

Refer to test configurations for configuration samples.

LevelDB cache store

LevelDB is a fast key-value filesystem-based storage written at Google. LevelDB cache store currently uses a Java implementation. It may be possible to use a JNI implementation in the future.

Sample Usage

LevelDB cache store requires 2 filesystem directories to be configured - each directory for a LevelDB database. One location is used to store non-expired data, while the second location is used to store expired keys pending purge.

EmbeddedCacheManager cacheManager = ...;

Configuration

Sample Programatic Configuration

Parameter Description
location Directory to use for LevelDB to store primary cache store data. Directory will be auto-created if it does not exit.
expiredLocation Directory to use for LevelDB to store expiring data pending to be purged permanently. Directory will be auto-created if it does not exit.
expiryQueueSize
Size of the in-memory queue to hold expiring entries before it gets flushed into expired LevelDB store
clearThreshold There are two methods to clear all entries in LevelDB. One method is to iterate through all entries and remove each entry individually. The other method is to delete the database and re-init. For smaller databases, deleting individual entries is faster than the latter method.  This configuration sets the max number of entries allowed before using the latter method
compressionType
Configuration for LevelDB for data compression, see CompressionType enum for options
blockSize
Configuration for LevelDB - see documentation for performance tuning
cacheSize
Configuration for LevelDB - see documentation for performance tuning

Sample XML Configuration

Additional References

Refer to the test case for code samples in action.

Refer to test configurations for configuration samples.

Cache Loaders and transactional caches

When a cache is transactional and a cache loader is present, the cache loader won't be enlisted in the transaction in which the cache is part. That means that it is possible to have inconsistencies at cache loader level: the transaction to succeed applying the in-memory state but (partially) fail applying the changes to the store. Manual recovery would not work with caches stores. 

MongoDB cache loader

The MongoDB cachestore is released within the 5.3 version of Infinispan aka "Tactical Nuclear Penguin".

To communicate with the MongoDB server instance, we are using the official Java driver version 2.10.1

To configure the cachestore, you just need to add a new entry into the loaders section.

Here is an example for your xml configuration file:

If you prefer the programmatic API here is a snippet:

The connection section contains the connection information to connect to the MongoDB server instance.
The authentication section is optional, it allows you to specificy username and password if you are using some.
The storage section explicits where you will store the data.

Section property usage default value
connection      
  host The hostname of the server on which the MongoDB is running
localhost
  port The port used by the MongoDB server instance.
27017
  timeout The timeout used by the MongoDB driver at the connection. (in ms)
2000
  acknoledgement The value used to configure the acknowledgment for write operation (-1 / 0 / 1 / 2+)
1
authentication      
  username The username used for the authentication with the MongoDB server.
 
  password The password used for the authentication with the MongoDB server.
 
storage      
  database The database used to store elements.
 
  collection The collection which will contain the elements.
 

For more information about the configuration property usage, you can refer to the official MongoDB java driver documentation

Labels:
infinispan infinispan Delete
documentation documentation Delete
cacheloader cacheloader Delete
store store Delete
loader loader Delete
cachestore cachestore Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. Jul 11, 2012

    Be sure to include the needed dependencies for your cache loader in your project. For instance, if you want to use JDBC cache loader on a Maven based project, add this dependency to your POM:

  2. Jul 31, 2012

    Cache Loaders and transactional caches

    When a cache is transactional and a cache loader is present, the cache loader won't be enlisted in the transaction in which the cache is part. That means that it is possible to have inconsistencies at cache loader level: the transaction to succeed applying the in-memory state but (partially) fail applying the changes to the store. Manual recovery would not work with caches stores.

    This, my friends, is a total deal-breaker for many use cases. Seems like the fix for this has been pushed along for some time now - any hope of seeing it soon?

  3. Sep 06, 2012

    Seems this one does not work

    Got a parser error
    Should be

    1. Sep 07, 2012

      Are you using 5.2 ?

  4. Mar 15, 2013

    In the earlier version of the documentation, the sample configuration shows how to use different connectionFactoryClass.  The  new version here does not have those sample configuration details any more.

    The following connectionFactoryClass attribute does not work!

    <stringKeyedJdbcStore xmlns="urn:infinispan:config:jdbc:5.2" fetchPersistentState="false" ignoreModifications="false" purgeOnStartup="false">
                <connectionPool connectionUrl="jdbc:derby:/var/tmp/tests/testStore;create=true"
                username="sa" driverClass="org.apache.derby.jdbc.EmbeddedDriver"
                connectionFactoryClass="org.infinispan.loaders.jdbc.connectionfactory.PooledConnectionFactory"
                />

    Here is the exception:

    org.infinispan.config.ConfigurationException: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[80,15]
    Message: Unexpected attribute 'connectionFactoryClass' encountered
        at org.infinispan.configuration.parsing.ParserRegistry.parse(ParserRegistry.java:87)

    Please add those details back.

  5. Jun 19, 2013

    Looks like there is an error on the first XML example (filestore), there is a trailing slash on the opening fileStore element, and it should not.

    We should read:

    1. Jun 20, 2013

      Thanks Louis, fixed