- Default Configuration Explained
- JTA Transactions Configuration
- Advanced Configuration
- Integration with JBoss Application Server
- Using Infinispan as remote Second Level Cache?
- Frequently Asked Questions
Following some of the principles set out by Brian Stansberry in Using JBoss Cache 3 as a Hibernate 3.5 Second Level Cache and taking in account improvements introduced by Infinispan, an Infinispan JPA/Hibernate second level cache provider has been developed. This wiki explains how to configure JPA/Hibernate to use the Infinispan and for those keen on lower level details, the key design decisions made and differences with previous JBoss Cache based cache providers.
If you're unfamiliar with JPA/Hibernate Second Level Caching, I'd suggest you to read Chapter 2.1 in this guide which explains the different types of data that can be cached.
Query result caching, or entity caching, may or may not improve application performance. Be sure to benchmark your application with and without caching.
1. First of all, to enable JPA/Hibernate second level cache with query result caching enabled, add either of the following:
2. Now, configure the Infinispan cache region factory using one of the two options below:
• If the Infinispan CacheManager instance happens to be bound to JNDI select JndiInfinispanRegionFactory as the cacheregion factory and add the cache manager's JNDI name:
|JBoss Application Server|
JBoss Application Server 6 and 7 deploy a shared Infinispan cache manager that can be used by all services, so when trying to configure applications with Infinispan second level cache, you should use the JNDI name for the cache manager responsible for the second level cache. By default, this is "java:CacheManager/entity". In any other application server, you can deploy your own cache manager and bind the CacheManager to JNDI, but in this cases it's generally preferred that the following method is used.
• If running JPA/Hibernate and Infinispan standalone or within third party Application Server, select the InfinispanRegionFactory as the cache region factory:
This is all the configuration you need to have JPA/Hibernate use Infinispan as cache provider with the default settings. You will still need to define which entities and queries need to be cached as defined in the Hibernate reference documentation, but that configuration aspect is not peculiar to Infinispan.
This default configuration should suit the majority of use cases but sometimes, further configuration is required and to help with such situations, please check the following section where more advanced settings are discussed.
The aim of this section is to explain the default settings for each of the different global data type (entity, collection, query and timestamps) caches, why these were chosen and what are the available alternatives.
- By default, for all entities and collections, whenever an new entity or collection is read from database and needs to be cached, it's only cached locally in order to reduce intra-cluster traffic. This option cannot be changed.
- By default, all entities and collections are configured to use a synchronous invalidation as clustering mode. This means that when an entity is updated, the updated cache will send a message to the other members of the cluster telling them that the entity has been modified. Upon receipt of this message, the other nodes will remove this data from their local cache, if it was stored there. This option can be changed to use replication by configuring entities or collections to use "replicated-entity" cache but it's generally not a recommended choice.
- By default, all entities and collections have initial state transfer disabled since there's no need for it. It's not recommended that this is enabled.
- By default, entities and collections are configured to use READ_COMMITTED as cache isolation level. It would only make sure to configure REPEATABLE_READ if the application evicts/clears entities from the Hibernate Session and then expects to repeatably re-read them in the same transaction. Otherwise, the Session's internal cache provides repeatable-read semantics. If you really need to use REPEATABLE_READ, you can simply configure entities or collections to use "entity-repeatable" cache.
- By default, entities and collections are configured with the following eviction settings:
- Eviction wake up interval is 5 seconds.
- Max number of entries are 10.000
- Max idle time before expiration is 100 seconds
You can change these settings on a per entity or collection basis or per individual entity or collection type.
More information in the "Advanced Configuration" section below.
- By default entites and collections are configured with lazy deserialization which helps deserialization when entities or collections are stored in isolated deployments. If you're sure you'll never deploy your entities or collections in classloader isolated deployment archives, you can disable this setting.
- By default, query cache is configured so that queries are only cached locally. Alternatively, you can configure query caching to use replication by selecting the "replicated-query" as query cache name. However, replication for query cache only makes sense if, and only if, all of this conditions are true:
- Performing the query is quite expensive.
- The same query is very likely to be repeatedly executed on different cluster nodes.
- The query is unlikely to be invalidated out of the cache (Note: Hibernate must aggressively invalidate query results from the cache any time any instance of one of the entity types targeted by the query. All such query results are invalidated, even if the change made to the specific entity instance would not have affected the query result)
- By default, query cache uses the same cache isolation levels and eviction/expiration settings as for entities/collections.
- By default, query cache has initial state transfer disabled. It is not recommended that this is enabled.
- By default, the timestamps cache is configured with asynchronous replication as clustering mode. Local or invalidated cluster modes are not allowed, since all cluster nodes must store all timestamps. As a result, no eviction/expiration is allowed for timestamp caches either.
- By default, the timestamps cache is configured with a cluster cache loader (in Hibernate 3.6.0 or earlier it had state transfer enabled) so that joining nodes can retrieve all timestamps. You shouldn't attempt to disable the cluster cache loader for the timestamps cache.
It is highly recommended that Hibernate is configured with JTA transactions so that both Hibernate and Infinispan cooperate within the same transaction and the interaction works as expected.
Otherwise, if Hibernate is configured for example with JDBC transactions, Hibernate will create a Transaction instance via java.sql.Connection and Infinispan will create a transaction via whatever TransactionManager returned by hibernate.transaction.manager_lookup_class. If hibernate.transaction.manager_lookup_class has not been populated, it will default on the dummy transaction manager. So, any work on the 2nd level cache will be done under a different transaction to the one used to commit the stuff to the database via Hibernate. In other words, your operations on the database and the 2LC are not treated as a single unit. Risks here include failures to update the 2LC leaving it with stale data while the database committed data correctly. It has also been observed that under some circumstances where JTA was not used, commit/rollbacks are not propagated to Infinispan.
To sum up, if you configure Hibernate with Infinispan, apply the following changes to your configuration file:
1. Unless your application uses JPA, you need to select the correct Hibernate transaction factory via the property hibernate.transaction.factory_class:
- If you're running within an application server, it's recommended that you use:
- If you're running in a standalone environment and you wanna enable JTA transaction factory, use:
The reason why JPA does not require a transaction factory class to be set up is because the entity manager already sets it to a variant of CMTTransactionFactory.
2. Select the correct Hibernate transaction manager lookup:
- If you're running within an application server, select the appropriate lookup class according to "JTA Transaction Managers" table.
For Example if you were running with the JBoss Application Server you would set:
- If you are running standalone and you want to add a JTA transaction manager lookup, things get a bit more complicated. Due to a current limitation, Hibernate does not support injecting a JTA TransactionManager or JTA UserTransaction that are not bound to JNDI. In other words, if you want to use JTA, Hibernate expects your TransactionManager to be bound to JNDI and it also expects that UserTransaction instances are retrieved from JNDI. This means that in an standalone environment, you need to add some code that binds your TransactionManager and UserTransaction to JNDI. With this in mind and with the help of one of our community contributors, we've created an example that does just that: JBoss Standalone JTA Example. Once you have the code in place, it's just a matter of selecting the correct Hibernate transaction manager lookup class, based on the JNDI names given. If you take JBossStandaloneJtaExample as an example, you simply have to add:
As you probably have noted through this section, there wasn't a single mention of the need to configure Infinispan's transaction manager lookup and there's a good reason for that. Basically, the code within Infinispan cache provider takes the transaction manager that has been configured at the Hibernate level and uses that. Otherwise, if no Hibernate transaction manager lookup class has been defined, Infinispan uses a default dummy transaction manager.
Since Hibernate 4.0, the way Infinispan hooks into the transaction manager can be configured. By default, since 4.0, Infinispan interacts with the transaction manager as an JTA synchronization, resulting in a faster interaction with the 2LC thanks to some key optimisations that the transaction manager can apply. However if desired, users can configure Infinispan to act as an XA resource (just like it did in 3.6 and earlier) by disabling the use of the synchronization. For example:
The JBoss standalone JTA example referred to in the previous section is inspired by the great work of Guenther Demetz, one of the members of the Infinispan community, who wrote an impressive wiki explaining how to set up standalone JTA with different transaction managers running outside of an EE server, and how to get this to work with an Infinispan backed JPA/Hibernate application.
Infinispan has the capability of exposing statistics via JMX and since Hibernate 3.5.0.Beta4, you can enable such statistics from the Hibernate/JPA configuration file. By default, Infinispan statistics are turned off but when these are disabled via the following method, statistics for the Infinispan Cache Manager and all the managed caches (entity, collection,...etc) are enabled:
The Infinispan cache provider jar file contains an Infinispan configuration file, which is the one used by default when configuring the Infinispan standalone cache region factory. This file contains default cache configurations for all Hibernate data types that should suit the majority of use cases. However, if you want to use a different configuration file, you can configure it via the following property:
For each Hibernate cache data types, Infinispan cache region factory has defined a default cache name to look up in either the default, or the user defined, Infinispan cache configuration file. These default values can be found in the Infinispan cache provider javadoc. You can change these cache names using the following properties:
One of the key improvements brought in by Infinispan is the fact that cache instances are more lightweight than they used to be in JBoss Cache. This has enabled a radical change in the way entity/collection type cache management works. With the Infinispan cache provider, each entity/collection type gets each own cache instance, whereas in old JBoss Cache based cache providers, all entity/collection types would be sharing the same cache instance. As a result of this, locking issues related to updating different entity/collection types concurrently are avoided completely.
This also has an important bearing on the meaning of hibernate.cache.infinispan.entity.cfg and hibernate.cache.infinispan.collection.cfg properties. These properties define the template cache name that should be used for all entity/collection data types. So, with the above hibernate.cache.infinispan.entity.cfg configuration, when a region needs to be created for entity com.acme.Person, the cache instance to be assigned to this entity will be based on a "custom-entity" named cache.
On top of that, this finer grained cache definition enables users to define cache settings on a per entity/collection basis. For example:
Here, we're configuring the Infinispan cache provider so that for com.acme.Person entity type, the cache instance assigned will be based on a "person-entity" named cache, and for com.acme.Person.addresses collection type, the cache instance assigned will be based on a "addresses-collection" named cache. If either of these two named caches did not exist in the Infinispan cache configuration file, the cache provider would create a cache instance for com.acme.Person entity and com.acme.Person.addresses collection based on the default cache in the configuration file.
Furthermore, thanks to the excellent feedback from the Infinispan community and in particular, Brian Stansberry, we've decided to allow users to define the most commonly tweaked Infinispan cache parameters via hibernate.cfg.xml or persistence.xml, for example eviction/expiration settings. So, with the Infinispan cache provider, you can configure eviction/expiration this way:
With the above configuration, you're overriding whatever eviction/expiration settings were defined for the default entity cache name in the Infinispan cache configuration used, regardless of whether it's the default one or user defined. More specifically, we're defining the following:
- All entities to use LRU eviction strategy
- The eviction thread to wake up every 2000 milliseconds
- The maximum number of entities for each entity type to be 5000 entries
- The lifespan of each entity instance to be 600000 milliseconds
- The maximum idle time for each entity instance to be 30000
You can also override eviction/expiration settings on a per entity/collection type basis in such way that the overriden settings only afftect that particular entity (i.e. com.acme.Person) or collection type (i.e. com.acme.Person.addresses). For example:
The aim of these configuration capabilities is to reduce the number of files needed to modify in order to define the most commonly tweaked parameters. So, by enabling eviction/expiration configuration on a per generic Hibernate data type or particular entity/collection type via hibernate.cfg.xml or persistence.xml, users don't have to touch to Infinispan's cache configuration file any more. We believe users will like this approach and so, if you there are any other Infinispan parameters that you often tweak and these cannot be configured via hibernate.cfg.xml or persistence.xml, please let the Infinispan team know by sending an email to firstname.lastname@example.org.
Please note that query/timestamp caches work the same way they did with JBoss Cache based cache providers. In other words, there's a query cache instance and timestamp cache instance shared by all. It's worth noting that eviction/expiration settings are allowed for query cache but not for timestamp cache. So configuring an eviction strategy other than NONE for timestamp cache would result in a failure to start up.
Finally, from Hibernate 3.5.4 and 3.6 onwards, queries with specific cache region names are stored under matching cache instances. This means that you can now set query cache region specific settings. For example, assuming you had a query like this:
The query would be stored under "AccountRegion" cache instance and users could control settings in similar fashion to what was done with entities and collections. So, for example, you could define specific eviction settings for this particular query region doing something like this:
In JBoss Application Server 7, Infinispan is the default second level cache provider and you can find details about its configuration the AS7 JPA reference guide.
Infinispan based Hibernate 2LC was developed as part of Hibernate 3.5 release and so it currently only works within AS 6 or higher. Hibernate 3.5 is not designed to work with AS/EAP 5.x or lower. To be able to run Infinispan based Hibernate 2LC in a lower AS version such as 5.1, the Infinispan 2LC module would require porting to Hibernate 3.3.
Recently, William Decoste has helped migrate the Infinispan 2LC module to Hibernate 3.3, and in this wiki, he explains how to integrate Infinispan Hibernate 2LC with JBoss AS/EAP 5.x.
Lately, several questions (here & here) have appeared in the Infinispan user forums asking whether it'd be possible to have an Infinispan second level cache that instead of living in the same JVM as the Hibernate code, it resides in a remote server, i.e. an Infinispan Hot Rod server. It's important to understand that trying to set up second level cache in this way is generally not a good idea for the following reasons:
- The purpose of a JPA/Hibernate second level cache is to store entities/collections recently retrieved from database or to maintain results of recent queries. So, part of the aim of the second level cache is to have data accessible locally rather than having to go to the database to retrieve it everytime this is needed. Hence, if you decide to set the second level cache to be remote as well, you're losing one of the key advantages of the second level cache: the fact that the cache is local to the code that requires it.
- Setting a remote second level cache can have a negative impact in the overall performance of your application because it means that cache misses require accessing a remote location to verify whether a particular entity/collection/query is cached. With a local second level cache however, these misses are resolved locally and so they are much faster to execute than with a remote second level cache.
There are however some edge cases where it might make sense to have a remote second level cache, for example:
- You are having memory issues in the JVM where JPA/Hibernate code and the second level cache is running. Off loading the second level cache to remote Hot Rod servers could be an interesting way to separate systems and allow you find the culprit of the memory issues more easily.
- Your application layer cannot be clustered but you still want to run multiple application layer nodes. In this case, you can't have multiple local second level cache instances running because they won't be able to invalidate each other for example when data in the second level cache is updated. In this case, having a remote second level cache could be a way out to make sure your second level cache is always in a consistent state, will all nodes in the application layer pointing to it.
- Rather than having the second level cache in a remote server, you want to simply keep the cache in a separate VM still within the same machine. In this case you would still have the additional overhead of talking across to another JVM, but it wouldn't have the latency of across a network. The benefit of doing this is that:
- Size the cache separate from the application, since the cache and the application server have very different memory profiles. One has lots of short lived objects, and the other could have lots of long lived objects.
- To pin the cache and the application server onto different CPU cores (using numactl), and even pin them to different physically memory based on the NUMA nodes.