2.3. Matching JBC Behavior to Types of Data

The preceding discussion has gone into a lot of detail about what Hibernate wants to accomplish as it caches data, and what JBoss Cache configuration options are available. What should be clear is that the configurations that are best for caching one type of data are not the best (and are sometimes completely incorrect) for other types. Entities likely work best with synchronous invalidation; timestamps require replication; query caching might do best in local mode.

Prior to Hibernate 3.3 and JBoss Cache 2.1, the conflicting requirements between the different cache types led to a real dilemna, particularly if query caching was enabled. This conflict arose because all four cache types needed to share a single underlying cache, with a single configuration. If query caching was enabled, the requirements of the timestamps cache basically forced use of synchronous replication, which is the worst performing choice for the more critical entity cache and is often inappropriate for the query cache.

With Hibernate 3.3 and JBoss Cache 2.1 it has become possible, even easy, to use separate underlying JBoss Cache instances for the different cache types. As a result, the entity cache can be optimally configured for entities while the necessary configuration for the timestamps cache is maintained.

There were three key changes that make this improvement possible:

2.3.1. The RegionFactory Interface

As mentioned previously, Hibernate 3.3 introduced the RegionFactory API as its mechanism for managing the Second Level Cache. This API makes it possible for implementations to know at all times whether they are working with entities, collections, queries or timestamps. That knowledge allows the Hibernate/JBoss Cache integration layer to make the best use of the various options JBoss Cache provides.

A Hibernate user doesn't need to understand the RegionFactory API in any detail at all; the main point is internally it makes possible independent management of the different cache types.

2.3.2. The CacheManager API

The CacheManager API is a new feature of JBoss Cache 2.1. It provides an API for managing multiple distinct JBoss Cache instances in the same VM. Basically a CacheManager is instantiated and provided a set of named cache configurations. An application like the Hibernate/JBoss Cache integration layer accesses the CacheManager and asks for a cache configured with a particular named configuration.

Again,a Hibernate user doesn't need to understand the CacheManager; it's an internal detail. The thing to understand is that the task of a Hibernate Second Level Cache user is to:

  • Provide a set of named JBoss Cache configurations in an XML file (or just use the default set included in the jbc2-configs.xml file found in the org.hibernate.cache.jbc2.builder package in hibernate-jbosscache2.jar).

  • Tell Hibernate which cache configurations to use for entity, collection, query and timestamp caching. In practice, this can be quite simple, as there is a reasonable set of defaults.

See Chapter 3, Configuration for more on how to do this.

2.3.3. Sharable JGroups Resources

JGroups is the group communication library JBoss Cache uses to send messages around a cluster. Each cache has a JGroups Channel; different channels around the cluster that have the same name and compatible configurations detect each other and form a group for message transmission.

A Channel is a fairly heavy object, typically using a good number of threads, several sockets and some good sized network I/O buffers. Creating multiple different channels in the same VM was therefore costly, and was an administrative burden as well, since each channel would need separate configuration to use different network addresses or ports. Architecturally, this mitigated against having multiple JBoss Cache instances in an application, since each would need its own Channel.

Added in JGroups 2.5 and much improved in the JGroups 2.6 series is the concept of sharable JGroups resources. Basically, the heavyweight JGroups elements can be shared. An application (e.g. the Hibernate/JBoss Cache integration layer) uses a JGroups ChannelFactory. The ChannelFactory is provided with a set of named channel configurations. When a Channel is needed (e.g. by a JBoss Cache instance), the application asks the ChannelFactory for the channel by name. If different callers ask for a channel with the same name, the ChannelFactory ensures that they get channels that share resources.

The effect of all this is that if a user wants to use four separate JBoss Cache instances, one for entity caching, one for collection caching, one for query caching and one for timestamp caching, those four caches can all share the same underlying JGroups resources.

The task of a Hibernate Second Level Cache user is to:

  • Provide a set of named JGroups configurations in an XML file (or just use the default set included in the jgroups-stacks.xml file found in the org.hibernate.cache.jbc2.builder package in the hibernate-jbosscache2.jar).

  • Tell Hibernate where to find that set of configurations on the classpath. See Section 3.1, “Configuring the Hibernate Session Factory” for details on how to do this. This is not necessary if the default set included in hibernate-jbosscache2.jar is used.

  • In the JBoss Cache configurations you are using specify the name of the channel you want to use. This should be one of the named configurations in the JGroups XML file. The default set of JBoss Cache configurations found in the hibernate-jbosscache2.jar already have appropriate default choices. See Section 3.2.3.4, “JGroups Channel Configuration” for details on how to set this if you don't wish to use the defaults.

See Section 3.3, “JGroups Configuration” for more on JGroups.

2.3.4. Bringing It All Together

So, we've seen that Hibernate caches up to four different types of data (entities, collections, queries and timestamps) and that Hibernate 3.3 + JBoss Cache 2 gives you the flexibility to use a separate underlying JBoss Cache, with different behavior, for each type. You can actually deploy four separate caches, one for each type.

In practice, four separate caches are unnecessary. For example, entities and collection caching have similar enough semantics that there is no reason not to share a JBoss Cache instance between them. The queries can usually use the same cache as well. Similarly, queries and timestamps can share a JBoss Cache instance configured for replication, with the hibernate.cache.region.jbc2.query.localonly=true configuration letting you turn off replication for the queries if you want to.

Here's a decision tree you can follow:

  1. Decide if you want to enable query caching.

  2. Decide if you want to use invalidation or replication for your entities and collections. Invalidation is generally recommended for entities and collections.

    • If you want invalidation, and you want query caching, you will need two JBoss Cache instances, one with synchronous invalidation for the entities and collections, and one with synchronous replication for the timestamps. The queries will go in the timestamp cache if you want them to replicate; they can go with the entities and collections otherwise.

    • If you want invalidation but don't want query caching, you can use a single JBoss Cache instance, configured for synchronous invalidation.

    • If you want replication, whether or not you want query caching, you can use a single JBoss Cache instance, configured for synchronous replication.

  3. If you are using query caching, from the above decision tree you've either got your timestamps sharing a cache with other data types, or they are by themselves. Either way, the cache being used for timestamps must have initial state transfer enabled. Now, if the timestamps are sharing a cache with entities, collections or queries, decide whether you want initial state transfer for that other data. See Section 2.2.5, “Initial State Transfer” for the implications of this. If you don't want initial state transfer for the other data, you'll need to have a separate cache for the timestamps.

  4. Finally, if your queries are sharing a cache configured for replication, decide if you want the cached query results to replicate. (The timestamps cache must replicate.) If not, you'll want to set the hibernate.cache.region.jbc2.query.localonly=true option when you configure your SessionFactory

Once you've made these decisions, you know whether you need just one underlying JBoss Cache instance, or more than one. Next we'll see how to actually configure the setup you've selected.