Hibernate.orgCommunity Documentation

Chapter 5. Transactions and Concurrency

5.1. Entity manager and transaction scopes
5.1.1. Unit of work
5.1.2. Long units of work
5.1.3. Considering object identity
5.1.4. Common concurrency control issues
5.2. Database transaction demarcation
5.2.1. Non-managed environment
5.2.2. Using JTA
5.2.3. Exception handling
5.3. EXTENDED Persistence Context
5.3.1. Container Managed Entity Manager
5.3.2. Application Managed Entity Manager
5.4. Optimistic concurrency control
5.4.1. Application version checking
5.4.2. Extended entity manager and automatic versioning
5.4.3. Detached objects and automatic versioning

The most important point about Hibernate Entity Manager and concurrency control is that it is very easy to understand. Hibernate Entity Manager directly uses JDBC connections and JTA resources without adding any additional locking behavior. We highly recommend you spend some time with the JDBC, ANSI, and transaction isolation specification of your database management system. Hibernate Entity Manager only adds automatic versioning but does not lock objects in memory or change the isolation level of your database transactions. Basically, use Hibernate Entity Manager like you would use direct JDBC (or JTA/CMT) with your database resources.

We start the discussion of concurrency control in Hibernate with the granularity of EntityManagerFactory, and EntityManager, as well as database transactions and long units of work..

In this chapter, and unless explicitly expressed, we will mix and match the concept of entity manager and persistence context. One is an API and programming object, the other a definition of scope. However, keep in mind the essential difference. A persistence context is usually bound to a JTA transaction in Java EE, and a persistence context starts and ends at transaction boundaries (transaction-scoped) unless you use an extended entity manager. Please refer to Section 1.2.3, “Persistence context scope” for more information.

A EntityManagerFactory is an expensive-to-create, threadsafe object intended to be shared by all application threads. It is created once, usually on application startup.

An EntityManager is an inexpensive, non-threadsafe object that should be used once, for a single business process, a single unit of work, and then discarded. An EntityManager will not obtain a JDBC Connection (or a Datasource) unless it is needed, so you may safely open and close an EntityManager even if you are not sure that data access will be needed to serve a particular request. (This becomes important as soon as you are implementing some of the following patterns using request interception.)

To complete this picture you also have to think about database transactions. A database transaction has to be as short as possible, to reduce lock contention in the database. Long database transactions will prevent your application from scaling to highly concurrent load.

What is the scope of a unit of work? Can a single Hibernate EntityManager span several database transactions or is this a one-to-one relationship of scopes? When should you open and close a Session and how do you demarcate the database transaction boundaries?

First, don't use the entitymanager-per-operation antipattern, that is, don't open and close an EntityManager for every simple database call in a single thread! Of course, the same is true for database transactions. Database calls in an application are made using a planned sequence, they are grouped into atomic units of work. (Note that this also means that auto-commit after every single SQL statement is useless in an application, this mode is intended for ad-hoc SQL console work.)

The most common pattern in a multi-user client/server application is entitymanager-per-request. In this model, a request from the client is send to the server (where the JPA persistence layer runs), a new EntityManager is opened, and all database operations are executed in this unit of work. Once the work has been completed (and the response for the client has been prepared), the persistence context is flushed and closed, as well as the entity manager object. You would also use a single database transaction to serve the clients request. The relationship between the two is one-to-one and this model is a perfect fit for many applications.

This is the default JPA persistence model in a Java EE environment (JTA bounded, transaction-scoped persistence context); injected (or looked up) entity managers share the same persistence context for a particular JTA transaction. The beauty of JPA is that you don't have to care about that anymore and just see data access through entity manager and demarcation of transaction scope on session beans as completely orthogonal.

The challenge is the implementation of this (and other) behavior outside an EJB3 container: not only has the EntityManager and resource-local transaction to be started and ended correctly, but they also have to be accessible for data access operations. The demarcation of a unit of work is ideally implemented using an interceptor that runs when a request hits the non-EJB3 container server and before the response will be send (i.e. a ServletFilter if you are using a standalone servlet container). We recommend to bind the EntityManager to the thread that serves the request, using a ThreadLocal variable. This allows easy access (like accessing a static variable) in all code that runs in this thread. Depending on the database transaction demarcation mechanism you chose, you might also keep the transaction context in a ThreadLocal variable. The implementation patterns for this are known as ThreadLocal Session and Open Session in View in the Hibernate community. You can easily extend the HibernateUtil shown in the Hibernate reference documentation to implement this pattern, you don't need any external software (it's in fact very trivial). Of course, you'd have to find a way to implement an interceptor and set it up in your environment. See the Hibernate website for tips and examples. Once again, remember that your first choice is naturally an EJB3 container - preferably a light and modular one such as JBoss application server.

The entitymanager-per-request pattern is not the only useful concept you can use to design units of work. Many business processes require a whole series of interactions with the user interleaved with database accesses. In web and enterprise applications it is not acceptable for a database transaction to span a user interaction with possibly long waiting time between requests. Consider the following example:

We call this unit of work, from the point of view of the user, a long running application transaction. There are many ways how you can implement this in your application.

A first naive implementation might keep the EntityManager and database transaction open during user think time, with locks held in the database to prevent concurrent modification, and to guarantee isolation and atomicity. This is of course an anti-pattern, a pessimistic approach, since lock contention would not allow the application to scale with the number of concurrent users.

Clearly, we have to use several database transactions to implement the application transaction. In this case, maintaining isolation of business processes becomes the partial responsibility of the application tier. A single application transaction usually spans several database transactions. It will be atomic if only one of these database transactions (the last one) stores the updated data, all others simply read data (e.g. in a wizard-style dialog spanning several request/response cycles). This is easier to implement than it might sound, especially if you use JPA entity manager and persistence context features:

Both entitymanager-per-request-with-detached-objects and entitymanager-per-application-transaction have advantages and disadvantages, we discuss them later in this chapter in the context of optimistic concurrency control.

TODO: This note should probably come later.

An application may concurrently access the same persistent state in two different persistence contexts. However, an instance of a managed class is never shared between two persistence contexts. Hence there are two different notions of identity:

Then for objects attached to a particular persistence context (i.e. in the scope of an EntityManager) the two notions are equivalent, and JVM identity for database identity is guaranteed by the Hibernate Entity Manager. However, while the application might concurrently access the "same" (persistent identity) business object in two different persistence contexts, the two instances will actually be "different" (JVM identity). Conflicts are resolved using (automatic versioning) at flush/commit time, using an optimistic approach.

This approach leaves Hibernate and the database to worry about concurrency; it also provides the best scalability, since guaranteeing identity in single-threaded units of work only doesn't need expensive locking or other means of synchronization. The application never needs to synchronize on any business object, as long as it sticks to a single thread per EntityManager. Within a persistence context, the application may safely use == to compare entities.

However, an application that uses == outside of a persistence context might see unexpected results. This might occur even in some unexpected places, for example, if you put two detached instances into the same Set. Both might have the same database identity (i.e. they represent the same row), but JVM identity is by definition not guaranteed for instances in detached state. The developer has to override the equals() and hashCode() methods in persistent classes and implement his own notion of object equality. There is one caveat: Never use the database identifier to implement equality, use a business key, a combination of unique, usually immutable, attributes. The database identifier will change if a transient entity is made persistent (see the contract of the persist() operation). If the transient instance (usually together with detached instances) is held in a Set, changing the hashcode breaks the contract of the Set. Attributes for good business keys don't have to be as stable as database primary keys, you only have to guarantee stability as long as the objects are in the same Set. See the Hibernate website for a more thorough discussion of this issue. Also note that this is not a Hibernate issue, but simply how Java object identity and equality has to be implemented.

Never use the anti-patterns entitymanager-per-user-session or entitymanager-per-application (of course, there are rare exceptions to this rule, e.g. entitymanager-per-application might be acceptable in a desktop application, with manual flushing of the persistence context). Note that some of the following issues might also appear with the recommended patterns, make sure you understand the implications before making a design decision:

Database (or system) transaction boundaries are always necessary. No communication with the database can occur outside of a database transaction (this seems to confuse many developers who are used to the auto-commit mode). Always use clear transaction boundaries, even for read-only operations. Depending on your isolation level and database capabilities this might not be required but there is no downside if you always demarcate transactions explicitly. You'll have to do operations outside a transaction, though, when you'll need to retain modifications in an EXTENDED persistence context.

A JPA application can run in non-managed (i.e. standalone, simple Web- or Swing applications) and managed Java EE environments. In a non-managed environment, an EntityManagerFactory is usually responsible for its own database connection pool. The application developer has to manually set transaction boundaries, in other words, begin, commit, or rollback database transactions itself. A managed environment usually provides container-managed transactions, with the transaction assembly defined declaratively through annotations of EJB session beans, for example. Programmatic transaction demarcation is then no longer necessary, even flushing the EntityManager is done automatically.

Usually, ending a unit of work involves four distinct phases:

We'll now have a closer look at transaction demarcation and exception handling in both managed- and non-managed environments.

If an JPA persistence layer runs in a non-managed environment, database connections are usually handled by Hibernate's pooling mechanism behind the scenes. The common entity manager and transaction handling idiom looks like this:

// Non-managed environment idiom

EntityManager em = emf.createEntityManager();
EntityTransaction tx = null;
try {
    tx = em.getTransaction();
    tx.begin();
    // do some work
    ...
    tx.commit();
}
catch (RuntimeException e) {
    if ( tx != null && tx.isActive() ) tx.rollback();
    throw e; // or display error message
}
finally {
    em.close();
}

You don't have to flush() the EntityManager explicitly - the call to commit() automatically triggers the synchronization.

A call to close() marks the end of an EntityManager. The main implication of close() is the release of resources - make sure you always close and never outside of guaranteed finally block.

You will very likely never see this idiom in business code in a normal application; fatal (system) exceptions should always be caught at the "top". In other words, the code that executes entity manager calls (in the persistence layer) and the code that handles RuntimeException (and usually can only clean up and exit) are in different layers. This can be a challenge to design yourself and you should use J2EE/EJB container services whenever they are available. Exception handling is discussed later in this chapter.

If your persistence layer runs in an application server (e.g. behind EJB3 session beans), every datasource connection obtained internally by the entity manager will automatically be part of the global JTA transaction. Hibernate offers two strategies for this integration.

If you use bean-managed transactions (BMT), the code will look like this:

// BMT idiom

@Resource public UserTransaction utx;
@Resource public EntityManagerFactory factory;
public void doBusiness() {
    EntityManager em = factory.createEntityManager();
    try {
    // do some work
    ...
    utx.commit();
}
catch (RuntimeException e) {
    if (utx != null) utx.rollback();
    throw e; // or display error message
}
finally {
    em.close();
}

With Container Managed Transactions (CMT) in an EJB3 container, transaction demarcation is done in session bean annotations or deployment descriptors, not programatically. The EntityManager will automatically be flushed on transaction completion (and if you have injected or lookup the EntityManager, it will be also closed automatically). If an exception occurs during the EntityManager use, transaction rollback occurs automatically if you don't catch the exception. Since EntityManager exceptions are RuntimeExceptions they will rollback the transaction as per the EJB specification (system exception vs. application exception).

It is important to let Hibernate EntityManager define the hibernate.transaction.factory_class (ie not overriding this value). Remember to also set org.hibernate.transaction.manager_lookup_class.

If you work in a CMT environment, you might also want to use the same entity manager in different parts of your code. Typically, in a non-managed environment you would use a ThreadLocal variable to hold the entity manager, but a single EJB request might execute in different threads (e.g. session bean calling another session bean). The EJB3 container takes care of the persistence context propagation for you. Either using injection or lookup, the EJB3 container will return an entity manager with the same persistence context bound to the JTA context if any, or create a new one and bind it (see Section 1.2.4, “Persistence context propagation” .)

Our entity manager/transaction management idiom for CMT and EJB3 container-use is reduced to this:

//CMT idiom through injection

@PersistenceContext(name="sample") EntityManager em;

Or this if you use Java Context and Dependency Injection (CDI).

@Inject EntityManager em;

In other words, all you have to do in a managed environment is to inject the EntityManager, do your data access work, and leave the rest to the container. Transaction boundaries are set declaratively in the annotations or deployment descriptors of your session beans. The lifecycle of the entity manager and persistence context is completely managed by the container.

Due to a silly limitation of the JTA spec, it is not possible for Hibernate to automatically clean up any unclosed ScrollableResults or Iterator instances returned by scroll() or iterate(). You must release the underlying database cursor by calling ScrollableResults.close() or Hibernate.close(Iterator) explicitly from a finally block. (Of course, most applications can easily avoid using scroll() or iterate() at all from the CMT code.)

If the EntityManager throws an exception (including any SQLException), you should immediately rollback the database transaction, call EntityManager.close() (if createEntityManager() has been called) and discard the EntityManager instance. Certain methods of EntityManager will not leave the persistence context in a consistent state. No exception thrown by an entity manager can be treated as recoverable. Ensure that the EntityManager will be closed by calling close() in a finally block. Note that a container managed entity manager will do that for you. You just have to let the RuntimeException propagate up to the container.

The Hibernate entity manager generally raises exceptions which encapsulate the Hibernate core exception. Common exceptions raised by the EntityManager API are

The HibernateException, which wraps most of the errors that can occur in a Hibernate persistence layer, is an unchecked exception. Note that Hibernate might also throw other unchecked exceptions which are not a HibernateException. These are, again, not recoverable and appropriate action should be taken.

Hibernate wraps SQLExceptions thrown while interacting with the database in a JDBCException. In fact, Hibernate will attempt to convert the exception into a more meaningful subclass of JDBCException. The underlying SQLException is always available via JDBCException.getCause(). Hibernate converts the SQLException into an appropriate JDBCException subclass using the SQLExceptionConverter attached to the SessionFactory. By default, the SQLExceptionConverter is defined by the configured dialect; however, it is also possible to plug in a custom implementation (see the javadocs for the SQLExceptionConverterFactory class for details). The standard JDBCException subtypes are:

All application managed entity manager and container managed persistence contexts defined as such are EXTENDED. This means that the persistence context type goes beyond the transaction life cycle. We should then understand what happens to operations made outside the scope of a transaction.

In an EXTENDED persistence context, all read only operations of the entity manager can be executed outside a transaction (find(), getReference(), refresh(), detach() and read queries). Some modifications operations can be executed outside a transaction, but they are queued until the persistence context join a transaction: this is the case of persist(), merge(), remove(). Some operations cannot be called outside a transaction: flush(), lock(), and update/delete queries.

The only approach that is consistent with high concurrency and high scalability is optimistic concurrency control with versioning. Version checking uses version numbers, or timestamps, to detect conflicting updates (and to prevent lost updates). Hibernate provides for three possible approaches to writing application code that uses optimistic concurrency. The use cases we show are in the context of long application transactions but version checking also has the benefit of preventing lost updates in single database transactions.

In an implementation without much help from the persistence mechanism, each interaction with the database occurs in a new EntityManager and the developer is responsible for reloading all persistent instances from the database before manipulating them. This approach forces the application to carry out its own version checking to ensure application transaction isolation. This approach is the least efficient in terms of database access. It is the approach most similar to EJB2 entities:

// foo is an instance loaded by a previous entity manager

em = factory.createEntityManager();
EntityTransaction t = em.getTransaction();
t.begin();
int oldVersion = foo.getVersion();
Foo dbFoo = em.find( foo.getClass(), foo.getKey() ); // load the current state
if ( dbFoo.getVersion()!=foo.getVersion ) throw new StaleObjectStateException();
dbFoo.setProperty("bar");
t.commit();
em.close();

The version property is mapped using @Version, and the entity manager will automatically increment it during flush if the entity is dirty.

Of course, if you are operating in a low-data-concurrency environment and don't require version checking, you may use this approach and just skip the version check. In that case, last commit wins will be the default strategy for your long application transactions. Keep in mind that this might confuse the users of the application, as they might experience lost updates without error messages or a chance to merge conflicting changes.

Clearly, manual version checking is only feasible in very trivial circumstances and not practical for most applications. Often not only single instances, but complete graphs of modified objects have to be checked. Hibernate offers automatic version checking with either detached instances or an extended entity manager and persistence context as the design paradigm.

A single persistence context is used for the whole application transaction. The entity manager checks instance versions at flush time, throwing an exception if concurrent modification is detected. It's up to the developer to catch and handle this exception (common options are the opportunity for the user to merge his changes or to restart the business process with non-stale data).

In an EXTENDED persistence context, all operations made outside an active transaction are queued. The EXTENDED persistence context is flushed when executed in an active transaction (at worse at commit time).

The Entity Manager is disconnected from any underlying JDBC connection when waiting for user interaction. In an application-managed extended entity manager, this occurs automatically at transaction completion. In a stateful session bean holding a container-managed extended entity manager (i.e. a SFSB annotated with @PersistenceContext(EXTENDED)), this occurs transparently as well. This approach is the most efficient in terms of database access. The application need not concern itself with version checking or with merging detached instances, nor does it have to reload instances in every database transaction. For those who might be concerned by the number of connections opened and closed, remember that the connection provider should be a connection pool, so there is no performance impact. The following examples show the idiom in a non-managed environment:

// foo is an instance loaded earlier by the extended entity manager

em.getTransaction.begin(); // new connection to data store is obtained and tx started
foo.setProperty("bar");
em.getTransaction().commit();  // End tx, flush and check version, disconnect

The foo object still knows which persistence context it was loaded in. With getTransaction.begin(); the entity manager obtains a new connection and resumes the persistence context. The method getTransaction().commit() will not only flush and check versions, but also disconnects the entity manager from the JDBC connection and return the connection to the pool.

This pattern is problematic if the persistence context is too big to be stored during user think time, and if you don't know where to store it. E.g. the HttpSession should be kept as small as possible. As the persistence context is also the (mandatory) first-level cache and contains all loaded objects, we can probably use this strategy only for a few request/response cycles. This is indeed recommended, as the persistence context will soon also have stale data.

It is up to you where you store the extended entity manager during requests, inside an EJB3 container you simply use a stateful session bean as described above. Don't transfer it to the web layer (or even serialize it to a separate tier) to store it in the HttpSession. In a non-managed, two-tiered environment the HttpSession might indeed be the right place to store it.