JBoss Community Archive (Read Only)

Infinispan 6.0

Transaction recovery

When to use recovery

Consider a distributed transaction in which money are transfered from an account stored in the database to an account stored in Infinispan. When TransactionManager.commit() is invoked, both resources prepare successfully(1st phase). During commit (2nd phase), the database successfully applies the changes whilst Infinispan fails before receiving the commit request from the TransactionManager. At this point the system is in an inconsistent state: money are taken from the datbase account but not visible yet in Infinispan(locks are only released during 2nd phase of 2PC). Recovery deals with this situation to make sure data in both the database and Infinispan ends up in a consistent state.

How does it work

Recovery is coordinated by the TransactionManager (TM). The TM works with Infinispan to determine the list of in-doubt transactions that require manual intervention and informs the system administrator (SA) (email, logs). This process is TM specific, but generally requires some configuration on TM's side.  

Knowing the in-doubt transaction ids, the SA can now connect to the Infinispan cluster and replay the commit of transactions or force the rollback. Infinispan provides JMX tooling for this - this is explained extensively in the Reconciliate state section.

Configuring recovery   

Recovery is not enabled by default in Infinispan. If disabled the TransactionManager won't be able to work with Infinispan to determine the in-doubt transactions. In order to enable recovery through xml configuration:

<transaction useEagerLocking="true" eagerLockSingleNode="true">
    <recovery enabled="true" recoveryInfoCacheName="noRecovery"/>
</transaction>

Note:  the recoveryInfoCacheName attribute is not mandatory. More information about it can be found in the  Recovery Cache section below.

Alternatively you can enable it through the fluent configuration API as follows:

//simply calling .recovery() enables it
configuration.fluent().transaction().recovery();

//then you can disable it
configuration.fluent().transaction().recovery().disable();

//or just check its status
boolean isRecoveryEnabled = configuration.isTransactionRecoveryEnabled();

Recovery can be enabled/disabled o a per cache level: e..g it is possible to have a transaction spanning a cache that is has it enabled and another one that doesn't.

Enable JMX support

Important: In order to be able to use JMX for managing recovery JMX support must be explicitly enabled. More about enabling JMX here.

Recovery cache

In order to track in-doubt transactions and be able to reply them Infinispan caches all transaction state for future use. This state is held only for in-doubt transaction, being removed for successfully completed transactions after the commit/rollback phase completed.

This in-doubt transaction data is held within a local cache: this allows one to configure swapping this info to disk through cache loader in the case it gets too big. This cache can be specified through the  "recoveryInfoCacheName" configuration attribute. If not specified infinispan will configure a local cache for you.

It is possible (though not mandated) to share same recovery cache between all the Infinispan caches that have recovery enabled.  If default recovery cache is overridden then the specified recovery cache must use a TransactionManagerLookup that returns a different TM than the one used by the cache itself.

Integration with the TM

Even though this is TM specific, generally the TM would need a reference to a XAResource imlementation in order to run XAResource.recover on it. In order to obtain a reference to a Infinispan XAResource following API can be used:

XAResource xar = cache.getAdvancedCache().getXAResource();  

Note: It is a common practice to run the recovery in a different process than the one running the transaction. At the moment it is not possible to do this with infinispan: the recovery must be run from the same process where the infinispan instance exists. This limitation will be dropped once ransactions over HotRod are available.

conciliate state

The TM informs the SA on in-doubt transaction in a proprietary way. At this stage it is assumed that the SA knows transaction's XID (byte array).

A normal recovery flow is:

1. SA connects to an Infinispan server through JMX, and lists the in doubt transactions

The image below is taken with JCosole connecting to an Infinispan node that has an in doubt transaction.

images/author/download/attachments/3737124/showInDoubtTx.png

The status of each in-doubt transaction is displayed(in this example "PREPARED"). There might be multiple elements in the status field, e.g. "PREPARED" and "COMMITTED" in the case the transaction committed on certain nodes but not on all of them.  

2. SA visually maps the XID received from the TM to an Infinispna internal id, represented as a number. This step is needed because the XID, a byte array, cannot conveniently be passed to the JMX tool (e.g. JConsole) and then re-assembled on infinispan's side.

3. SA forces the transaction's commit/rollback through the corresponding jmx operation, based on the internal id.

The image below is obtained by forcing the commit of the transaction based on its internal id.

images/author/download/attachments/3737124/forceCommit.png

Note: All JMX operations described above can be executed on any node, disregarding where the transaction originated.

Force commit/rollback based on XID

XID-based JMX operations for forcing in-doubt transactions' commit/rollback are available as well: these methods receive byte[] arrays describing the XID instead of the number associated with the transactions (as previously described at step 2). These can be useful e.g. if one wants to set up an automatic completion job for certain in-doubt transactions. This process is plugged into TM's recovery and has access to the TM's XID objects.

Want to know more?

The recovery design document describes in more detail the insides of transaction recovery implementation.

JBoss.org Content Archive (Read Only), exported from JBoss Community Documentation Editor at 2020-03-11 09:39:12 UTC, last content change 2011-05-19 21:46:04 UTC.