Hibernate.orgCommunity Documentation
Table of Contents
Hibernate uses a fetching strategy to retrieve
associated objects if the application needs to navigate the association.
Fetch strategies can be declared in the O/R mapping metadata, or
over-ridden by a particular HQL or Criteria
query.
Hibernate defines the following fetching strategies:
Join fetching: Hibernate retrieves the
associated instance or collection in the same
SELECT
, using an OUTER
JOIN
.
Select fetching: a second
SELECT
is used to retrieve the associated entity or
collection. Unless you explicitly disable lazy fetching by specifying
lazy="false"
, this second select will only be
executed when you access the association.
Subselect fetching: a second
SELECT
is used to retrieve the associated
collections for all entities retrieved in a previous query or fetch.
Unless you explicitly disable lazy fetching by specifying
lazy="false"
, this second select will only be
executed when you access the association.
Batch fetching: an optimization strategy
for select fetching. Hibernate retrieves a batch of entity instances
or collections in a single SELECT
by specifying a
list of primary or foreign keys.
Hibernate also distinguishes between:
Immediate fetching: an association, collection or attribute is fetched immediately when the owner is loaded.
Lazy collection fetching: a collection is fetched when the application invokes an operation upon that collection. This is the default for collections.
"Extra-lazy" collection fetching: individual elements of the collection are accessed from the database as needed. Hibernate tries not to fetch the whole collection into memory unless absolutely needed. It is suitable for large collections.
Proxy fetching: a single-valued association is fetched when a method other than the identifier getter is invoked upon the associated object.
"No-proxy" fetching: a single-valued association is fetched when the instance variable is accessed. Compared to proxy fetching, this approach is less lazy; the association is fetched even when only the identifier is accessed. It is also more transparent, since no proxy is visible to the application. This approach requires buildtime bytecode instrumentation and is rarely necessary.
Lazy attribute fetching: an attribute or single valued association is fetched when the instance variable is accessed. This approach requires buildtime bytecode instrumentation and is rarely necessary.
We have two orthogonal notions here: when is
the association fetched and how is it fetched. It is
important that you do not confuse them. We use fetch
to
tune performance. We can use lazy
to define a contract
for what data is always available in any detached instance of a particular
class.
By default, Hibernate uses lazy select fetching for collections and lazy proxy fetching for single-valued associations. These defaults make sense for most associations in the majority of applications.
If you set hibernate.default_batch_fetch_size
,
Hibernate will use the batch fetch optimization for lazy fetching. This
optimization can also be enabled at a more granular level.
Please be aware that access to a lazy association outside of the context of an open Hibernate session will result in an exception. For example:
s = sessions.openSession(); Transaction tx = s.beginTransaction(); User u = (User) s.createQuery("from User u where u.name=:userName") .setString("userName", userName).uniqueResult(); Map permissions = u.getPermissions(); tx.commit(); s.close(); Integer accessLevel = (Integer) permissions.get("accounts"); // Error!
Since the permissions collection was not initialized when the
Session
was closed, the collection will not be able
to load its state. Hibernate does not support lazy
initialization for detached objects. This can be fixed by
moving the code that reads from the collection to just before the
transaction is committed.
Alternatively, you can use a non-lazy collection or association,
by specifying lazy="false"
for the association
mapping. However, it is intended that lazy initialization be used for
almost all collections and associations. If you define too many non-lazy
associations in your object model, Hibernate will fetch the entire
database into memory in every transaction.
On the other hand, you can use join fetching, which is non-lazy by nature, instead of select fetching in a particular transaction. We will now explain how to customize the fetching strategy. In Hibernate, the mechanisms for choosing a fetch strategy are identical for single-valued associations and collections.
Select fetching (the default) is extremely vulnerable to N+1 selects problems, so we might want to enable join fetching in the mapping document:
<set name="permissions" fetch="join"> <key column="userId"/> <one-to-many class="Permission"/> </set
<many-to-one name="mother" class="Cat" fetch="join"/>
The fetch
strategy defined in the mapping
document affects:
retrieval via get()
or
load()
retrieval that happens implicitly when an association is navigated
Criteria
queries
HQL queries if subselect
fetching is
used
Irrespective of the fetching strategy you use, the defined non-lazy graph is guaranteed to be loaded into memory. This might, however, result in several immediate selects being used to execute a particular HQL query.
Usually, the mapping document is not used to customize fetching.
Instead, we keep the default behavior, and override it for a particular
transaction, using left join fetch
in HQL. This tells
Hibernate to fetch the association eagerly in the first select, using an
outer join. In the Criteria
query API, you would use
setFetchMode(FetchMode.JOIN)
.
If you want to change the fetching strategy used by
get()
or load()
, you can use a
Criteria
query. For example:
User user = (User) session.createCriteria(User.class) .setFetchMode("permissions", FetchMode.JOIN) .add( Restrictions.idEq(userId) ) .uniqueResult();
This is Hibernate's equivalent of what some ORM solutions call a "fetch plan".
A completely different approach to problems with N+1 selects is to use the second-level cache.
Lazy fetching for collections is implemented using Hibernate's own implementation of persistent collections. However, a different mechanism is needed for lazy behavior in single-ended associations. The target entity of the association must be proxied. Hibernate implements lazy initializing proxies for persistent objects using runtime bytecode enhancement which is accessed via the bytecode provider.
At startup, Hibernate generates proxies by default for all
persistent classes and uses them to enable lazy fetching of
many-to-one
and one-to-one
associations.
The mapping file may declare an interface to use as the proxy
interface for that class, with the proxy
attribute.
By default, Hibernate uses a subclass of the class. The
proxied class must implement a default constructor with at least package
visibility. This constructor is recommended for all persistent
classes.
There are potential problems to note when extending this approach to polymorphic classes.For example:
<class name="Cat" proxy="Cat"> ...... <subclass name="DomesticCat"> ..... </subclass> </class>
Firstly, instances of Cat
will never be
castable to DomesticCat
, even if the underlying
instance is an instance of DomesticCat
:
Cat cat = (Cat) session.load(Cat.class, id); // instantiate a proxy (does not hit the db) if ( cat.isDomesticCat() ) { // hit the db to initialize the proxy DomesticCat dc = (DomesticCat) cat; // Error! .... }
Secondly, it is possible to break proxy
==
:
Cat cat = (Cat) session.load(Cat.class, id); // instantiate a Cat proxy DomesticCat dc = (DomesticCat) session.load(DomesticCat.class, id); // acquire new DomesticCat proxy! System.out.println(cat==dc); // false
However, the situation is not quite as bad as it looks. Even though we now have two references to different proxy objects, the underlying instance will still be the same object:
cat.setWeight(11.0); // hit the db to initialize the proxy System.out.println( dc.getWeight() ); // 11.0
Third, you cannot use a bytecode provider generated proxy for a final
class or a class with any final
methods.
Finally, if your persistent object acquires any resources upon instantiation (e.g. in initializers or default constructor), then those resources will also be acquired by the proxy. The proxy class is an actual subclass of the persistent class.
These problems are all due to fundamental limitations in Java's
single inheritance model. To avoid these problems your persistent
classes must each implement an interface that declares its business
methods. You should specify these interfaces in the mapping file where
CatImpl
implements the interface
Cat
and DomesticCatImpl
implements
the interface DomesticCat
. For example:
<class name="CatImpl" proxy="Cat"> ...... <subclass name="DomesticCatImpl" proxy="DomesticCat"> ..... </subclass> </class>
Then proxies for instances of Cat
and
DomesticCat
can be returned by
load()
or iterate()
.
Cat cat = (Cat) session.load(CatImpl.class, catid); Iterator iter = session.createQuery("from CatImpl as cat where cat.name='fritz'").iterate(); Cat fritz = (Cat) iter.next();
list()
does not usually return
proxies.
Relationships are also lazily initialized. This means you must
declare any properties to be of type Cat
, not
CatImpl
.
Certain operations do not require proxy initialization:
equals()
: if the persistent class does not
override equals()
hashCode()
: if the persistent class does
not override hashCode()
The identifier getter method
Hibernate will detect persistent classes that override
equals()
or hashCode()
.
By choosing lazy="no-proxy"
instead of the
default lazy="proxy"
, you can avoid problems
associated with typecasting. However, buildtime bytecode instrumentation
is required, and all operations will result in immediate proxy
initialization.
A LazyInitializationException
will be thrown by
Hibernate if an uninitialized collection or proxy is accessed outside of
the scope of the Session
, i.e., when the entity
owning the collection or having the reference to the proxy is in the
detached state.
Sometimes a proxy or collection needs to be initialized before
closing the Session
. You can force initialization by
calling cat.getSex()
or
cat.getKittens().size()
, for example. However, this
can be confusing to readers of the code and it is not convenient for
generic code.
The static methods Hibernate.initialize()
and
Hibernate.isInitialized()
, provide the application
with a convenient way of working with lazily initialized collections or
proxies. Hibernate.initialize(cat)
will force the
initialization of a proxy, cat
, as long as its
Session
is still open. Hibernate.initialize(
cat.getKittens() )
has a similar effect for the collection of
kittens.
Another option is to keep the Session
open
until all required collections and proxies have been loaded. In some
application architectures, particularly where the code that accesses
data using Hibernate, and the code that uses it are in different
application layers or different physical processes, it can be a problem
to ensure that the Session
is open when a collection
is initialized. There are two basic ways to deal with this issue:
In a web-based application, a servlet filter can be used to
close the Session
only at the end of a user
request, once the rendering of the view is complete (the
Open Session in View pattern). Of course, this
places heavy demands on the correctness of the exception handling of
your application infrastructure. It is vitally important that the
Session
is closed and the transaction ended
before returning to the user, even when an exception occurs during
rendering of the view. See the Hibernate Wiki for examples of this
"Open Session in View" pattern.
In an application with a separate business tier, the business
logic must "prepare" all collections that the web tier needs before
returning. This means that the business tier should load all the
data and return all the data already initialized to the
presentation/web tier that is required for a particular use case.
Usually, the application calls
Hibernate.initialize()
for each collection that
will be needed in the web tier (this call must occur before the
session is closed) or retrieves the collection eagerly using a
Hibernate query with a FETCH
clause or a
FetchMode.JOIN
in Criteria
.
This is usually easier if you adopt the Command
pattern instead of a Session Facade.
You can also attach a previously loaded object to a new
Session
with merge()
or
lock()
before accessing uninitialized collections
or other proxies. Hibernate does not, and certainly
should not, do this automatically since it
would introduce impromptu transaction semantics.
Sometimes you do not want to initialize a large collection, but still need some information about it, like its size, for example, or a subset of the data.
You can use a collection filter to get the size of a collection without initializing it:
( (Integer) s.createFilter( collection, "select count(*)" ).list().get(0) ).intValue()
The createFilter()
method is also used to
efficiently retrieve subsets of a collection without needing to
initialize the whole collection:
s.createFilter( lazyCollection, "").setFirstResult(0).setMaxResults(10).list();
Using batch fetching, Hibernate can load several uninitialized proxies if one proxy is accessed. Batch fetching is an optimization of the lazy select fetching strategy. There are two ways you can configure batch fetching: on the class level and the collection level.
Batch fetching for classes/entities is easier to understand.
Consider the following example: at runtime you have 25
Cat
instances loaded in a Session
,
and each Cat
has a reference to its
owner
, a Person
. The
Person
class is mapped with a proxy,
lazy="true"
. If you now iterate through all cats and
call getOwner()
on each, Hibernate will, by default,
execute 25 SELECT
statements to retrieve the proxied
owners. You can tune this behavior by specifying a
batch-size
in the mapping of
Person
:
<class name="Person" batch-size="10">...</class>
With this batch-size
specified, Hibernate will now execute queries on demand when need to access the
uninitialized proxy, as above, but the difference is that instead of querying the exactly proxy entity that
being accessed, it will query more Person's owner at once, so, when accessing other person's owner, it may
already been initialized by this batch fetch with only a few ( much less than 25) queries will be executed.
This behavior is controlled by the batch-size
and batch fetch style configuration.
The batch fetch style configuration ( hibernate.batch_fetch_style
) is a new performance
improvement since 4.2.0, there are 3 different strategies provided, which is legacy
,
padded
and dynamic
.
LEGACY
The legacy algorithm where we keep a set of pre-built batch sizes based on
org.hibernate.internal.util.collections.ArrayHelper#getBatchSizes
.
Batches are performed using the next-smaller pre-built batch size from the number of existing batchable identifiers.
In the above example, with a batch-size setting of 25 the pre-built batch sizes would be [25, 12, 10, 9, 8, 7, .., 1].
And since there are 25 persons' owner to be initialized, then only one query will be executed using these 25 owners' identifier.
But in another case, suppose there are only 24 persons, there will be 3 queries (12, 10, 2) will be executed to go through all person's owner, and the query will looks like :
select * from owner where id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) select * from owner where id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?) select * from owner where id in (?, ?)
PADDED
This is kind of similar with the legacy algorithm, it uses the pre-build batch sizes based on same
org.hibernate.internal.util.collections.ArrayHelper#getBatchSizes
. The difference
is that here hibernate will use the next-bigger batch size and pads the extra identifier placeholders.
So, using the same example above, initializing 25 persons the query would be same as above, only 1 query will be executed to batch query all the owners.
However, the attempt to batch load 24 owners would result just a single batch of size 25, the identifiers to load would be "padded" (aka, repeated) to make up the difference.
DYNAMIC
Dynamically builds its SQL based on the actual number of available ids. Does still limit to the batch-size defined on the entity.
You can also enable batch fetching of collections. For example, if
each Person
has a lazy collection of
Cat
s, and 10 persons are currently loaded in the
Session
, iterating through all persons will generate
10 SELECT
s, one for every call to
getCats()
. If you enable batch fetching for the
cats
collection in the mapping of
Person
, Hibernate can pre-fetch collections:
<class name="Person"> <set name="cats" batch-size="3"> ... </set> </class>
For example, with a batch-size
of 3 and using legacy
batch style,
Hibernate will load 3, 3, 3, 1 collections in four SELECT
s. Again, the value
of the attribute depends on the expected number of uninitialized
collections in a particular Session
.
Batch fetching of collections is particularly useful if you have a nested tree of items, i.e. the typical bill-of-materials pattern. However, a nested set or a materialized path might be a better option for read-mostly trees.
If one lazy collection or single-valued proxy has to be fetched, Hibernate will load all of them, re-running the original query in a subselect. This works in the same way as batch-fetching but without the piecemeal loading.
Another way to affect the fetching strategy for loading associated
objects is through something called a fetch profile, which is a named
configuration associated with the
org.hibernate.SessionFactory
but enabled,
by name, on the org.hibernate.Session
.
Once enabled on a org.hibernate.Session
,
the fetch profile will be in affect for that
org.hibernate.Session
until it is
explicitly disabled.
So what does that mean? Well lets explain that by way of an example which show the different available approaches to configure a fetch profile:
Example 20.1. Specifying a fetch profile using
@FetchProfile
@Entity @FetchProfile(name = "customer-with-orders", fetchOverrides = { @FetchProfile.FetchOverride(entity = Customer.class, association = "orders", mode = FetchMode.JOIN) }) public class Customer { @Id @GeneratedValue private long id; private String name; private long customerNumber; @OneToMany private Set<Order> orders; // standard getter/setter ... }
Example 20.2. Specifying a fetch profile using
<fetch-profile>
outside
<class>
node
<hibernate-mapping> <class name="Customer"> ... <set name="orders" inverse="true"> <key column="cust_id"/> <one-to-many class="Order"/> </set> </class> <class name="Order"> ... </class> <fetch-profile name="customer-with-orders"> <fetch entity="Customer" association="orders" style="join"/> </fetch-profile> </hibernate-mapping>
Example 20.3. Specifying a fetch profile using
<fetch-profile>
inside
<class>
node
<hibernate-mapping> <class name="Customer"> ... <set name="orders" inverse="true"> <key column="cust_id"/> <one-to-many class="Order"/> </set> <fetch-profile name="customer-with-orders"> <fetch association="orders" style="join"/> </fetch-profile> </class> <class name="Order"> ... </class> </hibernate-mapping>
Now normally when you get a reference to a particular customer, that customer's set of orders will be lazy meaning we will not yet have loaded those orders from the database. Normally this is a good thing. Now lets say that you have a certain use case where it is more efficient to load the customer and their orders together. One way certainly is to use "dynamic fetching" strategies via an HQL or criteria queries. But another option is to use a fetch profile to achieve that. The following code will load both the customer andtheir orders:
Example 20.4. Activating a fetch profile for a given
Session
Session session = ...; session.enableFetchProfile( "customer-with-orders" ); // name matches from mapping Customer customer = (Customer) session.get( Customer.class, customerId );
@FetchProfile
definitions are global and
it does not matter on which class you place them. You can place the
@FetchProfile
annotation either onto a class or
package (package-info.java). In order to define multiple fetch
profiles for the same class or package
@FetchProfiles
can be used.
Currently only join style fetch profiles are supported, but they plan is to support additional styles. See HHH-3414 for details.
Hibernate supports the lazy fetching of individual properties. This optimization technique is also known as fetch groups. Please note that this is mostly a marketing feature; optimizing row reads is much more important than optimization of column reads. However, only loading some properties of a class could be useful in extreme cases. For example, when legacy tables have hundreds of columns and the data model cannot be improved.
To enable lazy property loading, set the lazy
attribute on your particular property mappings:
<class name="Document"> <id name="id"> <generator class="native"/> </id> <property name="name" not-null="true" length="50"/> <property name="summary" not-null="true" length="200" lazy="true"/> <property name="text" not-null="true" length="2000" lazy="true"/> </class>
Lazy property loading requires buildtime bytecode instrumentation. If your persistent classes are not enhanced, Hibernate will ignore lazy property settings and return to immediate fetching.
For bytecode instrumentation, use the following Ant task:
<target name="instrument" depends="compile"> <taskdef name="instrument" classname="org.hibernate.tool.instrument.InstrumentTask"> <classpath path="${jar.path}"/> <classpath path="${classes.dir}"/> <classpath refxml:id="lib.class.path"/> </taskdef> <instrument verbose="true"> <fileset dir="${testclasses.dir}/org/hibernate/auction/model"> <include name="*.class"/> </fileset> </instrument> </target>
A different way of avoiding unnecessary column reads, at least for read-only transactions, is to use the projection features of HQL or Criteria queries. This avoids the need for buildtime bytecode processing and is certainly a preferred solution.
You can force the usual eager fetching of properties using
fetch all properties
in HQL.
A Hibernate Session
is a transaction-level cache
of persistent data. It is possible to configure a cluster or JVM-level
(SessionFactory
-level) cache on a class-by-class and
collection-by-collection basis. You can even plug in a clustered cache. Be
aware that caches are not aware of changes made to the persistent store by
another application. They can, however, be configured to regularly expire
cached data.
You have the option to tell Hibernate which caching
implementation to use by specifying the name of a class that implements
org.hibernate.cache.spi.CacheProvider
using the property
hibernate.cache.provider_class
. Hibernate is bundled
with a number of built-in integrations with the open-source cache
providers that are listed in Table 20.1, “Cache Providers”. You can
also implement your own and plug it in as outlined above. Note that
versions prior to Hibernate 3.2 use EhCache as the default cache
provider.
Table 20.1. Cache Providers
Cache | Provider class | Type | Cluster Safe | Query Cache Supported |
---|---|---|---|---|
ConcurrentHashMap (only for testing purpose, in hibernate-testing module) | org.hibernate.testing.cache.CachingRegionFactory | memory | yes | |
EHCache | org.hibernate.cache.ehcache.EhCacheRegionFactory | memory, disk, transactional, clustered | yes | yes |
Infinispan | org.hibernate.cache.infinispan.InfinispanRegionFactory | clustered (ip multicast), transactional | yes (replication or invalidation) | yes (clock sync req.) |
As we have done in previous chapters we are looking at the two different possibiltites to configure caching. First configuration via annotations and then via Hibernate mapping files.
By default, entities are not part of the second level cache and we
recommend you to stick to this setting. However, you can override this
by setting the shared-cache-mode
element in your
persistence.xml
file or by using the
javax.persistence.sharedCache.mode
property in your
configuration. The following values are possible:
ENABLE_SELECTIVE
(Default and recommended
value): entities are not cached unless explicitly marked as
cacheable.
DISABLE_SELECTIVE
: entities are cached
unless explicitly marked as not cacheable.
ALL
: all entities are always cached even if
marked as non cacheable.
NONE
: no entity are cached even if marked
as cacheable. This option can make sense to disable second-level
cache altogether.
The cache concurrency strategy used by default can be set globaly
via the
hibernate.cache.default_cache_concurrency_strategy
configuration property. The values for this property are:
read-only
read-write
nonstrict-read-write
transactional
It is recommended to define the cache concurrency strategy per
entity rather than using a global one. Use the
@org.hibernate.annotations.Cache
annotation for
that.
Example 20.5. Definition of cache concurrency strategy via
@Cache
@Entity
@Cacheable
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
public class Forest { ... }
Hibernate also let's you cache the content of a collection or the
identifiers if the collection contains other entities. Use the
@Cache
annotation on the collection
property.
Example 20.6. Caching collections using annotations
@OneToMany(cascade=CascadeType.ALL, fetch=FetchType.EAGER)
@JoinColumn(name="CUST_ID")
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
public SortedSet<Ticket> getTickets() {
return tickets;
}
Example 20.7, “@Cache
annotation with
attributes”shows
the @org.hibernate.annotations.Cache
annotations with
its attributes. It allows you to define the caching strategy and region
of a given second level cache.
Example 20.7. @Cache
annotation with
attributes
@Cache( CacheConcurrencyStrategy usage(); String region() default ""; String include() default "all"; )
usage: the given cache concurrency strategy (NONE, READ_ONLY, NONSTRICT_READ_WRITE, READ_WRITE, TRANSACTIONAL) | |
region (optional): the cache region (default to the fqcn of the class or the fq role name of the collection) | |
|
Let's now take a look at Hibernate mapping files. There the
<cache>
element of a class or collection
mapping is used to configure the second level cache. Looking at Example 20.8, “The Hibernate <cache>
mapping
element” the parallels to
anotations is obvious.
Example 20.8. The Hibernate <cache>
mapping
element
<cache usage="transactional|read-write|nonstrict-read-write|read-only" region="RegionName" include="all|non-lazy" />
| |
| |
|
Alternatively to <cache>
, you can use
<class-cache>
and
<collection-cache>
elements in
hibernate.cfg.xml
.
Let's now have a closer look at the different usage strategies
If your application needs to read, but not modify, instances of a
persistent class, a read-only
cache can be used. This
is the simplest and optimal performing strategy. It is even safe for use
in a cluster.
If the application needs to update data, a
read-write
cache might be appropriate. This cache
strategy should never be used if serializable transaction isolation
level is required. If the cache is used in a JTA environment, you must
specify the property
hibernate.transaction.manager_lookup_class
and naming
a strategy for obtaining the JTA TransactionManager
.
In other environments, you should ensure that the transaction is
completed when Session.close()
or
Session.disconnect()
is called. If you want to use
this strategy in a cluster, you should ensure that the underlying cache
implementation supports locking. The built-in cache providers
do not support locking.
If the application only occasionally needs to update data (i.e. if
it is extremely unlikely that two transactions would try to update the
same item simultaneously), and strict transaction isolation is not
required, a nonstrict-read-write
cache might be
appropriate. If the cache is used in a JTA environment, you must specify
hibernate.transaction.manager_lookup_class
. In other
environments, you should ensure that the transaction is completed when
Session.close()
or
Session.disconnect()
is called.
The transactional
cache strategy provides
support for fully transactional cache providers such as JBoss TreeCache.
Such a cache can only be used in a JTA environment and you must specify
hibernate.transaction.manager_lookup_class
.
None of the cache providers support all of the cache concurrency strategies.
The following table shows which providers are compatible with which concurrency strategies.
Table 20.2. Cache Concurrency Strategy Support
Cache | read-only | nonstrict-read-write | read-write | transactional |
---|---|---|---|---|
ConcurrentHashMap (not intended for production use) | yes | yes | yes | |
EHCache | yes | yes | yes | yes |
Infinispan | yes | yes |
Whenever you pass an object to save()
,
update()
or saveOrUpdate()
, and
whenever you retrieve an object using load()
,
get()
, list()
,
iterate()
or scroll()
, that object
is added to the internal cache of the Session
.
When flush()
is subsequently called, the state of
that object will be synchronized with the database. If you do not want
this synchronization to occur, or if you are processing a huge number of
objects and need to manage memory efficiently, the
evict()
method can be used to remove the object and its
collections from the first-level cache.
Example 20.9. Explcitly evicting a cached instance from the first level cache
using Session.evict()
ScrollableResult cats = sess.createQuery("from Cat as cat").scroll(); //a huge result set while ( cats.next() ) { Cat cat = (Cat) cats.get(0); doSomethingWithACat(cat); sess.evict(cat); }
The Session
also provides a
contains()
method to determine if an instance belongs
to the session cache.
To evict all objects from the session cache, call
Session.clear()
For the second-level cache, there are methods defined on
SessionFactory
for evicting the cached state of an
instance, entire class, collection instance or entire collection
role.
Example 20.10. Second-level cache eviction via
SessionFactoty.evict()
and
SessionFacyory.evictCollection()
sessionFactory.evict(Cat.class, catId); //evict a particular Cat sessionFactory.evict(Cat.class); //evict all Cats sessionFactory.evictCollection("Cat.kittens", catId); //evict a particular collection of kittens sessionFactory.evictCollection("Cat.kittens"); //evict all kitten collections
The CacheMode
controls how a particular session
interacts with the second-level cache:
CacheMode.NORMAL
: will read items from and
write items to the second-level cache
CacheMode.GET
: will read items from the
second-level cache. Do not write to the second-level cache except when
updating data
CacheMode.PUT
: will write items to the
second-level cache. Do not read from the second-level cache
CacheMode.REFRESH
: will write items to the
second-level cache. Do not read from the second-level cache. Bypass
the effect of hibernate.cache.use_minimal_puts
forcing a refresh of the second-level cache for all items read from
the database
To browse the contents of a second-level or query cache region, use
the Statistics
API:
Example 20.11. Browsing the second-level cache entries via the
Statistics
API
Map cacheEntries = sessionFactory.getStatistics() .getSecondLevelCacheStatistics(regionName) .getEntries();
You will need to enable statistics and, optionally, force Hibernate to keep the cache entries in a more readable format:
Example 20.12. Enabling Hibernate statistics
hibernate.generate_statistics true hibernate.cache.use_structured_entries true
Query result sets can also be cached. This is only useful for queries that are run frequently with the same parameters.
Caching of query results introduces some overhead in terms of your applications normal transactional processing. For example, if you cache results of a query against Person Hibernate will need to keep track of when those results should be invalidated because changes have been committed against Person. That, coupled with the fact that most applications simply gain no benefit from caching query results, leads Hibernate to disable caching of query results by default. To use query caching, you will first need to enable the query cache:
hibernate.cache.use_query_cache true
This setting creates two new cache regions:
org.hibernate.cache.internal.StandardQueryCache
,
holding the cached query results
org.hibernate.cache.spi.UpdateTimestampsCache
,
holding timestamps of the most recent updates to queryable tables.
These are used to validate the results as they are served from the
query cache.
If you configure your underlying cache implementation to use expiry or timeouts is very important that the cache timeout of the underlying cache region for the UpdateTimestampsCache be set to a higher value than the timeouts of any of the query caches. In fact, we recommend that the the UpdateTimestampsCache region not be configured for expiry at all. Note, in particular, that an LRU cache expiry policy is never appropriate.
As mentioned above, most queries do not benefit from caching or
their results. So by default, individual queries are not cached even
after enabling query caching. To enable results caching for a particular
query, call org.hibernate.Query.setCacheable(true)
.
This call allows the query to look for existing cache results or add its
results to the cache when it is executed.
The query cache does not cache the state of the actual entities in the cache; it caches only identifier values and results of value type. For this reaso, the query cache should always be used in conjunction with the second-level cache for those entities expected to be cached as part of a query result cache (just as with collection caching).
If you require fine-grained control over query cache expiration
policies, you can specify a named cache region for a particular query by
calling Query.setCacheRegion()
.
List blogs = sess.createQuery("from Blog blog where blog.blogger = :blogger") .setEntity("blogger", blogger) .setMaxResults(15) .setCacheable(true) .setCacheRegion("frontpages") .list();
If you want to force the query cache to refresh one of its regions
(disregard any cached results it finds there) you can use
org.hibernate.Query.setCacheMode(CacheMode.REFRESH)
.
In conjunction with the region you have defined for the given query,
Hibernate will selectively force the results cached in that particular
region to be refreshed. This is particularly useful in cases where
underlying data may have been updated via a separate process and is a
far more efficient alternative to bulk eviction of the region via
org.hibernate.SessionFactory.evictQueries()
.
Hibernate internally needs an entry ( org.hibernate.engine.spi.EntityEntry
) to tell
the current state of an object with respect to its persistent state, when the object is associated with a
Session
. However, maintaining this association was kind of heavy operation due to lots of
other rules must by applied, since 4.2.0, there is a new improvement designed for this purpose, which will reduce
session-related memory and CPU overloads.
Basically, the idea is, instead of having a customized ( kind of heavy and which was usually identified as hotspot ) map to do the look up, we change it to
EntityEntry entry = (ManagedEntity)entity.$$_hibernate_getEntityEntry();
There are three ways to get benefits from this new improvement:
An entity can choose to implement this interface by itself, then it is the entity's responsibility to maintain
the bi-association that essentially provides access to information about an instance's association to a
Session/EntityManager.
More info about org.hibernate.engine.spi.ManagedEntity
please find from its javadoc.
Sometimes, you probably don't want to implement an intrusive interface, maybe due to portable concern, which is fine and Hibernate will take care of this internally with a wrapper class which implements that interface, and also an internal cache that maps this entity instance and the wrapper together.
Obviously, this is the easiest way to choose, since it doesn't require any change of the project source code, but it also cost more memory and CUP usage, comparing to the first one.
Besides the above two approaches, Hibernate also provides a
third choice which is build time bytecode enhancement. Applications
can use enhanced entity classes, annotated with either javax.persistence.Entity
or composite javax.persistence.Embeddable
.
To use the task org.hibernate.tool.enhance.EnhancementTask
define a taskdef and call the task, as shown below. This code uses a
pre-defined classpathref and a property referencing the compiled classes
directory.
<taskdef name="enhance" classname="org.hibernate.tool.enhance.EnhancementTask" classpathref="enhancement.classpath" /> <enhance> <fileset dir="${ejb-classes}/org/hibernate/auction/model" includes="**/*.class"/> </enhance>
The EnhancementTask is intended as a total replacement for InstrumentTask. Further, it is also incompatible with InstrumentTask, so any existing instrumented classes will need to be built from source again.
The Maven Plugin uses a Mojo descriptor to attach the Mojo to the compile phase for your project.
<dependencies> <dependency> <groupId>org.hibernate.javax.persistence</groupId> <artifactId>hibernate-jpa-[SPEC-VERSION]-api</artifactId> <version>[IMPL-VERSION]</version> <scope>compile</scope> </dependency> </dependencies> <plugins> <plugin> <groupId>org.hibernate.orm.tooling</groupId> <artifactId>hibernate-enhance-maven-plugin</artifactId> <version>VERSION</version> <executions> <execution> <goals> <goal>enhance</goal> </goals> </execution> </executions> </plugin>
The Gradle plugin adds an enhance task using the output directory of the compile task as the source location of entity class files to enhance.
apply plugin: 'java' apply plugin: 'maven' apply plugin: 'enhance' buildscript { repositories { mavenCentral() } dependencies { classpath 'org.hibernate:hibernate-gradle-plugin:VERSION' } } dependencies { compile group: 'org.hibernate.javax.persistence', name: 'hibernate-jpa-[SPEC-VERSION]-api', version: '[IMPL-VERSION]' compile group: 'org.hibernate', name: 'hibernate-gradle-plugin', version: 'VERSION' }
In the previous sections we have covered collections and their applications. In this section we explore some more issues in relation to collections at runtime.
Hibernate defines three basic kinds of collections:
collections of values
one-to-many associations
many-to-many associations
This classification distinguishes the various table and foreign key relationships but does not tell us quite everything we need to know about the relational model. To fully understand the relational structure and performance characteristics, we must also consider the structure of the primary key that is used by Hibernate to update or delete collection rows. This suggests the following classification:
indexed collections
sets
bags
All indexed collections (maps, lists, and arrays) have a primary
key consisting of the <key>
and
<index>
columns. In this case, collection
updates are extremely efficient. The primary key can be efficiently
indexed and a particular row can be efficiently located when Hibernate
tries to update or delete it.
Sets have a primary key consisting of
<key>
and element columns. This can be less
efficient for some types of collection element, particularly composite
elements or large text or binary fields, as the database may not be able
to index a complex primary key as efficiently. However, for one-to-many
or many-to-many associations, particularly in the case of synthetic
identifiers, it is likely to be just as efficient. If you want
SchemaExport
to actually create the primary key of a
<set>
, you must declare all columns as
not-null="true"
.
<idbag>
mappings define a surrogate key,
so they are efficient to update. In fact, they are the best case.
Bags are the worst case since they permit duplicate element values
and, as they have no index column, no primary key can be defined.
Hibernate has no way of distinguishing between duplicate rows. Hibernate
resolves this problem by completely removing in a single
DELETE
and recreating the collection whenever it
changes. This can be inefficient.
For a one-to-many association, the "primary key" may not be the physical primary key of the database table. Even in this case, the above classification is still useful. It reflects how Hibernate "locates" individual rows of the collection.
From the discussion above, it should be clear that indexed collections and sets allow the most efficient operation in terms of adding, removing and updating elements.
There is, arguably, one more advantage that indexed collections
have over sets for many-to-many associations or collections of values.
Because of the structure of a Set
, Hibernate does not
UPDATE
a row when an element is "changed". Changes to
a Set
always work via INSERT
and
DELETE
of individual rows. Once again, this
consideration does not apply to one-to-many associations.
After observing that arrays cannot be lazy, you can conclude that lists, maps and idbags are the most performant (non-inverse) collection types, with sets not far behind. You can expect sets to be the most common kind of collection in Hibernate applications. This is because the "set" semantics are most natural in the relational model.
However, in well-designed Hibernate domain models, most
collections are in fact one-to-many associations with
inverse="true"
. For these associations, the update is
handled by the many-to-one end of the association, and so considerations
of collection update performance simply do not apply.
There is a particular case, however, in which bags, and also
lists, are much more performant than sets. For a collection with
inverse="true"
, the standard bidirectional
one-to-many relationship idiom, for example, we can add elements to a
bag or list without needing to initialize (fetch) the bag elements. This
is because, unlike a set
,
Collection.add()
or
Collection.addAll()
must always return true for a bag
or List
. This can make the following common code much
faster:
Parent p = (Parent) sess.load(Parent.class, id); Child c = new Child(); c.setParent(p); p.getChildren().add(c); //no need to fetch the collection! sess.flush();
Deleting collection elements one by one can sometimes be extremely
inefficient. Hibernate knows not to do that in the case of an
newly-empty collection (if you called list.clear()
,
for example). In this case, Hibernate will issue a single
DELETE
.
Suppose you added a single element to a collection of size twenty
and then remove two elements. Hibernate will issue one
INSERT
statement and two DELETE
statements, unless the collection is a bag. This is certainly
desirable.
However, suppose that we remove eighteen elements, leaving two and then add thee new elements. There are two possible ways to proceed
delete eighteen rows one by one and then insert three rows
remove the whole collection in one SQL
DELETE
and insert all five current elements one
by one
Hibernate cannot know that the second option is probably quicker. It would probably be undesirable for Hibernate to be that intuitive as such behavior might confuse database triggers, etc.
Fortunately, you can force this behavior (i.e. the second strategy) at any time by discarding (i.e. dereferencing) the original collection and returning a newly instantiated collection with all the current elements.
One-shot-delete does not apply to collections mapped
inverse="true"
.
Optimization is not much use without monitoring and access to
performance numbers. Hibernate provides a full range of figures about its
internal operations. Statistics in Hibernate are available per
SessionFactory
.
You can access SessionFactory
metrics in two
ways. Your first option is to call
sessionFactory.getStatistics()
and read or display
the Statistics
yourself.
Hibernate can also use JMX to publish metrics if you enable the
StatisticsService
MBean. You can enable a single
MBean for all your SessionFactory
or one per factory.
See the following code for minimalistic configuration examples:
// MBean service registration for a specific SessionFactory Hashtable tb = new Hashtable(); tb.put("type", "statistics"); tb.put("sessionFactory", "myFinancialApp"); ObjectName on = new ObjectName("hibernate", tb); // MBean object name StatisticsService stats = new StatisticsService(); // MBean implementation stats.setSessionFactory(sessionFactory); // Bind the stats to a SessionFactory server.registerMBean(stats, on); // Register the Mbean on the server
// MBean service registration for all SessionFactory's Hashtable tb = new Hashtable(); tb.put("type", "statistics"); tb.put("sessionFactory", "all"); ObjectName on = new ObjectName("hibernate", tb); // MBean object name StatisticsService stats = new StatisticsService(); // MBean implementation server.registerMBean(stats, on); // Register the MBean on the server
You can activate and deactivate the monitoring for a
SessionFactory
:
at configuration time, set
hibernate.generate_statistics
to
false
at runtime:
sf.getStatistics().setStatisticsEnabled(true)
or
hibernateStatsBean.setStatisticsEnabled(true)
Statistics can be reset programmatically using the
clear()
method. A summary can be sent to a logger
(info level) using the logSummary()
method.
Hibernate provides a number of metrics, from basic information to
more specialized information that is only relevant in certain scenarios.
All available counters are described in the
Statistics
interface API, in three categories:
Metrics related to the general Session
usage, such as number of open sessions, retrieved JDBC connections,
etc.
Metrics related to the entities, collections, queries, and caches as a whole (aka global metrics).
Detailed metrics related to a particular entity, collection, query or cache region.
For example, you can check the cache hit, miss, and put ratio of entities, collections and queries, and the average time a query needs. Be aware that the number of milliseconds is subject to approximation in Java. Hibernate is tied to the JVM precision and on some platforms this might only be accurate to 10 seconds.
Simple getters are used to access the global metrics (i.e. not
tied to a particular entity, collection, cache region, etc.). You can
access the metrics of a particular entity, collection or cache region
through its name, and through its HQL or SQL representation for queries.
Please refer to the Statistics
,
EntityStatistics
,
CollectionStatistics
,
SecondLevelCacheStatistics
, and
QueryStatistics
API Javadoc for more information. The
following code is a simple example:
Statistics stats = HibernateUtil.sessionFactory.getStatistics(); double queryCacheHitCount = stats.getQueryCacheHitCount(); double queryCacheMissCount = stats.getQueryCacheMissCount(); double queryCacheHitRatio = queryCacheHitCount / (queryCacheHitCount + queryCacheMissCount); log.info("Query Hit ratio:" + queryCacheHitRatio); EntityStatistics entityStats = stats.getEntityStatistics( Cat.class.getName() ); long changes = entityStats.getInsertCount() + entityStats.getUpdateCount() + entityStats.getDeleteCount(); log.info(Cat.class.getName() + " changed " + changes + "times" );
You can work on all entities, collections, queries and region
caches, by retrieving the list of names of entities, collections,
queries and region caches using the following methods:
getQueries()
, getEntityNames()
,
getCollectionRoleNames()
, and
getSecondLevelCacheRegionNames()
.