A fetching strategy is the strategy Hibernate will use for
retrieving associated objects if the application needs to navigate the association.
Fetch strategies may be declared in the O/R mapping metadata, or over-ridden by a
particular HQL or Criteria
query.
Hibernate3 defines the following fetching strategies:
Join fetching - Hibernate retrieves the
associated instance or collection in the same SELECT
,
using an OUTER JOIN
.
Select fetching - a second SELECT
is used to retrieve the associated entity or collection. Unless
you explicitly disable lazy fetching by specifying lazy="false"
,
this second select will only be executed when you actually access the
association.
Subselect fetching - a second SELECT
is used to retrieve the associated collections for all entities retrieved in a
previous query or fetch. Unless you explicitly disable lazy fetching by specifying
lazy="false"
, this second select will only be executed when you
actually access the association.
Batch fetching - an optimization strategy
for select fetching - Hibernate retrieves a batch of entity instances
or collections in a single SELECT
, by specifying
a list of primary keys or foreign keys.
Hibernate also distinguishes between:
Immediate fetching - an association, collection or attribute is fetched immediately, when the owner is loaded.
Lazy collection fetching - a collection is fetched when the application invokes an operation upon that collection. (This is the default for collections.)
"Extra-lazy" collection fetching - individual elements of the collection are accessed from the database as needed. Hibernate tries not to fetch the whole collection into memory unless absolutely needed (suitable for very large collections)
Proxy fetching - a single-valued association is fetched when a method other than the identifier getter is invoked upon the associated object.
"No-proxy" fetching - a single-valued association is fetched when the instance variable is accessed. Compared to proxy fetching, this approach is less lazy (the association is fetched even when only the identifier is accessed) but more transparent, since no proxy is visible to the application. This approach requires buildtime bytecode instrumentation and is rarely necessary.
Lazy attribute fetching - an attribute or single valued association is fetched when the instance variable is accessed. This approach requires buildtime bytecode instrumentation and is rarely necessary.
We have two orthogonal notions here: when is the association
fetched, and how is it fetched (what SQL is used). Don't
confuse them! We use fetch
to tune performance. We may use
lazy
to define a contract for what data is always available
in any detached instance of a particular class.
By default, Hibernate3 uses lazy select fetching for collections and lazy proxy fetching for single-valued associations. These defaults make sense for almost all associations in almost all applications.
Note: if you set
hibernate.default_batch_fetch_size
, Hibernate will use the
batch fetch optimization for lazy fetching (this optimization may also be enabled
at a more granular level).
However, lazy fetching poses one problem that you must be aware of. Access to a lazy association outside of the context of an open Hibernate session will result in an exception. For example:
s = sessions.openSession(); Transaction tx = s.beginTransaction(); User u = (User) s.createQuery("from User u where u.name=:userName") .setString("userName", userName).uniqueResult(); Map permissions = u.getPermissions(); tx.commit(); s.close(); Integer accessLevel = (Integer) permissions.get("accounts"); // Error!
Since the permissions collection was not initialized when the
Session
was closed, the collection will not be able to
load its state. Hibernate does not support lazy initialization
for detached objects. The fix is to move the code that reads
from the collection to just before the transaction is committed.
Alternatively, we could use a non-lazy collection or association,
by specifying lazy="false"
for the association mapping.
However, it is intended that lazy initialization be used for almost all
collections and associations. If you define too many non-lazy associations
in your object model, Hibernate will end up needing to fetch the entire
database into memory in every transaction!
On the other hand, we often want to choose join fetching (which is non-lazy by nature) instead of select fetching in a particular transaction. We'll now see how to customize the fetching strategy. In Hibernate3, the mechanisms for choosing a fetch strategy are identical for single-valued associations and collections.
Select fetching (the default) is extremely vulnerable to N+1 selects problems, so we might want to enable join fetching in the mapping document:
<set name="permissions" fetch="join"> <key column="userId"/> <one-to-many class="Permission"/> </set
<many-to-one name="mother" class="Cat" fetch="join"/>
The fetch
strategy defined in the mapping document affects:
retrieval via get()
or load()
retrieval that happens implicitly when an association is navigated
Criteria
queries
HQL queries if subselect
fetching is used
No matter what fetching strategy you use, the defined non-lazy graph is guaranteed to be loaded into memory. Note that this might result in several immediate selects being used to execute a particular HQL query.
Usually, we don't use the mapping document to customize fetching. Instead, we
keep the default behavior, and override it for a particular transaction, using
left join fetch
in HQL. This tells Hibernate to fetch
the association eagerly in the first select, using an outer join. In the
Criteria
query API, you would use
setFetchMode(FetchMode.JOIN)
.
If you ever feel like you wish you could change the fetching strategy used by
get()
or load()
, simply use a
Criteria
query, for example:
User user = (User) session.createCriteria(User.class) .setFetchMode("permissions", FetchMode.JOIN) .add( Restrictions.idEq(userId) ) .uniqueResult();
(This is Hibernate's equivalent of what some ORM solutions call a "fetch plan".)
A completely different way to avoid problems with N+1 selects is to use the second-level cache.
Lazy fetching for collections is implemented using Hibernate's own implementation of persistent collections. However, a different mechanism is needed for lazy behavior in single-ended associations. The target entity of the association must be proxied. Hibernate implements lazy initializing proxies for persistent objects using runtime bytecode enhancement (via the excellent CGLIB library).
By default, Hibernate3 generates proxies (at startup) for all persistent classes
and uses them to enable lazy fetching of many-to-one
and
one-to-one
associations.
The mapping file may declare an interface to use as the proxy interface for that
class, with the proxy
attribute. By default, Hibernate uses a subclass
of the class. Note that the proxied class must implement a default constructor
with at least package visibility. We recommend this constructor for all persistent classes!
There are some gotchas to be aware of when extending this approach to polymorphic classes, eg.
<class name="Cat" proxy="Cat"> ...... <subclass name="DomesticCat"> ..... </subclass> </class>
Firstly, instances of Cat
will never be castable to
DomesticCat
, even if the underlying instance is an
instance of DomesticCat
:
Cat cat = (Cat) session.load(Cat.class, id); // instantiate a proxy (does not hit the db) if ( cat.isDomesticCat() ) { // hit the db to initialize the proxy DomesticCat dc = (DomesticCat) cat; // Error! .... }
Secondly, it is possible to break proxy ==
.
Cat cat = (Cat) session.load(Cat.class, id); // instantiate a Cat proxy DomesticCat dc = (DomesticCat) session.load(DomesticCat.class, id); // acquire new DomesticCat proxy! System.out.println(cat==dc); // false
However, the situation is not quite as bad as it looks. Even though we now have two references to different proxy objects, the underlying instance will still be the same object:
cat.setWeight(11.0); // hit the db to initialize the proxy System.out.println( dc.getWeight() ); // 11.0
Third, you may not use a CGLIB proxy for a final
class or a class
with any final
methods.
Finally, if your persistent object acquires any resources upon instantiation (eg. in initializers or default constructor), then those resources will also be acquired by the proxy. The proxy class is an actual subclass of the persistent class.
These problems are all due to fundamental limitations in Java's single inheritance model. If you wish to avoid these problems your persistent classes must each implement an interface that declares its business methods. You should specify these interfaces in the mapping file. eg.
<class name="CatImpl" proxy="Cat"> ...... <subclass name="DomesticCatImpl" proxy="DomesticCat"> ..... </subclass> </class>
where CatImpl
implements the interface Cat
and
DomesticCatImpl
implements the interface DomesticCat
. Then
proxies for instances of Cat
and DomesticCat
may be returned
by load()
or iterate()
. (Note that list()
does not usually return proxies.)
Cat cat = (Cat) session.load(CatImpl.class, catid); Iterator iter = session.createQuery("from CatImpl as cat where cat.name='fritz'").iterate(); Cat fritz = (Cat) iter.next();
Relationships are also lazily initialized. This means you must declare any properties to be of
type Cat
, not CatImpl
.
Certain operations do not require proxy initialization
equals()
, if the persistent class does not override
equals()
hashCode()
, if the persistent class does not override
hashCode()
The identifier getter method
Hibernate will detect persistent classes that override equals()
or
hashCode()
.
By choosing lazy="no-proxy"
instead of the default
lazy="proxy"
, we can avoid the problems associated with typecasting.
However, we will require buildtime bytecode instrumentation, and all operations
will result in immediate proxy initialization.
A LazyInitializationException
will be thrown by Hibernate if an uninitialized
collection or proxy is accessed outside of the scope of the Session
, ie. when
the entity owning the collection or having the reference to the proxy is in the detached state.
Sometimes we need to ensure that a proxy or collection is initialized before closing the
Session
. Of course, we can alway force initialization by calling
cat.getSex()
or cat.getKittens().size()
, for example.
But that is confusing to readers of the code and is not convenient for generic code.
The static methods Hibernate.initialize()
and Hibernate.isInitialized()
provide the application with a convenient way of working with lazily initialized collections or
proxies. Hibernate.initialize(cat)
will force the initialization of a proxy,
cat
, as long as its Session
is still open.
Hibernate.initialize( cat.getKittens() )
has a similar effect for the collection
of kittens.
Another option is to keep the Session
open until all needed
collections and proxies have been loaded. In some application architectures,
particularly where the code that accesses data using Hibernate, and the code that
uses it are in different application layers or different physical processes, it
can be a problem to ensure that the Session
is open when a
collection is initialized. There are two basic ways to deal with this issue:
In a web-based application, a servlet filter can be used to close the
Session
only at the very end of a user request, once
the rendering of the view is complete (the Open Session in
View pattern). Of course, this places heavy demands on the
correctness of the exception handling of your application infrastructure.
It is vitally important that the Session
is closed and the
transaction ended before returning to the user, even when an exception occurs
during rendering of the view. See the Hibernate Wiki for examples of this
"Open Session in View" pattern.
In an application with a separate business tier, the business logic must
"prepare" all collections that will be needed by the web tier before
returning. This means that the business tier should load all the data and
return all the data already initialized to the presentation/web tier that
is required for a particular use case. Usually, the application calls
Hibernate.initialize()
for each collection that will
be needed in the web tier (this call must occur before the session is closed)
or retrieves the collection eagerly using a Hibernate query with a
FETCH
clause or a FetchMode.JOIN
in
Criteria
. This is usually easier if you adopt the
Command pattern instead of a Session Facade.
You may also attach a previously loaded object to a new Session
with merge()
or lock()
before
accessing uninitialized collections (or other proxies). No, Hibernate does not,
and certainly should not do this automatically, since it
would introduce ad hoc transaction semantics!
Sometimes you don't want to initialize a large collection, but still need some information about it (like its size) or a subset of the data.
You can use a collection filter to get the size of a collection without initializing it:
( (Integer) s.createFilter( collection, "select count(*)" ).list().get(0) ).intValue()
The createFilter()
method is also used to efficiently retrieve subsets
of a collection without needing to initialize the whole collection:
s.createFilter( lazyCollection, "").setFirstResult(0).setMaxResults(10).list();
Hibernate can make efficient use of batch fetching, that is, Hibernate can load several uninitialized proxies if one proxy is accessed (or collections. Batch fetching is an optimization of the lazy select fetching strategy. There are two ways you can tune batch fetching: on the class and the collection level.
Batch fetching for classes/entities is easier to understand. Imagine you have the following situation
at runtime: You have 25 Cat
instances loaded in a Session
, each
Cat
has a reference to its owner
, a Person
.
The Person
class is mapped with a proxy, lazy="true"
. If you now
iterate through all cats and call getOwner()
on each, Hibernate will by default
execute 25 SELECT
statements, to retrieve the proxied owners. You can tune this
behavior by specifying a batch-size
in the mapping of Person
:
<class name="Person" batch-size="10">...</class>
Hibernate will now execute only three queries, the pattern is 10, 10, 5.
You may also enable batch fetching of collections. For example, if each Person
has
a lazy collection of Cat
s, and 10 persons are currently loaded in the
Session
, iterating through all persons will generate 10 SELECT
s,
one for every call to getCats()
. If you enable batch fetching for the
cats
collection in the mapping of Person
, Hibernate can pre-fetch
collections:
<class name="Person"> <set name="cats" batch-size="3"> ... </set> </class>
With a batch-size
of 3, Hibernate will load 3, 3, 3, 1 collections in four
SELECT
s. Again, the value of the attribute depends on the expected number of
uninitialized collections in a particular Session
.
Batch fetching of collections is particularly useful if you have a nested tree of items, ie. the typical bill-of-materials pattern. (Although a nested set or a materialized path might be a better option for read-mostly trees.)
If one lazy collection or single-valued proxy has to be fetched, Hibernate loads all of them, re-running the original query in a subselect. This works in the same way as batch-fetching, without the piecemeal loading.
Hibernate3 supports the lazy fetching of individual properties. This optimization technique is also known as fetch groups. Please note that this is mostly a marketing feature, as in practice, optimizing row reads is much more important than optimization of column reads. However, only loading some properties of a class might be useful in extreme cases, when legacy tables have hundreds of columns and the data model can not be improved.
To enable lazy property loading, set the lazy
attribute on your
particular property mappings:
<class name="Document"> <id name="id"> <generator class="native"/> </id> <property name="name" not-null="true" length="50"/> <property name="summary" not-null="true" length="200" lazy="true"/> <property name="text" not-null="true" length="2000" lazy="true"/> </class>
Lazy property loading requires buildtime bytecode instrumentation! If your persistent classes are not enhanced, Hibernate will silently ignore lazy property settings and fall back to immediate fetching.
For bytecode instrumentation, use the following Ant task:
<target name="instrument" depends="compile"> <taskdef name="instrument" classname="org.hibernate.tool.instrument.InstrumentTask"> <classpath path="${jar.path}"/> <classpath path="${classes.dir}"/> <classpath refid="lib.class.path"/> </taskdef> <instrument verbose="true"> <fileset dir="${testclasses.dir}/org/hibernate/auction/model"> <include name="*.class"/> </fileset> </instrument> </target>
A different (better?) way to avoid unnecessary column reads, at least for read-only transactions is to use the projection features of HQL or Criteria queries. This avoids the need for buildtime bytecode processing and is certainly a preferred solution.
You may force the usual eager fetching of properties using fetch all
properties
in HQL.