Fetching

Fetching, essentially, is the process of grabbing data from the database and making it available to the application. Tuning how an application does fetching is one of the biggest factors in determining how an application will perform. Fetching too much data, in terms of width (values/columns) and/or depth (results/rows), adds unnecessary overhead in terms of both JDBC communication and ResultSet processing. Fetching too little data might cause additional fetching to be needed. Tuning how an application fetches data presents a great opportunity to influence the application overall performance.

The basics

The concept of fetching breaks down into two different questions.

  • When should the data be fetched? Now? Later?

  • How should the data be fetched?

"now" is generally termed eager or immediate. "later" is generally termed lazy or delayed.

There are a number of scopes for defining fetching:

static

Static definition of fetching strategies is done in the mappings. The statically-defined fetch strategies is used in the absence of any dynamically defined strategies

SELECT

Performs a separate SQL select to load the data. This can either be EAGER (the second select is issued immediately) or LAZY (the second select is delayed until the data is needed). This is the strategy generally termed N+1.

JOIN

Inherently an EAGER style of fetching. The data to be fetched is obtained through the use of an SQL outer join.

BATCH

Performs a separate SQL select to load a number of related data items using an IN-restriction as part of the SQL WHERE-clause based on a batch size. Again, this can either be EAGER (the second select is issued immediately) or LAZY (the second select is delayed until the data is needed).

SUBSELECT

Performs a separate SQL select to load associated data based on the SQL restriction used to load the owner. Again, this can either be EAGER (the second select is issued immediately) or LAZY (the second select is delayed until the data is needed).

dynamic (sometimes referred to as runtime)

Dynamic definition is really use-case centric. There are multiple ways to define dynamic fetching:

fetch profiles

defined in mappings, but can be enabled/disabled on the Session.

HQL/JPQL

and both Hibernate and JPA Criteria queries have the ability to specify fetching, specific to said query.

entity graphs

Starting in Hibernate 4.2 (JPA 2.1) this is also an option.

Direct fetching vs entity queries

To see the difference between direct fetching and entity queries in regard to eagerly fetched associations, consider the following entities:

Example 1. Domain model
@Entity(name = "Department")
public static class Department {

    @Id
    private Long id;

    //Getters and setters omitted for brevity
}

@Entity(name = "Employee")
public static class Employee {

    @Id
    private Long id;

    @NaturalId
    private String username;

    @ManyToOne(fetch = FetchType.EAGER)
    private Department department;

    //Getters and setters omitted for brevity
}

The Employee entity has a @ManyToOne association to a Department which is fetched eagerly.

When issuing a direct entity fetch, Hibernate executed the following SQL query:

Example 2. Direct fetching example
Employee employee = entityManager.find( Employee.class, 1L );
select
    e.id as id1_1_0_,
    e.department_id as departme3_1_0_,
    e.username as username2_1_0_,
    d.id as id1_0_1_
from
    Employee e
left outer join
    Department d
        on e.department_id=d.id
where
    e.id = 1

The LEFT JOIN clause is added to the generated SQL query because this association is required to be fetched eagerly.

On the other hand, if you are using an entity query that does not contain a JOIN FETCH directive to the Department association:

Example 3. Entity query fetching example
Employee employee = entityManager.createQuery(
        "select e " +
        "from Employee e " +
        "where e.id = :id", Employee.class)
.setParameter( "id", 1L )
.getSingleResult();
select
    e.id as id1_1_,
    e.department_id as departme3_1_,
    e.username as username2_1_
from
    Employee e
where
    e.id = 1

select
    d.id as id1_0_0_
from
    Department d
where
    d.id = 1

Hibernate uses a secondary select instead. This is because the entity query fetch policy cannot be overridden, so Hibernate requires a secondary select to ensure that the EAGER association is fetched prior to returning the result to the user.

If you forget to JOIN FETCH all EAGER associations, Hibernate is going to issue a secondary select for each and every one of those which, in turn, can lean to N+1 query issues.

For this reason, you should prefer LAZY associations.

Applying fetch strategies

Let’s consider these topics as it relates to a simple domain model and a few use cases.

Example 4. Sample domain model
@Entity(name = "Department")
public static class Department {

    @Id
    private Long id;

    @OneToMany(mappedBy = "department")
    private List<Employee> employees = new ArrayList<>();

    //Getters and setters omitted for brevity
}

@Entity(name = "Employee")
public static class Employee {

    @Id
    private Long id;

    @NaturalId
    private String username;

    @Column(name = "pswd")
    @ColumnTransformer(
        read = "decrypt( 'AES', '00', pswd  )",
        write = "encrypt('AES', '00', ?)"
    )
    private String password;

    private int accessLevel;

    @ManyToOne(fetch = FetchType.LAZY)
    private Department department;

    @ManyToMany(mappedBy = "employees")
    private List<Project> projects = new ArrayList<>();

    //Getters and setters omitted for brevity
}

@Entity(name = "Project")
public class Project {

    @Id
    private Long id;

    @ManyToMany
    private List<Employee> employees = new ArrayList<>();

    //Getters and setters omitted for brevity
}

The Hibernate recommendation is to statically mark all associations lazy and to use dynamic fetching strategies for eagerness. This is unfortunately at odds with the JPA specification which defines that all one-to-one and many-to-one associations should be eagerly fetched by default. Hibernate, as a JPA provider, honors that default.

No fetching

For the first use case, consider the application login process for an Employee. Let’s assume that login only requires access to the Employee information, not Project nor Department information.

Example 5. No fetching example
Employee employee = entityManager.createQuery(
    "select e " +
    "from Employee e " +
    "where " +
    "    e.username = :username and " +
    "    e.password = :password",
    Employee.class)
.setParameter( "username", username)
.setParameter( "password", password)
.getSingleResult();

In this example, the application gets the Employee data. However, because all associations from Employee are declared as LAZY (JPA defines the default for collections as LAZY) no other data is fetched.

If the login process does not need access to the Employee information specifically, another fetching optimization here would be to limit the width of the query results.

Example 6. No fetching (scalar) example
Integer accessLevel = entityManager.createQuery(
    "select e.accessLevel " +
    "from Employee e " +
    "where " +
    "    e.username = :username and " +
    "    e.password = :password",
    Integer.class)
.setParameter( "username", username)
.setParameter( "password", password)
.getSingleResult();

Dynamic fetching via queries

For the second use case, consider a screen displaying the Projects for an Employee. Certainly access to the Employee is needed, as is the collection of Projects for that Employee. Information about Departments, other Employees or other Projects is not needed.

Example 7. Dynamic JPQL fetching example
Employee employee = entityManager.createQuery(
    "select e " +
    "from Employee e " +
    "left join fetch e.projects " +
    "where " +
    "    e.username = :username and " +
    "    e.password = :password",
    Employee.class)
.setParameter( "username", username)
.setParameter( "password", password)
.getSingleResult();
Example 8. Dynamic query fetching example
CriteriaBuilder builder = entityManager.getCriteriaBuilder();
CriteriaQuery<Employee> query = builder.createQuery( Employee.class );
Root<Employee> root = query.from( Employee.class );
root.fetch( "projects", JoinType.LEFT);
query.select(root).where(
    builder.and(
        builder.equal(root.get("username"), username),
        builder.equal(root.get("password"), password)
    )
);
Employee employee = entityManager.createQuery( query ).getSingleResult();

In this example we have an Employee and their Projects loaded in a single query shown both as an HQL query and a JPA Criteria query. In both cases, this resolves to exactly one database query to get all that information.

Dynamic fetching via JPA entity graph

JPA 2.1 introduced entity graphs so the application developer has more control over fetch plans.

Example 9. Fetch graph example
@Entity(name = "Employee")
@NamedEntityGraph(name = "employee.projects",
    attributeNodes = @NamedAttributeNode("projects")
)
Employee employee = entityManager.find(
    Employee.class,
    userId,
    Collections.singletonMap(
        "javax.persistence.fetchgraph",
        entityManager.getEntityGraph( "employee.projects" )
    )
);

Although the JPA standard specifies that you can override an EAGER fetching association at runtime using the javax.persistence.fetchgraph hint, currently, Hibernate does not implement this feature, so EAGER associations cannot be fetched lazily. For more info, check out the HHH-8776 Jira issue.

When executing a JPQL query, if an EAGER association is omitted, Hibernate will issue a secondary select for every association needed to be fetched eagerly, which can lead dto N+1 query issues.

For this reason, it’s better to use LAZY associations, and only fetch them eagerly on a per-query basis.

JPA entity subgraphs

An entity graph specifies which attributes to be fetched, but it limited to a single entity only. To fetch associations from a child entity, you need to use the @NamedSubgraph annotation.

If we have a Project parent entity which has an employees child associations, and we’d like to fetch the department for the Employee child association.

Example 10. Fetch graph with a subgraph mapping
@Entity(name = "Project")
@NamedEntityGraph(name = "project.employees",
    attributeNodes = @NamedAttributeNode(
        value = "employees",
        subgraph = "project.employees.department"
    ),
    subgraphs = @NamedSubgraph(
        name = "project.employees.department",
        attributeNodes = @NamedAttributeNode( "department" )
    )
)
public static class Project {

    @Id
    private Long id;

    @ManyToMany
    private List<Employee> employees = new ArrayList<>();

    //Getters and setters omitted for brevity
}

When fetching this entity graph, Hibernate generates the following SQL query:

Example 11. Fetch graph with a subgraph mapping
Project project = doInJPA( this::entityManagerFactory, entityManager -> {
    return entityManager.find(
        Project.class,
        1L,
        Collections.singletonMap(
            "javax.persistence.fetchgraph",
            entityManager.getEntityGraph( "project.employees" )
        )
    );
} );
select
    p.id as id1_2_0_, e.id as id1_1_1_, d.id as id1_0_2_,
    e.accessLevel as accessLe2_1_1_,
    e.department_id as departme5_1_1_,
    decrypt( 'AES', '00', e.pswd  ) as pswd3_1_1_,
    e.username as username4_1_1_,
    p_e.projects_id as projects1_3_0__,
    p_e.employees_id as employee2_3_0__
from
    Project p
inner join
    Project_Employee p_e
        on p.id=p_e.projects_id
inner join
    Employee e
        on p_e.employees_id=e.id
inner join
    Department d
        on e.department_id=d.id
where
    p.id = ?

-- binding parameter [1] as [BIGINT] - [1]

Dynamic fetching via Hibernate profiles

Suppose we wanted to leverage loading by natural-id to obtain the Employee information in the "projects for and employee" use-case. Loading by natural-id uses the statically defined fetching strategies, but does not expose a means to define load-specific fetching. So we would leverage a fetch profile.

Example 12. Fetch profile example
@Entity(name = "Employee")
@FetchProfile(
    name = "employee.projects",
    fetchOverrides = {
        @FetchProfile.FetchOverride(
            entity = Employee.class,
            association = "projects",
            mode = FetchMode.JOIN
        )
    }
)
session.enableFetchProfile( "employee.projects" );
Employee employee = session.bySimpleNaturalId( Employee.class ).load( username );

Here the Employee is obtained by natural-id lookup and the Employee’s Project data is fetched eagerly. If the Employee data is resolved from cache, the Project data is resolved on its own. However, if the Employee data is not resolved in cache, the Employee and Project data is resolved in one SQL query via join as we saw above.

Batch fetching

Hibernate offers the @BatchSize annotation, which can be used when fetching uninitialized entity proxies.

Considering the following entity mapping:

Example 13. @BatchSize mapping example
@Entity(name = "Department")
public static class Department {

    @Id
    private Long id;

    @OneToMany(mappedBy = "department")
    //@BatchSize(size = 5)
    private List<Employee> employees = new ArrayList<>();

    //Getters and setters omitted for brevity

}

@Entity(name = "Employee")
public static class Employee {

    @Id
    private Long id;

    @NaturalId
    private String name;

    @ManyToOne(fetch = FetchType.LAZY)
    private Department department;

    //Getters and setters omitted for brevity
}

Considering that we have previously fetched several Department entities, and now we need to initialize the employees entity collection for each particular Department, the @BatchSize annotations allows us to load multiple Employee entities in a single database roundtrip.

Example 14. @BatchSize fetching example
List<Department> departments = entityManager.createQuery(
    "select d " +
    "from Department d " +
    "inner join d.employees e " +
    "where e.name like 'John%'", Department.class)
.getResultList();

for ( Department department : departments ) {
    log.infof(
        "Department %d has {} employees",
        department.getId(),
        department.getEmployees().size()
    );
}
SELECT
    d.id as id1_0_
FROM
    Department d
INNER JOIN
    Employee employees1_
    ON d.id=employees1_.department_id

SELECT
    e.department_id as departme3_1_1_,
    e.id as id1_1_1_,
    e.id as id1_1_0_,
    e.department_id as departme3_1_0_,
    e.name as name2_1_0_
FROM
    Employee e
WHERE
    e.department_id IN (
        0, 2, 3, 4, 5
    )

SELECT
    e.department_id as departme3_1_1_,
    e.id as id1_1_1_,
    e.id as id1_1_0_,
    e.department_id as departme3_1_0_,
    e.name as name2_1_0_
FROM
    Employee e
WHERE
    e.department_id IN (
        6, 7, 8, 9, 1
    )

As you can see in the example above, there are only two SQL statements used to fetch the Employee entities associated to multiple Department entities.

Without @BatchSize, you’d run into a N+1 query issue, so, instead of 2 SQL statements, there would be 10 queries needed for fetching the Employee child entities.

However, although @BatchSize is better than running into an N+1 query issue, most of the time, a DTO projection or a JOIN FETCH is a much better alternative since it allows you to fetch all the required data with a single query.

The @Fetch annotation mapping

Besides the FetchType.LAZY or FetchType.EAGER JPA annotations, you can also use the Hibernate-specific @Fetch annotation that accepts one of the following FetchMode(s):

SELECT

The association is going to be fetched lazily using a secondary select for each individual entity, collection, or join load. It’s equivalent to JPA FetchType.LAZY fetching strategy.

JOIN

Use an outer join to load the related entities, collections or joins when using direct fetching. It’s equivalent to JPA FetchType.EAGER fetching strategy.

SUBSELECT

Available for collections only. When accessing a non-initialized collection, this fetch mode will trigger loading all elements of all collections of the same role for all owners associated with the persistence context using a single secondary select.

FetchMode.SELECT

To demonstrate how FetchMode.SELECT works, consider the following entity mapping:

Example 15. FetchMode.SELECT mapping example
@Entity(name = "Department")
public static class Department {

    @Id
    private Long id;

    @OneToMany(mappedBy = "department", fetch = FetchType.LAZY)
    @Fetch(FetchMode.SELECT)
    private List<Employee> employees = new ArrayList<>();

    //Getters and setters omitted for brevity

}

@Entity(name = "Employee")
public static class Employee {

    @Id
    @GeneratedValue
    private Long id;

    @NaturalId
    private String username;

    @ManyToOne(fetch = FetchType.LAZY)
    private Department department;

    //Getters and setters omitted for brevity

}

Considering there are multiple Department entities, each one having multiple Employee entities, when executing the following test case, Hibernate fetches every uninitialized Employee collection using a secondary SELECT statement upon accessing the child collection for the first time:

Example 16. FetchMode.SELECT mapping example
List<Department> departments = entityManager.createQuery(
    "select d from Department d", Department.class )
.getResultList();

log.infof( "Fetched %d Departments", departments.size());

for (Department department : departments ) {
    assertEquals( 3, department.getEmployees().size() );
}
SELECT
    d.id as id1_0_
FROM
    Department d

-- Fetched 2 Departments

SELECT
    e.department_id as departme3_1_0_,
    e.id as id1_1_0_,
    e.id as id1_1_1_,
    e.department_id as departme3_1_1_,
    e.username as username2_1_1_
FROM
    Employee e
WHERE
    e.department_id = 1

SELECT
    e.department_id as departme3_1_0_,
    e.id as id1_1_0_,
    e.id as id1_1_1_,
    e.department_id as departme3_1_1_,
    e.username as username2_1_1_
FROM
    Employee e
WHERE
    e.department_id = 2

The more Department entities are fetched by the first query, the more secondary SELECT statements are executed to initialize the employees collections. Therefore, FetchMode.SELECT can lead to N+1 query issues.

FetchMode.SUBSELECT

To demonstrate how FetchMode.SUBSELECT works, we are going to modify the FetchMode.SELECT mapping example to use FetchMode.SUBSELECT:

Example 17. FetchMode.SUBSELECT mapping example
@OneToMany(mappedBy = "department", fetch = FetchType.LAZY)
@Fetch(FetchMode.SUBSELECT)
private List<Employee> employees = new ArrayList<>();

Now, we are going to fetch all Department entities that match a given filtering criteria and then navigate their employees collections.

Hibernate is going to avoid the N+1 query issue by generating a single SQL statement to initialize all employees collections for all Department entities that were previously fetched. Instead of using passing all entity identifiers, Hibernate simply reruns the previous query that fetched the Department entities.

Example 18. FetchMode.SUBSELECT mapping example
List<Department> departments = entityManager.createQuery(
    "select d " +
    "from Department d " +
    "where d.name like :token", Department.class )
.setParameter( "token", "Department%" )
.getResultList();

log.infof( "Fetched %d Departments", departments.size());

for (Department department : departments ) {
    assertEquals( 3, department.getEmployees().size() );
}
SELECT
    d.id as id1_0_
FROM
    Department d
where
    d.name like 'Department%'

-- Fetched 2 Departments

SELECT
    e.department_id as departme3_1_1_,
    e.id as id1_1_1_,
    e.id as id1_1_0_,
    e.department_id as departme3_1_0_,
    e.username as username2_1_0_
FROM
    Employee e
WHERE
    e.department_id in (
        SELECT
            fetchmodes0_.id
        FROM
            Department fetchmodes0_
        WHERE
            d.name like 'Department%'
    )

FetchMode.JOIN

To demonstrate how FetchMode.JOIN works, we are going to modify the FetchMode.SELECT mapping example to use FetchMode.JOIN instead:

Example 19. FetchMode.JOIN mapping example
@OneToMany(mappedBy = "department")
@Fetch(FetchMode.JOIN)
private List<Employee> employees = new ArrayList<>();

Now, we are going to fetch one Department and navigate its employees collections.

The reason why we are not using a JPQL query to fetch multiple Department entities is because the FetchMode.JOIN strategy would be overridden by the query fetching directive.

To fetch multiple relationships with a JPQL query, the JOIN FETCH directive must be used instead.

Therefore, FetchMode.JOIN is useful for when entities are fetched directly, via their identifier or natural-id.

Also, the FetchMode.JOIN acts as a FetchType.EAGER strategy. Even if we mark the association as FetchType.LAZY, the FetchMode.JOIN will load the association eagerly.

Hibernate is going to avoid the secondary query by issuing an OUTER JOIN for the employees collection.

Example 20. FetchMode.JOIN mapping example
Department department = entityManager.find( Department.class, 1L );

log.infof( "Fetched department: %s", department.getId());

assertEquals( 3, department.getEmployees().size() );
SELECT
    d.id as id1_0_0_,
    e.department_id as departme3_1_1_,
    e.id as id1_1_1_,
    e.id as id1_1_2_,
    e.department_id as departme3_1_2_,
    e.username as username2_1_2_
FROM
    Department d
LEFT OUTER JOIN
    Employee e
        on d.id = e.department_id
WHERE
    d.id = 1

-- Fetched department: 1

This time, there was no secondary query because the child collection was loaded along with the parent entity.

@LazyCollection

The @LazyCollection annotation is used to specify the lazy fetching behavior of a given collection. The possible values are given by the LazyCollectionOption enumeration:

TRUE

Load it when the state is requested.

FALSE

Eagerly load it.

EXTRA

Prefer extra queries over full collection loading.

The TRUE and FALSE values are deprecated since you should be using the JPA FetchType attribute of the [annotations-jpa-elementcollection], [annotations-jpa-onetomany], or [annotations-jpa-manytomany] collection.

The EXTRA value has no equivalent in the JPA specification, and it’s used to avoid loading the entire collection even when the collection is accessed for the first time. Each element is fetched individually using a secondary query.

Example 21. LazyCollectionOption.EXTRA mapping example
@Entity(name = "Department")
public static class Department {

    @Id
    private Long id;

    @OneToMany(mappedBy = "department", cascade = CascadeType.ALL)
    @OrderColumn(name = "order_id")
    @LazyCollection( LazyCollectionOption.EXTRA )
    private List<Employee> employees = new ArrayList<>();

    //Getters and setters omitted for brevity

}

@Entity(name = "Employee")
public static class Employee {

    @Id
    private Long id;

    @NaturalId
    private String username;

    @ManyToOne(fetch = FetchType.LAZY)
    private Department department;

    //Getters and setters omitted for brevity

}

LazyCollectionOption.EXTRA only works for ordered collections, either List(s) that are annotated with @OrderColumn or Map(s).

For bags (e.g. regular List(s) of entities that do not preserve any certain ordering), the @LazyCollection(LazyCollectionOption.EXTRA)` behaves like any other FetchType.LAZY collection (the collection is fetched entirely upon being accessed for the first time).

Now, considering we have the following entities:

Example 22. LazyCollectionOption.EXTRA Domain Model example
Department department = new Department();
department.setId( 1L );
entityManager.persist( department );

for (long i = 1; i <= 3; i++ ) {
    Employee employee = new Employee();
    employee.setId( i );
    employee.setUsername( String.format( "user_%d", i ) );
    department.addEmployee(employee);
}

When fetching the employee collection entries by their position in the List, Hibernate generates the following SQL statements:

Example 23. LazyCollectionOption.EXTRA fetching example
Department department = entityManager.find(Department.class, 1L);

int employeeCount = department.getEmployees().size();

for(int i = 0; i < employeeCount; i++ ) {
    log.infof( "Fetched employee: %s", department.getEmployees().get( i ).getUsername());
}
SELECT
    max(order_id) + 1
FROM
    Employee
WHERE
    department_id = ?

-- binding parameter [1] as [BIGINT] - [1]

SELECT
    e.id as id1_1_0_,
    e.department_id as departme3_1_0_,
    e.username as username2_1_0_
FROM
    Employee e
WHERE
    e.department_id=?
    AND e.order_id=?

-- binding parameter [1] as [BIGINT]  - [1]
-- binding parameter [2] as [INTEGER] - [0]

SELECT
    e.id as id1_1_0_,
    e.department_id as departme3_1_0_,
    e.username as username2_1_0_
FROM
    Employee e
WHERE
    e.department_id=?
    AND e.order_id=?

-- binding parameter [1] as [BIGINT]  - [1]
-- binding parameter [2] as [INTEGER] - [1]

SELECT
    e.id as id1_1_0_,
    e.department_id as departme3_1_0_,
    e.username as username2_1_0_
FROM
    Employee e
WHERE
    e.department_id=?
    AND e.order_id=?

-- binding parameter [1] as [BIGINT]  - [1]
-- binding parameter [2] as [INTEGER] - [2]

Therefore, the child entities were fetched one after the other without triggering a full collection initialization.

For this reason, caution is advised because LazyCollectionOption.EXTRA lazy collections are prone to N+1 query issues.