Hibernate.orgCommunity Documentation

Hibernate OGM Reference Guide

Logo


Preface
1. Goals
2. What we have today
3. Use cases
1. How to get help and contribute on Hibernate OGM
1.1. How to get help
1.2. How to contribute
1.2.1. How to build Hibernate OGM
1.2.2. How to contribute code effectively
2. Getting started with Hibernate OGM
3. Architecture
3.1. General architecture
3.2. How is data persisted
3.3. How is data queried
4. Configure and start Hibernate OGM
4.1. Bootstrapping Hibernate OGM
4.1.1. Using JPA
4.1.2. Using Hibernate ORM native APIs
4.2. Environments
4.2.1. In a Java EE container
4.2.2. In a standalone JTA environment
4.2.3. Without JTA
4.3. Configuration options
4.4. Configuring Hibernate Search
4.5. How to package Hibernate OGM applications for WildFly 8 and JBoss EAP 6
4.5.1. Packaging Hibernate OGM applications for WildFly 8
4.5.2. Packaging Hibernate OGM applications for JBoss EAP 6
5. Datastores
5.1. Infinispan
5.1.1. Configure Infinispan
5.1.2. Manage data size
5.1.3. Clustering: deploy multiple Infinispan nodes
5.1.4. Transactions
5.1.5. Storing a Lucene index in Infinispan
5.2. Ehcache
5.2.1. Configure Ehcache
5.2.2. Transactions
5.3. MongoDB
5.3.1. Configuring MongoDB
5.3.2. Storage principles
5.3.3. Transactions
5.3.4. Queries
5.4. Neo4j
5.4.1. How to add Neo4j integration
5.4.2. Configuring Neo4j
5.4.3. Storage principles
5.4.4. Transactions
5.4.5. Queries
5.5. CouchDB
5.5.1. Configuring CouchDB
5.5.2. Storage principles
5.5.3. Transactions
5.5.4. Queries
6. Map your entities
6.1. Supported entity mapping
6.2. Supported Types
6.2.1. Types mapped as native Java Types
6.2.2. Types mapped as Strings
6.3. Supported association mapping
7. Query your entities
7.1. Using JP-QL
7.2. Using Hibernate Search
7.3. Using the Criteria API

Hibernate Object/Grid Mapper (OGM) is a persistence engine providing Java Persistence (JPA) support for NoSQL datastores. It reuses Hibernate ORM’s object life cycle management and (de)hydration engine but persists entities into a NoSQL store (key/value, document, column-oriented, etc) instead of a relational database. It reuses the Java Persistence Query Language (JP-QL) as an interface to querying stored data.

The project is still very young and very ambitious at the same time. Many things are on the roadmap (more NoSQL, query, denormalization engine, etc). If you wish to help, please check Chapter 1, How to get help and contribute on Hibernate OGM.

Hibernate OGM is released under the LGPL open source license.

Warning

This documentation and this project are work in progress. Please give us feedback on

  • what you like
  • what you don’t like
  • what is confusing

Check Section 1.2, “How to contribute” on how to contact us.

Hibernate OGM:

NoSQL can be very disconcerting as it is composed of many disparate solutions with different benefits and drawbacks. Speaking only of the main ones, NoSQL is at least categorized in four families:


Each have different benefits and drawbacks and one solution might fit a use case better than an other. However access patterns and APIs are different from one product to the other.

Hibernate OGM is not expected to be the Rosetta stone used to interact with all NoSQL solution in all use cases. But for people modeling their data as a domain model, it provides distinctive advantages over raw APIs and has the benefit of providing an API and semantic known to Java developers. Reusing the same programmatic model and trying different (No)SQL engines will hopefully help people to explore alternative datastores.

Hibernate OGM also aims at helping people scale traditional relational databases by providing a NoSQL front-end and keeping the same JPA APIs and domain model.

Hibernate OGM is a young project. The code, the direction and the documentation are all in flux and being built by the community. Join and help us shape it!

First of all, make sure to read this reference documentation. This is the most comprehensive formal source of information. Of course, it is not perfect: feel free to come and ask for help, comment or propose improvements in our Hibernate OGM forum.

You can also:

  • open bug reports in JIRA
  • propose improvements on the development mailing list
  • join us on IRC to discuss developments and improvements (#hibernate-dev on freenode.net; you need to be registered on freenode: the room does not accept "anonymous" users).

Welcome!

There are many ways to contribute:

  • report bugs in JIRA
  • give feedback in the forum, IRC or the development mailing list
  • improve the documentation
  • fix bugs or contribute new features
  • propose and code a datastore dialect for your favorite NoSQL engine

Hibernate OGM’s code is available on GitHub at https://github.com/hibernate/hibernate-ogm.

If you are familiar with JPA, you are almost good to go :-) We will nevertheless walk you through the first few steps of persisting and retrieving an entity using Hibernate OGM.

Before we can start, make sure you have the following tools configured:

Hibernate OGM is published in the JBoss hosted Maven repository. Adjust your ~/.m2/settings.xml file according to the guidelines found on this webpage. In this example we will use Infinispan as the targeted datastore.

Add org.hibernate.ogm:hibernate-ogm-bom:4.1.0.Beta7 to your dependency management block and org.hibernate.ogm:hibernate-ogm-infinispan:4.1.0.Beta7 to your project dependencies:


<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.hibernate.ogm</groupId>
            <artifactId>hibernate-ogm-bom</artifactId>
            <version>4.1.0.Beta7</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
<dependencyManagement>

<dependencies>
    <dependency>
        <groupId>org.hibernate.ogm</groupId>
        <artifactId>hibernate-ogm-infinispan</artifactId>
    </dependency>
</dependencies>

The former is a so-called "bill of materials" POM which specifies a matching set of versions for Hibernate OGM and its dependencies. That way you never need to specify a version explicitly within your dependencies block, you will rather get the versions from the BOM automatically.

Note

If you’re deploying your application onto JBoss WildFly or JBoss EAP, you don’t need to add the Hibernate OGM modules to your deployment unit but you can rather add them as modules to the application server itself. Refer to Section 4.5, “How to package Hibernate OGM applications for WildFly 8 and JBoss EAP 6” to learn more.

We will use the JPA APIs in this tutorial. While Hibernate OGM depends on JPA 2.1, it is marked as provided in the Maven POM file. If you run outside a Java EE container, make sure to explicitly add the dependency:


<dependency>
    <groupId>org.hibernate.javax.persistence</groupId>
    <artifactId>hibernate-jpa-2.1-api</artifactId>
</dependency>

Let’s now map our first Hibernate OGM entity.

@Entity

public class Dog {
   @Id @GeneratedValue(strategy = GenerationType.TABLE, generator = "dog")
   @TableGenerator(
      name = "dog",
      table = "sequences",
      pkColumnName = "key",
      pkColumnValue = "dog",
      valueColumnName = "seed"
   )
   public Long getId() { return id; }
   public void setId(Long id) { this.id = id; }
   private Long id;
   public String getName() { return name; }
   public void setName(String name) { this.name = name; }
   private String name;
   @ManyToOne
   public Breed getBreed() { return breed; }
   public void setBreed(Breed breed) { this.breed = breed; }
   private Breed breed;
}
@Entity
public class Breed {
   @Id @GeneratedValue(generator = "uuid")
   @GenericGenerator(name="uuid", strategy="uuid2")
   public String getId() { return id; }
   public void setId(String id) { this.id = id; }
   private String id;
   public String getName() { return name; }
   public void setName(String name) { this.name = name; }
   private String name;
}

I lied to you, we have already mapped two entities! If you are familiar with JPA, you can see that there is nothing specific to Hibernate OGM in our mapping.

In this tutorial, we will use JBoss Transactions for our JTA transaction manager. So let’s add the JTA API and JBoss Transactions to our POM as well. The final list of dependencies should look like this:


<dependencies>
    <!-- Hibernate OGM Infinispan module; pulls in the OGM core module -->
    <dependency>
        <groupId>org.hibernate.ogm</groupId>
        <artifactId>hibernate-ogm-infinispan</artifactId>
    </dependency>

    <!-- standard APIs dependencies - provided in a Java EE container -->
    <dependency>
        <groupId>org.hibernate.javax.persistence</groupId>
        <artifactId>hibernate-jpa-2.1-api</artifactId>
    </dependency>
    <dependency>
        <groupId>org.jboss.spec.javax.transaction</groupId>
        <artifactId>jboss-transaction-api_1.2_spec</artifactId>
    </dependency>

    <!-- JBoss Transactions dependency - this (or another implementation) is
         provided in a Java EE container -->
    <dependency>
        <groupId>org.jboss.jbossts</groupId>
        <artifactId>jbossjta</artifactId>
    </dependency>
</dependencies>

Next we need to define the persistence unit. Create a META-INF/persistence.xml file.


<?xml version="1.0"?>
<persistence xmlns="http://java.sun.com/xml/ns/persistence"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd"
             version="2.0">

    <persistence-unit name="ogm-jpa-tutorial" transaction-type="JTA">
        <!-- Use Hibernate OGM provider: configuration will be transparent -->
        <provider>org.hibernate.ogm.jpa.HibernateOgmPersistence</provider>
        <properties>
            <!-- property is optional if you want to use Infinispan, otherwise adjust to your favorite
                NoSQL Datastore provider.
            <property name="hibernate.ogm.datastore.provider" value="infinispan"/>
            -->
            <!-- defines which JTA Transaction we plan to use -->
            <property name="hibernate.transaction.jta.platform"
                      value="org.hibernate.service.jta.platform.internal.JBossStandAloneJtaPlatform"/>
        </properties>
    </persistence-unit>
</persistence>

Let’s now persist a set of entities and retrieve them.

//accessing JBoss's Transaction can be done differently but this one works nicely

TransactionManager tm = getTransactionManager();
//build the EntityManagerFactory as you would build in in Hibernate ORM
EntityManagerFactory emf = Persistence.createEntityManagerFactory(
    "ogm-jpa-tutorial");
final Logger logger = LoggerFactory.getLogger(DogBreedRunner.class);
[..]
//Persist entities the way you are used to in plain JPA
tm.begin();
logger.infof("About to store dog and breed");
EntityManager em = emf.createEntityManager();
Breed collie = new Breed();
collie.setName("Collie");
em.persist(collie);
Dog dina = new Dog();
dina.setName("Dina");
dina.setBreed(collie);
em.persist(dina);
Long dinaId = dina.getId();
em.flush();
em.close();
tm.commit();
[..]
//Retrieve your entities the way you are used to in plain JPA
tm.begin();
logger.infof("About to retrieve dog and breed");
em = emf.createEntityManager();
dina = em.find(Dog.class, dinaId);
logger.infof("Found dog %s of breed %s", dina.getName(), dina.getBreed().getName());
em.flush();
em.close();
tm.commit();
[..]
emf.close();
private static final String JBOSS_TM_CLASS_NAME = "com.arjuna.ats.jta.TransactionManager";
public static TransactionManager getTransactionManager() throws Exception {
    Class<?> tmClass = Main.class.getClassLoader().loadClass(JBOSS_TM_CLASS_NAME);
    return (TransactionManager) tmClass.getMethod("transactionManager").invoke(null);
}

Note

Some JVM do not handle mixed IPv4/IPv6 stacks properly (older Mac OS X JDK in particular), if you experience trouble starting the Infinispan cluster, pass the following property: -Djava.net.preferIPv4Stack=true to your JVM or upgrade to a recent JDK version. jdk7u6 (b22) is known to work on Max OS X.

Note

There are some additional constraints related to transactions when working with Neo4j. You will find more details in the Neo4j transactions section: Section 5.4.4, “Transactions”

A working example can be found in Hibernate OGM’s distribution under hibernate-ogm-documentation/examples/gettingstarted.

What have we seen?

  • Hibernate OGM is a JPA implementation and is used as such both for mapping and in API usage
  • It is configured as a specific JPA provider: org.hibernate.ogm.jpa.HibernateOgmPersistence

Let’s explore more in the next chapters.

Note

Hibernate OGM defines an abstraction layer represented by DatastoreProvider and GridDialect to separate the OGM engine from the datastores interaction. It has successfully abstracted various key/value stores and MongoDB. We are working on testing it on other NoSQL families.

In this chapter we will explore:

  • the general architecture
  • how the data is persisted in the NoSQL datastore
  • how we support JP-QL queries

Let’s start with the general architecture.

Hibernate OGM is really made possible by the reuse of a few key components:


Hibernate OGM reuses as much as possible from the Hibernate ORM infrastructure. There is no need to rewrite an entirely new JPA engine. The Persisters and the Loaders (two interfaces used by Hibernate ORM) have been rewritten to persist data in the NoSQL store. These implementations are the core of Hibernate OGM. We will see in Section 3.2, “How is data persisted” how the data is structured.

The particularities between NoSQL stores are abstracted by the notion of a DatastoreProvider and a GridDialect.

  • DatastoreProvider abstracts how to start and maintain a connection between Hibernate OGM and the datastore.
  • GridDialect abstracts how data itself including association is persisted.

Think of them as the JDBC layer for our NoSQL stores.

Other than these, all the Create/Read/Update/Delete (CRUD) operations are implemented by the Hibernate ORM engine (object hydration and dehydration, cascading, lifecycle etc).

As of today, we have implemented four datastore providers:

  • a Map based datastore provider (for testing)
  • an Infinispan based datastore provider to persist your entities in Infinispan
  • a Ehcache based datastore provider to persist your entities in Ehcache
  • a MongoDB based datastore provider to persist data in a MongoDB database
  • a Neo4j based datastore provider to persist data in the Neo4j graph database
  • a CouchDB based datastore provider to persist data in the CouchDB document store

To implement JP-QL queries, Hibernate OGM parses the JP-QL string and calls the appropriate translator functions to build a native query. If the query is too complex for the native capabilities of the NoSQL store, the Teiid query engine is used as an intermediary engine to implement the missing features (typically joins between entities, aggregation). Finally, if the underlying engine does not have any query support, we use Hibernate Search as an external query engine.

Reality is a bit more nuanced, we will discuss the subject of querying in more details in Section 3.3, “How is data queried”.

Hibernate OGM best works in a JTA environment. The easiest solution is to deploy it on a Java EE container. Alternatively, you can use a standalone JTA TransactionManager. We explain how to in Section 4.2.2, “In a standalone JTA environment”.

Let’s now see how and in which structure data is persisted in the NoSQL data store.

Hibernate OGM tries to reuse as much as possible the relational model concepts, at least when they are practical and make sense in OGM’s case. For very good reasons, the relational model brought peace in the database landscape over 30 years ago. In particular, Hibernate OGM inherits the following traits:

If the application data model is too tightly coupled with your persistent data model, a few issues arise including:

Entities are stored as tuples of values by Hibernate OGM. More specifically, each entity is conceptually represented by a Map<String,Object> where the key represents the column name (often the property name but not always) and the value represents the column value as a basic type. We favor basic types over complex ones to increase portability (across platforms and across type / class schema evolution over time). For example a URL object is stored as its String representation.

The key identifying a given entity instance is composed of:


The GridDialect specific to the NoSQL datastore you target is then responsible to convert this map into the most natural model:

  • for a key/value store or a data grid, we use the logical key as the key in the grid and we store the map as the value. Note that it’s an approximation and some key/value providers will use more tailored approaches.
  • for a document oriented store, the map is represented by a document and each entry in the map corresponds to a property in a document.

Associations are also stored as tuple as well or more specifically as a set of tuples. Hibernate OGM stores the information necessary to navigate from an entity to its associations. This is a departure from the pure relational model but it ensures that association data is reachable via key lookups based on the information contained in the entity tuple we want to navigate from. Note that this leads to some level of duplication as information has to be stored for both sides of the association.

The key in which association data are stored is composed of:

  • the table name
  • the column name(s) representing the foreign key to the entity we come from
  • the column value(s) representing the foreign key to the entity we come from

Using this approach, we favor fast read and (slightly) slower writes.


Note that this approach has benefits and drawbacks:

  • it ensures that all CRUD operations are doable via key lookups
  • it favors reads over writes (for associations)
  • but it duplicates data

Note

We might offer alternative association data persistence options in the future based on feedback.

Again, there are specificities in how data is inherently stored in the specific NoSQL store. For example, in document oriented stores, the association information including the identifier to the associated entities can be stored in the entity owning the association. This is a more natural model for documents.

TODO: this sentence might be worth a diagram to show the difference with the key/value store.

Some identifiers require to store a seed in the datastore (like sequences for examples). The seed is stored in the value whose key is composed of:

  • the table name
  • the column name representing the segment
  • the column value representing the segment

Make sure to check the chapter dedicated to the NoSQL store you target to find the specificities.

Many NoSQL stores have no notion of schema. Likewise, the tuple stored by Hibernate OGM is not tied to a particular schema: the tuple is represented by a Map, not a typed Map specific to a given entity type. Nevertheless, JPA does describe a schema thanks to:

  • the class schema
  • the JPA physical annotations like @Table and @Column.

While tied to the application, it offers some robustness and explicit understanding when the schema is changed as the schema is right in front of the developers' eyes. This is an intermediary model between the strictly typed relational model and the totally schema-less approach pushed by some NoSQL families.

Since Hibernate OGM wants to offer all of JPA, it needs to support JP-QL queries. Hibernate OGM parses the JP-QL query string and extracts its meaning. From there, several options are available depending of the capabilities of the NoSQL store you target:

If the NoSQL datastore has some query capabilities and if the JP-QL query is simple enough to be executed by the datastore, then the JP-QL parser directly pushes the query generation to the NoSQL specific query translator. The query returns the list of matching identifiers snd uses Hibernate OGM to return managed objects.

Some of the JP-QL features are not supported by NoSQL solutions. Two typical examples are joins between entities - which you should limit anyways in a NoSQL environment - and aggregations like average, max, min etc. When the NoSQL store does not support the query, we use Teiid - a database federation engine - to build simpler queries executed to the datastore and perform the join or aggregation operations in Teiid itself.

Finally some NoSQL stores have poor query support, or none at all. In this case Hibernate OGM can use Hibernate Search as its indexing and query engine. Hibernate Search is able to index and query objects - entities - and run full-text queries. It uses the well known Apache Lucene to do that but adds a few interesting characteristics like clustering support and an object oriented abstraction including an object oriented query DSL. Let’s have a look at the architecture of Hibernate OGM when using Hibernate Search:


In this situation, Hibernate ORM Core pushes change events to Hibernate Search which will index entities accordingly and keep the index and the datastore in sync. The JP-QL query parser delegates the query translation to the Hibernate Search query translator and executes the query on top of the Lucene indexes. Indexes can be stored in various fashions:

  • on a file system (the default in Lucene)
  • in Infinispan via the Infinispan Lucene directory implementation: the index is then distributed across several servers transparently
  • in NoSQL stores like Voldemort that can natively store Lucene indexes
  • in NoSQL stores that can be used as overflow to Infinispan: in this case Infinispan is used as an intermediary layer to serve the index efficiently but persists the index in another NoSQL store.

Note that for complex queries involving joins or aggregation, Hibernate OGM can use Teiid as an intermediary query engine that will delegate to Hibernate Search.

Note that you can use Hibernate Search even if you do plan to use the NoSQL datastore query capabilities. Hibernate Search offers a few interesting options:

  • clusterability
  • full-text queries - ie Google for your entities
  • geospatial queries
  • query faceting (ie dynamic categorization of the query results by price, brand etc)

What’s the progress status on queries?

Well… now is a good time to remind you that Hibernate OGM is open source and that contributing to such cutting edge project is a lot of fun. Check out Chapter 1, How to get help and contribute on Hibernate OGM for more details.

But to answer your question, we have finished the skeleton of the architecture as well as the JP-QL parser implementation. The Hibernate Search query translator can execute simple queries already. However, we do not yet have a NoSQL specific query translator but the approach is quite clear to us. Teiid for complex queries is also not integrated but work is being done to facilitate that integration soon. Native Hibernate Search queries are fully supported.

Hibernate OGM favors ease of use and convention over configuration. This makes its configuration quite simple by default.

Hibernate OGM can be used via the Hibernate native APIs (Session) or via the JPA APIs (EntityManager). Depending of your choice, the bootstrapping strategy is slightly different.

The good news is that if you use JPA as your primary API, the configuration is extremely simple. Hibernate OGM is seen as a persistence provider which you need to configure in your persistence.xml. That’s it! The provider name is org.hibernate.ogm.jpa.HibernateOgmPersistence.


There are a couple of things to notice:

You also need to configure which NoSQL datastore you want to use and how to connect to it. We will detail how to do that later in Chapter 5, Datastores. In this case, we have used the defaults settings for Infinispan.

From there, simply bootstrap JPA the way you are used to with Hibernate ORM:

  • via Persistence.createEntityManagerFactory
  • by injecting the EntityManager / EntityManagerFactory in a Java EE container
  • by using your favorite injection framework (CDI - Weld, Spring, Guice)

If you want to bootstrap Hibernate OGM using the native Hibernate APIs, use the class org.hibernate.ogm.cfg.OgmConfiguration.


There are a couple of things to notice:

You also need to configure which NoSQL datastore you want to use and how to connect to it. We will detail how to do that later in Chapter 5, Datastores. In this case, we have used the defaults settings for Infinispan.

Hibernate OGM runs in various environments, pretty much what you are used to with Hibernate ORM. There are however environments where it works better and has been more thoroughly tested.

You don’t have to do much in this case. You need three specific settings:

If you use JPA, simply set the transaction-type to JTA and the transaction factory will be set for you.

If you use Hibernate ORM native APIs only, then set hibernate.transaction.factory_class to either:

Set the JTA platform to the right Java EE container. The property is hibernate.transaction.transaction.jta.platform and must contain the fully qualified class name of the lookup implementation. The list of available values are listed in Hibernate ORM’s configuration section. For example, in WildFly, use org.hibernate.service.jta.platform.internal.JBossAppServerJtaPlatform.

In your persistence.xml, you also need to define an existing datasource. It is not needed by Hibernate OGM and won’t be used but the JPA specification mandates this setting.


java:DefaultDS will work for out of the box WildFly deployments.

There is a set of common misconceptions in the Java community about JTA:

None of that is true of course, let me show you how to use JBoss Transaction in a standalone environment with Hibernate OGM.

In Hibernate OGM, make sure to set the following properties:

On the JBoss Transaction side, add JBoss Transaction in your classpath. If you use maven, it should look like this:


The next step is you get access to the transaction manager. The easiest solution is to do as the following example:

TransactionManager transactionManager =

   com.arjuna.ats.jta.TransactionManager.transactionmanager();

Then use the standard JTA APIs to demarcate your transaction and you are done!


That was not too hard, was it? Note that application frameworks like Seam or Spring Framework should be able to initialize the transaction manager and call it to demarcate transactions for you. Check their respective documentation.

The most important options when configuring Hibernate OGM are related to the datastore. They are explained in Chapter 5, Datastores.

Otherwise, most options from Hibernate ORM and Hibernate Search are applicable when using Hibernate OGM. You can pass them as you are used to do either in your persistence.xml file, your hibernate.cfg.xml file or programmatically.

More interesting is a list of options that do not apply to Hibernate OGM and that should not be set:

  • hibernate.dialect
  • hibernate.connection.* and in particular hibernate.connection.provider_class
  • hibernate.show_sql and hibernate.format_sql
  • hibernate.default_schema and hibernate.default_catalog
  • hibernate.use_sql_comments
  • hibernate.jdbc.*
  • hibernate.hbm2ddl.auto and hibernate.hbm2ddl.import_file

Provided you’re deploying on WildFly 8 or JBoss EAP 6, there is an additional way to add the OGM dependencies to your application.

In WildFly 8 and JBoss EAP 6, class loading is based on modules that have to define explicit dependencies on other modules. Modules allow to share the same artifacts across multiple applications, getting you smaller and quicker deployments.

More details about modules are described in Class Loading in WildFly.

You can download the pre-packaged module ZIP from:

Unpack the archive into the modules folder of your WildFly 8 installation. The modules included are:

  • org.hibernate:ogm, the core Hibernate OGM library and the Infinispan datastore provider.
  • org.hibernate.ogm.<%DATASTORE%>:main, one module for each datastore provider besides Infinispan, with <%DATASTORE%> being one of ehcache, mongodb etc. You only need to add those modules which you actually intend to use.
  • Several shared dependencies such as org.hibernate.hql:<%VERSION%> (containing the query parser) and others

There are two ways to include the dependencies in your project:

Using the manifest
Add this entry to the MANIFEST.MF in your archive (replace <%DATASTORE%> with the right value for your chosen datastore):
Dependencies: org.hibernate:ogm services, org.hibernate.ogm.<%DATASTORE%>:main services
Using jboss-deployment-structure.xml
This is a JBoss-specific descriptor. Add a WEB-INF/jboss-deployment-structure.xml in your archive with the following content (replace <%DATASTORE%> with the right value for your chosen datastore):

<jboss-deployment-structure>
    <deployment>
        <dependencies>
            <module name="org.hibernate" slot="ogm" services="export" />
            <module name="org.hibernate.ogm.<%DATASTORE%>" slot="main" services="export" />
        </dependencies>
    </deployment>
</jboss-deployment-structure>

More information about the descriptor can be found in the WildFly documentation.

You can download the pre-packaged module ZIP from:

Unpack the archive into the modules folder of your JBoss EAP 6 installation. The modules included are:

  • org.hibernate:ogm, the core Hibernate OGM library and the Infinispan datastore provider.
  • org.hibernate.ogm.<%DATASTORE%>:main, one module for each datastore provider besides Infinispan, with <%DATASTORE%> being one of ehcache, mongodb etc. You only need to add those modules which you actually intend to use.
  • javax.persistence.api:main, this will upgrade the JPA API version in JBoss EAP to 2.1
  • javax.ws.rs.api:main, org.jboss.resteasy.resteasy-*:main>, this will upgrade the RESTEasy version in JBoss EAP to 3.0.x (JAX-RS 2); Note that the RESTEasy modules are obtained from the official RESTEasy distribution (which itself contains a module ZIP) and re-packaged into the module ZIP of Hibernate OGM for your convinience
  • Several shared dependencies such as org.hibernate.hql:<%VERSION%> (containing the query parser) and others

Warning

The existing modules for javax.persistence.api, javax.ws.rs.api:main and org.javassist:main are going to be replaced with the version required by OGM. Efforts are underway to avoid this in future revisions (see OGM-499).

There are two ways to include the dependencies in your project:

Using the manifest
Add this entry to the MANIFEST.MF in your archive (replace <%DATASTORE%> with the right value for your chosen datastore):
Dependencies: org.hibernate:ogm, org.hibernate.ogm.<%DATASTORE%>:main services
Using jboss-deployment-structure.xml
This is a JBoss-specific descriptor. Add a WEB-INF/jboss-deployment-structure.xml in your archive with the following content (replace <%DATASTORE%> with the right value for your chosen datastore):

<jboss-deployment-structure>
    <deployment>
        <dependencies>
            <module name="org.hibernate" slot="ogm" />
            <module name="org.hibernate.ogm.<%DATASTORE%>" slot="main" services="export" />
        </dependencies>
    </deployment>
</jboss-deployment-structure>

If you are working with the CouchDB backend, you furthermore need to add the following to your jboss-deployment-structure.xml:


...
<deployment>
    ...
    <dependencies>
        ...
        <module name="org.jboss.resteasy.resteasy-jackson2-provider" services="import" />
    </dependencies>
    <exclusions>
        <module name="org.jboss.resteasy.resteasy-jackson-provider" />
    </exclusions>
</deployment>

This causes the RESTEasy Jackson 2 provider to be enabled rather the Jackson 1 provider which is used in JBoss EAP 6.x by default. That step is required as the CouchDB dialect implementation depends on Jackson 2.

Currently Hibernate OGM supports the following datastores:

  • Map: stores data in an in-memory Java map to store data. Use it only for unit tests.
  • Infinispan: stores data into Infinispan (data grid)
  • Ehcache: stores data into Ehcache (cache)
  • MongoDB: stores data into MongoDB (document store)
  • Neo4j: stores data into Neo4j (graph database)
  • CouchDB: stores data into CouchDB (document store)

More are planned, if you are interested, come talk to us (see Chapter 1, How to get help and contribute on Hibernate OGM).

Hibernate OGM interacts with NoSQL datastores via two contracts:

  • a datastore provider which is responsible for starting and stopping the connection(s) with the datastore and prop up the datastore if needed
  • a grid dialect which is responsible for converting an Hibernate OGM operation into a datastore specific operation

The main thing you need to do is to configure which datastore provider you want to use. This is done via the hibernate.ogm.datastore.provider option. Possible values are

  • the fully qualified class name of a DatastoreProvider implementation or
  • one preferably of the following shortcuts: map (only to be used for unit tests), infinispan, ehcache, mongodb, neo4j or couchdb

Note

When bootstrapping a session factory or entity manager factory programmatically, you should use the constants declared on OgmProperties to specify configuration properties such as hibernate.ogm.datastore.provider.

In this case you also can specify the provider in form of a class object of a datastore provider type or pass an instance of a datastore provider type:

Map<String, Object> properties = new HashMap<String, Object>();


// pass the type
properties.put( OgmProperties.DATASTORE_PROVIDER, MyDatastoreProvider.class );
// or an instance
properties.put( OgmProperties.DATASTORE_PROVIDER, new MyDatastoreProvider() );
EntityManagerFactory emf = Persistence.createEntityManagerFactory( "my-pu", properties );

You also need to add the relevant Hibernate OGM module in your classpath. In maven that would look like:


<dependency>
    <groupId>org.hibernate.ogm</groupId>
    <artifactId>hibernate-ogm-infinispan</artifactId>
    <version>4.1.0.Beta7</version>
</dependency>

The module names are hibernate-ogm-infinispan, hibernate-ogm-ehcache, hibernate-ogm-mongodb, hibernate-ogm-neo4j and hibernate-ogm-couchdb. The map datastore is included in the Hibernate OGM engine module.

By default, a datastore provider chooses the best grid dialect transparently but you can manually override that setting with the hibernate.ogm.datastore.grid_dialect option. Use the fully qualified class name of the GridDialect implementation. Most users should ignore this setting entirely and live happy.

Infinispan is an open source in-memory data grid focusing on high performance. As a data grid, you can deploy it on multiple servers - referred to as nodes - and connect to it as if it were a single storage engine: it will cleverly distribute both the computation effort and the data storage.

It is trivial to setup on a single node, in your local JVM, so you can easily try Hibernate OGM. But Infinispan really shines in multiple node deployments: you will need to configure some networking details but nothing changes in terms of application behaviour, while performance and data size can scale linearly.

From all its features we’ll only describe those relevant to Hibernate OGM; for a complete description of all its capabilities and configuration options, refer to the Infinispan project documentation at infinispan.org.

Two steps basically:

Hibernate OGM will not use a single Cache but three and is going to use them for different purposes; so that you can configure the Caches meant for each role separately.

We’ll explain in the following paragraphs how you can take advantage of this and which aspects of Infinispan you’re likely to want to reconfigure from their defaults. All attributes and elements from Infinispan which we don’t mention are safe to ignore. Refer to the Infinispan User Guide for the guru level performance tuning and customizations.

An Infinispan configuration file is an XML file complying with the Infinispan schema; the basic structure is shown in the following example:


The global section contains elements which affect the whole instance; mainly of interest for Hibernate OGM users is the transport element in which we’ll set JGroups configuration overrides.

In the namedCache section (or in default if we want to affect all named caches) we’ll likely want to configure clustering modes, eviction policies and CacheStores.

In its default configuration Infinispan stores all data in the heap of the JVM; in this barebone mode it is conceptually not very different than using a HashMap: the size of the data should fit in the heap of your VM, and stopping/killing/crashing your application will get all data lost with no way to recover it.

To store data permanently (out of the JVM memory) a CacheStore should be enabled. The infinispan-core.jar includes a simple implementation able to store data in simple binary files, on any read/write mounted filesystem; this is an easy starting point, but the real stuff is to be found in the additional modules found in the Infinispan distribution. Here you can find many more implementations to store your data in anything from JDBC connected relational databases, other NoSQL engines, to cloud storage services or other Infinispan clusters. Finally, implementing a custom CacheStore is a trivial programming exercise.

To limit the memory consumption of the precious heap space, you can activate a passivation or an eviction policy; again there are several strategies to play with, for now let’s just consider you’ll likely need one to avoid running out of memory when storing too many entries in the bounded JVM memory space; of course you don’t need to choose one while experimenting with limited data sizes: enabling such a strategy doesn’t have any other impact in the functionality of your Hibernate OGM application (other than performance: entries stored in the Infinispan in-memory space is accessed much quicker than from any CacheStore).

A CacheStore can be configured as write-through, committing all changes to the CacheStore before returning (and in the same transaction) or as write-behind. A write-behind configuration is normally not encouraged in storage engines, as a failure of the node implies some data might be lost without receiving any notification about it, but this problem is mitigated in Infinispan because of its capability to combine CacheStore write-behind with a synchronous replication to other Infinispan nodes.


In this example we enabled both eviction and a CacheStore (the loader element). LIRS is one of the choices we have for eviction strategies. Here it is configured to keep (approximately) 2000 entries in live memory and evict the remaining as a memory usage control strategy.

The CacheStore is enabling passivation, which means that the entries which are evicted are stored on the filesystem.

Warning

You could configure an eviction strategy while not configuring a passivating CacheStore! That is a valid configuration for Infinispan but will have the evictor permanently remove entries. Hibernate OGM will break in such a configuration.

Tip

Currently with Infinispan 5.1, the FileCacheStore is neither very fast nor very efficient: we picked it for ease of setup. For a production system it’s worth looking at the large collection of high performance and cloud friendly cachestores provided by the Infinispan distribution.

The best thing about Infinispan is that all nodes are treated equally and it requires almost no beforehand capacity planning: to add more nodes to the cluster you just have to start new JVMs, on the same or different physical server, having your same Infinispan configuration and your same application.

Infinispan supports several clustering cache modes; each mode provides the same API and functionality but with different performance, scalability and availability options:

To use the replication or distribution cache modes Infinispan will use JGroups to discover and connect to the other nodes.

In the default configuration, JGroups will attempt to autodetect peer nodes using a multicast socket; this works out of the box in the most network environments but will require some extra configuration in cloud environments (which often block multicast packets) or in case of strict firewalls. See the JGroups reference documentation, specifically look for Discovery Protocols to customize the detection of peer nodes.

Nowadays, the JVM defaults to use IPv6 network stack; this will work fine with JGroups, but only if you configured IPv6 correctly. It is often useful to force the JVM to use IPv4.

It is also useful to let JGroups know which networking interface you want to use; especially if you have multiple interfaces it might not guess correctly.


Note

You don’t need to use IPv4: JGroups is compatible with IPv6 provided you have routing properly configured and valid addresses assigned.

The jgroups.bind_addr needs to match a placeholder name in your JGroups configuration in case you don’t use the default one.

The default configuration uses distribution as cache mode and uses the jgroups-tcp.xml configuration for JGroups, which is contained in the Infinispan jar as the default configuration for Infinispan users. Let’s see how to reconfigure this:


In the example above we specify a custom JGroups configuration file and set the cache mode for the default cache to distribution; this is going to be inherited by the ENTITIES and the ASSOCIATIONS caches. But for IDENTIFIERS we have chosen (for the sake of this example) to use replication.

Now that you have clustering configured, start the service on multiple nodes. Each node will need the same configuration and jars.

Tip

We have just shown how to override the clustering mode and the networking stack for the sake of completeness, but you don’t have to!

Start with the default configuration and see if that fits you. You can fine tune these setting when you are closer to going in production.

Hibernate Search, which can be used for advanced query capabilities (see Chapter 7, Query your entities), needs some place to store the indexes for its embedded Apache Lucene engine.

A common place to store these indexes is the filesystem which is the default for Hibernate Search; however if your goal is to scale your NoSQL engine on multiple nodes you need to share this index. Network sharing filesystems are a possibility but we don’t recommended that. Often the best option is to store the index in whatever NoSQL database you are using (or a different dedicated one).

Tip

You might find this section useful even if you don’t intend to store your data in Infinispan.

The Infinispan project provides an adaptor to plug into Apache Lucene, so that it writes the indexes in Infinispan and searches data in it. Since Infinispan can be used as an application cache to other NoSQL storage engines by using a CacheStore (see Section 5.1.2, “Manage data size”) you can use this adaptor to store the Lucene indexes in any NoSQL store supported by Infinispan:

  • Cassandra
  • Filesystem (but locked correctly at the Infinispan level)
  • MongoDB
  • HBase
  • JDBC databases
  • JDBM
  • BDBJE
  • A secondary (independent) Infinispan grid
  • Any Cloud storage service supported by JClouds

How to configure it? Here is a simple cheat sheet to get you started with this type of setup:

  • Add org.hibernate:hibernate-search-infinispan:4.5.1.Final to your dependencies
  • set these configuration properties:

    • hibernate.search.default.directory_provider = infinispan
    • hibernate.search.default.exclusive_index_use = false
    • hibernate.search.infinispan.configuration_resourcename = [infinispan configuration filename]

The referenced Infinispan configuration should define a CacheStore to load/store the index in the NoSQL engine of choice. It should also define three cache names:


This configuration is not going to scale well on write operations: to do that you should read about the master/slave and sharding options in Hibernate Search. The complete explanation and configuration options can be found in the Hibernate Search Reference Guide

Some NoSQL support storage of Lucene indexes directly, in which case you might skip the Infinispan Lucene integration by implementing a custom DirectoryProvider for Hibernate Search. You’re very welcome to share the code and have it merged in Hibernate Search for others to use, inspect, improve and maintain.

When combined with Hibernate ORM, Ehcache is commonly used as a 2nd level cache, so caching data which is stored in a relational database. When used with Hibernate OGM it is not "just a cache" but is the main storage engine for your data.

This is not the reference manual for Ehcache itself: we’re going to list only how Hibernate OGM should be configured to use Ehcache; for all the tuning and advanced options please refer to the Ehcache Documentation.

Two steps:

MongoDB is a document oriented datastore written in C++ with strong emphasis on ease of use.

This implementation is based upon the MongoDB Java driver. The currently supported version is 2.10.1.

The following properties are available to configure MongoDB support:

MongoDB datastore configuration properties

hibernate.ogm.datastore.provider
To use MongoDB as a datastore provider, this property must be set to mongodb
hibernate.ogm.option.configurator
The fully-qualified class name or an instance of a programmatic option configurator (see Section 5.3.1.2, “Programmatic configuration”)
hibernate.ogm.datastore.host
The hostname of the MongoDB instance. The default value is 127.0.0.1.
hibernate.ogm.datastore.port
The port used by the MongoDB instance. The default value is 27017.
hibernate.ogm.datastore.database
The database to connect to. This property has no default value.
hibernate.ogm.datastore.username
The username used when connecting to the MongoDB server. This property has no default value.
hibernate.ogm.datastore.password
The password used to connect to the MongoDB server. This property has no default value. This property is ignored if the username isn’t specified.
hibernate.ogm.mongodb.connection_timeout
Defines the timeout used by the driver when the connection to the MongoDB instance is initiated. This configuration is expressed in milliseconds. The default value is 5000.
hibernate.ogm.datastore.document.association_storage
Defines the way OGM stores association information in MongoDB. The following two strategies exist (values of the org.hibernate.ogm.datastore.document.options.AssociationStorageType enum): IN_ENTITY (store association information within the entity) and ASSOCIATION_DOCUMENT (store association information in a dedicated document per association). IN_ENTITY is the default and recommended option unless the association navigation data is much bigger than the core of the document and leads to performance degradation.
hibernate.ogm.mongodb.association_document_storage

Defines how to store assocation documents (applies only if the ASSOCIATION_DOCUMENT association storage strategy is used). Possible strategies are (values of the org.hibernate.ogm.datastore.mongodb.options.AssociationDocumentType enum):

  • GLOBAL_COLLECTION (default): stores the association information in a unique MongoDB collection for all associations
  • COLLECTION_PER_ASSOCIATION stores the association in a dedicated MongoDB collection per association
hibernate.ogm.mongodb.write_concern
Defines the write concern setting to be applied when issuing writes against the MongoDB datastore. Possible settings are (values of the WriteConcernType enum): ERRORS_IGNORED, ACKNOWLEDGED, UNACKNOWLEDGED, FSYNCED, JOURNALED, REPLICA_ACKNOWLEDGED, MAJORITY and CUSTOM. When set to CUSTOM, a custom WriteConcern implementation type has to be specified.
hibernate.ogm.mongodb.write_concern_type
Specifies a custom WriteConcern implementation type (fully-qualified name, class object or instance). This is useful in cases where the pre-defined configurations are not sufficient, e.g. if you want to ensure that writes are propagated to a specific number of replicas or given "tag set". Only takes effect if hibernate.ogm.mongodb.write_concern is set to CUSTOM.
hibernate.ogm.mongodb.read_preference
Specifies the ReadPreference to be applied when issuing reads against the MongoDB datastore. Possible settings are (values of the ReadPreferenceType enum): PRIMARY, PRIMARY_PREFERRED, SECONDARY, SECONDARY_PREFERRED and NEAREST. It’s currently not possible to plug in custom read preference types. If you’re interested in such a feature, please let us know.

For more information, please refer to the official documentation. This option is case insensitive and the default value is ACKNOWLEDGED.

Note

When bootstrapping a session factory or entity manager factory programmatically, you should use the constants accessible via MongoDBProperties when specifying the configuration properties listed above. Common properties shared between (document) stores are declared on OgmProperties and DocumentStoreProperties, respectively. To ease migration between stores, it is recommended to reference these constants directly from there.

Hibernate OGM allows to configure store-specific options via Java annotations. When working with the MongoDB backend, you can specify the following settings:

The following shows an example:


The @WriteConcern annotation on the entity level expresses that all writes should be done using the JOURNALED setting. Similarly, the @ReadPreference annotation advices the engine to preferably read that entity from the primary node if possible. The other two annotations on the type-level specify that all associations of the Zoo class should be stored in separate assocation documents, using a dedicated collection per association. This setting applies to the animals and employees associations. Only the elements of the visitors association will be stored in the document of the corresponding Zoo entity as per the configuration of that specific property which takes precedence over the entity-level configuration.

In addition to the annotation mechanism, Hibernate OGM also provides a programmatic API for applying store-specific configuration options. This can be useful if you can’t modify certain entity types or don’t want to add store-specific configuration annotations to them. The API allows set options in a type-safe fashion on the global, entity and property levels.

When working with MongoDB, you can currently configure the following options using the API:

To set these options via the API, you need to create an OptionConfigurator implementation as shown in the following example:


The call to configureOptionsFor(), passing the store-specific identifier type MongoDB, provides the entry point into the API. Following the fluent API pattern, you then can configure global options (writeConcern(), readPreference()) and navigate to single entities or properties to apply options specific to these (associationStorage() etc.). The call to writeConcern() for the Animal entity shows how a specific write concern type can be used. Here RequiringReplicaCountOf is a custom implementation of WriteConcern which ensures that writes are propagated to a given number of replicas before a write is acknowledged.

Options given on the property level precede entity-level options. So e.g. the animals association of the Zoo class would be stored using the in-entity strategy, while all other associations of the Zoo entity would be stored using separate association documents.

Similarly, entity-level options take precedence over options given on the global level. Global-level options specified via the API complement the settings given via configuration properties. In case a setting is given via a configuration property and the API at the same time, the latter takes precedence.

Note that for a given level (property, entity, global), an option set via annotations is overridden by the same option set programmatically. This allows you to change settings in a more flexible way if required.

To register an option configurator, specify its class name using the hibernate.ogm.option.configurator property. When bootstrapping a session factory or entity manager factory programmatically, you also can pass in an OptionConfigurator instance or the class object representing the configurator type.

Hibernate OGM tries to make the mapping to the underlying datastore as natural as possible so that third party applications not using Hibernate OGM can still read and update the same datastore. We worked particularly hard on the MongoDB model to offer various classic mappings between your object model and the MongoDB documents.

Entities are stored as MongoDB documents and not as BLOBs which means each entity property will be translated into a document field. You can use the name property of the @Table and @Column annotations to rename the collections and the document’s fields if you need to.

Note that embedded objects are mapped as nested documents.


The _id field of a MongoDB document is directly used to store the identifier columns mapped in the entities. You can use simple identifiers (e.g. of type long with a table-based id generator or of type String with a GUID generator) as well as embedded identifiers.

Generally, it is recommended though to work with MongoDB’s object id data type. This will facilitate the integration with other applications possibly expecting that common MongoDB id type. To do so, you have two options:

In both cases the id will be stored as native ObjectId in the datastore.

You can assign id values yourself or (preferably) take advantage of the IDENTITY generation strategy which will automatically assign an id during insert. The following shows an example:


Note

You also can use GenerationType.AUTO to store ids as object id in MongoDB. This requires though the property hibernate.id.new_generator_mappings to be set to false.

Embedded identifiers are stored as embedded document within the _id field. Hibernate OGM will convert the @Id property into a _id document field so you can name the entity id like you want it will always be stored into _id (the recommended approach in MongoDB). That means in particular that MongoDB will automatically index your _id fields. Let’s look at an example:


Hibernate OGM MongoDB proposes three strategies to store navigation information for associations. To switch between these strategies, either use the @AssocationStorage and @AssociationDocumentStorage annotations (see Section 5.3.1.1, “Annotation based configuration”), the API for programmatic configuration (see Section 5.3.1.2, “Programmatic configuration”) or specify a default strategy via the hibernate.ogm.datastore.document.association_storage and hibernate.ogm.mongodb.association_document_storage configuration properties.

The three possible strategies are:

  • IN_ENTITY (default)
  • ASSOCIATION_DOCUMENT, using a global collection for all associations
  • ASSOCIATION_DOCUMENT, using a dedicated collection for each association

You can express queries in a few different ways:

Hibernate OGM also supports certain forms of native queries for MongoDB. Currently two forms of native queries are available via the MongoDB backend:

The former always maps results to entity types. The latter either maps results to entity types or to certain supported forms of projection. Note that parameterized queries are not supported by MongoDB, so don’t expect Query#setParameter() to work.

Warning

Specifying native MongoDB queries using the CLI syntax is an EXPERIMENTAL feature for the time being. Currently only find() and count() queries are supported via the CLI syntax. Further query types (including updating queries) may be supported in future revisions.

No cursor operations such as sort() are supported. Instead use the corresponding MongoDB query modifiers such as $orderby within the criteria parameter.

JSON parameters passed via the CLI syntax must be specified using the strict mode The only relaxation of this is that single quotes may be used when specifying attribute names/values to facilitate embedding queries within Java strings.

Note that results of projections are returned as retrieved from the MongoDB driver at the moment and are not (yet) converted using suitable Hibernate OGM type implementations.

You can execute native queries as shown in the following example:


The result of a query is a managed entity (or a list thereof) or a projection of attributes in form of an object array, just like you would get from a JP-QL query.


Note

As OgmSession extends org.hibernate.Session (which originally has been designed with relational databases in mind only) you could also invoke createSQLQuery() to create a native query. But for the sake of comprehensibility, you should prefer createNativeQuery(), and in fact createSQLQuery() has been deprecated on OgmSession.

Native queries can also be created using the @NamedNativeQuery annotation:


Hibernate OGM stores data in a natural way so you can still execute queries using the MongoDB driver, the main drawback is that the results are going to be raw MongoDB documents and not managed entities.

Neo4j is a robust (fully ACID) transactional property graph database. This kind of databases are suited for those type of problems that can be represented with a graph like social relationships or road maps for example.

At the moment only the support for the embedded Neo4j is included in OGM.

This is our first version and a bit experimental. In particular we plan on using node navigation much more than index lookup in a future version.

The following properties are available to configure Neo4j support:

Hibernate OGM tries to make the mapping to the underlying datastore as natural as possible so that third party applications not using Hibernate OGM can still read and update the same datastore.

Entities are stored as Neo4j nodes, which means each entity property will be translated into a property of the node. The name of the table mapping the entity is used as label.

You can use the name property of the @Table and @Column annotations to rename the label and the node’s properties.

An additional label ENTITY is added to the node.



Currently an embeddable element is stored saving the values on the node that represents the entity.

Hibernate OGM will create unique constraints for the identifier of entities and for the properties defined as unique using:

  • @NaturalId
  • @Column(unique = true)
  • @Table(uniqueConstraints = @UniqueConstraint(columnNames = { "column_name" }

    Neo4j does not support constraints on more than one property.
    For this reason, Hibernate OGM will create a unique constraint only when
    defined on a single column and it will ignore the ones defined on multiple columns.

An association, bidirectional or unidirectional, is always mapped using one relationship, beginning at the owning side of the association. This is possible because in Neo4j relationships can be navigated in both directions.

The type of the relationships depends on the type of the association, but in general it is the role of the association on the main side. The only property stored on the relationship is going to be the index of the association when required, for example when the association is annotated with @OrderColumn or when a java.util.Map is used.





Hibernate OGM supports the table generation strategy as well as the sequence generation strategy with Neo4j. It is generally recommended to work with the latter, as it allows a slightly more efficient querying for the next sequence value.

Sequence-based generators are represented by nodes in the following form:


Each sequence generator node is labelled with SEQUENCE. The sequence name can be specified via @SequenceGenerator#sequenceName(). A unique constraint is applied to the property sequence_name in order to ensure uniqueness of sequences.

If required, you can set the initial value of a sequence and the increment size via @SequenceGenerator#initialValue() and @SequenceGenerator#allocationSize(), respectively. The options @SequenceGenerator#catalog() and @SequenceGenerator#schema() are not supported.

Table-based generators are represented by nodes in the following form:


Each table generator node is labelled with TABLE_BASED_SEQUENCE and the table name as specified via @TableGenerator#table(). The sequence name is to be given via @TableGenerator#pkColumnValue(). The node properties holding the sequence name and value can be configured via @TableGenerator#pkColumnName() and @TableGenerator#valueColumnName(), respectively. A unique constraint is applied to the property sequence_name to avoid the same sequence name is used twice within the same "table".

If required, you can set the initial value of a sequence and the increment size via @TableGenerator#initialValue() and @TableGenerator#allocationSize(), respectively. The options @TableGenerator#catalog(), @TableGenerator#schema(), @TableGenerator#uniqueConstraints() and @TableGenerator#indexes() are not supported.

You can express queries in a few different ways:

Hibernate OGM also supports Cypher queries for Neo4j. You can execute Cypher queries as shown in the following example:


The result of a query is a managed entity (or a list thereof) or a projection of attributes in form of an object array, just like you would get from a JP-QL query.


Note

As OgmSession extends org.hibernate.Session (which originally has been designed with relational databases in mind only) you could also invoke createSQLQuery() to create a native query. But for the sake of comprehensibility, you should prefer createNativeQuery(), and in fact createSQLQuery() has been deprecated on OgmSession.

Native queries can also be created using the @NamedNativeQuery annotation:


Hibernate OGM stores data in a natural way so you can still execute queries using your favorite tool, the main drawback is that the results are going to be raw Neo4j elements and not managed entities.

CouchDB is a document-oriented datastore which stores your data in form of JSON documents and exposes its API via HTTP based on REST principles. It is thus very easy to access from a wide range of languages and applications.

Note

Support for CouchDB is considered an EXPERIMENTAL feature as of this release. In particular you should be prepared for possible changes to the persistent representation of mapped objects in future releases. Should you find any bugs or have feature requests for this dialect, then please open a ticket in the OGM issue tracker.

Hibernate OGM uses the excellent RESTEasy library to talk to CouchDB stores, so there is no need to include any of the Java client libraries for CouchDB in your classpath.

The following properties are available to configure CouchDB support in Hibernate OGM:

CouchDB datastore configuration properties

hibernate.ogm.datastore.provider
To use CouchDB as a datastore provider, this property must be set to couchdb
hibernate.ogm.option.configurator
The fully-qualified class name or an instance of a programmatic option configurator (see Section 5.5.1.2, “Programmatic configuration”)
hibernate.ogm.datastore.host
The hostname of the CouchDB instance. The default value is 127.0.0.1.
hibernate.ogm.datastore.port
The port used by the CouchDB instance. The default value is 5984.
hibernate.ogm.datastore.database
The database to connect to. This property has no default value.
hibernate.ogm.datastore.create_database
Whether to create the specified database in case it does not exist or not. Can be true or false (default). Note that the specified user must have the right to create databases if set to true.
hibernate.ogm.datastore.username
The username used when connecting to the CouchDB server. Note that this user must have the right to create design documents in the chosen database. This property has no default value. Hibernate OGM currently does not support accessing CouchDB via HTTPS; if you’re interested in such functionality, let us know.
hibernate.ogm.datastore.password
The password used to connect to the CouchDB server. This property has no default value. This property is ignored if the username isn’t specified.
hibernate.ogm.datastore.document.association_storage
Defines the way OGM stores association information in CouchDB. The following two strategies exist (values of the org.hibernate.ogm.datastore.document.options.AssociationStorageType enum): IN_ENTITY (store association information within the entity) and ASSOCIATION_DOCUMENT (store association information in a dedicated document per association). IN_ENTITY is the default and recommended option unless the association navigation data is much bigger than the core of the document and leads to performance degradation.

Note

When bootstrapping a session factory or entity manager factory programmatically, you should use the constants accessible via CouchDBProperties when specifying the configuration properties listed above. Common properties shared between (document) stores are declared on OgmProperties and DocumentStoreProperties, respectively. To ease migration between stores, it is recommended to reference these constants directly from there.

In addition to the annotation mechanism, Hibernate OGM also provides a programmatic API for applying store-specific configuration options. This can be useful if you can’t modify certain entity types or don’t want to add store-specific configuration annotations to them. The API allows set options in a type-safe fashion on the global, entity and property levels.

When working with CouchDB, you can currently configure the following options using the API:

To set this option via the API, you need to create an OptionConfigurator implementation as shown in the following example:


The call to configureOptionsFor(), passing the store-specific identifier type CouchDB, provides the entry point into the API. Following the fluent API pattern, you then can configure global options and navigate to single entities or properties to apply options specific to these.

Options given on the property level precede entity-level options. So e.g. the visitors association of the Zoo class would be stored using the in-entity strategy, while all other associations of the Zoo entity would be stored using separate association documents.

Similarly, entity-level options take precedence over options given on the global level. Global-level options specified via the API complement the settings given via configuration properties. In case a setting is given via a configuration property and the API at the same time, the latter takes precedence.

Note that for a given level (property, entity, global), an option set via annotations is overridden by the same option set programmatically. This allows you to change settings in a more flexible way if required.

To register an option configurator, specify its class name using the hibernate.ogm.option.configurator property. When bootstrapping a session factory or entity manager factory programmatically, you also can pass in an OptionConfigurator instance or the class object representing the configurator type.

Hibernate OGM tries to make the mapping to the underlying datastore as natural as possible so that third party applications not using Hibernate OGM can still read and update the same datastore. The following describe how entities and associations are mapped to CouchDB documents by Hibernate OGM.

Entities are stored as CouchDB documents and not as BLOBs which means each entity property will be translated into a document field. You can use the name property of the @Table and @Column annotations to rename the collections and the document’s fields if you need to.

CouchDB provides a built-in mechanism for detecting concurrent updates to one and the same document. For that purpose each document has an attribute named _rev (for "revision") which is to be passed back to the store when doing an update. So when writing back a document and the document’s revision has been altered by another writer in parallel, CouchDB will raise an optimistic locking error (you could then e.g. re-read the current document version and try another update).

For this mechanism to work, you need to declare a property for the _rev attribute in all your entity types and mark it with the @Version and @Generated annotations. The first marks it as a property used for optimistic locking, while the latter advices Hibernate OGM to refresh that property after writes since its value is managed by the datastore.

The following shows an example of an entity and its persistent representation in CouchDB.


Note that CouchDB doesn’t have a concept of "tables" or "collections" as e.g. MongoDB does; Instead all documents are stored in one large bucket. Thus Hibernate OGM needs to add two additional attributes: $type which contains the type of a document (entity vs. association documents) and $table which specifies the entity name as derived from the type or given via the @Table annotation.

Note

Attributes whose name starts with the "$" character are managed by Hibernate OGM and thus should not be modified manually. Also it is not recommended to start the names of your attributes with the "$" character to avoid collisions with attributes possibly introduced by Hibernate OGM in future releases.

Embedded objects are mapped as nested documents. The following listing shows an example:


Hibernate OGM CouchDB provides two strategies to store navigation information for associations. To switch between these strategies, either use the @AssocationStorage annotation (see Section 5.5.1.1, “Annotation based configuration”), the API for programmatic configuration (see Section 5.5.1.2, “Programmatic configuration”) or specify a global default strategy via the hibernate.ogm.datastore.document.association_storage configuration property.

The possible strategies are IN_ENTITY (default) and ASSOCIATION_DOCUMENT.

TODO:

  • Talk about supported approaches (properties, embedded objects, inheritance)
  • Talk about associations
  • Talk about identifier recommendations

This section is a work in progress, if you find something that does not work as expected, let us know and we will update it (and fix the problem of course).

Pretty much all entity related constructs should work out of the box in Hibernate OGM. @Entity, @Table, @Column, @Enumarated, @Temporal, @Cacheable and the like will work as expected. If you want an example, check out Chapter 2, Getting started with Hibernate OGM or the documentation of Hibernate ORM. Let’s concentrate of the features that differ or are simply not supported by Hibernate OGM.

The various inheritance strategies are not supported by Hibernate OGM, only the table per concrete class strategy is used. f This is not so much a limitation but rather an acknowledgment of the dynamic nature of NoSQL schemas. If you feel the need to support other strategies, let us know (see Section 1.2, “How to contribute”). Simply do not use @Inheritance nor @DiscriminatorColumn.

Secondary tables are not supported by Hibernate OGM at the moment. If you have needs for this feature, let us know (see Section 1.2, “How to contribute”).

All SQL related constructs as well as HQL centered mapping are not supported in Hibernate OGM. Here is a list of feature that will not work:

  • Named queries
  • Native queries

All standard JPA id generators are supported: IDENTITY, SEQUENCE, TABLE and AUTO. If you need support for additional generators, let us know (see Section 1.2, “How to contribute”). We recommend you use a UUID based generator as this type of generator allows maximum scalability to the underlying data grid as no cluster-wide counter is necessary.


All association types are supported (@OneToOne, @OneToMany, @ManyToOne, @ManyToMany). Likewise, all collection types are supported (Set, Map, List). The way Hibernate OGM stores association information is however quite different than the traditional RDBMS representation. Check Section 3.2, “How is data persisted” for more information.

Keep in mind that collections with many entries won’t perform very well in Hibernate OGM (at least today) as all of the association navigation for a given entity is stored in a single key. If your collection is made of 1 million elements, Hibernate OGM stores 1 million tuples in the association key.

To query a NoSQL database is a complex feat, especially as not all NoSQL solutions support all forms of query. One of the goals of Hibernate OGM is to deal with this complexity so that users don’t have to. However, that’s not yet all implemented and depending on your use case there might be better approaches you can take advantage of.

If you skipped to this section without reading Chapter 3, Architecture, I’d suggest to read at least Section 3.3, “How is data queried” as it will greatly help you choosing a query approach.

For Hibernate OGM we developed a brand new JP-QL parser which is already able to convert simple queries using Hibernate Search (as described in Section 5.3.4, “Queries”, for MongoDB queries are transformed into native MongoDB queries instead).

Note that the following preconditions must be met:

  • no join, aggregation, or other relational operations are implied
  • the target entities and properties are indexed by Hibernate Search

You can make use of the following JP-QL constructs:

  • simple comparisons using "<", "⇐", "=", ">=" and ">"
  • IS NULL and IS NOT NULL
  • the boolean operators AND, OR, NOT
  • LIKE, IN and BETWEEN
  • ORDER BY

If this is not sufficient for you use case, you may instead either work with Hibernate Search full-text queries or the native query technology of the NoSQL storage you are using.

To provide an example of what kind of queries would work:


We actually did use Hibernate Search already in the previous example; specifically the annotations @Indexed and @Field are Hibernate Search specific. In this example the query was defined using a JP-QL string and then defining parameters; that’s useful if all you have a is a JP-QL Query, but it is limiting.

Hibernate Search remaps the properties annotated with @Field in Lucene Documents, and manages the Lucene indexes so that you can then perform Lucene Queries.

To be extremely short, Apache Lucene is a full-text indexing and query engine with excellent query performance. Featurewise, full-text means you can do much more than a simple equality match as we did in the previous example.

Let’s show another example, now creating a Lucene Query instead:


Assuming our database contains an Hypothesis instance having description "tomorrow we release", the query above will not find the entity because we disabled text analysis in the previous mapping.

If we enable text analysis (which is the default):


Now the entity would match a query on "tomorrow" as we’re unlocking text similarity queries!

Text similarity can be very powerful as it can be configured for specific languages or domain specific terminology; it can deal with typos and synonyms, and above all it can return results by relevance.

Worth noting the Lucene index is a vectorial space of term occurrence statistics: so extracting tags from text, frequencies of strings and correlate this data makes it very easy to build efficient data analysis applications.

For a full explanation of all its capabilities and configuration options, see the Hibernate Search reference documentation.

While the potential of Lucene queries is very high, it’s not suited for all use cases Let’s see some of the limitations of Lucene Queries as our main query engine:

  • Lucene doesn’t support Joins. Any to-One relations can be mapped fine, and the Lucene community is making progress on other forms, but restrictions on OneToMany or ManyToMany can’t be implemented today.
  • Since we apply changes to the index at commit time, your updates won’t affect queries until you commit (we might improve on this).
  • While queries are extremely fast, write operations are not as fast (but we can make it scale).