Hibernate.orgCommunity Documentation

Chapter 9. Spatial

9.1. Enable indexing of Spatial Coordinates
9.1.1. Indexing coordinates for Double Range Queries
9.1.2. Indexing coordinates in a Grid with Quad Trees
9.1.3. Implementing the Coordinates interface
9.2. Performing Spatial Queries
9.2.1. Returning distance to query point in the search results
9.3. Multiple Coordinate pairs
9.4. Insight: implementation details of Quad Tree indexing
9.4.1. At indexing level
9.4.2. At search level

With the Spatial extensions you can combine fulltext queries with restrictions based on distance from a point in space, filter results based on distances from coordinates or sort results on such a distance criteria.

The spatial support of Hibernate Search has a few goals:

For example, you might search for that Italian place named approximately "Il Ciociaro" and is somewhere in the 2 km area around your office.

To be able to filter an @Indexed @Entity on a distance criteria you need to add the @Spatial annotation (org.hibernate.search.annotations.Spatial) and specify one or more sets of coordinates.

There are different techniques to index point coordinates, in particular Hibernate Search Spatial offers a choice between two strategies:

We will now describe both methods so you can make a suitable choice; of course you can pick different strategies for each set of coordinates. These strategies are selected by specifying spatialMode, an attribute of the @Spatial annotation.

Instead of using the @Latitude and @Longitue annotations you can choose to implement the org.hibernate.search.spatial.Coordinates interface.

As we will see in the section Section 9.3, “Multiple Coordinate pairs”, a @Spatial @Indexed @Entity can have multiple @Spatial annotations; when having the entity implement Coordinates, the implemented methods refer to the default Spatial name: the default pair of coordinates.

An alternative is to use properties implementing the Coordinates interface; this way you can have multiple Spatial instances:

When using this form the @Spatial .name automatically defaults to the propery name.

The Hibernate Search DSL has been extended to support the spatial feature. You can build a query to search around a pair of coordinates (latitude,longitude) or around a bean implementing the Coordinates interface.

As with any fulltext queries, also for Spatial queries you:

A fully working example can be found in the source code, in the testsuite. See SpatialIndexingTest.testSpatialAnnotationOnClassLevel() and in the Hotel class.

As an alternative to passing separate values for latitude and longitude values, you can also pass an object implementing the Coordinates interface:

To get the distance to the center of the search returned with the results you just need to project it:

  • Use FullTextQuery.setProjection with FullTextQuery.SPATIAL_DISTANCE as one of the projected fields.

  • Call FullTextQuery.setSpatialParameters with the latitude, longitude and the name of the spatial field used to build the spatial query. Note that using coordinates different thans the center used for the query will have unexpected results.

Distance projection and null values

Using distance projection on non @Spatial enabled entities and/or with a non spatial Query will have unexpected results as entities not spatially indexed and/or having null values for latitude or longitude will be considered to be at (0,0)/(lat,0)/(0,long).

Using distance projection with a spatial query on spatially indexed entities having, eventually, null values for latitude and/or longitude is safe as they will not be found by the spatial query and won't have distance calculated.

To sort the results by distance to the center of the search you will have to build a Sort object using a DistanceSortField:

The DistanceSortField must be constructed using the same coordinates on the same spatial field used to build the spatial query otherwise the sorting will occur with another center than the query. This repetition is needed to allow you to define Queries with any tool.

Sorting and null values

Using distance sort on non @Spatial enabled entities and/or with a non spatial Query will have also unexpected results as entities non spatially indexed and/or with null values for latitude or longitude will be considered to be at (0,0)/(lat,0)/(0,long)

Using distance sort with a spatial query on spatially indexed entities having, potentially, null values for latitude and/or longitude is safe as they will not be found by the spatial query and so won't be sorted

You can associate multiple pairs of coordinates to the same entity, as long as each pair is uniquelly identified by using a different name. This is achieved by stacking multiple @Spatial annotations in a @Spatials annotation, and specifying the name attribute on the @Spatial annotation.

In the example Example 9.5, “Search for an Hotel by distance” we used onDefaultCoordinates() which points to the coordinates defined by a @Spatial annotation whose name attribute was not specified.

To target an alternative pair of coordinates at query time, we need to specify the pair by name using onCoordinates (String) instead of onDefaultCoordinates():

The present chapter is meant to provide a technical insight in quad-tree (grid) indexing: how coordinates are mapped to the index and how queries are implemented.

When Hibernate Search indexes the entity annotated with @Spatial, it instantiates a SpatialFieldBridge to transform the latitude and longitude fields accessed via the Coordinates interface to the multiple index fields stored in the Lucene index.

Principle of the spatial index: the spatial index used in Hibernate Search is a QuadTree (http://en.wikipedia.org/wiki/Quadtree).

To make computation in a flat coordinates system the latitude and longitude field values will be projected with a sinusoidal projection ( http://en.wikipedia.org/wiki/Sinusoidal_projection). Origin values space is :

[-90 -> +90],]-180 -> 180]

for latitude,longitude coordinates and projected space is:

]-pi -> +pi],[-pi/2 -> +pi/2]

for cartesian x,y coordinates (beware of fields order inversion: x is longitude and y is latitude).

The index is divided into n levels labeled from 0 to n-1.

At the level 0 the projected space is the whole Earth. At the level 1 the projected space is devided into 4 rectangles (called boxes as in bounding box):

[-pi,-pi/2]->[0,0], [-pi,0]->[0,+pi/2], [0,-pi/2]->[+pi,0] and [0,0]->[+pi,+pi/2]

At level n+1 each box of level n is divided into 4 new boxes and so on. The numbers of boxes at a given level is 4^n.

Each box is given an id, in this format: [Box index on the X axis]|[Box index on the Y axis] To calculate the index of a box on an axis we divide the axis range in 2^n slots and find the slot the box belongs to. At the n level the indexes on an axis are from -(2^n)/2 to (2^n)/2. For instance, the 5th level has 4^5 = 1024 boxes with 32 indexes on each axis (32x32 is 1024) and the box of Id "0|8" is covering the [0,8/32*pi/2]->[1/32*pi,9/32*pi/2] rectangle is projected space.

Beware! The boxes are rectangles in projected space but the related area on Earth is not a rectangle!

Now that we have all these boxes at all these levels will be indexing points "into" them.

For a point (lat,long) we calculate its projection (x,y) and then we calculate for each level of the spatial index, the ids of the boxes it belongs to.

At each level the point is in one and only one box. For points on the edges the box are considered exclusive n the left side and inclusive on the right i-e ]start,end] (the points are normalized before projection to [-90,+90],]-180,+180]).

We store in the Lucene document corresponding to the entity to index one field for each level of the quad tree. The field is named: [spatial index fields name]_HSSI_[n]. [spatial index fields name] is given either by the parameter at class level annotation or derived from the name of the spatial annoted method of he entitiy, HSSI stands for Hibernate Search Spatial Index and n is the level of the quad tree.

We also store the latitude and longitude as a Numeric field under [spatial index fields name]_HSSI_Latitude and [spatial index fields name]_HSSI_Longitude fields. They will be used to filter precisely results by distance in the second stage of the search.

Now that we have all these fields, what are they used for?

When you ask for a spatial search by providing a search discus (center+radius) we will calculate the boxes ids that do cover the search discus in the projected space, fetch all the documents that belong to these boxes (thus narrowing the number of documents for which we will have to calculate distance to the center) and then filter this subset with a real distance calculation. This is called two level spatial filtering.