Chapter 3. Graph Model

One of the central concepts within ModeShape is that of its graph model. Information is structured into a hierarchy of nodes with properties, where nodes in the hierarchy are identified by their path (and/or identifier properties). Properties are identified by a name that incorporates a namespace and local name, and contain one or more property values consisting of normal Java strings, names, paths, URIs, booleans, longs, doubles, decimals, binary content, dates, UUIDs, references to other nodes, or any other serializable object.

This graph model is used throughout ModeShape: it forms the basis for the connector framework, it is used by the sequencing framework for the generated output, and it is what the JCR implementation uses internally to access and operate on the repository content.

Therefore, this chapter provides essential information that will be essential to really understanding how the connectors, sequencers, and other ModeShape features work.

3.1. Names

ModeShape uses names to identify quite a few different types of objects. As we'll soon see, each property of a node is given by a name, and each segment in a path is comprised of a name. Therefore, names are a very important concept.

ModeShape names consist of a local part that is qualified with a namespace. The local part can consist of any character, and the namespace is identified by a URI. Namespaces were introduced in the previous chapter and are managed by the ExecutionContext's namespace registry. Namespaces help reduce the risk of clashes in names that have an equivalent same local part.

All names are immutable, which means that once a Name object is created, it will never change. This characteristic makes it much easier to write thread-safe code - the objects never change and therefore require no locks or synchronization to guarantee atomic reads. This is a technique that is more and more often found in newer languages and frameworks that simplify concurrent operations.

Name is also a interface rather than a concrete class:

@Immutable
public interface Name extends Comparable<Name>, Serializable, Readable {

    /**
     * Get the local name part of this qualified name.
     * @return the local name; never null
     */
    String getLocalName();

    /**
     * Get the URI for the namespace used in this qualified name.
     * @return the URI; never null but possibly empty
     */
    String getNamespaceUri();
}

This means that you need to use a factory to create Name instances.

The use of a factory may seem like a disadvantage and unnecessary complexity, but there actually are several benefits. First, it hides the concrete implementations, which is very appealing if an optimized implementation can be chosen for particular situations. It also simplifies the usage, since Name only has a few methods. Third, it allows the factory to cache or pool instances where appropriate to help conserve memory. Finally, the very same factory actually serves as a conversion mechanism from other forms. We'll actually see more of this later in this chapter, when we talk about other kinds of property values.

The factory for creating Name objects is called NameFactory and is available within the ExecutionContext, via the getValueFactories() method.

We'll see how names are used later on, but one more point to make: Name is both serializable and comparable, and all implementations should support equals(...) and hashCode() so that Name can be used as a key in a hash-based map. Name also extends the Readable interface, which we'll learn more about later in this chapter.

3.2. Paths

Another important concept in ModeShape's graph model is that of a path, which provides a way of locating a node within a hierarchy. ModeShape's Path object is an immutable ordered sequence of Path.Segment objects. A small portion of the interface is shown here:

@Immutable
public interface Path extends Comparable<Path>, Iterable<Path.Segment>, Serializable, Readable {

    /**
     * Return the number of segments in this path.
     * @return the number of path segments
     */
    public int size();

    /**
     * Return whether this path represents the root path.
     * @return true if this path is the root path, or false otherwise
     */
    public boolean isRoot();

    /**
     * {@inheritDoc}
     */
    public Iterator<Path.Segment> iterator();

    /**
     * Obtain a copy of the segments in this path. None of the segments are encoded.
     * @return the array of segments as a copy
     */
    public Path.Segment[] getSegmentsArray();

    /**
     * Get an unmodifiable list of the path segments.
     * @return the unmodifiable list of path segments; never null
     */
    public List<Path.Segment> getSegmentsList();
    /**
     * Get the last segment in this path.
     * @return the last segment, or null if the path is empty
     */
    public Path.Segment getLastSegment();

    /**
     * Get the segment at the supplied index.
     * @param index the index
     * @return the segment
     * @throws IndexOutOfBoundsException if the index is out of bounds
     */
    public Path.Segment getSegment( int index );

    /**
     * Return an iterator that walks the paths from the root path down to this path. This method 
     * always returns at least one path (the root returns an iterator containing itself).
     * @return the path iterator; never null
     */
    public Iterator<Path> pathsFromRoot();

    /**
     * Return a new path consisting of the segments starting at beginIndex index (inclusive). 
     * This is equivalent to calling path.subpath(beginIndex,path.size()-1).
     * @param beginIndex the beginning index, inclusive.
     * @return the specified subpath
     * @exception IndexOutOfBoundsException if the beginIndex is negative or larger 
     *            than the length of this Path object
     */
    public Path subpath( int beginIndex );

    /**
     * Return a new path consisting of the segments between the beginIndex index (inclusive)
     * and the endIndex index (exclusive).
     * @param beginIndex the beginning index, inclusive.
     * @param endIndex the ending index, exclusive.
     * @return the specified subpath
     * @exception IndexOutOfBoundsException if the beginIndex is negative, or 
     *            endIndex is larger than the length of this Path 
     *            object, or beginIndex is larger than endIndex.
     */
    public Path subpath( int beginIndex, int endIndex );

    ...
}

There are actually quite a few methods (not shown above) for obtaining related paths: the path of the parent, the path of an ancestor, resolving a path relative to this path, normalizing a path (by removing "." and ".." segments), finding the lowest common ancestor shared with another path, etc. There are also a number of methods that compare the path with others, including determining whether a path is above, equal to, or below this path.

Each Path.Segment is an immutable pair of a Name and same-name-sibling (SNS) index. When two sibling nodes have the same name, then the first sibling will have SNS index of "1" and the second will be given a SNS index of "2". (This mirrors the same-name-sibling index behavior of JCR paths.)

@Immutable
public static interface Path.Segment extends Cloneable, Comparable<Path.Segment>, Serializable, Readable 
{

    /**
     * Get the name component of this segment.
     * @return the segment's name
     */
    public Name getName();

    /**
     * Get the index for this segment, which will be 1 by default.
     * @return the index
     */
    public int getIndex();

    /**
     * Return whether this segment has an index that is not "1"
     * @return true if this segment has an index, or false otherwise.
     */
    public boolean hasIndex();

    /**
     * Return whether this segment is a self-reference (or ".").
     * @return true if the segment is a self-reference, or false otherwise.
     */
    public boolean isSelfReference();

    /**
     * Return whether this segment is a reference to a parent (or "..")
     * @return true if the segment is a parent-reference, or false otherwise.
     */
    public boolean isParentReference();
}

Like Name, the only way to create a Path or a Path.Segment is to use the PathFactory, which is available within the ExecutionContext via the getValueFactories() method.

3.3. Properties

The ModeShape graph model allows nodes to hold multiple properties, where each property is identified by a unique Name and may have one or more values. Like many of the other classes used in the graph model, Property is an immutable object that, once constructed, can never be changed and therefore provides a consistent snapshot of the state of a property as it existed at the time it was read.

ModeShape properties can hold a wide range of value objects, including normal Java strings, names, paths, URIs, booleans, longs, doubles, decimals, binary content, dates, UUIDs, references to other nodes, or any other serializable object. All but three of these are the standard Java classes: dates are represented by an immutable DateTime class; binary content is represented by an immutable Binary interface patterned after the interface of the same name in JSR-283; and Reference is an immutable interface patterned after the corresponding interface is JSR-170 and JSR-283.

The Property interface defines methods for obtaining the name and property values:

@Immutable
public interface Property extends Iterable<Object>, Comparable<Property>, Readable {

    /**
     * Get the name of the property.
     * 
     * @return the property name; never null
     */
    Name getName();

    /**
     * Get the number of actual values in this property.
     * @return the number of actual values in this property; always non-negative
     */
    int size();

    /**
     * Determine whether the property currently has multiple values.
     * @return true if the property has multiple values, or false otherwise.
     */
    boolean isMultiple();

    /**
     * Determine whether the property currently has a single value.
     * @return true if the property has a single value, or false otherwise.
     */
    boolean isSingle();

    /**
     * Determine whether this property has no actual values. This method may return true 
     * regardless of whether the property has a single value or multiple values.
     * This method is a convenience method that is equivalent to size() == 0.
     * @return true if this property has no values, or false otherwise
     */
    boolean isEmpty();

    /**
     * Obtain the property's first value in its natural form. This is equivalent to calling
     * isEmpty() ? null : iterator().next()
     * @return the first value, or null if the property is {@link #isEmpty() empty}
     */
    Object getFirstValue();

    /**
     * Obtain the property's values in their natural form. This is equivalent to calling iterator().
     * A valid iterator is returned if the property has single valued or multi-valued.
     * The resulting iterator is immutable, and all property values are immutable.
     * @return an iterator over the values; never null
     */
    Iterator<?> getValues();

    /**
     * Obtain the property's values as an array of objects in their natural form.
     * A valid iterator is returned if the property has single valued or multi-valued, or a
     * null value is returned if the property is {@link #isEmpty() empty}.
     * The resulting array is a copy, guaranteeing immutability for the property.
     * @return the array of values
     */
    Object[] getValuesAsArray();
}

Creating Property instances is done by using the PropertyFactory object owned by the ExecutionContext. This factory defines methods for creating properties with a Name and various representation of values, including variable-length arguments, arrays, Iterator, and Iterable.

When it comes to using the property values, ModeShape takes a non-traditional approach. Many other graph models (including JCR) mark each property with a data type and then require all property values adhere to this data type. When the property values are obtained, they are guaranteed to be of the correct type. However, many times the property's data type may not match the data type expected by the caller, and so a conversion may be required and thus has to be coded.

The ModeShape graph model uses a different tact. Because callers almost always have to convert the values to the types they can handle, ModeShape skips the steps of associating the Property with a data type and ensuring the values match. Instead, ModeShape simply provides a very easy mechanism to convert the property values to the type desired by the caller. In fact, the conversion mechanism is exactly the same as the factories that create the values in the first place.

3.4. Values and Value Factories

ModeShape properties can hold a variety of value object types: strings, names, paths, URIs, booleans, longs, doubles, decimals, binary content, dates, UUIDs, references to other nodes, or any other serializable object. To assist in the creation of these values and conversion into other types, ModeShape defines a ValueFactory interface. This interface is parameterized with the type of value that is being created, but defines methods for creating those values from all of the other known value types:

public interface ValueFactory<T> {

    /**
     * Get the PropertyType of values created by this factory.
     * @return the value type; never null
     */
    PropertyType getPropertyType();

		/*
		 * Methods to create a value by converting from another value type.
		 * If the supplied value is the same type as returned by this factory,
		 * these methods simply return the supplied value.
		 * All of these methods throw a ValueFormatException if the supplied value
		 * could not be converted to this type.
		 */
    T create( String value ) throws ValueFormatException;
    T create( String value, TextDecoder decoder ) throws ValueFormatException;
    T create( int value ) throws ValueFormatException;
    T create( long value ) throws ValueFormatException;
    T create( boolean value ) throws ValueFormatException;
    T create( float value ) throws ValueFormatException;
    T create( double value ) throws ValueFormatException;
    T create( BigDecimal value ) throws ValueFormatException;
    T create( Calendar value ) throws ValueFormatException;
    T create( Date value ) throws ValueFormatException;
    T create( DateTime value ) throws ValueFormatException;
    T create( Name value ) throws ValueFormatException;
    T create( Path value ) throws ValueFormatException;
    T create( Reference value ) throws ValueFormatException;
    T create( URI value ) throws ValueFormatException;
    T create( UUID value ) throws ValueFormatException;
    T create( byte[] value ) throws ValueFormatException;
    T create( Binary value ) throws ValueFormatException, IoException;
    T create( InputStream stream, long approximateLength ) throws ValueFormatException, IoException;
    T create( Reader reader, long approximateLength ) throws ValueFormatException, IoException;
    T create( Object value ) throws ValueFormatException, IoException;

    /*
     * Methods to create an array of values by converting from another array of values.
     * If the supplied values are the same type as returned by this factory,
     * these methods simply return the supplied array.
     * All of these methods throw a ValueFormatException if the supplied values
     * could not be converted to this type.
		 */
    T[] create( String[] values ) throws ValueFormatException;
    T[] create( String[] values, TextDecoder decoder ) throws ValueFormatException;
    T[] create( int[] values ) throws ValueFormatException;
    T[] create( long[] values ) throws ValueFormatException;
    T[] create( boolean[] values ) throws ValueFormatException;
    T[] create( float[] values ) throws ValueFormatException;
    T[] create( double[] values ) throws ValueFormatException;
    T[] create( BigDecimal[] values ) throws ValueFormatException;
    T[] create( Calendar[] values ) throws ValueFormatException;
    T[] create( Date[] values ) throws ValueFormatException;
    T[] create( DateTime[] values ) throws ValueFormatException;
    T[] create( Name[] values ) throws ValueFormatException;
    T[] create( Path[] values ) throws ValueFormatException;
    T[] create( Reference[] values ) throws ValueFormatException;
    T[] create( URI[] values ) throws ValueFormatException;
    T[] create( UUID[] values ) throws ValueFormatException;
    T[] create( byte[][] values ) throws ValueFormatException;
    T[] create( Binary[] values ) throws ValueFormatException, IoException;
    T[] create( Object[] values ) throws ValueFormatException, IoException;

    /**
     * Create an iterator over the values (of an unknown type). The factory converts any 
     * values as required.  This is useful when wanting to iterate over the values of a property,
     * where the resulting iterator exposes the desired type.
     * @param values the values
     * @return the iterator of type T over the values, or null if the supplied parameter is null
     * @throws ValueFormatException if the conversion from an iterator of objects could not be performed
     * @throws IoException If an unexpected problem occurs during the conversion.
     */
    Iterator<T> create( Iterator<?> values ) throws ValueFormatException, IoException;
    Iterable<T> create( Iterable<?> valueIterable ) throws ValueFormatException, IoException;
}

This makes it very easy to convert one or more values (of any type, including mixtures) into corresponding value(s) that are of the desired type. For example, converting the first value of a property (regardless of type) to a String is simple:

ValueFactory<String> stringFactory = ...
Property property = ...
String value = stringFactory.create( property.getFirstValue() );

Likewise, iterating over the values in a property and converting them is just as easy:

ValueFactory<String> stringFactory = ...
Property property = ...
for ( String value : stringFactory.create(property) ) {
    // do something with the values
}

What we've glossed over so far, however, is how to obtain the correct ValueFactory for the desired type. If you remember back in the previous chapter, ExecutionContext has a getValueFactories() method that return a ValueFactories interface:

This interface exposes a ValueFactory for each of the types, and even has methods to obtain a ValueFactory given the PropertyType enumeration. So, the previous examples could be expanded a bit:

ValueFactory<String> stringFactory = context.getValueFactories().getStringFactory();
Property property = ...
String value = stringFactory.create( property.getFirstValue() );

and

ValueFactory<String> stringFactory = context.getValueFactories().getStringFactory();
Property property = ...
for ( String value : stringFactory.create(property) ) {
    // do something with the values
}

You might have noticed that several of the ValueFactories methods return subinterfaces of ValueFactory. These add type-specific methods that are more commonly needed in certain cases. For example, here is the NameFactory interface:

public interface NameFactory extends ValueFactory<Name> {

    Name create( String namespaceUri, String localName );
    Name create( String namespaceUri, String localName, TextDecoder decoder );

    NamespaceRegistry getNamespaceRegistry();
}

and here is the DateTimeFactory interface, which adds methods for creating DateTime values for the current time as well as for specific instants in time:

public interface DateTimeFactory extends ValueFactory<DateTime> {

    /**
     * Create a date-time instance for the current time in the local time zone.
     */
    DateTime create();

    /**
     * Create a date-time instance for the current time in UTC.
     */
    DateTime createUtc();

    DateTime create( DateTime original, long offsetInMillis );
    DateTime create( int year, int monthOfYear, int dayOfMonth,
                     int hourOfDay, int minuteOfHour, int secondOfMinute, int millisecondsOfSecond );
    DateTime create( int year, int monthOfYear, int dayOfMonth,
                     int hourOfDay, int minuteOfHour, int secondOfMinute, int millisecondsOfSecond,
                     int timeZoneOffsetHours );
    DateTime create(	int year, int monthOfYear, int dayOfMonth,
                     int hourOfDay, int minuteOfHour, int secondOfMinute, int millisecondsOfSecond,
                     int timeZoneOffsetHours, String timeZoneId );
}

The PathFactory interface defines methods for creating relative and absolute Path objects using combinations of other Path objects and Names and Path.Segments, and introduces methods for creating Path.Segment objects:

public interface PathFactory extends ValueFactory<Path> {

    Path createRootPath();
    Path createAbsolutePath( Name... segmentNames );
    Path createAbsolutePath( Path.Segment... segments );
    Path createAbsolutePath( Iterable<Path.Segment> segments );

    Path createRelativePath();
    Path createRelativePath( Name... segmentNames );
    Path createRelativePath( Path.Segment... segments );
    Path createRelativePath( Iterable<Path.Segment> segments );

    Path create( Path parentPath, Path childPath );
    Path create( Path parentPath, Name segmentName, int index );
    Path create( Path parentPath, String segmentName, int index );
    Path create( Path parentPath, Name... segmentNames );
    Path create( Path parentPath, Path.Segment... segments );
    Path create( Path parentPath, Iterable<Path.Segment> segments );
    Path create( Path parentPath, String subpath );

    Path.Segment createSegment( String segmentName );
    Path.Segment createSegment( String segmentName, TextDecoder decoder );
    Path.Segment createSegment( String segmentName, int index );
    Path.Segment createSegment( Name segmentName );
    Path.Segment createSegment( Name segmentName, int index );
}

And finally, the BinaryFactory defines methods for creating Binary objects from a variety of binary formats, as well as a method that looks for a cached Binary instance given the supplied secure hash:

public interface BinaryFactory extends ValueFactory<Binary> {

    /**
     * Create a value from the binary content given by the supplied input, the approximate length, 
     * and the SHA-1 secure hash of the content. If the secure hash is null, then a secure hash is
     * computed from the content. If the secure hash is not null, it is assumed to be the hash for 
     * the content and may not be checked.
     */
    Binary create( InputStream stream, long approximateLength, byte[] secureHash ) 
                          throws ValueFormatException, IoException;
    Binary create( Reader reader, long approximateLength, byte[] secureHash ) 
                          throws ValueFormatException, IoException;

    /**
     * Create a binary value from the given file.
     */
    Binary create( File file ) throws ValueFormatException, IoException;

    /**
     * Find an existing binary value given the supplied secure hash. If no such binary value exists, 
     * null is returned. This method can be used when the caller knows the secure hash (e.g., from 
     * a previously-held Binary object), and would like to reuse an existing binary value 
     * (if possible) rather than recreate the binary value by processing the stream contents. This is
     * especially true when the size of the binary is quite large.
     * 
     * @param secureHash the secure hash of the binary content, which was probably obtained from a
     *        previously-held Binary object; a null or empty value is allowed, but will always 
     *        result in returning null
     * @return the existing Binary value that has the same secure hash, or null if there is no 
     *        such value available at this time
     */
    Binary find( byte[] secureHash );
}

ModeShape provides efficient implementations of all of these interfaces: the ValueFactory interfaces and subinterfaces; the Path, Path.Segment, Name, Binary, DateTime, and Reference interfaces; and the ValueFactories interface returned by the ExecutionContext. In fact, some of these interfaces have multiple implementations that are optimized for specific but frequently-occurring conditions.

3.5. Readable, TextEncoder, and TextDecoder

As shown above, the Name, Path.Segment, Path, and Property interfaces all extend the Readable interface, which defines a number of getString(...) methods that can produce a (readable) string representation of of that object. Recall that all of these objects contain names with namespace URIs and local names (consisting of any characters), and so obtaining a readable string representation will require converting the URIs to prefixes, escaping certain characters in the local names, and formatting the prefix and escaped local name appropriately. The different getString(...) methods of the Readable interface accept various combinations of NamespaceRegistry and TextEncoder parameters:

@Immutable
public interface Readable {

    /**
     * Get the string form of the object. A default encoder is used to encode characters.
     * @return the encoded string
     */
    public String getString();

    /**
     * Get the encoded string form of the object, using the supplied encoder to encode characters.
     * @param encoder the encoder to use, or null if the default encoder should be used
     * @return the encoded string
     */
    public String getString( TextEncoder encoder );

    /**
     * Get the string form of the object, using the supplied namespace registry to convert any 
     * namespace URIs to prefixes. A default encoder is used to encode characters.
     * @param namespaceRegistry the namespace registry that should be used to obtain the prefix
     *        for any namespace URIs
     * @return the encoded string
     * @throws IllegalArgumentException if the namespace registry is null
     */
    public String getString( NamespaceRegistry namespaceRegistry );

    /**
     * Get the encoded string form of the object, using the supplied namespace registry to convert 
     * the any namespace URIs to prefixes.
     * @param namespaceRegistry the namespace registry that should be used to obtain the prefix for 
     *        the namespace URIs
     * @param encoder the encoder to use, or null if the default encoder should be used
     * @return the encoded string
     * @throws IllegalArgumentException if the namespace registry is null
     */
    public String getString( NamespaceRegistry namespaceRegistry,
                             TextEncoder encoder );

    /**
     * Get the encoded string form of the object, using the supplied namespace registry to convert 
     * the names' namespace URIs to prefixes and the supplied encoder to encode characters, and using 
     * the second delimiter to encode (or convert) the delimiter used between the namespace prefix 
     * and the local part of any names.
     * @param namespaceRegistry the namespace registry that should be used to obtain the prefix 
     *        for the namespace URIs in the names
     * @param encoder the encoder to use for encoding the local part and namespace prefix of any names, 
     *        or null if the default encoder should be used
     * @param delimiterEncoder the encoder to use for encoding the delimiter between the local part 
     *        and namespace prefix of any names, or null if the standard delimiter should be used
     * @return the encoded string
     */
    public String getString( NamespaceRegistry namespaceRegistry,
                             TextEncoder encoder, TextEncoder delimiterEncoder );
}

We've seen the NamespaceRegistry in the previous chapter, but we've haven't yet talked about the TextEncoder interface. A TextEncoder merely does what you'd expect: it encodes the characters in a string using some implementation-specific algorithm. ModeShape provides a number of TextEncoder implementations, including:

The Jsr283Encoder escapes characters that are not allowed in JCR names, per the JSR-283 specification. Specifically, these are the '*', '/', ':', '[', ']', and '|' characters, which are escaped by replacing them with the Unicode characters U+F02A, U+F02F, U+F03A, U+F05B, U+F05D, and U+F07C, respectively.
The NoOpEncoder does no conversion.
The UrlEncoder converts text to be used within the different parts of a URL, as defined by Section 2.3 of RFC 2396. Note that this class does not encode a complete URL (since java.net.URLEncoder and java.net.URLDecoder should be used for such purposes).
The XmlNameEncoder converts any UTF-16 unicode character that is not a valid XML name character according to the World Wide Web Consortium (W3C) Extensible Markup Language (XML) 1.0 (Fourth Edition) Recommendation, escaping such characters as _xHHHH_, where HHHH stands for the four-digit hexadecimal UTF-16 unicode value for the character in the most significant bit first order. For example, the name "Customer_ID" is encoded as "Customer_x0020_ID".
The XmlValueEncoder escapes characters that are not allowed in XML values. Specifically, these are the '&', '<', '>', '"', and ''', which are all escaped to "&", '<', '>', '"', and '''.
The FileNameEncoder escapes characters that are not allowed in file names on Linux, OS X, or Windows XP. Unsafe characters are escaped as described in the UrlEncoder.
The SecureHashTextEncoder performs a secure hash of the input text and returns that hash as the encoded text. This encoder can be configured to use different secure hash algorithms and to return a fixed number of characters from the hash.

All of these classes also implement the TextDecoder interface, which defines a method that decodes an encoded string using the opposite transformation.

Of course, you can provide alternative implementations, and supply them to the appropriate getString(...) methods as required.

3.6. Locations

In addition to Path objects, nodes can be identified by one or more identification properties. These really are just Property instances with names that have a special meaning (usually to connectors). ModeShape also defines a Location class that encapsulates:

the Path to the node; or
one or more identification properties that are likely source-specific and that are represented with Property objects; or
a combination of both.

So, when a client knows the path and/or the identification properties, they can create a Location object and then use that to identify the node. Location is a class that can be instantiated through factory methods on the class:

public abstract class Location implements Iterable<Property>, Comparable<Location> {

    public static Location create( Path path ) { ... }
    public static Location create( UUID uuid ) { ... }
    public static Location create( Path path, UUID uuid ) { ... }
    public static Location create( Path path, Property idProperty ) { ... }
    public static Location create( Path path, Property firstIdProperty, 
                                     Property... remainingIdProperties ) { ... }
    public static Location create( Path path, Iterable<Property idProperties ) { ... }
    public static Location create( Property idProperty ) { ... }
    public static Location create( Property firstIdProperty, 
                                     Property... remainingIdProperties ) { ... }
    public static Location create( Iterable<Property> idProperties ) { ... }
    public static Location create( List<Property> idProperties ) { ... }
    ...
}

Like many of the other classes and interfaces, Location is immutable and cannot be changed once created. However, there are methods on the class to create a copy of the Location object with a different Path, a different UUID, or different identification properties:

public abstract class Location implements Iterable<Property>, Comparable<Location> {
    ...
    public Location with( Property newIdProperty );
    public Location with( Path newPath );
    public Location with( UUID uuid );
    ...
}

One more thing about locations: we'll see later in the next chapter how they are used to make requests to the connectors. When creating the requests, clients usually have an incomplete location (e.g., a path but no identification properties). When processing the requests, connectors provide an actual location that contains the path and all identification properties. If actual Location objects are then reused in subsequent requests by the client, the connectors will have the benefit of having both the path and identification properties and may be able to more efficiently locate the identified node.

3.7. Graph API

ModeShape's Graph API was designed as a lightweight public API for working with graph information. The Graph class is the primary class in API, and each instance represents a single, independent view of a single graph. Graph instances don't maintain state, so every request (or batch of requests) operates against the underlying graph and then returns immutable snapshots of the requested state at the time the request was made.

There are several ways to obtain a Graph instance, as we'll see in later chapters. For the time being, the important thing to understand is what a Graph instance represents and how it interacts with the underlying content to return representations of portions of that underlying graph content.

The Graph class basically represents an internal domain specific language (DSL), designed to be easy to use in an application. The Graph API makes extensive use of interfaces and method chaining, so that methods return a concise interface that has only those methods that make sense at that point. In fact, this should be really easy if your IDE has code completion. Just remember that under the covers, a Graph is just building Request objects, submitting them to the connector, and then exposing the results.

The next few subsections describe how to use a Graph instance.

3.7.1. Using Workspaces

ModeShape graphs have the notion of workspaces that provide different views of the content. Some graphs may have one workspace, while others may have multiple workspaces. Some graphs will allow a client to create new workspaces or destroy existing workspaces, while other graphs will not allow adding or removing workspaces. Some graphs may have workspaces that may show the same (or very similar) content, while other graphs may have workspaces that contain completely independent content.

The Graph object is always bound to a workspace, which initially is the default workspace. To find out what the name of the default workspace is, simply ask for the current workspace after creating the Graph:

Workspace current = graph.getCurrentWorkspace();

To obtain the list of workspaces available in a graph, simply ask for them:

Set<String> workspaceNames = graph.getWorkspaces();

Once you know the name of a particular workspace, you can specify that the graph should use it:

graph.useWorkspace("myWorkspace");

From this point forward, all requests will apply to the workspace named "myWorkspace". At any time, you can use a different workspace, which will affect all subsequent requests made using the graph. To go back to the default workspace, simply supply a null name:

graph.useWorkspace(null);

Of course, creating a new workspace is just as easy:

graph.createWorkspace().named("newWorkspace");

This will attempt to create a workspace named "newWorkspace", which will fail if that workspace already exists. You may want to create a new workspace with a name that should be altered if the name you supply is already used. The following code shows how you can do this:

graph.createWorkspace().namedSomethingLike("newWorkspace");

If there is no existing workspace named "newWorkspace", a new one will be created with this name. However, if "newWorkspace" already exists, this call will create a workspace with a name that is some alteration of the supplied name.

You can also clone workspaces, too:

graph.createWorkspace().clonedFrom("original").named("something");

graph.createWorkspace().clonedFrom("original").namedSomethingLike("something");

As you can see, it's very easy to specify which workspace you want to use or to create new workspaces. You can also find out which workspace the graph is currently using:

String current = graph.getCurrentWorkspaceName();

or, if you want, you can get more information about the workspace:

Workspace current = graph.getCurrentWorkspace();
String name = current.getName();
Location rootLocation = current.getRoot();

3.7.2. Working with Nodes

Now let's switch to working with nodes. This first example returns a map of properties (keyed by property name) for a node at a specific Path:

Path path = ...
Map<Name,Property> propertiesByName = graph.getPropertiesByName().on(path);

This next example shows how the graph can be used to obtain and loop over the properties of a node:

Path path = ...
for ( Property property : graph.getProperties().on(path) ) {
	  ...
}

Likewise, the next example shows how the graph can be used to obtain and loop over the children of a node:

Path path = ...
for ( Location child : graph.getChildren().of(path) ) {
    Path childPath = child.getPath();
	  ...
}

Notice that the examples pass a Path instance to the on(...) and of(...) methods. Many of the Graph API methods take a variety of parameter types, including String, Paths, Locations, UUID, or Property parameters. This should make it easy to use in many different situations.

Of course, changing content is more interesting and offers more interesting possibilities. Here are a few examples:

Path path = ...
Location location = ...
Property idProp1 = ...
Property idProp2 = ...
UUID uuid = ...
graph.move(path).into(idProp1, idProp2);
graph.copy(path).into(location);
graph.delete(uuid);
graph.delete(idProp1,idProp2);

The methods shown above work immediately, as soon as each request is built. However, there is another way to use the Graph object, and that is in a batch mode. Simply create a Graph.Batch object using the batch() method, create the requests on that batch object, and then execute all of the commands on the batch by calling its execute() method. That execute() method returns a Results interface that can be used to read the node information retrieved by the batched requests.

Method chaining works really well with the batch mode, since multiple commands can be assembled together very easily:

Path path = ...
String path2 = ...
Location location = ...
Property idProp1 = ...
Property idProp2 = ...
UUID uuid = ...
graph.batch().move(path).into(idProp1, idProp2)
       .and().copy(path2).into(location)
       .and().delete(uuid)
       .execute();
Results results = graph.batch().read(path2)
                           .and().readChildren().of(idProp1,idProp2)
                           .and().readSugraphOfDepth(3).at(uuid2)
                           .execute();
for ( Location child : results.getNode(path2) ) {
    ...
}

Of course, this section provided just a hint of the Graph API. The Graph interface is actually quite complete and offers a full-featured approach for reading and updating a graph. For more information, see the Graph JavaDocs.

3.8. Requests

ModeShape Graph objects operate upon the underlying graph content, but we haven't really talked about how that works. Recall that the Graph objects don't maintain any stateful representation of the content, but instead submit requests to the underlying graph and return representations of the requested portions of the content. This section focuses on what those requests look like, since they'll actually become very important when working with connectors in the next chapter.

A graph Request is an encapsulation of a command that is to be executed by the underlying graph owner (typically a connector). Request objects can take many different forms, as there are different classes for each kind of request. Each request contains the information needed to complete the processing, and it also is the place where the results (or error) are recorded.

The Graph object creates the Request objects using Location objects to identify the node (or nodes) that are the subject of the request. The Graph can either submit the request immediately, or it can batch multiple requests together into "units of work". The submitted requests are then processed by the underlying system (e.g., connector) and returned back to the Graph object, which then extracts and returns the results.

3.8.1. Basic Requests

There are actually quite a few different types of Request classes:

ReadNodeRequest: A request to read a node's properties and children from the named workspace in the source. The node may be specified by path and/or by identification properties. The connector returns all properties and the locations for all children, or sets a PathNotFoundException error on the request if the node did not exist in the workspace. If the node is found, the connector sets on the request the actual location of the node (including the path and identification properties). The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
VerifyNodeExistsRequest: A request to verify the existence of a node at the specified location in the named workspace of the source. The connector returns all the actual location for the node if it exists, or sets a PathNotFoundException error on the request if the node does not exist in the workspace. The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
ReadAllPropertiesRequest: A request to read all of the properties of a node from the named workspace in the source. The node may be specified by path and/or by identification properties. The connector returns all properties that were found on the node, or sets a PathNotFoundException error on the request if the node did not exist in the workspace. If the node is found, the connector sets on the request the actual location of the node (including the path and identification properties). The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
ReadPropertyRequest: A request to read a single property of a node from the named workspace in the source. The node may be specified by path and/or by identification properties, and the property is specified by name. The connector returns the property if found on the node, or sets a PathNotFoundException error on the request if the node or property did not exist in the workspace. If the node is found, the connector sets on the request the actual location of the node (including the path and identification properties). The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
ReadAllChildrenRequest: A request to read all of the children of a node from the named workspace in the source. The node may be specified by path and/or by identification properties. The connector returns an ordered list of locations for each child found on the node, an empty list if the node had no children, or sets a PathNotFoundException error on the request if the node did not exist in the workspace. If the node is found, the connector sets on the request the actual location of the parent node (including the path and identification properties). The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
ReadBlockOfChildrenRequest: A request to read a block of children of a node, starting with the n^th child from the named workspace in the source. This is designed to allow paging through the children, which is much more efficient for large numbers of children. The node may be specified by path and/or by identification properties, and the block is defined by a starting index and a count (i.e., the block size). The connector returns an ordered list of locations for each of the node's children found in the block, or an empty list if there are no children in that range. The connector also sets on the request the actual location of the parent node (including the path and identification properties) or sets a PathNotFoundException error on the request if the parent node did not exist in the workspace. The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
ReadNextBlockOfChildrenRequest: A request to read a block of children of a node, starting with the children that immediately follow a previously-returned child from the named workspace in the source. This is designed to allow paging through the children, which is much more efficient for large numbers of children. The node may be specified by path and/or by identification properties, and the block is defined by the location of the node immediately preceding the block and a count (i.e., the block size). The connector returns an ordered list of locations for each of the node's children found in the block, or an empty list if there are no children in that range. The connector also sets on the request the actual location of the parent node (including the path and identification properties) or sets a PathNotFoundException error on the request if the parent node did not exist in the workspace. The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
ReadBranchRequest: A request to read a portion of a subgraph that has as its root a particular node, up to a maximum depth. This request is an efficient mechanism when a branch (or part of a branch) is to be navigated and processed, and replaces some non-trivial code to read the branch iteratively using multiple ReadNodeRequests. The connector reads the branch to the specified maximum depth, returning the properties and children for all nodes found in the branch. The connector also sets on the request the actual location of the branch's root node (including the path and identification properties). The connector sets a PathNotFoundException error on the request if the node at the top of the branch does not exist in the workspace. The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
CompositeRequest: A request that actually comprises multiple requests (none of which will be a composite). The connector simply processes all of the requests in the composite request, but should set on the composite request any error (usually the first error) that occurs during processing of the contained requests.

3.8.2. Change Requests

ChangeRequest is a subclass of Request that provides a base class for all the requests that request a change be made to the content. As we'll see later, these ChangeRequest objects also get reused by the observation system.

There specific subclasses of ChangeRequest are:

CreateNodeRequest: A request to create a node at the specified location and setting on the new node the properties included in the request. The connector creates the node at the desired location, adjusting any same-name-sibling indexes as required. (If an SNS index is provided in the new node's location, existing children with the same name after that SNS index will have their SNS indexes adjusted. However, if the requested location does not include a SNS index, the new node is added after all existing children, and it's SNS index is set accordingly.) The connector also sets on the request the actual location of the new node (including the path and identification properties).. The connector sets a PathNotFoundException error on the request if the parent node does not exist in the workspace. The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
RemovePropertiesRequest: A request to remove a set of properties on an existing node. The request contains the location of the node as well as the names of the properties to be removed. The connector performs these changes and sets on the request the actual location (including the path and identification properties) of the node. The connector sets a PathNotFoundException error on the request if the node does not exist in the workspace. The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
UpdatePropertiesRequest: A request to set or update properties on an existing node. The request contains the location of the node as well as the properties to be set and those to be deleted. The connector performs these changes and sets on the request the actual location (including the path and identification properties) of the node. The connector sets a PathNotFoundException error on the request if the node does not exist in the workspace. The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
RenameNodeRequest: A request to change the name of a node. The connector changes the node's name, adjusts all SNS indexes accordingly, and returns the actual locations (including the path and identification properties) of both the original location and the new location. The connector sets a PathNotFoundException error on the request if the node does not exist in the workspace. The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
CopyBranchRequest: A request to copy a portion of a subgraph that has as its root a particular node, up to a maximum depth. The request includes the name of the workspace where the original node is located as well as the name of the workspace where the copy is to be placed (these may be the same, but may be different). The connector copies the branch from the original location, up to the specified maximum depth, and places a copy of the node as a child of the new location. The connector also sets on the request the actual location (including the path and identification properties) of the original location as well as the location of the new copy. The connector sets a PathNotFoundException error on the request if the node at the top of the branch does not exist in the workspace. The connector sets a InvalidWorkspaceException error on the request if one of the named workspaces does not exist.
MoveBranchRequest: A request to move a subgraph that has a particular node as its root. The connector moves the branch from the original location and places it as child of the specified new location. The connector also sets on the request the actual location (including the path and identification properties) of the original and new locations. The connector will adjust SNS indexes accordingly. The connector sets a PathNotFoundException error on the request if the node that is to be moved or the new location do not exist in the workspace. The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
DeleteBranchRequest: A request to delete an entire branch specified by a single node's location. The connector deletes the specified node and all nodes below it, and sets the actual location, including the path and identification properties, of the node that was deleted. The connector sets a PathNotFoundException error on the request if the node being deleted does not exist in the workspace. The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.

3.8.3. Workspace Read Requests

There are also requests that read information about workspaces:

GetWorkspacesRequest: A request to obtain the names of the existing workspaces that are accessible to the caller.
VerifyWorkspaceRequest: A request to verify that a workspace with a particular name exists. The connector returns the actual location for the root node if the workspace exists, as well as the actual name of the workspace (e.g., the default workspace name if a null name is supplied).

3.8.4. Workspace Change Requests

And there are also requests that deal with changing workspaces (and thus extend ChangeRequest):

CreateWorkspaceRequest: A request to create a workspace with a particular name. The connector returns the actual location for the root node if the workspace exists, as well as the actual name of the workspace (e.g., the default workspace name if a null name is supplied). The connector sets a InvalidWorkspaceException error on the request if the named workspace already exists.
DestroyWorkspaceRequest: A request to destroy a workspace with a particular name. The connector sets a InvalidWorkspaceException error on the request if the named workspace does not exist.
CloneWorkspaceRequest: A request to clone one named workspace as another new named workspace. The connector sets a InvalidWorkspaceException error on the request if the original workspace does not exist, or if the new workspace already exists.

3.8.5. Search Requests

Several requests are designed to push searches and queries down to the connector, if connectors support such operations:

SearchRequest: A request to query a named workspace using a supplied query. The connector returns tuples containing the columns and resulting values, plus statistics about the execution of the query.
FullTextSearchRequest: A request to search a named workspace using a supplied full-text search string and optional offset and limit values. The connector returns tuples containing the columns and resulting values, plus statistics about the execution of the query.

3.8.6. Function Requests

One type of request allows a function to be passed to the connector:

FunctionRequest: A request that executes a supplied function at a particular location within a named workspace. The inputs to the function can be set on the request (as a series of name-value pairs), and when executed the function will set the outputs as name-value pairs on the request. This request is extremely useful for (complex) operations that must first read information from the workspace and then perform other actions.

This section covered the different kinds of Request classes. The next section provides a easy way to encapsulate how a component should responds to these requests, and after that we'll see how these Request objects are also used in the observation framework.

3.9. Request processors

ModeShape connectors are typically the components that receive these Request objects. We'll dive deep into connectors in the next chapter, but before we do there is one more component related to Requests that should be discussed.

The RequestProcessor class is an abstract class that defines a process(...) method for each concrete Request subclass. In other words, there is a process(CompositeRequest) method, a process(ReadNodeRequest) method, and so on. This makes it easy to implement behavior that responds to the different kinds of Request classes: simply subclass the RequestProcessor, override all of the abstract methods, and optionally overriding any of the other methods that have a default implementation.

Note

The RequestProcessor abstract class contains default implementations for quite a few of the process(...) methods, and these will be sufficient but probably not efficient or optimum. If you can provide a more efficient implementation given your source, feel free to do so. However, if performance is not a big issue, all of the concrete methods will provide the correct behavior. Keep things simple to start out - you can always provide better implementations later.

3.10. Observation

The ModeShape graph model also incorporates an observation framework that allows components to register and be notified when changes occur within the content owned by a graph.

Many event frameworks define the listeners and sources as interfaces. While this is often useful, it requires that the implementations properly address the thread-safe semantics of managing and calling the listeners. The ModeShape observation framework uses abstract or concrete classes to minimize the effort required for implementing ChangeObserver or Observable. These abstract classes provide implementations for a number of utility methods (such as the unregister() method on ChangeObserver) that also save effort and code.

However, one of the more important reasons for providing classes is that ChangeObserver uses weak references to track the Observable instances, and the ChangeObservers class uses weak references for the listeners. This means that an observer does not prevent Observable instances from being garbage collected, nor do observers prevent Observable instances from being garbage collected. These abstract class provide all this functionality for free.

3.10.1. Observable

Any component that can have changes and be observed can implement the Observable interface. This interface allows Observers to register (or be registered) to receive notifications of the changes. However, a concrete and thread-safe implementation of this interface, called ChangeObservers, is available and should be used where possible, since it automatically manages the registered ChangeObserver instances and properly implements the register and unregister mechanisms.

3.10.2. Observers

Components that are to recieve notifications of changes are called observers. To create an observer, simply extend the ChangeObserver abstract class and provide an implementation of the notify(Changes) method. Then, register the observer with an Observable using its register(ChangeObserver) method. The observer's notify(Changes) method will then be called with the changes that have been made to the Observable.

When an observer is no longer needed, it should be unregistered from all Observable instances with which it was registered. The ChangeObserver class automatically tracks which Observable instances it is registered with, and calling the observer's unregister() will unregister the observer from all of these Observables. Alternatively, an observer can be unregistered from a single Observable using the Observable's unregister(ChangeObserver) method.

3.10.3. Changes

The Changes class represents the set of individual changes that have been made during a single, atomic operation. Each Changes instance has information about the source of the changes, the timestamp at which the changes occurred, and the individual changes that were made. These individual changes take the form of ChangeRequest objects, which we'll see more of in the next chapter. Each request is frozen, meaning it is immutable and will not change. Also none of the change requests will be marked as cancelled.

Using the actual ChangeRequest objects as the "events" has a number of advantages. First, the existing ChangeRequest subclasses already contain the information to accurately and completely describe the operation. Reusing these classes means we don't need a duplicate class structure or come up with a generic event class.

Second, the requests have all the state required for an event, plus they often will have more. For example, the DeleteBranchRequest has the actual location of the branch that was deleted (and in this way is not much different than a more generic event), but the CreateNodeRequest has the actual location of the created node along with the properties of that node. Additionally, the RemovePropertyRequest has the actual location of the node along with the name of the property that was removed. In many cases, these requests have all the information a more general event class might have but then hopefully enough information for many observers to use directly without having to read the graph to decide what actually changed.

Third, the requests that make up a Changes instance can actually be replayed. Consider the case of a cache that is backed by a RepositorySource, which might use an observer to keep the cache in sync. As the cache is notified of Changes, the cache can simply replay the changes against its source.

As we'll see in the next chapter, each connector is responsible for propagating the ChangeRequest objects to the connector's Observer. But that's not the only use of Observers. We'll also see later how the sequencing system uses Observers to monitor for changes in the graph content to determine which, if any, sequencers should be run. And, the JCR implementation also uses the observation framework to propagate those changes to JCR clients.

3.11. Summary

In this chapter, we introduced ModeShape's graph model and showed the different kinds of objects used to represent nodes, paths, names, and properties. We saw how all of these objects are actually immutable, and how the low-level Graph API uses this characteristic to provide a stateless and thread-safe interface for working with repository content using the request model used to read, update, and change content.

Next, we'll dive into the connector framework, which builds on top of the graph model and request model, allowing ModeShape to access the graph content stored in many different kinds of systems.