JBoss.orgCommunity Documentation

Chapter 9. Using the JCR API with ModeShape

9.1. What's new in JCR 2.0?
9.1.1. Connecting
9.1.2. Identifiers
9.1.3. Binary Values
9.1.4. Node Type Management
9.1.5. Queries
9.1.6. Workspace Management
9.1.7. Observation
9.1.8. Locking
9.1.9. Versioning
9.1.10. Importing and Exporting
9.1.11. Orderable Child Nodes
9.1.12. Paths
9.1.13. getItem(String)
9.2. Obtaining a JCR Repository
9.2.1. URL formats
9.2.2. Accessing Repositories from JNDI
9.2.3. Cleaning Up after JcrRepositoryFactory
9.3. ModeShape's JcrEngine
9.4. Creating JCR Sessions
9.4.1. Using JAAS
9.4.2. Using Custom Security
9.4.3. Using HTTP Servlet security
9.4.4. Guest (Anonymous) User Access
9.5. JCR Specification Support
9.5.1. Required features
9.5.2. Optional features
9.5.3. TCK Compatibility features
9.5.4. JCR Security
9.5.5. Built-In Node Types
9.5.6. Custom Node Type Registration
9.6. Summary

The Content Repository for Java Technology API 2.0 provides a standard Java API for working with content repositories. Abbreviated "JCR", this API was developed as part of the Java Community Process under JSR-170 (JCR 1.0) and has been revised and improved as JCR 2.0 under JSR-283. Some of the improvements make it possible for your application to be written entirely against the JCR 2.0 API.

Note

In the interests of brevity, this chapter does not attempt to reproduce the JSR-283 specification nor provide an exhaustive definition of ModeShape JCR capabilities. Rather, this chapter will describe any deviations from the specification as well as any ModeShape-specific public APIs and configuration. So, for a detailed explanation of the JCR API and its many interfaces and methods, see the JSR-283 specification.

Using ModeShape within your application is actually quite straightforward, and with JCR 2.0 it is possible for your application to do everything using only the JCR 2.0 API. Your application will first obtain a javax.jcr.Repository instance, and will use that object to create sessions through which your application will read, modify, search, or monitor content in the repository. JCR sessions are designed to be lightweight, so it is perfectly fine (and actually recommended) for your application to create many short-lived sessions while generally avoiding longer-lived sessions. In fact, javax.jcr.Session objects are not required to be thread-safe (and are not in ModeShape), so your application should avoid using a single Session instance in multiple threads.

Before we get started talking about how to use ModeShape via the standard JCR 2.0 API, it's worth spending a little time talking about the changes in JCR 2.0 compared with JCR 1.0.

Although an application written against the JCR 1.0 API will for the most part work very well against a JCR 2.0 repository, there are a few improvements to the JCR 2.0 API that your application will likely want to leverage.

Let's look at some of the more important changes in the JCR 2.0 API. However, this is certainly not definitive nor a complete comparison, so please consult the JSR-283 specification.

JCR 1.0 did not specify a way for client applications to obtain the Repository instance, though the JCR 1.0 specification did state this is typically done through JNDI. Consequently, JCR clients either used the JNDI approach or were required to use implementation-specific code. Often, client applications abstracted this process to minimize their reliance upon implementation-specific interfaces.

While the JNDI approach still works, JCR 2.0 introduces a new mechanism that makes it possible to find a Repository instance using only the JCR API. Details of this are covered more in later, but suffice to say that ModeShape does support this new RepositoryFactory approach.

How this affects your application: If your application used an implementation-specific approach to obtaining a Repository instance, you might consider changing it to use the new RepositoryFactory mechanism.

JCR 1.0 has always supported storing binary values in properties, but clients could do little more than just stream the bytes for each value. JCR 2.0 introduces a Binary interface that defines a way to get the size of the binary value, an InputStream to the value, a method for random access to the value's bytes, and a way to dispose of the binary value when completed (allowing the implementation to better clean up memory and other resources).

How this affects your application: The way your existing JCR application accesses and sets binary values will still work, but the methods are now deprecated. Therefore, you will very likely want to change to use the new Binary interface. For example, code that previously accessed the input stream directly from the Property:

Property property = ...

InputStream stream = property.getInputStream();
try {
   // Read stream
} finally {
   stream.close();
}

can be minimally changed to first get the Binary value and then get the stream from this Binary value:

Property property = ...

InputStream stream = property.getBinary().getInputStream();
try {
   // Read stream
} finally {
   stream.close();
}

This second example is not using any deprecated methods, but does not actually dispose of the Binary object. This actually works just fine in ModeShape, as closing the InputStream will automatically dispose of the Binary object.

You may also consider whether your application may benefit from the new Binary.getSize() or Binary.read(byte[],long) methods.

JCR 1.0 made it possible for applications to query the repository using XPath and JCR-SQL query languages. JCR 2.0 maintains the (mostly) similar Java interfaces for executing queries, but it deprecates the XPath and JCR-SQL query languages and introduces a new declarative language called "JCR-SQL2" that is a very good improvement over JCR-SQL. JCR 2.0 also introduces a new query object model (called "JCR-QOM") for defining queries using a programmatic API.

ModeShape supports all of these languages (XPath, JCR-SQL, JCR-SQL2, JCR-QOM), and also supports a full-text query language that is defined by the full-text search expression in the JCR-SQL2 language. Additionally, ModeShape extends most of these languages to support richer and more capable queries.

How this affects your application: Your application can continue to use XPath and JCR-SQL queries. However, your application may benefit from switching from JCR-SQL to JCR-SQL2 and its greater capabilities and expressive power. Leverage some of the ModeShape extensions to make your JCR-SQL2 queries even more powerful.

Versioning of nodes was defined as an optional feature of the JCR 1.0 API. The JCR 2.0 API expanded upon locking by defining a simple versioning model, introducing the VersionManager interface, and making some semantic changes as well. For example, restoring a version that contained a versioned child in its subgraph no longer automatically restores the versioned child. This behavior was ambiguous in the JCR 1.0 specification, and ModeShape 1.x performed the restore operation recursively down the graph. The JCR 2.0 specification more clearly requires a non-recursive restore. Therefore, ModeShape 2.0.0.Final now supports the "full versioning" model.

How this affects your application: If your application is already using JCR 1.0 versioning feature, be aware that many of the version-related methods on Node were deprecated in JCR 2.0 and moved to the new VersionManager interface. Also, any reliance upon ModeShape's recursive restore operation must be changed, per the JCR 2.0 specification.

Before your application can use a JCR repository, it has to find it. As mentioned above, the JCR 2.0 API defines a new RepositoryFactory interface that can be used with the Java Standard Edition Service Loader mechanism to obtain a Repository instance, all using the JCR API alone:



Map<String,String> parameters = ...
Repository repository = null;
for (RepositoryFactory factory : ServiceLoader.load(RepositoryFactory.class)) {
    repository = factory.getRepository(parameters);
    if (repository != null) break;
}

This code looks for all RepositoryFactory implementations on the classpath (assuming those implementations properly defined the service provider within their JARs), and will ask each to create a repository given the supplied parameters. Thus, the parameters are specific to the implementation you want to use.

Note

With JCR 1.0, applications could only find a Repository instance using implementation-specific code. This new JCR 2.0 approach is a bit more complicated, but should work with most JCR 2.0 implementations and does not require using any implementation classes. And your application can even load the parameters from a configuration resource, meaning nothing in your application depends on a particular JCR implementation.

ModeShape uses a single property named "org.modeshape.jcr.URL" with a value that is a URL that either resolves to a ModeShape configuration file, such as

 file://path/to/configFile.xml?repositoryName=MyRepository 

or points to a JcrEngine instance in JNDI:

 jndi://name/in/jndi?repositoryName=MyRepository 

Pointing directly to a configuration file often works well in stand-alone applications, while using JNDI works great for applications deployed to server platforms (e.g., an application server or servlet container) where multiple applications might want to use the same JCR repository. We'll see in the next section how to configure ModeShape's JcrEngine explicitly and register it in JNDI.

So, here's the ServiceLoader example again, but with ModeShape-specific parameters:



String configUrl = ... ; // URL that points to your configuration file
Map<String,String> parameters = Collections.singletonMap("org.modeshape.jcr.URL", configUrl);
Repository repository = null;
for (RepositoryFactory factory : ServiceLoader.load(RepositoryFactory.class)) {
    repository = factory.getRepository(parameters);
    if (repository != null) break;
}

Once you've gotten hold of a Repository instance, you can use it to create Sessions, using code similar to:



Credentials credentials = ...; // JCR credentials
String workspaceName = ...;  // Name of repository workspace
Session session = repository.login(credentials,workspaceName);

We'll talk about the various ways of creating sessions in a later chapter.

The value of configUrl in the code snippets above would be something like file:relativePathToConfigFile?repositoryName=yourRepositoryName. In this example, the configuration file that specifies the repository setup will be loaded from the file path relativePathToConfigFile and the repository named yourRepositoryName will be returned. If there is no repository with that name or the configuration file does not exist at that path, getRepository(Map) will return null. The format for the configuration file is the same as used above when loading a JcrConfiguration object from a configuration file.

If an absolute path to the configuration file works better, a value for configUrl like file:///absolutePathToConfigFile?repositoryName=yourRepositoryName could have been used instead. Note the addition of the three forward slashes after the protocol portion of the URL (i.e., file:). this indicates that the following path is an absolute path.

This method can be used to load files from the file system or from the classpath. If there is no file found at the given path, the same path will be used to try to load the configuration file as a resource through the classloader.

Behind the scenes, the JcrRepositoryFactory is checking to see if it has already configured and started a JcrEngine for the named configuration file. If it has already created a JcrEngine for this configuration, then that JcrEngine is reused. Otherwise, a new JcrEngine is created and configured based on the given configuration file. Either way, the JcrEngine is used to get the reference to the returned Repository.

Given a slightly different URL, the same code used above can be reused to get a Repository from a JcrEngine that has been previously deployed through JNDI. JNDI URLs take the form jndi:///nameOfJndiResource?repositoryName=yourRepositoryName. The nameOfJndiResource is passed directly to a JNDI lookup. If no JcrEngine exists at the given name, the getRepository(Map) method will return null. Also, any additional parameters besides JcrRepositoryFactory.URL that are provided in the parameters map in the getRepository(Map) method will be used in the constructor for the InitialContext used to look up the JNDI reference.

Accessing a repository through JNDI differs slightly from accessing a repository from a configuration file in that the JcrRepositoryFactory will never create a new JcrEngine instance in response to a getRepository invocation with a JNDI URL.

As a preceding section notes, it is possible for the JcrRepositoryFactory to create one or more JcrEngine instances. Although the JSR-283 specification does not specify a way to shutdown engines or repositories created as a side effect of JcrRepositoryFactory use, ModeShape has an extension to the JSR-283 API that provides this capability.



org.modeshape.jcr.api.RepositoryFactory repoFactory = JcrRepositoryFactory();
// Create any number of JcrEngines by calling repoFactory.getRepository(params);
// Do some stuff with the repository
// Now clean up the repository when finished
repoFactory.shutdown(30, TimeUnit.SECONDS);

The code listed above will instantiate a new JcrRepositoryFactory and use a ModeShape-specific method to shutdown any JcrEngines created by JcrRepositoryFactory and wait for up to 30 seconds for each of them to shutdown gracefully. Behind the scenes, the shutdown(long, TimeUnit) method is iterating over an internal collection of JcrEngines and calling shutdown and awaitTermination(long, TimeUnit) on each engine.

Although the preferred mechanism to obtain a Repository object is through the RepositoryFactory interface described above, there are times when an application wants or needs to have more control over an actual ModeShape engine, which encapsulates everything necessary to run one or more JCR repositories and managing the underlying repository sources, the pools of connections to the sources, the sequencers, the MIME type detector(s), and the Repository implementations.

Creating a new JcrEngine instance is very easy if you already have a valid JcrConfiguration instance as described in the previous chapter. Once you have a valid JcrConfiguration instance, all you have to do is build and start the engine:



JcrConfiguration config = ...
JcrEngine engine = config.build();
engine.start();
 

Obtaining a JCR Repository instance is a matter of simply asking the engine for it by the name defined in the configuration:



javax.jcr.Repository repository = engine.getRepository("Name of repository");
 

At this point, your application can proceed by working with the JCR API.

And, once you're finished with the JcrEngine, you should shut it down:



engine.shutdown();
engine.awaitTermination(3,TimeUnit.SECONDS);    // optional
 

When the shutdown() method is called, the Repository instances managed by the engine are marked as being shut down, and they will not be able to create new Sessions. However, any existing Sessions or ongoing operations (e.g., event notifications) present at the time of the shutdown() call will be allowed to finish. In essence, shutdown() is a graceful request, and since it may take some time to complete, you can wait until the shutdown has completed by simply calling awaitTermination(...) as shown above. This method will block until the engine has indeed shutdown or until the supplied time duration has passed (whichever comes first). And, yes, you can call the awaitTermination(...) method repeatedly if needed.

Once you have obtained a reference to the JCR Repository, you can create a JCR session using one of its login(...) methods. The JSR-283 specification provides four login methods, but the behavior of these methods depends on the kind of authentication system your application is using.

The login() method allows the implementation to choose its own security context to create a session in the default workspace for the repository. The ModeShape JCR implementation uses the security context from the current JAAS AccessControlContext. This implies that this method will throw a LoginException if it is not executed as a PrivilegedAction (AND the JcrRepository.Options.ANONYMOUS_USER_ROLES option does not allow access; see below for an example of how to configure guest user access). Here is one example of how this might work:

Subject subject = ...;
Session session = Subject.doAsPrivileged(subject, new PrivilegedExceptionAction<Session>() {
    public Session run() throws Exception {
        return repository.login();
    }
}, AccessController.getContext());

Another variant of this is to use the AccessControlContext directly, which then operates against the current Subject:

Session session = AccessController.doPrivileged( new PrivilegedExceptionAction<Session>() {
    public Session run() throws Exception {
        return repository.login();
    }
});

Either of these approaches will yield a session with the same user name and roles as subject. The login(String workspaceName) method is comparable and allows the workspace to be specified by name:

Subject subject = ...;
final String workspaceName = ...;
Session session = (Session) Subject.doAsPrivileged(subject, new PrivilegedExceptionAction<Session>() {
    public Session run() throws Exception {
        return repository.login(workspaceName);
    }}, AccessController.getContext());

The JCR API also allows supplying a JCR Credentials object directly as part of the login process, although ModeShape imposes some requirements on what types of Credentials may be supplied. The simplest way is to provide a JCR SimpleCredentials object. These credentials will be validated against the JAAS realm named "modeshape-jcr", unless another realm name is provided as an option during the JCR repository configuration. For example:

String userName = ...;
char[] password = ...;
Session session = repository.login(new SimpleCredentials(userName, password));

Similarly, the login(Credentials credentials, String workspaceName) method enables passing the credentials and a workspace name:

String userName = ...;
char[] password = ...;
String workspaceName = ...;
Session session = repository.login(new SimpleCredentials(userName, password), workspaceName);

If a LoginContext is available for the user, that can be used as part of the credentials to authenticate the user with ModeShape instead. This snippet uses an anonymous class to provide the login context, but any class with a LoginContext getLoginContext() method can be used as well.

final LoginContext loginContext = ...;
Session session = repository.login(new Credentials() {
	LoginContext loginContext getLoginContext() {
		return loginContext;
	}
}, workspaceName);

Not all applications can or want to use JAAS for their authentication system, so ModeShape provides a way to integrate your own custom security provider. The first step is to provide a custom implementation of SecurityContext that integrates with your application security, allowing ModeShape to discover the authenticated user's name, determine whether the authenticated user has been assigned particular roles (see the JCR Security section), and to notify your application security system that the authenticated session (for JCR) has ended.

The next step is to wrap your SecurityContext instance within an instance of SecurityContextCredentials, and pass it as the Credentials parameter in one of the two login(...) methods:

SecurityContext securityContext = new CustomSecurityContext(...);
Session session = repository.login(new SecurityContextCredentials(securityContext));
			

Once the Session is obtained, the repository content can be accessed and modified like any other JCR repository.

Servlet-based applications can make use of the servlet's existing authentication mechanism from HttpServletRequest. Please note that the example below assumes that the servlet has a security constraint that prevents unauthenticated access.

HttpServletRequest request = ...;
SecurityContext securityContext = new ServletSecurityContext(request);
Session session = repository.login(new SecurityContextCredentials(securityContext));

You'll note that this is just a specialization of the custom security context approach, since the ServletSecurityContext just implements the SecurityContext interface and delegates to the HttpServletRequest. Feel free to use this class in your servlet-based applications.

We believe that ModeShape JCR implementation is JCR-compliant, but we are awaiting final certification of compliance. Additionally, the JCR specification allows some latitude to implementors for some implementation details. The sections below clarify ModeShape's current and planned behavior. As always, please consult the current list of known issues and bugs.

ModeShape 2.0.0.Final implements all of the JCR 2.0 required features:

ModeShape supports several query languages, including the JCR-SQL2 and JCR-QOM query languages defined in JSR-283, and the XPath and JCR-SQL languages defined in JSR-170 but deprecated in JSR-283. ModeShape also supports a fulltext search language that is defined by the full-text search expression grammar used in the second parameter of the CONTAINS(...) function of the JCR-SQL2 language. We just pulled it out and made it available as a first-class query language.

The ModeShape project has not yet been certified to be fully-compliant with the JCR 2.0 specification, but does plan on attaining this certification in the very near future.

However, the ModeShape project also runs the JCR TCK unit tests from the reference implementation every night. These tests technically do not represent the official TCK, but are used within the TCK. Most of these unit tests are run in the modeshape-jcr module against the in-memory repository to ensure our JCR implementation behaves correctly, and the same tests are run in the modeshape-integration-tests module against a variety of connectors to ensure they're implemented correctly. The modeshape-jcr-tck module runs all of these TCK unit tests, and currently there are only a handful of failures due to known issues (see the JCR specification support section for details).

ModeShape 2.0.0.Final currently passes 1371 of the 1391 JCR TCK tests, where 17 of these 20 failures appear to be bugs in the TCK tests (see JCR-2648, JCR-2661, JCR-2662, and JCR-2663). The remaining 3 failures are due to known issues (see MODE-760 and MODE-786).

Although the JSR-283 specification requires implementation of the Session.checkPermission(String, String) method, it allows implementors to choose the granularity of their access controls. ModeShape supports coarse-grained, role-based access control at the repository and workspace level.

ModeShape has extended the set of JCR-defined actions ("add_node", "set_property", "remove", and "read") with additional actions ("register_type", "register_namespace", "unlock_any", "create_workspace" and "delete_workspace"). The "register_type" and "register_namespace" permissions control the ability to register (and unregister) node types and namespaces, respectively. The "unlock_any"" permission grants the user the ability to unlock any locked node or branch (as opposed to users without that permission who can only unlock nodes or branches that they have locked themselves or for which they hold the lock token). Finally, the "create_workspace" and "delete_workspace" permissions grant the user the ability to create workspaces and delete workspaces, respectively, using the corresponding methods on Workspace. Permissions to perform these actions are aggregated in roles that can be assigned to users.

ModeShape currently defines three roles: readonly, readwrite, and admin. If the Credentials passed into Repository.login(...) (or the Subject from the AccessControlContext, if one of the no-credential login methods were used) have any of these roles, the session will have the corresponding access to all workspaces within the repository. The mapping from the roles to the actions that they allow is provided below, for any values of path.


It is also possible to grant access only to one or more repositories on a single ModeShape server or to one or more named workspaces within a repository. The format for role names is defined below:


It is also possible to grant more than one role to the same user. For example, the user "jsmith" could be granted the roles "readonly.production", "readwrite.production.jsmith", and "readwrite.staging" to allow read-only access to any workspace on a production repository, read/write access to a personal workspace on the same production repository, and read/write access to any workspace in a staging repository.

As a final note, the ModeShape JCR implementation may have additional security roles added in the future. A CONNECT role is already being used by the ModeShape REST Server to control whether users have access to the repository through that means.

ModeShape supports all of the built-in node types described in the JSR-283 specification. ModeShape also defines some custom node types in the mode namespace, but none of these node types (other than mode:resource) are intended to be used by developers integrating with ModeShape and may be changed or removed at any time.

Although the JSR-283 specification does not require support for registration and unregistration of custom types, ModeShape supports this extremely useful feature. Custom node types can be added at startup, as noted above, at runtime through a ModeShape-specific interface that accepts CND files, or through the JSR-283 node type template methods. All three of these node type registration mechanisms are supported equally within ModeShape, although the CND approach for defining node types is recommended.

Note

ModeShape also supports defining custom node types to load at startup. This is discussed in more detail in the previous chapter.

Node types can be defined like so:

Session session = ... ;
Workspace workspace = session.getWorkspace();

// Obtain the ModeShape-specific node type manager ...
JcrNodeTypeManager nodeTypeManager = (JcrNodeTypeManager) workspace.getNodeTypeManager();

// Declare a mixin node type named "searchable" (with no namespace)
NodeTypeTemplate nodeType = nodeTypeManager.createNodeTypeTemplate();
nodeType.setName("searchable");
nodeType.setMixin(true);

// Add a mandatory child named "source" with a required primary type of "nt:file" 
NodeDefinitionTemplate childNode = nodeTypeManager.createNodeDefinitionTemplate();
childNode.setName("source");
childNode.setMandatory(true);
childNode.setRequiredPrimaryTypesNames(new String[] { "nt:file" });
childNode.setDefaultPrimaryTypeName("nt:file");
nodeType.getNodeDefinitionTemplates().add(childNode);

// Add a multi-valued STRING property named "keywords"
PropertyDefinitionTemplate property = nodeTypeManager.createPropertyDefinitionTemplate();
property.setName("keywords");
property.setMultiple(true);
property.setRequiredType(PropertyType.STRING);
nodeType.getPropertyDefinitionTemplates().add(property);

// Register the custom node type
nodeTypeManager.registerNodeType(nodeType,false);

Residual properties and child node definitions can also be defined simply by not calling setName on the template.

Custom node types can be defined more succinctly through the CND file format defined by the JCR 2.0 specification. In fact, this is how JBoss ModeShape defines its built-in node types. An example CND file that declares the same node type as above would be:

[searchable] mixin
- keywords (string) multiple
+ source (nt:file) = nt:file mandatory

This definition could then be registered as part of the repository configuration, using the JcrConfiguration class (see the previous chapter). Or, you can also use a Session to declare the node types in a CND file, but this also requires ModeShape-specific interfaces and classes:

String pathToCndFileInClassLoader = ...;
CndNodeTypeSource nodeTypeSource = new CndNodeTypeSource(pathToCndFileInClassLoader);

for (Problem problem : nodeTypeSource.getProblems()) {
    System.err.println(problem);
}
if (!nodeTypeSource.isValid()) {
    throw new IllegalStateException("Problems loading node types");
}

Session session = ... ;
// Obtain the ModeShape-specific node type manager ...
Workspace workspace = session.getWorkspace();
JcrNodeTypeManager nodeTypeManager = (JcrNodeTypeManager) workspace.getNodeTypeManager();
nodeTypeManager.registerNodeTypes(nodeTypeSource);

The CndNodeTypeSource class actually implements the JcrNodeTypeSource interface, so other implementations can actually be defined. For more information, see the JavaDoc for JcrNodeTypeSource.

ModeShape also supports a simple means of unregistering types, although it is not possible to unregister types that are currently being used by nodes or as required primary types or supertypes of other types. Unused node types can be unregistered with the following code:

String unusedNodeTypeName = ...;

Session session = ... ;
// Obtain the ModeShape-specific node type manager ...
Workspace workspace = session.getWorkspace();
JcrNodeTypeManager nodeTypeManager = (JcrNodeTypeManager) workspace.getNodeTypeManager();
nodeTypeManager.unregisterNodeType(Collections.singleton(unusedNodeTypeName));

In this chapter, we covered how to use JCR with ModeShape and learned about how it implements the JCR specification. Now that you know how ModeShape repositories work and how to use JCR to work with ModeShape repositories, we'll move on in the next chapter to show how you can use ModeShape to query and search your JCR data.