Using ModeShape within your application is actually quite straightforward, and with JCR 2.0 it is possible for your application to do everything using only the JCR 2.0 API. Your application will first obtain a javax.jcr.Repository instance, and will use that object to create sessions through which your application will read, modify, search, or monitor content in the repository.
However, before you can use ModeShape, you need to configure it, and that's what this chapter covers.
- Configuring ModeShape
- Configuration Files
- Programmatic Configuration
- Loading from a Configuration Repository
- JCR Repository options
- Repository system content
- Query index directory
- Authentication and Authorization
- Using ModeShape in Web Applications
- Deploying ModeShape to JBoss AS 5 and 6
- Embedding ModeShape in web apps for deployment to other servers
- Setting the Classpath
- Building against ModeShape via Maven
- Building against ModeShape via JARs
- What's next
There really are three options:
- Load from a file is conceptually the most straightforward and requires the least amount of Java code, but it does requires having a configuration file. This is easy, allows one to manage configurations in version control, enables your application to use only the standard JCR API, and will likely be the best approach for most applications. If you're not sure, use this approach.
- Programmatic configuration allows an application to define and edit a configuration using Java code. This is useful when you cannot pre-define your configuration, or when you want to start with a baseline configuration, make programmatic changes based upon some inputs or preferences, and then save the configuration to a file. However, this requires that you write your application directly against ModeShape-specific interfaces and class.
- Load from a configuration repository is an advanced technique that allows multiple JcrEngine instances (usually in different processes perhaps on different machines) to easily access a (shared) configuration.
Each of these approaches has their obvious advantages, so the choice of which one to use is entirely up to you.
By far the easiest approach to defining your ModeShape configuration is to use a configuration file. As mentioned above, you'll want to do this if your application uses the standard and implementation-independent RepositoryFactory mechanism to obtain the JCR Repository reference.
Here is an example configuration file used in the repository example covered in the Getting Started document, though it has been slightly simplified for clarity):
Most likely you'll define your configuration in a file. But there are some situations where it's far easier - even necessary - to programmatically configure ModeShape. For example, you may not be able to predefine a configuration, because it needs parameters and information known only at runtime.
One obvious approach is to write code that takes this new information and generates a ModeShape configuration file. The challenge here is that a sizable amount of code may be required just to write out the XML file in the correct format.
Perhaps an easier approach is to use the ModeShape JcrConfiguration class to programmatically construct the configuration, and then have it write the configuration out to a file. You can even load a starting configuration, programmatically modify it, and write it out to a file. From there, your application can use the standard and implementation-independent JCR API to find and use the Repository instances.
The JcrConfiguration class is used by ModeShape to read in the configuration files, but it was also designed to have an easy-to-use API that makes it easy to configure each of the different kinds of components, especially when using an IDE with code completion. The next few sections describe how to configure the various parts of a ModeShape configuration.
Each repository source definition must include the name of the RepositorySource class as well as each bean property that should be set on the object:
This example defines an in-memory source with the name "source A", a description, and a single "defaultWorkspaceName" bean property. Different RepositorySource implementations will the bean properties that are required and optional. Of course, the class can be specified as Class reference or a string (followed by whether the class should be loaded from the classpath or from a specific classpath).
|Each time repositorySource(String) is called, it will either load the existing definition with the supplied name or will create a new definition if one does not already exist. To remove a definition, simply call remove() on the result of repositorySource(String). The set of existing definitions can be accessed with the repositorySources() method.|
Each repository must be defined to use a named repository source, but all other aspects (e.g., namespaces, node types, options) are optional.
This example defines a repository that uses the "source 1" repository source (which could be a federated source, an in-memory source, a database store, or any other source). Additionally, this example adds the node types in the "myCustomNodeTypes.cnd" file as those that will be made available when the repository is accessed. It also defines the "http://www.example.com/acme" namespace, and finally sets the "JAAS_LOGIN_CONFIG_NAME" option to define the name of the JAAS login configuration that should be used by the ModeShape repository.
|Each time repository(String) is called, it will either load the existing definition with the supplied name or will create a new definition if one does not already exist. To remove a definition, simply call remove() on the result of repository(String). The set of existing definitions can be accessed with the repositories() method.|
Each defined sequencer must specify the name of the StreamSequencer implementation class as well as the path expressions defining which nodes should be sequenced and the output paths defining where the sequencer output should be placed (often as a function of the input path expression).
This shows an example of a sequencer definition named "Image Sequencer" that uses the ImageMetadataSequencer class (loaded from the classpath), that is to sequence the "jcr:data" property on any new or changed nodes that are named "jcr:content" below a parent node with a name ending in ".jpg", ".jpeg", ".gif", ".bmp", ".pcx", ".iff", ".ras", ".pbm", ".pgm", ".ppm" or ".psd". The output of the sequencing operation should be placed at the "/images/$1" node, where the "$1" value is captured as the name of the parent node. (The capture groups work the same way as regular expressions.) Of course, the class can be specified as Class reference or a string (followed by whether the class should be loaded from the classpath or from a specific classpath).
|Each time sequencer(String) is called, it will either load the existing definition with the supplied name or will create a new definition if one does not already exist. To remove a definition, simply call remove() on the result of sequencer(String). The set of existing definitions can be accessed with the sequencers() method.|
Note that in addition to including a description for the configuration, it is also possible to set sequencer-specific properties using the setProperty(String,String) method. When ModeShape uses this configuration to set up a sequencing operation, it will instantiate the StreamSequencer class and will call a JavaBean-style setter method for each property. For example, calling setProperty("foo","val1") on the sequencer configuration will mean that ModeShape will instantiate the sequencer implementation and will look for a setFoo(String) method on the sequencer implementation class, and use that method (if found) to pass the "val1" value to the instance.
Each defined MIME type detector must specify the name of the MimeTypeDetector implementation class as well as any other bean properties required by the implementation.
Of course, the class can be specified as Class reference or a string (followed by whether the class should be loaded from the classpath or from a specific classpath).
|Each time mimeTypeDetector(String) is called, it will either load the existing definition with the supplied name or will create a new definition if one does not already exist. To remove a definition, simply call remove() on the result of mimeTypeDetector(String). The set of existing definitions can be accessed with the mimeTypeDetectors() method.|
Regardless of how the JcrConfiguration is loaded, it can also be stored to a file or stream in an XML format that can then be reloaded in the future to recreate the configuration. This makes it very easy to programmatically generate a configuration file once while being able to load that same configuration at a later time (or on a different instance).
This will create a file at pathToFile that contains the current configuration in XML format. Any changes made after the most recent call to the save() method on the JcrConfiguration object will not be saved in the configuration repository, and thus will not be in the generated file. The generated XML will not be formatted, so it may be a bit hard to read. (Any good XML editor will be able to format it for readability.)
So far, we've seen how to load a configuration from a file, how to programmatically define a configuration and write it out to a file. In this section, we'll see how ModeShape can load its configuration from another repository.
|This really is a very advanced way to define your configuration, so this is recommended only for those that are already very comfortable with ModeShape and its lower-level graph API and connector API.|
The first step is to create and configure the RepositorySource instance that we'll use to access the repository where the configuration is stored. Then, create a JcrConfiguration instance and load from this source:
The loadFrom(...) method can be called any number of times, but each time it is called it completely wipes out any current notion of the configuration and replaces it with the configuration found in the file.
There is an optional second parameter that defines the name of the workspace in the supplied source where the configuration content can be found. It is not needed if the workspace is the source's default workspace. There is an optional third parameter that defines the Path within the configuration repository identifying the parent node of the various configuration nodes. If not specified, it assumes "/". This makes it possible for the configuration content to be located at a different location in the hierarchical structure. (This is not often required, but it is very useful if you ModeShape configuration file is embedded within another XML file.)
Once the JcrConfiguration has been loaded from a RepositorySource, the JcrConfiguration instance can be used to modify the configuration and then save those changes back to the repository. This technique can be used to place a configuration into a repository (such as a database) for the first time:
Now you can load this configuration in multiple processes, using the approach mentioned above.
ModeShape JCR repositories have a number of behaviors that can be controlled from within the configuration. These are known as repository options, and all have sensible defaults. However, they do allow you to better configure the JCR repository instances to best suit your needs.
As mentioned earlier, these options can be set programmatically or within the configuration file. When setting up the configuration programmatically, the actual enum literal values must be used, and all values are String literals:
When using a configuration file, you set the option within the "mode:options" fragment under the "mode:repository" section. Each option fragment typically looks something like this:
where the "jcr:name" XML attribute value contains the lower-camel-case form of the option literal, and the "mode:value" XML attribute value contains the repository option value. In the example above, the "jaasLoginConfigName" is the option name, and "modeshape-jcr" is the option value. An alternative representation is to set the name using the XML element name and set the primary type with an XML attribute. Thus, this fragment is equivalent to the previous listing:
The following table describes all of the current repository options.
|jaasLoginConfigName||The JAAS JAAS application configuration name that specifies which login module should be used to validate credentials. By default, "modeshape-jcr" is used. Set the option with an empty (zero-length) value to completely turn off JAAS authentication (see the Built-In Providers section for details). The enumeration literal is Option.JAAS_LOGIN_CONFIG_NAME|
|systemSourceName||The name of the source (and optionally the workspace in the source) where the "/jcr:system" branch should be stored. The format is "name of workspace@name of source", or simply "name of source" if the default workspace is to be used. If this option is not used, a transient in-memory source will be used. Note that all leading and trailing whitespaces is removed for both the source name and workspace name. Thus, a value of "@" implies a zero-length workspace name and zero-length source name. Also, any use of the '@' character in source and workspace names must be escaped with a preceding backslash.The enumeration literal is Option.SYSTEM_SOURCE_NAME|
|anonymousUserRoles||A comma-delimited list of default roles provided for anonymous access. A null or empty value for this option means that anonymous access is disabled. The enumeration literal is Option.ANONYMOUS_USER_ROLES|
|exposeWorksapceNamesInDescription||A boolean flag that indicates whether a complete list of workspace names should be exposed in the custom repository descriptor "org.modeshape.jcr.api.Repository.REPOSITORY_WORKSPACES". If this option is set to true, then any code that can access the repository can retrieve a complete list of workspace names through the javax.jcr.Repository.getDescriptor(String) method without logging in. The default value is 'true', meaning that the descriptor is populated.Since some ModeShape installations may consider the list of workspace names to be restricted information and limit the ability of some or all users to see a complete list of workspace names, this option can be set to "false" to disable this capability. If this option is set to "false", the "org.modeshape.jcr.api.Repository.REPOSITORY_WORKSPACES" descriptor will not be set.The enumeration literal is Option.EXPOSE_WORKSPACE_NAMES_IN_DESCRIPTOR|
|repositoryJndiLocation||A string property that when specified tells the JcrEngine where to put the Repository in JNDI. Assumes that you have write access to the JNDI tree. If no value set, then the Repository will not be bound to JNDI. The enumeration literal is Option.REPOSITORY_JNDI_LOCATION|
|queryExecutionEnabled||A boolean flag that specifies whether this repository is expected to execute searches and queries. If client applications will never perform searches or queries, then maintaining the query indexes is an unnecessary overhead, and can be disabled. Note that this is merely a hint, and that searches and queries might still work when this is set to 'false'. The default is 'true', meaning that clients can execute searches and queries. The enumeration literal is Option.QUERY_EXECUTION_ENABLED|
|queryIndexDirectory||The system may maintain a set of indexes that improve the performance of searching and querying the content. These size of these indexes depend upon the size of the content being stored, and thus may consume a significant amount of space. This option defines a location on the file system where this repository may (if needed) store indexes so they don't consume large amounts of memory.If specified, the value must be a valid path to a writable directory on the file system. If the path specifies a non-existant location, the repository may attempt to create the missing directories. The path may be absolute or relative to the location where this VM was started. If the specified location is not a readable and writable directory (or cannot be created as such), then this will generate an exception when the repository is created.The default value is null, meaning the search indexes may not be stored on the local file system and, if needed, will be stored within memory.The enumeration literal is Option.QUERY_INDEX_DIRECTORY|
|queryIndexesUpdatedSynchronously||An advanced boolean flag that specifies whether updates to the indexes (if used) should be made synchronously, meaning that a call to Session.save() will not return until the search indexes have been completely updated. The benefit of synchronous updates is that a search or query performed immediately after a save() will operate upon content that was just changed. The downside is that the save() operation will take longer.With asynchronous updates, however, the only work done during a save() invocation is that required to persist the changes in the underlying repository source, while changes to the search indexes are made in a different thread that may not run immediately. In this case, there may be an indeterminate lag before searching or querying after a save() will operate upon the changed content.The default is value 'false', meaning the updates are performed asynchronously.The enumeration literal is Option.QUERY_INDEXES_UPDATED_SYNCHRONOUSLY|
|queryIndexesRebuiltSynchronously||An advanced boolean flag that specifies whether the indexes should be rebuilt synchronously when the repository restarts. If this flag is set to 'true', query indexes for each workspace in the repository will be rebuilt synchronously the first time that the repository is accessed (e.g., at the first login). If this flag is set to 'false', the query indexes for each workspace in the repository will be rebuilt asynchronously.Rebuilding the indexes synchronously can cause very significant latency in the initial repository access if the repository contains a significant amount of content that must be reindexed. Updating the indexes asynchronously eliminates this latency, but repository queries may generate inconsistent results while the indexes are being updated. That is, query results may refer to content that is no longer in the repository or may fail to include appropriate results for nodes that had been added to the repository.The default is value 'true', meaning the rebuilds are performed synchronously.The enumeration literal is Option.QUERY_INDEXES_REBUILT_SYNCHRONOUSLY|
|rebuildQueryIndexOnStartup|| An advanced setting that specifies the strategy used to determine which query indexes need to be rebuilt when the repository restarts. ModeShape currently supports two strategies:
|projectNodeTypes||An advanced boolean flag that defines whether or not the node types should be exposed as content under the "/jcr:system/jcr:nodeTypes" node. Value is either "true" or "false" (default). The enumeration literal is Option.PROJECT_NODE_TYPES|
|readDepth||An advanced integer flag that specifies the depth of the subgraphs that should be loaded from the connectors during normal read operations. The default value is 1. The enumeration literal is Option.READ_DEPTH|
|indexReadDepth||An advanced integer flag that specifies the depth of the subgraphs that should be loaded from the connectors during indexing operations. The default value is 4. The enumeration literal is Option.INDEX_READ_DEPTH|
|tablesIncludeColumnsForInheritedProperties||An advanced boolean flag that dictates whether the property definitions inherited from supertypes should be represented in the corresponding queryable table with columns. The JCR specification gives implementations some flexibility, so ModeShape allows this to be controlled.When this option is set to "false", then each table has only those columns representing the (single-valued) property definitions explicitly defined by the node type. When this option is set to "true" (the default), each table will contain columns for each of the (single-valued) property definitions explicitly defined on the node type and inherited by the node type from all of the supertypes.The enumeration literal is Option.TABLES_INCLUDE_COLUMNS_FOR_INHERITED_PROPERTIES|
|performReferentialIntegrityChecks||An advanced boolean flag that specifies whether referential integrity checks should be performed upon Session.save(). If set to "true" (the default), referential integrity checks are performed to ensure that nodes referenced by other nodes cannot be removed. If the value is set to "false", then these referential integrity checks will not be performed when removing nodes.Many people generally discourage the use of REFERENCE properties because of the overhead and the need for referential integrity. These concerns are somewhat mitigated by the introduction in JCR 2.0 of the WEAKREFERENCE property type, which are excluded from referential integrity checks.This option is available for those cases where REFERENCE properties are not used within your content, and thus the referential integrity checks will never find violations. In these cases, you may disable these checks to slightly improve performance of delete operations.The enumeration literal is Option.PERFORM_REFERENTIAL_INTEGRITY_CHECKS|
|versionHistoryStructure|| An advanced flag that specifies the structure used to store version histories under the "/jcr:system/jcr:versionStorage" branch. The JCR 2.0 specification does not predefine any particular structure, but ModeShape supports two types:
|removeDerivedContentWithOriginal||An advanced boolean flag that dictates whether content derived from other content (e.g., that output by sequencers) should be automatically (re)moved when the content from which it was derived is (re)moved from the repository. For example, consider that a file is uploaded and sequenced, and that the content derived from the file is stored in the repository. When that file is (re)moved, this option dictates whether the derived content should also be (re)moved automatically.By default this option has a value of "true", ensuring that all derived content is deleted whenever the original content is deleted. A value of "false" will leave the derived content.The enumeration literal is Option.REMOVE_DERIVED_CONTENT_WITH_ORIGINAL|
|useAnonymousAccessOnFailedLogin||A boolean flag that indicates whether any failed, non-anonymous login attempts will automatically cause the Session to be created using the anonymous context. If anonymous logins are not enabled (with the anonymousUserRoles option), then the login will still fail. By default this option has a value of "false", ensuring that non-anonymous login attempts either succeed as the requested user or fail.The enumeration literal is Option.USE_ANONYMOUS_ACCESS_ON_FAILED_LOGIN|
|useSecurityContextCredentials||Older versions of ModeShape allowed client applications to pass in Credentials implementations that had a getSecurityContext() method that returned a SecurityContext object, which ModeShape would then use for authorization. However, since ModeShape now provides support for customized authentication and authorization modules, this is no longer needed and has been deprecated. If, however, your applications were written to use this SecurityContextCredentials implementation, then you can enable this option to turn the old behavior back on. Note, however, that this option will be removed in the next major release. Value is either "true" or "false" (default). The enumeration literal is Option.USE_SECURITY_CONTEXT_CREDENTIALS|
|Setting the useAnonymousAccessOnFailedLogin option to "true" and setting the anonymousUserRoles to a valid value means that all login attempts will succeed, but named login attempts may actually succeed in an anonymous context. You can programattically determine which context is being used by checking the value of Session.getUserID().|
Each JCR repository contains information about the system in the "/jcr:system" area of the repository content. All of this system content applies to the whole repository (e.g., namespaces, node types, locks, versions, etc.) and therefore every session for each workspace sees the exact same "/jcr:system" content.
ModeShape implements this behavior by storing all "/jcr:system" content in a separate workspace, and then using federation to project that content into each workspace. This ensures that all workspaces see the same content, without having to duplicate the "/jcr:system" content in each workspace and ensure those copies stay in sync. Federation is better than duplication.
By default, ModeShape creates this separate system workspace in a transient, in-memory store. This works great for some simplistic cases, but this doesn't work when using clustering, versioning, or dynamically registering namespaces or adding or changing node types. This is because these features all rely upon changing or adding content in the "/jcr:system" area. For example, version histories are stored under "/jcr:system/jcr:versionStorage", node types under "/jcr:system/jcr:versionStorage", and namespaces under "/jcr:system/mode:namespaces".
In these situations, it is necessary to persist the system content in a repository source, and if clustering is enabled this source needs to be accessible to all members of the cluster. Many times, the easiest approach is to simply define an extra workspace in your repository source where the system content can be stored. It's also possible to define a separate repository source with a separate workspace for each repository's system content. (Using a separate source is required when the repository is using a single repository source that can only store limited kinds of nodes, like the file system connector or Subversion connector that can only store nt:file and nt:folder nodes.)
You should always configure each ModeShape repository with a source for its system workspace by using the SYSTEM_SOURCE_NAME repository option with a value that defines the name of source and name of the workspace in that source where the system content should be stored, in the format:
This specifies the system content should be stored in the workspace named "workspaceName" in the "sourceName" repository source.
The system content can be stored in any repository source capable of storing any content and, in the case of clustering, that is accessible across multiple processes. For most people, this will mean a relational database. Here is an abbreviated example of an XML configuration that defines a source for the system storage (in a MySQL database) and a repository that uses it:
Of course, you can always use a separate workspace in your primary source, too:
ModeShape maintains a set of index files that are used to process queries and searches, using the Lucene search engine. By default, these indexes are kept in memory (primarily because it's easy to configure). But most production configurations should not store them in-memory but should instead store these index files on the local file system.
Each ModeShape repository can be configured where the indexes should be stored, using the "QUERY_INDEX_DIRECTORY" repository option (see JcrRepository.Option) when using the programmatic API or the "queryIndexDirectory" repository option in a ModeShape configuration file. The value of this setting should be the absolute or relative path to the folder where the indexes should be stored. In this directory, ModeShape will store the index files for each workspace in a folder named similarly to the workspace. Note that ModeShape will dynamically create these workspace folders as required.
For example, here is part of a ModeShape configuration file that specifies these index files should be stored in the "data/car_repository/indexes" folder, relative to the folder where the JVM process was started:
ModeShape 2.6 introduced pluggable authentication and authorization modules. Several modules are included and configured out-of-the-box, but it is now possible to implement and configure customized authentication and authorization logic. This section describes how these modules work, what's there out-of-the-box, and how to implement and add your own modules.
The AuthenticationProvider interface defines a single method:
When a client calls one of the Repository's login methods, ModeShape calls the authenticate method on each of the AuthenticationProvider implementations registered with the Repository. As soon as one provider returns a non-null ExecutionContext, the caller is authenticated and ModeShape uses that ExecutionContext within the resulting Session.
When the client uses the Session and attempts to perform actions on the content, ModeShape uses the ExecutionContext's SecurityContext to determine whether the user has the necessary privileges. If the SecurityContext object implements the AuthorizationProvider interface, then ModeShape will call the hasPermission(...) method, passing in the ExecutionContext, the repository name, the name of the source used for the repository, the workspace name, the path of the node upon which the actions are being applied, and the array of actions (see ModeShapePermissions for the possible values):
If the SecurityContext does not implement AuthorizationProvider, then ModeShape uses role-based authorization by mapping the actions into roles and then for each role calling the hasRole(...) method on SecurityContext. Only if all of these invocations returns true will the operation be allowed to continue.
ModeShape comes with several AuthorizationProvider implementations that are automatically configured with every Repository, depending upon other settings and options. These providers are as follows:
- JaasProvider uses JAAS for all authentication and role-based authorization. This provider authenticates clients that login to the Repository with a SimpleCredentials object, where the username and password match that in the JAAS policy, or a JaasCredentials constructed with a specific and already-authenticated JAAS LoginContext. This provider can be disabled by setting the jaasLoginConfigName configuration option to an empty (i.e., zero-length) value; otherwise, the option defines the name of the JAAS login configuration and will default to "modeshape-jcr" if not explicitly set. (This provider also works in some J2EE containers, in which the JAAS Subject is not available via the standard JAAS API and instead requires use of the JACC API, which many J2EE containers support).
- SeamSecurityProvider delegates all authentication and role-based authorization to the Seam Security framework. This provider authenticates clients that login to the Repository with no need to pass a Credentials object. Note this does require obtaining a session for each servlet request, which is actually how the JCR API was intended to be used within web applications. This provider is automatically enabled when the Seam Security Identity class is found on the classpath.
- ServletProvider delegates all authentication and role-based authorization to the servlet framework. This provider authenticates clients that login to the Repository with a ServletCredentials object, which can be constructed with the HttpServletRequest. Note this does require obtaining a session for each servlet request, which is actually how the JCR API was intended to be used within web applications. This provider is automatically enabled when the HttpServletSession class is found on the classpath.
- AnonymousProvider will allow clients without Credentials to operate upon the repository, and will use role-based authorization based upon the roles defined by the anonymousUserRoles configuration option. This provider authenticates clients that provide an AnonymousCredentials to the Repository's login(...) methods or use one of the login(...) methods that does not take a Credentials object.
The SecurityContextProvider is also configured only when the useSecurityContextCredentials configuration option is set to 'true'. This provider authenticates clients that pass a SecurityContextCredentials object, and delegates all authentication to the embedded SecurityContext. This deprecated approach is not enabled by default, and will be removed in the next major release of ModeShape. It remains in place to enable applications that use this older and less attractive approach to upgrade to ModeShape 2.6 (or later) without breaking their authentication mechanism.
It is possible to provide your own authentication and authorization logic by providing one (or more) classes that implements the AuthorizationProvider interface, specifying the names of these classes in the configuration (see below), and making the classes available on the correct classpath.
Implementing the AuthorizationProvider interface is pretty straightforward. Your class needs a no-arg constructor, and the authenticate method must simply authenticate the credentials for the named repository and workspace. If the credentials are not authenticated, simply return null. Otherwise, simply create an ExecutionContext instance (from the ExecutionContext supplied in the repositoryContext parameter) to contain an appropriate SecurityContext instance for the authenticated user. As mentioned above, the SecurityContext should also implement the AuthorizationProvider interface for non-role-based authorization.
For example, let's imagine that our JCR application has its own authentication and authorization system. We can integrate with that by create a new Credentials implementation called MyAppCredentials to encapsulate any information needed by the authentication/authorization system, which we'll assume is accessed by a singleton class SecurityService. We can then implement AuthenticationProvider as follows:
where the MyAppSecurityContext is as follows:
Then we just need to configure the Repository to use this provider. In the ModeShape configuration files, there is an optional "mode:authenticationProviders" child element of "mode:repository", and within this fragment you can define zero or more authentication providers by specifying a name, the class, an optional description, and optionally any bean properties that should be called upon instantiation. (Note that the class will be instantiated only once per Repository instance). Here's an example configuration file:
ModeShape 2.1 introduced the ability to have a cluster of JcrEngine instances distributed across multiple processes while behaving as though everything was happening in a single process. With clusters, the workload can be distributed across multiple machines, increasing tolerance against failure while allowing ModeShape to scale out to handle more workload.
ModeShape clustering uses the powerful, flexible and mature JGroups library to handle all network communication within the cluster. JGroups provides a wealth of capabilities, including automatically detecting new engines in the cluster (called discovery), reliable multicast communication, and automatic determination of the master node in the cluster. JGroups has a flexible protocol stack, works across firewalls, WANs and LANs, and supports multiple transport protocols, failure detection, reliable unicast and multicast message transmission, and encryption.
By default, clustering is not enabled. This means that each JcrEngine instance is self-contained and will not be aware of changes made in other JcrEngine instances. This is perfect in many lightweight or embedded scenarios, because it does not introduce any overhead associated with network communication.
However, clustering ModeShape is very easy and requires only a few simple steps:
- Enable clustering in the ModeShape configuration (more on this in a bit).
- Include the modeshape-clustering module in your application, either by JAR file or Maven dependency.
- Start (or deploy) multiple JcrEngine instances using the same configuration. For embedded scenarios, this means simply instantiating multiple JcrEngine instances in multiple processes. In other cases, this means deploying ModeShape to multiple servers (either using the WebDAV server, REST server, or into JNDI and using with your own applications).
Your JCR-based application doesn't need to change in any other ways. Any implementations registered in Sessions on any of the engines will be notified of all events, regardless of whether those events were due to changes in the local or remote engines.
It also doesn't matter how many Repository instances are defined in the configuration and managed by each JcrEngine instance: each engine in the cluster can manage multiple named repositories. ModeShape ensures that all Sessions for a named repository see the changes made to that repository, regardless of where those sessions are located in the cluster. Likewise, those same changes will not be visible to the sessions for any other named repository.
A ModeShape configuration can have a "clustering" fragment that defines the name of the cluster and the JGroups configuration:
The "clusterName" is a string that is a logical name of the cluster; all engines connecting to the same name form a cluster. Any messages multicast from one engine in the cluster will be received by all other members of the cluster. Again, the cluster name is independent of the repositories managed by th
The "configuration" value is a string that is one of:
- the absolute file system path to the file containing the JGroups XML configuration;
- the relative file system path to the file containing the JGroups XML configuration, relative to the current working directory of the Java process;
- the name of a resource on the classpath containing the JGroups XML configuration;
- the URL that can be resolved to the JGroups XML configuration; or
- the string representation of JGroups configuration, either in XML format or the older string format.
The format of this JGroups configuration will be described in the next section. If the "configuration" property is not given, ModeShape will use the default JGroups configuration (as defined by the specific JGroups version).
|Note that all engines in the cluster must have the same JGroups configuration. In fact, all engines in the cluster will almost always have exactly the same ModeShape configuration.|
Here is an example of a "clustering" fragment defining a cluster named "modeshape-cluster" using the JGroups configuration defined in the "jgroups-modeshape.xml" file at the supplied URL:
This next example uses the JGroups configuration defined in the "jgroups-modeshape.xml" resource file on the classpath (or as an absolute path on a *nix system):
Next is an example that specifies the JGroups configuration using the older string representation of the form:
Of course, the "configuration" property can be specified as a child element, too (line breaks added for readability):
And finally an example that specifies the JGroups configuration using the newer XML representation (line breaks added for readability):
Note that the this example uses a child XML element for the "configuration", along with a CDATA section, so that the XML configuration can be nested within the ModeShape configuration.
|Remember to specify the system workspace name for each repository that is clustered.|
The JGroups configuration defines a protocol stack that is used for messaging, starting with the bottom-most protocol and ending with the top-most protocol.
An example of the newer-style JGroups XML format is:
The older-style JGroups string format is of the form:
This format is generally harder to read and generally discouraged. Nevertheless, here's an example of the older string format defining the same stack as the previous XML example (line breaks have been added for readability):
For more details on how to configure the JGroups stack, see the JGroups Manual.
|JGroups is also used in Infinispan, JBoss AS, and other open source projects, and many of the JGroups configurations will work with ModeShape deployed in those same environments. For example, this blog post describes how to configure JGroups with three autodiscovery options available on Amazon EC2.|
Sometimes your applications can simply define a configuration file and use the RepositoryFactory to access its repositories. This is very straightforward, and this is useful for many simple applications because the application will then own the ModeShape instance(s).
Web applications are a different story. Often, you would rather your web application not contain the code that initializes the JCR repository, but instead configure ModeShape as a central, shared service that all of your web applications can simply reference and use.
Unfortunately, there's not single way to deploy ModeShape into any web or application server, since they all have slightly different deployment and configuration techniques. The remainder of this section will talk about how to deploy ModeShape to two popular open source servers.
The JBoss Application Server (or JBoss AS) is a very popular open source Java application server, with an extremely healthy and active community. ModeShape offers a way to deploy ModeShape into JBoss AS versions 5.x or 6.x as as a central, shared service that can be monitored and administered using the embedded console.
ModeShape provides a downloadable ZIP file that can be unzipped into any JBoss AS profile. When you do this, that profile will contain all the files necessary for ModeShape to run when the server is started. The default configuration is for a single, in-memory repository with two users. However, other than basic playing, you will want to edit the configuration files to define a more robust, persistent and secure configuration.
There are two distribution ZIP files, one for JBoss AS 5.x and another for JBoss AS 6.x, and both file contains several components:
- JAR files for the JCR 2.0 API and ModeShape's small extensions to the JCR API on the global classpath (that is, in the "lib/" directory). These APIs are available to all deployed applications, services and components. The JCR API contains the "javax.jcr" packages and has no other dependencies. ModeShape's extensions define interfaces in the "org.modeshape.jcr.api" packages; these extend a few of the standard JCR API interfaces and add several methods to make them more useful.
- The ModeShape Service, represented as an exploded JAR file in the "deploy" directory. This is where the JcrEngine is running, though any application (or other JBoss service) can access its JCR Repository instances using the standard RepositoryFactory approach (covered in the next chapter) with JNDI URLs:
By default, there is a single in-memory repository named "repository", but this can be changed by simply editing the "deploy/modeshape-services.jar/managedConfigRepository.xml" configuration file. All of ModeShape's standard sequencers and connectors (and JARs for their dependencies) are included, meaning they can be configured for use without worrying about adding JARs to the classpath. Feel free to remove any of the JARs are not needed for your custom configuration.
- A pair of JAAS properties files, located in the "conf/props/" directory, that come out of the box with an "admin" user (with password "admin") that has full read, write, and administration privileges, and a "guest" user (with password "guest") that has only read and write privileges. Simply edit these files to change users, passwords, and roles, or to configure JAAS differently.
- The ModeShape RESTful API, represented as an exploded WAR file in the "deploy" directory. This allows remote applications to interact with ModeShape to access and manipulate repository content using a RESTful API that uses JSON in the requests and responses. All ModeShape repositories can be accessed, and authentication is done using the ModeShape JAAS configuration.
- The ModeShape WebDAV API, represented as an exploded WAR file in the "deploy" directory. This web application allows external clients to access and manipulate the content in the ModeShape repositories using the standard WebDAV protocol. For example, you can mount a repository (or parts of it) as a network drive on most operating systems, and then upload or download files and folders using standard OS operations and graphical tools. All ModeShape repositories can be accessed, and authentication is done using the ModeShape JAAS configuration.
- A plugin for the embedded JBoss AS console, represented as a WAR file in the "deploy" directory. This plugin also works with RHQ administration, monitoring, alerting, operational control and configuration system. (We plan to add more metrics and operations over the next few releases, as we gain more experience using the ModeShape RHQ plugin.)
- A JDBC driver that allows applications also deployed on the same JBoss AS instance to query the repositories through JDBC. This driver is on the global classpath so it can be used in any deployed component. A single JDBC DataSource is also configured in the "deploy/modeshape-services.jar/modeshape-jdbc-ds.xml" file to use the single default in-memory repository available out of the box. Simply edit this file to add or change the DataSource definitions. The driver can also be used in a separate JVM to issue queries and access database metadata.
- A remote client JAR that can be used by Java applications to use JDBC or the RESTful API to remotely access a ModeShape repository deployed on JBoss AS. This JAR includes ModeShape's full JDBC driver.
Here are the contents of this file:
Your web application or JBoss service can use one of the JCR Repository instances running inside the ModeShape service by simply using the RepositoryFactory technique described earlier, with a URL such as:
Be sure to use the correct repository name.
Since the JCR API JAR is on the global classpath, your web application can use the JCR API without having to include the JAR file in your application's WAR file. In fact, your application will likely get ClassCastExceptions if it does include the JCR API in its WAR file. Plus, if needed, your application can use ModeShape's "org.modeshape.jcr.api" extensions to the JCR API (again, on the global classpath), and should not need or use any of the classes or interfaces in the ModeShape implementation.
Each kind of web server or application server is different, but all servlet containers do provide a way of configuring objects and placing them into JNDI. ModeShape provides a JndiRepositoryFactory class that implements and that can be used in the server's configuration. The JndiRepositoryFactory requires two properties:
- configFile is required and specifies the path to the configuration file resource, which must be available on the classpath
- repositoryName is optional and specifies the name of a JCR repository that exists in the JCR configuration that should be registered in JNDI; if not provided, then the ModeShape engine will be registered in JDNI at the specified location.
Here's an example of a fragment of the conf/context.xml for Tomcat that registers the ModeShape engine in JNDI at "jcr/local":
The web application can then use the newer pattern specified by the JCR 2.0 specification to use the and RepositoryFactory:
Alternatively, it's possible to use this JndiRepositoryFactory class to start up ModeShape and register an individual JCR Repository instance. Here's an example of a fragment of the conf/context.xml for Tomcat that registers the "Cars" repository in JNDI at "jcr/local/Cars":
The web application can then directly lookup the Repository instance in JNDI, as recommended in the older JCR 1.0 specification:
or via the newer pattern using the JCR 2.0 RepositoryFactory-style lookup approach:
Before the server can start, however, all of the ModeShape jars need to be placed on the classpath for the server. JAAS also needs to be configured, and this can be done using the application server's configuration or in your web application if you're using a simple servlet container. For more details, see the Reference Guide.
Then, your web application needs to reference the Resource and state its requirements in its web.xml:
Note that the value of resource-env-ref-name fields must matche the value of the name attribute on the <Resource> tag in the context.xml described above. This is a must.
At this point, your web application can perform the lookup of the Repository object by using JNDI directly (or the more standard RepositoryFactory technique shown in the next chapter), create and use a Session, and then close the Session. Here's an example of a JSP page that does this:
Since this uses a servlet container, there is no JAAS implementation configured, so note the loading of IDTrust to create the JAAS realm. (To make this work in Tomcat, the security folder that contains the jaas.conf.xml, users.properties, and roles.properties needs to be moved into the %CATALINA_HOME% directory.)
|If you deploy your application to JBoss AS or EAP and deploy ModeShape as a service, your application doesn't have to do anything with JAAS, since that's provided by the platform.|
Before you deploy ModeShape into your application or its environment, you need to make sure that all of the ModeShape JARs are on the appropriate classpath. Two different scenarios are covered in this section: Maven-based, and using JARs with the traditional classpath.
By far the easiest way to use ModeShape is to use Maven, because with just a few lines of code, Maven will automatically pull all the JARs and source for all of the ModeShape libraries as well as everything those libraries need. All of ModeShape's artifacts for each release are published in the new JBoss Maven repository under the "org.modeshape" group ID.
The JBoss Maven repository not only contains all of the artifacts for ModeShape and other open source projects hosted at JBoss.org, but it also proxies quite a few other repositories that contain many other third-party libraries.
So if you're using Maven (or Ivy), first make sure your project knows about this new JBoss Maven repository. One way to do this is to add the following to your project POM (you'll still likely want to use other Maven repositories for third-party artifacts):
Or, you can add this information to your ~/.m2/settings.xml file. For more information, see the JBoss wiki page.
Then, simply modify your project's POM by adding dependencies on the ModeShape JCR library:
This adds only the minimal libraries required to use ModeShape. If your application is going to use clustering, you'll need to also depend upon the clustering module:
You also need to add dependencies for each of the connectors and sequencers you want to use. Here is the list of available sequencers:
Here is the list of available connectors:
The sequencer and connector libraries you choose, plus every third-party library they need, will be pulled in automatically by Maven into your project.
ModeShape is designed to use the same logging framework as your application, and it uses SLF4J to accomplish this. In other words, ModeShape depends upon the SLF4J API library, but requires you to provide provide a logging implementation as well as the appropriate SLF4J binding JAR.
For example, if your application is using Log4J, your application will already have a dependency for it, and so ModeShape log messages will be sent to the same logging system used in your application, you need to add a dependency to the SLF4J-to-Log4J binding JAR:
Of course, SLF4J works with other logging frameworks, too. Some logging implementations (such as LogBack) implement the SLF4J API natively, meaning they require no binding JAR. For details on the options and how to configure them, see the SLF4J manual.
If your application doesn't use Maven, you'll need to obtain the ModeShape JARs and place them onto your application's classpath. ModeShape provides a single download with all of the JARs for all ModeShape components and all dependencies. This file contains the following:
- modeshape-jcr-2.8.0.Final-jar-with-dependencies.jar contains all of the classes (except those under javax.jcr) necessary to run the core ModeShape JCR repository engine using the in-memory connector and the federating connector;
- one modeshape-connector-<type>-2.8.0.Final-jar-with-dependencies.jar for each type of connector, each containing all of the classes necessary for that connector, designed to be added to the classpath after the modeshape-jcr-2.8.0.Final-jar-with-dependencies.jar file;
- one modeshape-sequencer-<type>-2.8.0.Final-jar-with-dependencies.jar for each type of connector, each containing all of the classes necessary for that sequencer, designed to be added to the classpath after the modeshape-jcr-2.8.0.Final-jar-with-dependencies.jar file;
- modeshape-mimetype-detector-aperture-2.8.0.Final-jar-with-dependencies.jar containing all of the classes necessary for detecting the MIME type of files based upon their name and/or content, designed to be added to the classpath after the modeshape-jcr-2.8.0.Final-jar-with-dependencies.jar file;
Note that the core engine is required in all configurations. The jcr-2.0.jar file is not included and must be provided by you. And, as mentioned in the previous section, ModeShape uses SLF4J for logging and you must provide a logging implementation as well as the appropriate SLF4J binding JAR.
This chapter outlines how you configure ModeShape, how to deploy ModeShape into your application, and how to set up your application's environment with the required ModeShape JARs. The next chapter talks about how your application can use the JCR API to access ModeShape repositories.