Skip to end of metadata
Go to start of metadata

The various components of ModeShape are designed as plain old Java objects, or POJOs (Plain Old Java Objects). And rather than making assumptions about their environment, each component instead requires that any external dependencies necessary for it to operate must be supplied to it. This pattern is known as Dependency Injection, and it allows the components to be simpler and allows for a great deal of flexibility and customization in how the components are configured.

The approach that ModeShape takes is simple: a simple POJO that represents everything about the environment in which components operate. Called ExecutionContext, it contains references to most of the essential facilities, including: security (authentication and authorization); namespace registry; name factories; factories for properties and property values; logging; and access to class loaders (given a classpath). Most of the ModeShape components require an ExecutionContext and thus have access to all these facilities.

The ExecutionContext is a concrete class that is instantiated with the no-argument constructor:

The fact that so many of the ModeShape components take ExecutionContext instances gives us some interesting possibilities. For example, one execution context instance can be used as the highest-level (or "application-level") context for all of the services (e.g., RepositoryService, SequencingService, etc.). Then, an execution context could be created for each user that will be performing operations, and that user's context can be passed around to not only provide security information about the user but also to allow the activities being performed to be recorded for user feedback, monitoring and/or auditing purposes.

As mentioned above, the starting point is to create a default execution context, which will have all the default components:

Once you have this top-level context, you can start creating subcontexts with different components, and different security contexts. (Of course, you can create a subcontext from any instance.) To create a subcontext, simply use one of the with(...) methods on the parent context. We'll show examples later on in this chapter.

Security

ModeShape uses a simple abstraction layer to isolate it from the security infrastructure used within an application. A SecurityContext represents the context of an authenticated user, and is defined as an interface:

Every ExecutionContext has a SecurityContext instance, though the top-level (default) execution context does not represent an authenticated user. But you can create a subcontext for a user authenticated via JAAS:

In the case of JAAS, you might not have the password but would rather prompt the user. In that case, simply create a subcontext with a different security context:

Of course if your application has a non-JAAS authentication and authorization system, you can simply provide your own implementation of SecurityContext:

These ExecutionContexts then represent the authenticated user in any component that uses the context.

JAAS

One of the SecurityContext implementations provided by ModeShape is the JaasSecurityContext, which delegates any authentication or authorization requests to a Java Authentication and Authorization Service (JAAS) provider. This is the standard approach for authenticating and authorizing in Java.

There are quite a few JAAS providers available, but one of the best and most powerful providers is JBoss Security, the open source security framework used by JBoss. JBoss Security offers a number of JAAS login modules, including:

  • User-Roles Login Module is a simple javax.security.auth.login.LoginContext implementation that uses usernames and passwords stored in a properties file.
  • Client Login Module prompts the user for their username and password.
  • Database Server Login Module uses a JDBC database to authenticate principals and associate them with roles.
  • LDAP Login Module uses an LDAP directory to authenticate principals. Two implementations are available.
  • Certificate Login Module authenticates using X509 certificates, obtaining roles from either property files or a JDBC database.
  • Operating System Login Module authenticates using the operating system's mechanism.
    and many others. Plus, JBoss Security also provides other capabilities, such as using XACML policies or using federated single sign-on. For more detail, see the JBoss Security project.

User-Roles Configuration

When assigning users to roles via properties files (e.g. when using the User-Roles Login Module),  each line contains the comma-separated list of roles for a particular user, and is of the form:

<username>=<role>[,<role>,...] where:

  • <username> is the name of the user,
  • <role> is an expression describing a role for the user and which adheres to the format

<role>=<roleName>[.<sourceName>[.<workspaceName]] where:

  • <roleName> is one of "admin", "readonly", "readwrite", or (for WebDAV and RESTful access) "connect"
  • <sourceName> is the name of the repository source to which the role is granted; if absent, the role will be granted for all repository sources
  • <workspaceName> is the name of the repository workspace to which the role is granted; if absent, the role will be granted for all workspaces in the repository

For example, the following line provides all roles to user 'jsmith' for all workspaces in all repositories: jsmith=admin,connect,readonly,readwrite while jsmith=connect,readonly,readwrite.customers provides connect and read access to all repositories, but only write access to all workspaces in the 'customers' repository

When using the JBoss AS kit, the properties file handling these assignments is: modeshape-roles.properties.

Web application security

If ModeShape is being used within a web application, then it is probably desirable to reuse the security infrastructure of the application server. This can be accomplished by implementing the SecurityContext interface with an implementation that delegates to the HttpServletRequest. Then, for each request, create a SecurityContextCredentials instance around your SecurityContext, and use that credentials to obtain a JCR Session.

Here is an example of the SecurityContext implementation that uses the servlet request:

Then use this to create a Session:

We'll see later in the JCR chapter how this can be used to obtain a JCR Session for the authenticated user.

Namespace Registry

As we saw earlier, every ExecutionContext has a registry of namespaces. Namespaces are used throughout the graph API (as we'll see soon), and the prefix associated with each namespace makes for more readable string representations. The namespace registry tracks all of these namespaces and prefixes, and allows registrations to be added, modified, or removed. The interface for the NamespaceRegistry shows how these operations are done:

This interfaces exposes Namespace objects that are immutable:

ModeShape actually uses several implementations of NamespaceRegistry, but you can even implement your own and create ExecutionContexts that use it:

Class Loaders

ModeShape is designed around extensions: sequencers, connectors, MIME type detectors, and class loader factories. The core part of ModeShape is relatively small and has few dependencies, while many of the "interesting" components are extensions that plug into and are used by different parts of the core or by layers above (such as the JCR implementation). The core doesn't really care what the extensions do or what external libraries they require, as long as the extension fulfills its end of the extension contract.

This means that you only need the core modules of ModeShape on the application classpath, while the extensions do not have to be on the application classpath. And because the core modules of ModeShape have few dependencies, the risk of ModeShape libraries conflicting with the application's are lower. Extensions, on the other hand, will likely have a lot of unique dependencies. By separating the core of ModeShape from the class loaders used to load the extensions, your application is isolated from the extensions and their dependencies.

Of course, you can put all the JARs on the application classpath, too. This is what the examples in the Getting Started document do.

But in this case, how does ModeShape load all the extension classes? You may have noticed earlier that ExecutionContext implements the ClassLoaderFactory interface with a single method:

This means that any component that has a reference to an ExecutionContext has the ability to create a class loader with a supplied class path. As we'll see later, the connectors and sequencers are all defined with a class and optional class path. This is where that class path comes in.

The actual meaning of the class path, however, is a function of the implementation. ModeShape uses a StandardClassLoaderFactory that just loads the classes using the Thread's current context class loader (or, if there is none, delegates to the class loader that loaded the StandardClassLoaderFactory class). Of course, it's possible to implement other ClassLoaderFactory with other implementations. Then, just create a subcontext with your implementation:

The modeshape-classloader-maven project has a class loader factory implementation that parses the names into Maven coordinates, then uses those coordinates to look up artifacts in a Maven 2 repository. The artifact's POM file is used to determine the dependencies, which is done transitively to obtain the complete dependency graph. The resulting class loader has access to these artifacts in dependency order.

This class loader is not ready for use, however, since there is no tooling to help populate the repository.

MIME Type Detectors

ModeShape often needs the ability to determine the MIME type for some binary content. When uploading content into a repository, we may want to add the MIME type as metadata. Or, we may want to make some processing decisions based upon the MIME type. So, ModeShape has a small pluggable framework for determining the MIME type by using the name of the file (e.g., extensions) and/or by reading the actual content.

ModeShape defines a MimeTypeDetector interface that abstracts the implementation that actually determines the MIME type given the name and content. If the detector is able to determine the MIME type, it simply returns it as a string. If not, it merely returns null. Note, however, that a detector must be thread-safe. Here is the interface:

To use a detector, simply invoke the method and supply the name of the content (e.g., the name of the file, with the extension) and the InputStream to the actual binary content. The result is a String containing the MIME type (e.g., "text/plain") or null if the MIME type cannot be determined. Note that the name or InputStream may be null, making this a very versatile utility.

Once again, you can obtain a MimeTypeDetector from the ExecutionContext. ModeShape provides and uses by default an implementation that uses only the name (the content is ignored), looking at the name's extension and looking for a match in a small listing (loaded from the org/modeshape/graph/mime.types loaded from the classpath). You can add extensions by copying this file, adding or correcting the entries, and then placing your updated file in the expected location on the classpath.

Of course, you can always use a different MimeTypeDetector by creating a subcontext and supplying your implementation:

Text Extractors

ModeShape can store all kinds of content, and ModeShape makes it easy to perform full-text searches on that content. To support searching, ModeShape extracts the text from the various properties on each node. They way it does this for most property types (e.g., STRING, LONG, DATE, PATH, NAME, etc.) is simply to read and use the literal values. But BINARY properties are another story: there's no way to indexes the binary content directly. Instead, ModeShape has a small pluggable framework for extracting useful text from the binary content, based upon the MIME type of the content itself.

The process works like this: when a BINARY property needs to be indexed for search, ModeShape determines the MIME type of the content, determines if there is a text extractor capable of handling that MIME type, and if so it passes the content to the text extractor and gets back a string of text, and it indexes that text.

ModeShape provides two text extractors out-of-the-box. The Teiid VDB text extractor operates only upon Teiid virtual database (i.e., ".vdb") files and extracts the virtual database's logical name, description, and version, plus the logical name, description, source name, source translator name, and JNDI name for each of the virtual database's models.

The second out-of-the-box extractor is capable of extracting text from wider variety of file types, including Microsoft Office, PDF, HTML, plain text, and XML. This extractor uses the Tika toolkit from Apache, so a number of other file formats are supported. However, these other file formats require additional libraries that are not included out of the box. This is discussed in more detail in a later chapter.

Text extraction can be an intensive process, so it is not enabled by default. But enabling the text extractors in ModeShape's configuration is actually pretty easy. When using a configuration file, simply add a "<mode:textExtractors>" fragment under the "<configuration>" root element. Within the "<mode:textExtractors>" element place one or more "<mode:textExtractor>" fragments specifying at least the extractor's name and fully-qualified Java class.

For example, here is the fragment that defines the Teiid text extractor and the Tika text extractor. Note that the Teiid text extractor has no options and is pretty simple, while the Tika extractor allows much more control over the MIME types that should be processed:

It's also possible to define your own text extractors by implementing the TextExtractor interface:

As mentioned above, the "supportsMimeType" method will be called first, and only if your implementation returns true for a given MIME type will the "extractFrom" method be called. The supplied TextExtractorContext object provides information about the text being processed, while the TextExtractorOutput is a simple interface that your extractor uses to record one or more strings containing the extracted text.

If you need text extraction in sequencers or connectors, you can always get a TextExtractor instance from the ExecutionContext. That TextExtractor implementation is actually a composite of all of the text extractors defined in the configuration.

Of course, you can always use a different TextExtractor by creating a subcontext and supplying your implementation:

Property factory and value factories

Two other components are made available by the ExecutionContext. The PropertyFactory is an interface that can be used to create Property instances, which are used throughout the graph API. The ValueFactories interface provides access to a number of different factories for different kinds of property values. These will be discussed in much more detail in the next chapter. But like the other components that are in an ExecutionContext, you can create subcontexts with different implementations:

and

Of course, implementing your own factories is a pretty advanced topic, and it will likely be something you do not need to do in your application.

Summary

In this chapter, we introduced the ExecutionContext as a representation of the environment in which many of the ModeShape components operate. ExecutionContext provides a very simple but powerful way to inject commonly-needed facilities throughout the system.

In the next chapter, we'll dive into Graph API and will introduce the notion of nodes, paths, names, and properties, that are so essential and used throughout ModeShape.

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.