Chapter 6. Creating custom sequencers

6.1. Creating the Maven 2 project
6.2. Implementing the StreamSequencer interface
6.3. Testing custom sequencers
6.4. Deploying custom sequencers

The current release of JBoss DNA comes with six sequencers. However, it's very easy to create your own sequencers and to then configure JBoss DNA to use them in your own application.

Creating a custom sequencer involves the following steps:

Create a Maven 2 project for your sequencer;
Implement the org.jboss.dna.spi.sequencers.StreamSequencer interface with your own implementation, and create unit tests to verify the functionality and expected behavior;
Add the sequencer configuration to the JBoss DNA SequencingService in your application as described in the previous chapter; and
Deploy the JAR file with your implementation (as well as any dependencies), and make them available to JBoss DNA in your application.

It's that simple.

6.1. Creating the Maven 2 project

The first step is to create the Maven 2 project that you can use to compile your code and build the JARs. Maven 2 automates a lot of the work, and since you're already set up to use Maven, using Maven for your project will save you a lot of time and effort. Of course, you don't have to use Maven 2, but then you'll have to get the required libraries and manage the compiling and building process yourself.

Note

JBoss DNA may provide in the future a Maven archetype for creating sequencer projects. If you'd find this useful and would like to help create it, please join the community.

Note

The dna-sequencer-images project is a small, self-contained sequencer implementation that has only the minimal dependencies. Starting with this project's source and modifying it to suit your needs may be the easiest way to get started. See the subversion repository: http://anonsvn.jboss.org/repos/dna/trunk/sequencers/dna-sequencer-images/

You can create your Maven project any way you'd like. For examples, see the Maven 2 documentation. Once you've done that, just add the dependencies in your project's pom.xml dependencies section:




<dependency>

  <groupId>org.jboss.dna</groupId>

  <artifactId>dna-common</artifactId>

  <version>0.4</version>

</dependency>

<dependency>

  <groupId>org.jboss.dna</groupId>

  <artifactId>dna-graph</artifactId>

  <version>0.4</version>

</dependency>

<dependency>

  <groupId>org.slf4j</groupId>

  <artifactId>slf4j-api</artifactId>

</dependency>

These are minimum dependencies required for compiling a sequencer. Of course, you'll have to add other dependencies that your sequencer needs.

As for testing, you probably will want to add more dependencies, such as those listed here:




<dependency>

  <groupId>junit</groupId>

  <artifactId>junit</artifactId>

  <version>4.4</version>

  <scope>test</scope>

</dependency>

<dependency>

  <groupId>org.hamcrest</groupId>

  <artifactId>hamcrest-library</artifactId>

  <version>1.1</version>

  <scope>test</scope>

</dependency>

<!-- Logging with Log4J -->

<dependency>

  <groupId>org.slf4j</groupId>

  <artifactId>slf4j-log4j12</artifactId>

  <version>1.4.3</version>

  <scope>test</scope>

</dependency>

<dependency>

  <groupId>log4j</groupId>

  <artifactId>log4j</artifactId>

  <version>1.2.14</version>

  <scope>test</scope>

</dependency>

Testing JBoss DNA sequencers does not require a JCR repository or the JBoss DNA services. (For more detail, see the testing section.) However, if you want to do integration testing with a JCR repository and the JBoss DNA services, you'll need additional dependencies for these libraries.




<dependency>

  <groupId>org.jboss.dna</groupId>

  <artifactId>dna-repository</artifactId>

  <version>0.4</version>

  <scope>test</scope>

</dependency>

<!-- Java Content Repository API -->

<dependency>

  <groupId>javax.jcr</groupId>

  <artifactId>jcr</artifactId>

  <version>1.0.1</version>

  <scope>test</scope>

</dependency>

<!-- Apache Jackrabbit (JCR Implementation) -->

<dependency>

  <groupId>org.apache.jackrabbit</groupId>

  <artifactId>jackrabbit-api</artifactId>

  <version>1.4</version>

  <scope>test</scope>

  <!-- Exclude these since they are included in JDK 1.5 -->

  <exclusions>

    <exclusion>

      <groupId>xml-apis</groupId>

      <artifactId>xml-apis</artifactId>

    </exclusion>

    <exclusion>

      <groupId>xerces</groupId>

      <artifactId>xercesImpl</artifactId>

    </exclusion>

  </exclusions>

</dependency>

<dependency>

  <groupId>org.apache.jackrabbit</groupId>

  <artifactId>jackrabbit-core</artifactId>

  <version>1.4.5</version>

  <scope>test</scope>

  <!-- Exclude these since they are included in JDK 1.5 -->

  <exclusions>

    <exclusion>

      <groupId>xml-apis</groupId>

      <artifactId>xml-apis</artifactId>

    </exclusion>

    <exclusion>

      <groupId>xerces</groupId>

      <artifactId>xercesImpl</artifactId>

    </exclusion>

  </exclusions>

</dependency>

At this point, your project should be set up correctly, and you're ready to move on to writing the Java implementation for your sequencer.

6.2. Implementing the StreamSequencer interface

After creating the project and setting up the dependencies, the next step is to create a Java class that implements the org.jboss.dna.spi.sequencers.StreamSequencer interface. This interface is very straightforward and involves a single method:



public interface StreamSequencer {


    /**

     * Sequence the data found in the supplied stream, placing the output 

     * information into the supplied map.

     *

     * @param stream the stream with the data to be sequenced; never null

     * @param output the output from the sequencing operation; never null

     * @param context the context for the sequencing operation; never null

     */

    void sequence( InputStream stream, SequencerOutput output, SequencerContext context );

The job of a stream sequencer is to process the data in the supplied stream, and place into the SequencerOutput any information that is to go into the JCR repository. JBoss DNA figures out when your sequencer should be called (of course, using the sequencing configuration you'll add in a bit), and then makes sure the generated information is saved in the correct place in the repository.

The SequencerContext provides information about the current sequencing operation, including the location and properties of the node being sequenced, the MIME type of the node being sequenced, and a location to record problems that aren't severe enough to warrant throwing an exception.

The SequencerOutput class is fairly easy to use. There are basically two methods you need to call. One method sets the property values, while the other sets references to other nodes in the repository. Use these methods to describe the properties of the nodes you want to create, using relative paths for the nodes and valid JCR property names for properties and references. JBoss DNA will ensure that nodes are created or updated whenever they're needed.



public interface SequencerOutput {


  /**

   * Set the supplied property on the supplied node.  The allowable

   * values are any of the following:

   *   - primitives (which will be autoboxed)

   *   - String instances

   *   - String arrays

   *   - byte arrays

   *   - InputStream instances

   *   - Calendar instances

   *

   * @param nodePath the path to the node containing the property; 

   * may not be null

   * @param property the name of the property to be set

   * @param values the value(s) for the property; may be empty if 

   * any existing property is to be removed

   */

  void setProperty( String nodePath, String property, Object... values );


  /**

   * Set the supplied reference on the supplied node.

   *

   * @param nodePath the path to the node containing the property; 

   * may not be null

   * @param property the name of the property to be set

   * @param paths the paths to the referenced property, which may be

   * absolute paths or relative to the sequencer output node;

   * may be empty if any existing property is to be removed

   */

  void setReference( String nodePath, String property, String... paths );

}

JBoss DNA will create nodes of type nt:unstructured unless you specify the value for the jcr:primaryType property. You can also specify the values for the jcr:mixinTypes property if you want to add mixins to any node.

For a complete example of a sequencer, let's look at the org.jboss.dna.sequencers.image.ImageMetadataSequencer implementation:



public class ImageMetadataSequencer implements StreamSequencer {


    public static final String METADATA_NODE = "image:metadata";

    public static final String IMAGE_PRIMARY_TYPE = "jcr:primaryType";

    public static final String IMAGE_MIXINS = "jcr:mixinTypes";

    public static final String IMAGE_MIME_TYPE = "jcr:mimeType";

    public static final String IMAGE_ENCODING = "jcr:encoding";

    public static final String IMAGE_FORMAT_NAME = "image:formatName";

    public static final String IMAGE_WIDTH = "image:width";

    public static final String IMAGE_HEIGHT = "image:height";

    public static final String IMAGE_BITS_PER_PIXEL = "image:bitsPerPixel";

    public static final String IMAGE_PROGRESSIVE = "image:progressive";

    public static final String IMAGE_NUMBER_OF_IMAGES = "image:numberOfImages";

    public static final String IMAGE_PHYSICAL_WIDTH_DPI = "image:physicalWidthDpi";

    public static final String IMAGE_PHYSICAL_HEIGHT_DPI = "image:physicalHeightDpi";

    public static final String IMAGE_PHYSICAL_WIDTH_INCHES = "image:physicalWidthInches";

    public static final String IMAGE_PHYSICAL_HEIGHT_INCHES = "image:physicalHeightInches";


    /**

     * {@inheritDoc}

     */

    public void sequence( InputStream stream, SequencerOutput output, 

                          SequencerContext context ) {

        ImageMetadata metadata = new ImageMetadata();

        metadata.setInput(stream);

        metadata.setDetermineImageNumber(true);

        metadata.setCollectComments(true);


        // Process the image stream and extract the metadata ...

        if (!metadata.check()) {

            metadata = null;

        }


        // Generate the output graph if we found useful metadata ...

        if (metadata != null) {

            // Place the image metadata into the output map ...

            output.setProperty(METADATA_NODE, IMAGE_PRIMARY_TYPE, "image:metadata");

            // output.psetProperty(METADATA_NODE, IMAGE_MIXINS, "");

            output.setProperty(METADATA_NODE, IMAGE_MIME_TYPE, metadata.getMimeType());

            // output.setProperty(METADATA_NODE, IMAGE_ENCODING, "");

            output.setProperty(METADATA_NODE, IMAGE_FORMAT_NAME, metadata.getFormatName());

            output.setProperty(METADATA_NODE, IMAGE_WIDTH, metadata.getWidth());

            output.setProperty(METADATA_NODE, IMAGE_HEIGHT, metadata.getHeight());

            output.setProperty(METADATA_NODE, IMAGE_BITS_PER_PIXEL, metadata.getBitsPerPixel());

            output.setProperty(METADATA_NODE, IMAGE_PROGRESSIVE, metadata.isProgressive());

            output.setProperty(METADATA_NODE, IMAGE_NUMBER_OF_IMAGES, metadata.getNumberOfImages());

            output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_WIDTH_DPI, metadata.getPhysicalWidthDpi());

            output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_HEIGHT_DPI, metadata.getPhysicalHeightDpi());

            output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_WIDTH_INCHES, metadata.getPhysicalWidthInch());

            output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_HEIGHT_INCHES, metadata.getPhysicalHeightInch());

        }

    }

}

Notice how the image metadata is extracted and the output graph is generated. A single node is created with the name image:metadata and with the image:metadata node type. No mixins are defined for the node, but several properties are set on the node using the values obtained from the image metadata. After this method returns, the constructed graph will be saved to the repository in all of the places defined by its configuration. (This is why only relative paths are used in the sequencer.)

6.3. Testing custom sequencers

The sequencing framework was designed to make testing sequencers much easier. In particular, the StreamSequencer interface does not make use of the JCR API. So instead of requiring a fully-configured JCR repository and JBoss DNA system, unit tests for a sequencer can focus on testing that the content is processed correctly and the desired output graph is generated.

Note

For a complete example of a sequencer unit test, see the ImageMetadataSequencerTest unit test in the org.jboss.dna.sequencer.images package of the dna-sequencers-image project.

The following code fragment shows one way of testing a sequencer, using JUnit 4.4 assertions and some of the classes made available by JBoss DNA. Of course, this example code does not do any error handling and does not make all the assertions a real test would.



Sequencer sequencer = new ImageMetadataSequencer();

MockSequencerOutput output = new MockSequencerOutput();

MockSequencerContext context = new MockSequencerContext();

InputStream stream = null;

try {

    stream = this.getClass().getClassLoader().getResource("caution.gif").openStream();

    sequencer.sequence(stream,output,context);   // writes to 'output'

    assertThat(output.getPropertyValues("image:metadata", "jcr:primaryType"), 

               is(new Object[] {"image:metadata"}));

    assertThat(output.getPropertyValues("image:metadata", "jcr:mimeType"), 

               is(new Object[] {"image/gif"}));

    // ... make more assertions here

    assertThat(output.hasReferences(), is(false));

} finally {

    stream.close();

}

It's also useful to test that a sequencer produces no output for something it should not understand:



Sequencer sequencer = new ImageMetadataSequencer();

MockSequencerOutput output = new MockSequencerOutput();

MockSequencerContext context = new MockSequencerContext();

InputStream stream = null;

try {

    stream = this.getClass().getClassLoader().getResource("caution.pict").openStream();

    sequencer.sequence(stream,output,context);   // writes to 'output'

    assertThat(output.hasProperties(), is(false));

    assertThat(output.hasReferences(), is(false));

} finally {

    stream.close();

}

These are just two simple tests that show ways of testing a sequencer. Some tests may get quite involved, especially if a lot of output data is produced.

It may also be useful to create some integration tests that configure JBoss DNA to use a custom sequencer, and to then upload content using the JCR API, verifying that the custom sequencer did run. However, remember that JBoss DNA runs sequencers asynchronously in the background, and you must sychronize your tests to ensure that the sequencers have a chance to run before checking the results. (One way of doing this (although, granted, not always reliable) is to wait for a second after uploading your content, shutdown the SequencingService and await its termination, and then check that the sequencer output has been saved to the JCR repository. For an example of this technique, see the SequencingClientTest unit test in the example application.)

6.4. Deploying custom sequencers

The first step of deploying a sequencer consists of adding/changing the sequencer configuration (e.g., SequencerConfig) in the SequencingService. This was covered in the previous chapter.

The second step is to make the sequencer implementation available to JBoss DNA. At this time, the JAR containing your new sequencer, as well as any JARs that your sequencer depends on, should be placed on your application classpath.

Note

A future goal of JBoss DNA is to allow sequencers, connectors, and other extensions to be easily deployed into a runtime repository. This process will not only be much simpler, but it will also provide JBoss DNA with the information necessary to update configurations and create the appropriate class loaders for each extension. Having separate class loaders for each extension helps prevent the pollution of the common classpath, facilitates an isolated runtime environment to eliminate any dependency conflicts, and may potentially enable hot redeployment of newer extension versions.