JBoss.orgCommunity Documentation

Chapter 33. JCR API Extensions

33.1. API and usage
33.2. Configuration
33.3. Implementation notices

eXo JCR implementation offers new extended feature beyond JCR specification. Sometimes one JCR Node has hundreds or even thousands of child nodes. This situation is highly not recommended for content repository data storage, but sometimes it occurs. JCR Team is pleased to announce new feature that will help to have a deal with huge child lists. They can be iterated in a "lazy" manner now giving improvement in term of performance and RAM usage.

Lazy child nodes iteration feature is accessible via extended interface org.exoplatform.services.jcr.core.ExtendedNode, the inheritor of javax.jcr.Node. It provides a new single method shown below:

   /**
    * Returns a NodeIterator over all child Nodes of this Node. Does not include properties 
    * of this Node. If this node has no child nodes, then an empty iterator is returned.
    * 
    * @return A NodeIterator over all child Nodes of this <code>Node</code>.
    * @throws RepositoryException If an error occurs.
    */
   public NodeIterator getNodesLazily() throws RepositoryException;

From the view of end-user or client application, getNodesLazily() works similar to JCR specified getNodes() returning NodeIterator. "Lazy" iterator supports the same set of features as an ordinary NodeIterator, including skip() and excluding remove() features. "Lazy" implementation performs reading from DB by pages. Each time when it has no more elements stored in memory, it reads next set of items from persistent layer. This set is called "page". Must admit that getNodesLazily feature fully supports session and transaction changes log, so it's a functionally-full analogue of specified getNodes() operation. So when having a deal with huge list of child nodes, getNodes() can be simply and safely substituted with getNodesLazily().

JCR gives an experimental opportunity to replace all getNodes() invocations with getNodesLazily() calls. It handles a boolean system property named "org.exoplatform.jcr.forceUserGetNodesLazily" that internally replaces one call with another, without any code changes. But be sure using it only for development purposes. This feature can be used with top level products using eXo JCR to perform a quick compatibility and performance tests without changing any code. This is not recommended to be used as a production solution.

In order to enable add the "-Dorg.exoplatform.jcr.forceUserGetNodesLazily=true" to the java system properties.

The "lazy" iterator reads the child nodes "page" after "page" into the memory. In this context, a "page" is a set of nodes that is read at once. The size of the page is by default 100 nodes and can be configured though workspace container configuration using "lazy-node-iterator-page-size" parameter. For example:

<container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
   <properties>
      <property name="source-name" value="jdbcjcr" />
      <property name="multi-db" value="true" />
      <property name="max-buffer-size" value="200k" />
      <property name="swap-directory" value="target/temp/swap/ws" />
      <property name="lazy-node-iterator-page-size" value="50" />
      ...
   </properties>

It's not recommended to configure a large number for the page size.

Current "lazy" child nodes iterator supports caching, when pages are cached atomically in safe and optimized way. Cache is always kept in consistent state using invalidation if child list changed. Take in account the following difference in getNodes and getNodesLazily. Specification defined getNodes method reads whole list of nodes, so child items added after invocation will never be in results. GetNodesLazily doesn't acquire full list of nodes, so child items added after iterator creation can be found in result. So getNodesLazily can represent some kind of "real-time" results. But it is highly depend on numerous conditions and should not be used as a feature, it more likely implementation specific issue typical for "lazy-pattern".