Chapter 1. eXoJCR

Add a new residual property definition with name "downloadCount" to the existing node type "myNodeType".

There are two limitations that do not allow us to make the task with a single call of registerNodeType method.

Existing nodes of the type "myNodeType", which does not contain properties "downloadCount" that conflicts with node type what we need.
Registered node type "myNodeType" will not allow us to add properties "downloadCount" because it has no such specific properties.

To complete the task, we need to make 3 steps:

Change the existing node type "myNodeType" by adding the mandatory property "downloadCount".
Add the node type "myNodeType" with the property "downloadCount" to all the existing node types.
Change the definition of the property "downloadCount" of the node type "myNodeType" to mandatory.

1.9.5.4. Changing the list of super types

NodeTypeValue testNValue = nodeTypeManager.getNodeTypeValue("exo:myNodeType");

List<String> superType  = testNValue.getDeclaredSupertypeNames();
superType.add("mix:versionable");
testNValue.setDeclaredSupertypeNames(superType);

nodeTypeManager.registerNodeType(testNValue, ExtendedNodeTypeManager.REPLACE_IF_EXISTS);

1.10. Registry Service

1.10.1. Concept

The Registry Service is one of the key parts of the infrastructure built around eXo JCR. Each JCR that is based on service, applications, etc may have its own configuration, settings data and other data that have to be stored persistently and used by the approptiate service or application. ( We call it "Consumer").

The service acts as a centralized collector (Registry) for such data. Naturally, a registry storage is JCR based i.e. stored in some JCR workspace (one per Repository) as an Item tree under /exo:registry node.

Despite the fact that the structure of the tree is well defined (see the scheme below), it is not recommended for other services to manipulate data using JCR API directly for better flexibility. So the Registry Service acts as a mediator between a Consumer and its settings.

The proposed structure of the Registry Service storage is divided into 3 logical groups: services, applications and users:

 exo:registry/          <-- registry "root" (exo:registry)
   exo:services/        <-- service data storage (exo:registryGroup)
     service1/
       Consumer data    (exo:registryEntry)
     ...
   exo:applications/    <-- application data storage (exo:registryGroup)
     app1/
       Consumer data    (exo:registryEntry)
     ...
   exo:users/           <-- user personal data storage (exo:registryGroup)
     user1/
       Consumer data    (exo:registryEntry)
     ...

Each upper level eXo Service may store its configuration in eXo Registry. At first, start from xml-config (in jar etc) and then from Registry. In configuration file, you can add force-xml-configuration parameter to component to ignore reading parameters initialization from RegistryService and to use file instead:

<value-param>
  <name>force-xml-configuration</name>
  <value>true</value>
</value-param>

1.10.2. The API

The main functionality of the Registry Service is pretty simple and straightforward, it is described in the Registry abstract class as the following:

public abstract class Registry
{

   /**
    * Returns Registry node object which wraps Node of "exo:registry" type (the whole registry tree)
    */
   public abstract RegistryNode getRegistry(SessionProvider sessionProvider) throws RepositoryConfigurationException,
      RepositoryException;

   /**
    * Returns existed RegistryEntry which wraps Node of "exo:registryEntry" type
    */
   public abstract RegistryEntry getEntry(SessionProvider sessionProvider, String entryPath)
      throws PathNotFoundException, RepositoryException;

   /**
    * creates an entry in the group. In a case if the group does not exist it will be silently
    * created as well
    */
   public abstract void createEntry(SessionProvider sessionProvider, String groupPath, RegistryEntry entry)
      throws RepositoryException;

   /**
    * updates an entry in the group
    */
   public abstract void recreateEntry(SessionProvider sessionProvider, String groupPath, RegistryEntry entry)
      throws RepositoryException;

   /**
    * removes entry located on entryPath (concatenation of group path / entry name)
    */
   public abstract void removeEntry(SessionProvider sessionProvider, String entryPath) throws RepositoryException;

}

As you can see it looks like a simple CRUD interface for the RegistryEntry object which wraps registry data for some Consumer as a Registry Entry. The Registry Service itself knows nothing about the wrapping data, it is Consumer's responsibility to manage and use its data in its own way.

To create an Entity Consumer you should know how to serialize the data to some XML structure and then create a RegistryEntry from these data at once or populate them in a RegistryEntry object (using RegistryEntry(String entryName) constructor and then obtain and fill a DOM document).

Example of RegistryService using:

    RegistryService regService = (RegistryService) container
    .getComponentInstanceOfType(RegistryService.class);

    RegistryEntry registryEntry = regService.getEntry(sessionProvider,
            RegistryService.EXO_SERVICES + "/my-service");

    Document doc = registryEntry.getDocument();
    
    String mySetting = getElementsByTagName("tagname").item(index).getTextContent();
     .....

1.10.3. Configuration

RegistryService has two optional params: value parameter mixin-names and properties parameter locations. The mixin-names is used for adding additional mixins to nodes exo:registry, exo:applications, exo:services, exo:users and exo:groups of RegistryService. This allows the top level applications to manage these nodes in special way. Locations is used to mention where exo:registry is placed for each repository. The name of each property is interpreted as a repository name and its value as a workspace name (a system workspace by default).

<component>
   <type>org.exoplatform.services.jcr.ext.registry.RegistryService</type>
   <init-params>
      <values-param>
         <name>mixin-names</name>
         <value>exo:hideable</value>      
      </values-param>
      <properties-param>         
      <name>locations</name>
         <property name="db1" value="ws2"/>
      </properties-param>
   </init-params>
</component>

1.11. Namespace altering

Since version 1.11, eXo JCR implementation supports namespaces altering.

1.11.1. Adding new namespace

ExtendedNamespaceRegistry namespaceRegistry = (ExtendedNamespaceRegistry) workspace.getNamespaceRegistry();
namespaceRegistry.registerNamespace("newMapping", "http://dumb.uri/jcr");

1.11.2. Changing existing namespace

ExtendedNamespaceRegistry namespaceRegistry = (ExtendedNamespaceRegistry) workspace.getNamespaceRegistry();
namespaceRegistry.registerNamespace("newMapping", "http://dumb.uri/jcr");
namespaceRegistry.registerNamespace("newMapping2", "http://dumb.uri/jcr");

1.11.3. Removing existing namespace

ExtendedNamespaceRegistry namespaceRegistry = (ExtendedNamespaceRegistry) workspace.getNamespaceRegistry();
namespaceRegistry.registerNamespace("newMapping", "http://dumb.uri/jcr");
namespaceRegistry.unregisterNamespace("newMapping");

1.12. Node Types and Namespaces

Support of node types and namespaces is required by the JSR-170 specification. Beyond the methods required by the specification, eXo JCR has its own API extension for the Node type registration as well as the ability to declaratively define node types in the Repository at the start-up time.

1.12.1. Node Types definition

Node type registration extension is declared in org.exoplatform.services.jcr.core.nodetype.ExtendedNodeTypeManager interface

Your custom service can register some neccessary predefined node types at the start-up time. The node definition should be placed in a special XML file (see DTD below) and declared in the service's configuration file thanks to eXo component plugin mechanism, described as follows:

<external-component-plugins>
  <target-component>org.exoplatform.services.jcr.RepositoryService</target-component>
      <component-plugin>
        <name>add.nodeType</name>
        <set-method>addPlugin</set-method>
        <type>org.exoplatform.services.jcr.impl.AddNodeTypePlugin</type>
        <init-params>
          <values-param>
            <name>autoCreatedInNewRepository</name>
            <description>Node types configuration file</description>
            <value>jar:/conf/test/nodetypes-tck.xml</value>
            <value>jar:/conf/test/nodetypes-impl.xml</value>
          </values-param>
    <values-param> 
            <name>repo1</name> 
            <description>Node types configuration file for repository with name repo1</description> 
            <value>jar:/conf/test/nodetypes-test.xml</value> 
          </values-param>
    <values-param> 
            <name>repo2</name> 
            <description>Node types configuration file for repository with name repo2</description> 
            <value>jar:/conf/test/nodetypes-test2.xml</value> 
          </values-param>
        </init-params>
      </component-plugin>

There are two types of registration. The first type is the registration of node types in all created repositories, it is configured in values-param with the name autoCreatedInNewRepository. The second type is registration of node types in specified repository and it is configured in values-param with the name of repository.

Node type definition file format:

  <?xml version="1.0" encoding="UTF-8"?>
  <!DOCTYPE nodeTypes [
   <!ELEMENT nodeTypes (nodeType)*>
      <!ELEMENT nodeType (supertypes?|propertyDefinitions?|childNodeDefinitions?)>

      <!ATTLIST nodeType
         name CDATA #REQUIRED
         isMixin (true|false) #REQUIRED
         hasOrderableChildNodes (true|false)
         primaryItemName CDATA
      >
      <!ELEMENT supertypes (supertype*)>
      <!ELEMENT supertype (CDATA)>
   
      <!ELEMENT propertyDefinitions (propertyDefinition*)>

      <!ELEMENT propertyDefinition (valueConstraints?|defaultValues?)>
      <!ATTLIST propertyDefinition
         name CDATA #REQUIRED
         requiredType (String|Date|Path|Name|Reference|Binary|Double|Long|Boolean|undefined) #REQUIRED
         autoCreated (true|false) #REQUIRED
         mandatory (true|false) #REQUIRED
         onParentVersion (COPY|VERSION|INITIALIZE|COMPUTE|IGNORE|ABORT) #REQUIRED
         protected (true|false) #REQUIRED
         multiple  (true|false) #REQUIRED
      >    
    <!-- For example if you need to set ValueConstraints [], 
      you have to add an empty element <valueConstraints/>. 
      The same order is for other properties like defaultValues, requiredPrimaryTypes etc.
      -->  
      <!ELEMENT valueConstraints (valueConstraint*)>
      <!ELEMENT valueConstraint (CDATA)>
      <!ELEMENT defaultValues (defaultValue*)>
      <!ELEMENT defaultValue (CDATA)>

      <!ELEMENT childNodeDefinitions (childNodeDefinition*)>

      <!ELEMENT childNodeDefinition (requiredPrimaryTypes)>
      <!ATTLIST childNodeDefinition
         name CDATA #REQUIRED
         defaultPrimaryType  CDATA #REQUIRED
         autoCreated (true|false) #REQUIRED
         mandatory (true|false) #REQUIRED
         onParentVersion (COPY|VERSION|INITIALIZE|COMPUTE|IGNORE|ABORT) #REQUIRED
         protected (true|false) #REQUIRED
         sameNameSiblings (true|false) #REQUIRED
      >
      <!ELEMENT requiredPrimaryTypes (requiredPrimaryType+)>
      <!ELEMENT requiredPrimaryType (CDATA)>  
]>

1.12.2. Namespaces definition

Default namespaces are registered by repository at the start-up time

Your custom service can extend a set of namespaces with some application specific ones, declaring it in service's configuration file thanks to eXo component plugin mechanism, described as follows:

      <component-plugin> 
          <name>add.namespaces</name>
          <set-method>addPlugin</set-method>
          <type>org.exoplatform.services.jcr.impl.AddNamespacesPlugin</type>
          <init-params>
            <properties-param>
              <name>namespaces</name>
              <property name="test" value="http://www.test.org/test"/>
            </properties-param>      
          </init-params>                  
      </component-plugin>

1.13. eXo JCR configuration

This section provides you the knowledge about eXo JCR configuration in details, including the basic and advanced configuration.

1.13.1. Related documents

1.13.2. Portal and Standalone configuration

Like other eXo services, eXo JCR can be configured and used in the portal or embedded mode (as a service embedded in GateIn) and in standalone mode.

In Embedded mode, JCR services are registered in the Portal container and the second option is to use a Standalone container. The main difference between these container types is that the first one is intended to be used in a Portal (Web) environment, while the second one can be used standalone (see the comprehensive page Service Configuration for Beginners for more details).

The following setup procedure is used to obtain a Standalone configuration (see more in Container configuration):

Configuration that is set explicitly using StandaloneContainer.addConfigurationURL(String url) or StandaloneContainer.addConfigurationPath(String path) before getInstance()
Configuration from $base:directory/exo-configuration.xml or $base:directory/conf/exo-configuration.xml file. Where $base:directory is either AS's home directory in case of J2EE AS environment or just the current directory in case of a standalone application.
/conf/exo-configuration.xml in the current classloader (e.g. war, ear archive)
Configuration from $service_jar_file/conf/portal/configuration.xml. WARNING: Don't rely on some concrete jar's configuration if you have more than one jar containing conf/portal/configuration.xml file. In this case choosing a configuration is unpredictable.

JCR service configuration looks like:

<component>
  <key>org.exoplatform.services.jcr.RepositoryService</key>
  <type>org.exoplatform.services.jcr.impl.RepositoryServiceImpl</type>
</component>
<component>
  <key>org.exoplatform.services.jcr.config.RepositoryServiceConfiguration</key>
  <type>org.exoplatform.services.jcr.impl.config.RepositoryServiceConfigurationImpl</type>
  <init-params>
    <value-param>
      <name>conf-path</name>
      <description>JCR repositories configuration file</description>
      <value>jar:/conf/standalone/exo-jcr-config.xml</value>
    </value-param>
    <value-param>
      <name>max-backup-files</name>
      <value>5</value>
    </value-param>
    <properties-param>
      <name>working-conf</name>
      <description>working-conf</description>
      <property name="source-name" value="jdbcjcr" />
      <property name="dialect" value="hsqldb" />
      <property name="persister-class-name" value="org.exoplatform.services.jcr.impl.config.JDBCConfigurationPersister" />
    </properties-param>
  </init-params>
</component>

conf-path : a path to a RepositoryService JCR Configuration.

max-backup-files : max number of backup files. This option lets you specify the number of stored backups. Number of backups can't exceed this value. File which will exceed the limit will replace the oldest file.

working-conf : optional; JCR configuration persister configuration. If there isn't a working-conf, the persister will be disabled.

1.13.3. JCR Configuration

The Configuration is defined in an XML file (see DTD below).

JCR Service can use multiple Repositories and each repository can have multiple Workspaces.

From v.1.9 JCR, repositories configuration parameters support human-readable formats of values. They are all case-insensitive:

Numbers formats: K,KB - kilobytes, M,MB - megabytes, G,GB - gigabytes, T,TB - terabytes. Examples: 100.5 - digit 100.5, 200k - 200 Kbytes, 4m - 4 Mbytes, 1.4G - 1.4 Gbytes, 10T - 10 Tbytes

Time format endings: ms - milliseconds, m - minutes, h - hours, d - days, w - weeks, if no ending - seconds. Examples: 500ms - 500 milliseconds, 20 - 20 seconds, 30m - 30 minutes, 12h - 12 hours, 5d - 5 days, 4w - 4 weeks.

1.13.4. Repository service configuration (JCR repositories configuration)

Service configuration may be placed in jar:/conf/standalone/exo-jcr-config.xml for standalone mode. For portal mode, it is located in the portal web application portal/WEB-INF/conf/jcr/repository-configuration.xml.

default-repository: The name of a default repository (one returned by RepositoryService.getRepository()).

repositories: The list of repositories.

1.13.5. Repository configuration

name: The name of a repository.

default-workspace: The name of a workspace obtained using Session's login() or login(Credentials) methods (ones without an explicit workspace name).

system-workspace: The name of workspace where /jcr:system node is placed.

security-domain: The name of a security domain for JAAS authentication.

access-control: The name of an access control policy. There can be 3 types: optional - ACL is created on-demand(default), disable - no access control, mandatory - an ACL is created for each added node(not supported yet).

authentication-policy: The name of an authentication policy class.

workspaces: The list of workspaces.

session-max-age: The time after which an idle session will be removed (called logout). If session-max-age is not set up, idle session will never be removed.

lock-remover-max-threads: Number of threads that can serve LockRemover tasks. Default value is 1. Repository may have many workspaces, each workspace have own LockManager. JCR supports Locks with defined lifetime. Such a lock must be removed is it become expired. That is what LockRemovers does. But LockRemovers is not an independent timer-threads, its a task that executed each 30 seconds. Such a task is served by ThreadPoolExecutor which may use different number of threads.

1.13.6. Workspace configuration

name: The name of a workspace

container: Workspace data container (physical storage) configuration.

initializer: Workspace initializer configuration.

cache: Workspace storage cache configuration.

query-handler: Query handler configuration.

auto-init-permissions: DEPRECATED in JCR 1.9 (use initializer). Default permissions of the root node. It is defined as a set of semicolon-delimited permissions containing a group of space-delimited identities (user, group, etc, see Organization service documentation for details) and the type of permission. For example, any read; :/admin read;:/admin add_node; :/admin set_property;:/admin remove means that users from group admin have all permissions and other users have only a 'read' permission.

1.13.7. Value Storage plugin configuration (for data container):

Note

The value-storage element is optional. If you don't include it, the values will be stored as BLOBs inside the database.

value-storage: Optional value Storage plugin definition.

class: A value storage plugin class name (attribute).

properties: The list of properties (name-value pairs) for a concrete Value Storage plugin.

filters: The list of filters defining conditions when this plugin is applicable.

1.13.8. Initializer configuration (optional)

class: Initializer implementation class.

properties: The list of properties (name-value pairs). Properties are supported.

root-nodetype: The node type for root node initialization.

root-permissions: Default permissions of the root node. It is defined as a set of semicolon-delimited permissions containing a group of space-delimited identities (user, group etc, see Organization service documentation for details) and the type of permission. For example any read; :/admin read;:/admin add_node; :/admin set_property;:/admin remove means that users from group admin have all permissions and other users have only a 'read' permission.

Configurable initializer adds a capability to override workspace initial startup procedure (used for Clustering).

1.13.9. Cache configuration

enabled: If workspace cache is enabled or not.

class: Cache implementation class, optional from 1.9. Default value is. org.exoplatform.services.jcr.impl.dataflow.persistent.LinkedWorkspaceStorageCacheImpl.

Cache can be configured to use concrete implementation of WorkspaceStorageCache interface. JCR core has two implementation to use:

LinkedWorkspaceStorageCacheImpl - default, with configurable read behavior and statistic.
WorkspaceStorageCacheImpl - pre 1.9, still can be used.

properties: The list of properties (name-value pairs) for Workspace cache.

max-size: Cache maximum size (maxSize prior to v.1.9).

live-time: Cached item live time (liveTime prior to v.1.9).

From 1.9 LinkedWorkspaceStorageCacheImpl supports additional optional parameters.

statistic-period: Period (time format) of cache statistic thread execution, 5 minutes by default.

statistic-log: If true cache statistic will be printed to default logger (log.info), false by default or not.

statistic-clean: If true cache statistic will be cleaned after was gathered, false by default or not.

cleaner-period: Period of the eldest items remover execution, 20 minutes by default.

blocking-users-count: Number of concurrent users allowed to read cache storage, 0 - unlimited by default.

1.13.10. Query Handler configuration

class: A Query Handler class name.

properties: The list of properties (name-value pairs) for a Query Handler (indexDir).

Properties and advanced features described in Search Configuration.

1.13.11. Lock Manager configuration

time-out: Time after which the unused global lock will be removed.

persister: A class for storing lock information for future use. For example, remove lock after jcr restart.

path: A lock folder. Each workspace has its own one.

Note

Also see lock-remover-max-threads repository configuration parameter.

<!ELEMENT repository-service (repositories)>
<!ATTLIST repository-service default-repository NMTOKEN #REQUIRED>
<!ELEMENT repositories (repository)>
<!ELEMENT repository (security-domain,access-control,session-max-age,authentication-policy,workspaces)>
<!ATTLIST repository
  default-workspace NMTOKEN #REQUIRED
  name NMTOKEN #REQUIRED
  system-workspace NMTOKEN #REQUIRED
>
<!ELEMENT security-domain (#PCDATA)>
<!ELEMENT access-control (#PCDATA)>
<!ELEMENT session-max-age (#PCDATA)>
<!ELEMENT authentication-policy (#PCDATA)>
<!ELEMENT workspaces (workspace+)>
<!ELEMENT workspace (container,initializer,cache,query-handler)>
<!ATTLIST workspace name NMTOKEN #REQUIRED>
<!ELEMENT container (properties,value-storages)>
<!ATTLIST container class NMTOKEN #REQUIRED>
<!ELEMENT value-storages (value-storage+)>
<!ELEMENT value-storage (properties,filters)>
<!ATTLIST value-storage class NMTOKEN #REQUIRED>
<!ELEMENT filters (filter+)>
<!ELEMENT filter EMPTY>
<!ATTLIST filter property-type NMTOKEN #REQUIRED>
<!ELEMENT initializer (properties)>
<!ATTLIST initializer class NMTOKEN #REQUIRED>
<!ELEMENT cache (properties)>
<!ATTLIST cache 
  enabled NMTOKEN #REQUIRED
  class NMTOKEN #REQUIRED
>
<!ELEMENT query-handler (properties)>
<!ATTLIST query-handler class NMTOKEN #REQUIRED>
<!ELEMENT access-manager (properties)>
<!ATTLIST access-manager class NMTOKEN #REQUIRED>
<!ELEMENT lock-manager (time-out,persister)>
<!ELEMENT time-out (#PCDATA)>
<!ELEMENT persister (properties)>
<!ELEMENT properties (property+)>
<!ELEMENT property EMPTY>

1.13.12. Help application to prohibit the use of closed sessions

Products that use eXo JCR, sometimes missuse it since they continue to use a session that has been closed through a method call on a node, a property or even the session itself. To prevent bad practices we propose three modes which are the folllowing:

If the system property exo.jcr.prohibit.closed.session.usage has been set to true, then a RepositoryException will be thrown any time an application will try to access to a closed session. In the stack trace, you will be able to know the call stack that closes the session.
If the system property exo.jcr.prohibit.closed.session.usage has not been set and the system property exo.product.developing has been set to true, then a warning will be logged in the log file with the full stack trace in order to help identifying the root cause of the issue. In the stack trace, you will be able to know the call stack that closes the session.
If none of the previous system properties have been set, then we will ignore that the issue and let the application use the closed session as it was possible before without doing anything in order to allow applications to migrate step by step.

1.13.13. Help application to allow the use of closed datasources

Since usage of closed session affects usage of closed datasource we propose three ways to resolve such kind of isses:

If the system property exo.jcr.prohibit.closed.datasource.usage is set to true (default value) then a SQLException will be thrown any time an application will try to access to a closed datasource. In the stack trace, you will be able to know the call stack that closes the datasource.
If the system property exo.jcr.prohibit.closed.datasource.usage is set to false and the system property exo.product.developing is set to true, then a warning will be logged in the log file with the full stack trace in order to help identifying the root cause of the issue. In the stack trace, you will be able to know the call stack that closes the datasource.
If the system property exo.jcr.prohibit.closed.datasource.usage is set to false and the system property exo.product.developing is set to false usage of closed datasource will be allowed and nothing will be logged or thrown.

1.13.14. Getting the effective configuration at Runtime of all the repositories

The effective configuration of all the repositories and their workspaces can be known thanks to the method getConfigurationXML() that is exposed through JMX at the RepositoryServiceConfiguration level in case of a PortalContainer the name of the related MBean will be of type exo:portal=${portal-container-name},service=RepositoryServiceConfiguration. This method will give you the effective configuration in XML format that has been really interpreted by the the JCR core. This could be helpful to understand how your repositories/workspaces are configured especially if you would like to overwrite the configuration for some reasons.

1.13.15. Configuration of workspaces using system properties

You can configure values of properties defined in the file repository-configuration.xml using System Properties. This is quite helpful especially when you want to change the default configuration of all the workspaces for example if we want to disable the rdms indexing for all the workspace without this kind of improvement it is very error prone. For all components that can be configured thanks to properties such as container, value-storage, workspace-initializer, cache, query-handler, lock-manager, access-manager and persister the logic for example for the component 'container' and the property called 'foo' will be the following:

If we have a system property called exo.jcr.config.force.workspace.repository_collaboration.container.foo that has been defined, its value will be used for the configuration of the repository 'repository' and the workspace 'collaboration'
If we have a system property called exo.jcr.config.force.repository.repository.container.foo that has been defined, its value will be used for the configuration of all the workspaces of the repository 'repository' except the workspaces for which we configured the same property using system properties defined in #1
If we have a system property called exo.jcr.config.force.all.container.foo that has been defined, its value will be used for the configuration of all the workspaces except the workspaces for which we configured the same property using system properties defined in #1 or #2
If we have a property 'foo' configured for the repository 'repository' and the workspace 'collaboration' and we have no system properties corresponding to rule #1, #2 and #3, we will use this value (current behavior)
If the previous rules don't allow to give a value to the property 'foo', we will then check the default value in the following order exo.jcr.config.default.workspace.repository_collaboration.container.foo, exo.jcr.config.default.repository.repository.container.foo, exo.jcr.config.default.all.container.foo

To turn on this feature you need to define a component called SystemParametersPersistenceConfigurator. A simple example:

  <component>
    <key>org.exoplatform.services.jcr.config.SystemParametersPersistenceConfigurator</key>
    <type>org.exoplatform.services.jcr.config.SystemParametersPersistenceConfigurator</type>
    <init-params>
      <value-param>
        <name>file-path</name>
        <value>target/temp</value>
      </value-param>
      <values-param>
        <name>unmodifiable</name>
        <value>cache.test-parameter-I</value>
      </values-param>
      <values-param>
        <name>before-initialize</name>
        <value>value-storage.enabled</value>
      </values-param>
    </init-params>
  </component>

To make the configuration process easier here you can define thee parameters.

file-path — this is mandatory parameter which defines the location of the file where all parameters configured on pervious launch of AS are stored.
unmodifiable — this defines the list of parameters which cannot be modified using system properties
before-initialize — this defines the list of parameters which can be set only for not initialized workspaces (e.g. during the first start of the AS)

The parameter in the list have the following format: {component-name}.{parameter-name}. This takes affect for every workspace component called {component-name}.

Please take into account that if this component is not defined in the configuration, the workspace configuration overriding using system properties mechanism will be disabled. In other words: if you don't configure SystemParametersPersistenceConfigurator, the system properties are ignored.

1.14. Multi-language support in eXo JCR RDB backend

Whenever relational database is used to store multilingual text data of eXo Java Content Repository, it is necessary to adapt configuration in order to support UTF-8 encoding. Here is a short HOWTO instruction for several supported RDBMS with examples.

The configuration file you have to modify: .../webapps/portal/WEB-INF/conf/jcr/repository-configuration.xml

Note

Datasource jdbcjcr used in examples can be configured via InitialContextInitializer component.

1.14.1. Oracle

In order to run multilanguage JCR on an Oracle backend Unicode encoding for characters set should be applied to the database. Other Oracle globalization parameters don't make any impact. The only property to modify is NLS_CHARACTERSET.

We have tested NLS_CHARACTERSET = AL32UTF8 and it works well for many European and Asian languages.

Example of database configuration (used for JCR testing):

NLS_LANGUAGE             AMERICAN
NLS_TERRITORY            AMERICA
NLS_CURRENCY             $
NLS_ISO_CURRENCY         AMERICA
NLS_NUMERIC_CHARACTERS   .,
NLS_CHARACTERSET         AL32UTF8
NLS_CALENDAR             GREGORIAN
NLS_DATE_FORMAT          DD-MON-RR
NLS_DATE_LANGUAGE        AMERICAN
NLS_SORT                 BINARY
NLS_TIME_FORMAT          HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT     DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT       HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT  DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY        $
NLS_COMP                 BINARY
NLS_LENGTH_SEMANTICS     BYTE
NLS_NCHAR_CONV_EXCP      FALSE
NLS_NCHAR_CHARACTERSET   AL16UTF16

Warning

JCR doesn't use NVARCHAR columns, so that the value of the parameter NLS_NCHAR_CHARACTERSET does not matter for JCR.

Create database with Unicode encoding and use Oracle dialect for the Workspace Container:

<workspace name="collaboration">
          <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
            <properties>
              <property name="source-name" value="jdbcjcr" />
              <property name="dialect" value="oracle" />
              <property name="multi-db" value="false" />
              <property name="max-buffer-size" value="200k" />
              <property name="swap-directory" value="target/temp/swap/ws" />
            </properties>
          .....

1.14.2. DB2

DB2 Universal Database (DB2 UDB) supports UTF-8 and UTF-16/UCS-2. When a Unicode database is created, CHAR, VARCHAR, LONG VARCHAR data are stored in UTF-8 form. It's enough for JCR multi-lingual support.

Example of UTF-8 database creation:

DB2 CREATE DATABASE dbname USING CODESET UTF-8 TERRITORY US

Create database with UTF-8 encoding and use db2 dialect for Workspace Container on DB2 v.9 and higher:

<workspace name="collaboration">
          <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
            <properties>
              <property name="source-name" value="jdbcjcr" />
              <property name="dialect" value="db2" />
              <property name="multi-db" value="false" />
              <property name="max-buffer-size" value="200k" />
              <property name="swap-directory" value="target/temp/swap/ws" />
            </properties>
          .....

Note

For DB2 v.8.x support change the property "dialect" to db2v8.

1.14.3. MySQL

JCR MySQL-backend requires special dialect MySQL-UTF8 to be used for internationalization support. But the database default charset should be latin1 to use limited index space effectively (1000 bytes for MyISAM engine, 767 for InnoDB). If database default charset is multibyte, a JCR database initialization error is thrown concerning index creation failure. In other words, JCR can work on any singlebyte default charset of database, with UTF8 supported by MySQL server. But we have tested it only on latin1 database default charset.

Repository configuration, workspace container entry example:

<workspace name="collaboration">
          <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
            <properties>
              <property name="source-name" value="jdbcjcr" />
              <property name="dialect" value="mysql-utf8" />
              <property name="multi-db" value="false" />
              <property name="max-buffer-size" value="200k" />
              <property name="swap-directory" value="target/temp/swap/ws" />
            </properties>
          .....

You will need also to indicate the charset name either at server level using the server parameter --character-set-server (find more details there ) or at datasource configuration level by adding a new property as below:

          <property name="connectionProperties" value="useUnicode=yes;characterEncoding=utf8;characterSetResults=UTF-8;" />

1.14.4. PostgreSQL/PostgrePlus

On PostgreSQL/PostgrePlus-backend, multilingual support can be enabled in different ways:

Using the locale features of the operating system to provide locale-specific collation order, number formatting, translated messages, and other aspects. UTF-8 is widely used on Linux distributions by default, so it can be useful in such case.
Providing a number of different character sets defined in the PostgreSQL/PostgrePlus server, including multiple-byte character sets, to support storing text of any languages, and providing character set translation between client and server. We recommend to use UTF-8 database charset, it will allow any-to-any conversations and make this issue transparent for the JCR.

Create database with UTF-8 encoding and use a PgSQL dialect for Workspace Container:

<workspace name="collaboration">
          <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
            <properties>
              <property name="source-name" value="jdbcjcr" />
              <property name="dialect" value="pgsql" />
              <property name="multi-db" value="false" />
              <property name="max-buffer-size" value="200k" />
              <property name="swap-directory" value="target/temp/swap/ws" />
            </properties>
          .....

1.15. How to host several JCR instances on the same database instance?

Frequently, a single database instance must be shared by several other applications. But some of our customers have also asked for a way to host several JCR instances in the same database instance. To fulfill this need, we had to review our queries and scope them to the current schema; it is now possible to have one JCR instance per DB schema instead of per DB instance. To benefit of the work done for this feature you will need to apply the configuration changes described below.

1.15.1. LockManager configuration

To enable this feature you need to replace org.jboss.cache.loader.JDBCCacheLoader with org.exoplatform.services.jcr.impl.core.lock.jbosscache.JDBCCacheLoader in JBossCache configuration file.

Here is an example of this very part of the configuration:

<jbosscache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:jboss:jbosscache-core:config:3.1">

   <locking useLockStriping="false" concurrencyLevel="500" lockParentForChildInsertRemove="false"
      lockAcquisitionTimeout="20000" />

   <clustering mode="replication" clusterName="${jbosscache-cluster-name}">
      <stateRetrieval timeout="20000" fetchInMemoryState="false" />
      <sync />
   </clustering>

   <loaders passivation="false" shared="true">
      <!-- All the data of the JCR locks needs to be loaded at startup -->
      <preload>
         <node fqn="/" />
      </preload>  
      <!--
      For another cache-loader class you should use another template with
      cache-loader specific parameters
      -->
      <loader class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.JDBCCacheLoader" async="false" fetchPersistentState="false"
         ignoreModifications="false" purgeOnStartup="false">
         <properties>
            cache.jdbc.table.name=${jbosscache-cl-cache.jdbc.table.name}
            cache.jdbc.table.create=${jbosscache-cl-cache.jdbc.table.create}
            cache.jdbc.table.drop=${jbosscache-cl-cache.jdbc.table.drop}
            cache.jdbc.table.primarykey=${jbosscache-cl-cache.jdbc.table.primarykey}
            cache.jdbc.fqn.column=${jbosscache-cl-cache.jdbc.fqn.column}
            cache.jdbc.fqn.type=${jbosscache-cl-cache.jdbc.fqn.type}
            cache.jdbc.node.column=${jbosscache-cl-cache.jdbc.node.column}
            cache.jdbc.node.type=${jbosscache-cl-cache.jdbc.node.type}
            cache.jdbc.parent.column=${jbosscache-cl-cache.jdbc.parent.column}
            cache.jdbc.datasource=${jbosscache-cl-cache.jdbc.datasource}
         </properties>
      </loader>
   </loaders>
</jbosscache>

You can also obtain file example from GitHub.

1.15.2. HibernateService configuration

If you use HibernateService for JDBC connections management you will need to specify explicitly the default schema by setting "hibernate.default_schema" property in the configuration of HibernateService.

Here is an example:

<component>
    <key>org.exoplatform.services.database.HibernateService</key>
    <jmx-name>database:type=HibernateService</jmx-name>
    <type>org.exoplatform.services.database.impl.HibernateServiceImpl</type>
    <init-params>
      <properties-param>
        <name>hibernate.properties</name>
        <description>Default Hibernate Service</description>
        ...........
        <property name="hibernate.default_schema" value="${gatein.idm.datasource.schema:}"/>
      </properties-param>
    </init-params>
</component>

1.16. Search Configuration

Search is an important function in eXo JCR, so it is very necessary for you to know how to configure the eXo JCR Search tool.

1.16.1. XML Configuration

JCR index configuration. You can find this file here: .../portal/WEB-INF/conf/jcr/repository-configuration.xml

<repository-service default-repository="db1">
  <repositories>
    <repository name="db1" system-workspace="ws" default-workspace="ws">
       ....
      <workspaces>
        <workspace name="ws">
       ....
          <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
            <properties>
              <property name="index-dir" value="${java.io.tmpdir}/temp/index/db1/ws" />
              <property name="synonymprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.PropertiesSynonymProvider" />
              <property name="synonymprovider-config-path" value="/synonyms.properties" />
              <property name="indexing-configuration-path" value="/indexing-configuration.xml" />
              <property name="query-class" value="org.exoplatform.services.jcr.impl.core.query.QueryImpl" />
            </properties>
          </query-handler>
        ... 
        </workspace>
     </workspaces>
    </repository>        
  </repositories>
</repository-service>

1.16.2. Configuration parameters

Table 1.2.

Parameter	Default	Description	Since
index-dir	none	The location of the index directory. This parameter is mandatory. Up to 1.9, this parameter called "indexDir"	1.0
use-compoundfile	true	Advises lucene to use compound files for the index files.	1.9
min-merge-docs	100	Minimum number of nodes in an index until segments are merged.	1.9
volatile-idle-time	3	Idle time in seconds until the volatile index part is moved to a persistent index even though minMergeDocs is not reached.	1.9
max-merge-docs	Integer.MAX_VALUE	Maximum number of nodes in segments that will be merged. The default value changed in JCR 1.9 to Integer.MAX_VALUE.	1.9
merge-factor	10	Determines how often segment indices are merged.	1.9
max-field-length	10000	The number of words that are fulltext indexed at most per property.	1.9
cache-size	1000	Size of the document number cache. This cache maps uuids to lucene document numbers	1.9
force-consistencycheck	false	Runs a consistency check on every startup. If false, a consistency check is only performed when the search index detects a prior forced shutdown.	1.9
auto-repair	true	Errors detected by a consistency check are automatically repaired. If false, errors are only written to the log.	1.9
query-class	QueryImpl	Class name that implements the javax.jcr.query.Query interface.This class must also extend from the class: org.exoplatform.services.jcr.impl.core.query.AbstractQueryImpl.	1.9
document-order	true	If true and the query does not contain an 'order by' clause, result nodes will be in document order. For better performance when queries return a lot of nodes set to 'false'.	1.9
result-fetch-size	Integer.MAX_VALUE	The number of results when a query is executed. Default value: Integer.MAX_VALUE (-> all).	1.9
excerptprovider-class	DefaultXMLExcerpt	The name of the class that implements org.exoplatform.services.jcr.impl.core.query.lucene.ExcerptProvider and should be used for the rep:excerpt() function in a query.	1.9
support-highlighting	false	If set to true additional information is stored in the index to support highlighting using the rep:excerpt() function.	1.9
synonymprovider-class	none	The name of a class that implements org.exoplatform.services.jcr.impl.core.query.lucene.SynonymProvider. The default value is null (-> not set).	1.9
synonymprovider-config-path	none	The path to the synonym provider configuration file. This path interpreted is relative to the path parameter. If there is a path element inside the SearchIndex element, then this path is interpreted and relative to the root path of the path. Whether this parameter is mandatory or not, it depends on the synonym provider implementation. The default value is null (-> not set).	1.9
indexing-configuration-path	none	The path to the indexing configuration file.	1.9
indexing-configuration-class	IndexingConfigurationImpl	The name of the class that implements org.exoplatform.services.jcr.impl.core.query.lucene.IndexingConfiguration.	1.9
force-consistencycheck	false	If setting to true, a consistency check is performed, depending on the parameter forceConsistencyCheck. If setting to false, no consistency check is performed on startup, even if a redo log had been applied.	1.9
spellchecker-class	none	The name of a class that implements org.exoplatform.services.jcr.impl.core.query.lucene.SpellChecker.	1.9
spellchecker-more-popular	true	If setting true, spellchecker returns only the suggest words that are as frequent or more frequent than the checked word. If setting false, spellchecker returns null (if checked word exit in dictionary), or spellchecker will return most close suggest word.	1.10
spellchecker-min-distance	0.55f	Minimal distance between checked word and proposed suggest word.	1.10
errorlog-size	50(Kb)	The default size of error log file in Kb.	1.9
upgrade-index	false	Allows JCR to convert an existing index into the new format. Also, it is possible to set this property via system property, for example: -Dupgrade-index=true Indexes before JCR 1.12 will not run with JCR 1.12. Hence you have to run an automatic migration: Start JCR with -Dupgrade-index=true. The old index format is then converted in the new index format. After the conversion the new format is used. On the next start, you don't need this option anymore. The old index is replaced and a back conversion is not possible - therefore better take a backup of the index before. (Only for migrations from JCR 1.9 and later.)	1.12
analyzer	org.apache.lucene.analysis.standard.StandardAnalyzer	Class name of a lucene analyzer to use for fulltext indexing of text.	1.12

Note

The Maximum number of clauses permitted per BooleanQuery, can be changed via the System property org.apache.lucene.maxClauseCount. The default value of this parameter is Integer.MAX_VALUE.

1.16.3. Global Search Index

1.16.3.1. Global Search Index Configuration

The global search index is configured in the above-mentioned configuration file (portal/WEB-INF/conf/jcr/repository-configuration.xml) in the tag "query-handler".

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">

In fact, when using Lucene, you should always use the same analyzer for indexing and for querying, otherwise the results are unpredictable. You don't have to worry about this, eXo JCR does this for you automatically. If you don't like the StandardAnalyzer configured by default, just replace it by your own.

If you don't have a handy QueryHandler, you should learn how to create a customized Handler in 5 minutes.

1.16.3.2. Customized Search Indexes and Analyzers

By default Exo JCR uses the Lucene standard Analyzer to index contents. This analyzer uses some standard filters in the method that analyzes the content:

public TokenStream tokenStream(String fieldName, Reader reader) {
    StandardTokenizer tokenStream = new StandardTokenizer(reader, replaceInvalidAcronym);
    tokenStream.setMaxTokenLength(maxTokenLength);
    TokenStream result = new StandardFilter(tokenStream);
    result = new LowerCaseFilter(result);
    result = new StopFilter(result, stopSet);
    return result;
  }

The first one (StandardFilter) removes 's (as 's in "Peter's") from the end of words and removes dots from acronyms.
The second one (LowerCaseFilter) normalizes token text to lower case.
The last one (StopFilter) removes stop words from a token stream. The stop set is defined in the analyzer.

For specific cases, you may wish to use additional filters like ISOLatin1AccentFilter, which replaces accented characters in the ISO Latin 1 character set (ISO-8859-1) by their unaccented equivalents.

In order to use a different filter, you have to create a new analyzer, and a new search index to use the analyzer. You put it in a jar, which is deployed with your application.

1.16.3.2.1. Creating the filter

The ISOLatin1AccentFilter is not present in the current Lucene version used by eXo. You can use the attached file. You can also create your own filter, the relevant method is

public final Token next(final Token reusableToken) throws java.io.IOException

which defines how chars are read and used by the filter.

1.16.3.2.2. Creating the analyzer

The analyzer has to extends org.apache.lucene.analysis.standard.StandardAnalyzer, and overload the method

public TokenStream tokenStream(String fieldName, Reader reader)

to put your own filters. You can have a glance at the example analyzer attached to this article.

1.16.3.2.3. Creating the search index

Now, we have the analyzer, we have to write the SearchIndex, which will use the analyzer. Your have to extends org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex. You have to write the constructor, to set the right analyzer, and the method

public Analyzer getAnalyzer() {
    return MyAnalyzer;
  }

to return your analyzer. You can see the attached SearchIndex.

Note

Since 1.12 version, we can set Analyzer directly in configuration. So, creation new SearchIndex only for new Analyzer is redundant.

1.16.3.2.4. Configuring your application to use your SearchIndex

In portal/WEB-INF/conf/jcr/repository-configuration.xml, you have to replace each

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">

by your own class

<query-handler class="mypackage.indexation.MySearchIndex">

1.16.3.2.5. Configure your application to use your Analyzer

In portal/WEB-INF/conf/jcr/repository-configuration.xml, you have to add parameter "analyzer" to each query-handler config:

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
   <properties>
      ...
      <property name="analyzer" value="org.exoplatform.services.jcr.impl.core.MyAnalyzer"/>
      ...
   </properties>
</query-handler>

When you start exo, your SearchIndex will start to index contents with the specified filters.

1.16.4. Indexing Adjustments

1.16.4.1. IndexingConfiguration

Starting with version 1.9, the default search index implementation in JCR allows you to control which properties of a node are indexed. You also can define different analyzers for different nodes.

The configuration parameter is called indexingConfiguration and per default is not set. This means all properties of a node are indexed.

If you wish to configure the indexing behavior, you need to add a parameter to the query-handler element in your configuration file.

<property name="indexing-configuration-path" value="/indexing_configuration.xml"/>

Index configuration path can indicate any file located on the file system, in the jar or war files.

Please note that you have to declare the namespace prefixes in the configuration element that you are using throughout the XML file!

1.16.4.2. Indexing rules

1.16.4.2.1. Node Scope Limit

To optimize the index size, you can limit the node scope so that only certain properties of a node type are indexed.

With the below configuration, only properties named Text are indexed for nodes of type nt:unstructured. This configuration also applies to all nodes whose type extends from nt:unstructured.

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <index-rule nodeType="nt:unstructured">
    <property>Text</property>
  </index-rule>
</configuration>

1.16.4.2.2. Indexing Boost Value

It is also possible to configure a boost value for the nodes that match the index rule. The default boost value is 1.0. Higher boost values (a reasonable range is 1.0 - 5.0) will yield a higher score value and appear as more relevant.

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <index-rule nodeType="nt:unstructured"
              boost="2.0">
    <property>Text</property>
  </index-rule>
</configuration>

If you do not wish to boost the complete node but only certain properties, you can also provide a boost value for the listed properties:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <index-rule nodeType="nt:unstructured">
    <property boost="3.0">Title</property>
    <property boost="1.5">Text</property>
  </index-rule>
</configuration>

1.16.4.2.3. Conditional Index Rules

You may also add a condition to the index rule and have multiple rules with the same nodeType. The first index rule that matches will apply and all remain ones are ignored:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <index-rule nodeType="nt:unstructured"
              boost="2.0"
              condition="@priority = 'high'">
    <property>Text</property>
  </index-rule>
  <index-rule nodeType="nt:unstructured">
    <property>Text</property>
  </index-rule>
</configuration>

In the above example, the first rule only applies if the nt:unstructured node has a priority property with a value 'high'. The condition syntax supports only the equals operator and a string literal.

You may also refer properties in the condition that are not on the current node:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <index-rule nodeType="nt:unstructured"
              boost="2.0"
              condition="ancestor::*/@priority = 'high'">
    <property>Text</property>
  </index-rule>
  <index-rule nodeType="nt:unstructured"
              boost="0.5"
              condition="parent::foo/@priority = 'low'">
    <property>Text</property>
  </index-rule>
  <index-rule nodeType="nt:unstructured"
              boost="1.5"
              condition="bar/@priority = 'medium'">
    <property>Text</property>
  </index-rule>
  <index-rule nodeType="nt:unstructured">
    <property>Text</property>
  </index-rule>
</configuration>

The indexing configuration also allows you to specify the type of a node in the condition. Please note however that the type match must be exact. It does not consider sub types of the specified node type.

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <index-rule nodeType="nt:unstructured"
              boost="2.0"
              condition="element(*, nt:unstructured)/@priority = 'high'">
    <property>Text</property>
  </index-rule>
</configuration>

1.16.4.2.4. Exclusion from the Node Scope Index

Per default all configured properties are fulltext indexed if they are of type STRING and included in the node scope index. A node scope search finds normally all nodes of an index. That is, the select jcr:contains(., 'foo') returns all nodes that have a string property containing the word 'foo'. You can exclude explicitly a property from the node scope index:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <index-rule nodeType="nt:unstructured">
    <property nodeScopeIndex="false">Text</property>
  </index-rule>
</configuration>

1.16.4.2.5. Nodes exclusion From Query Results

You have an ability to disable the indexing on nodes that are sub nodes of excluded paths and/or that are of a given type. To get this done you simply need to add some lines to the configuration file:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.3.dtd">
<configuration xmlns:exo="http://www.exoplatform.com/jcr/exo/1.0">
  <exclude nodeType="exo:hiddenable"/>
  <exclude path="/my[2]/path"/>
  <exclude nodeType="exo:foo" path="/my/other[2]/path"/>
</configuration>

This will exclude nodes of type "exo:hiddenable" and nodes with the path "/my[2]/path" from the results. As you see you can also combine exclusions.

1.16.4.3. Indexing Aggregates

Sometimes it is useful to include the contents of descendant nodes into a single node to easier search on content that is scattered across multiple nodes.

JCR allows you to define indexed aggregates, basing on relative path patterns and primary node types.

The following example creates an indexed aggregate on nt:file that includes the content of the jcr:content node:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
               xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <aggregate primaryType="nt:file">
    <include>jcr:content</include>
  </aggregate>
</configuration>

You can also restrict the included nodes to a certain type:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
               xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <aggregate primaryType="nt:file">
    <include primaryType="nt:resource">jcr:content</include>
  </aggregate>
</configuration>

You may also use the * to match all child nodes:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
               xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <aggregate primaryType="nt:file">
    <include primaryType="nt:resource">*</include>
  </aggregate>
</configuration>

If you wish to include nodes up to a certain depth below the current node, you can add multiple include elements. E.g. the nt:file node may contain a complete XML document under jcr:content:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
               xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <aggregate primaryType="nt:file">
    <include>*</include>
    <include>*/*</include>
    <include>*/*/*</include>
  </aggregate>
</configuration>

1.16.4.4. Property-Level Analyzers

1.16.4.4.1. Example

In this configuration section, you define how a property has to be analyzed. If there is an analyzer configuration for a property, this analyzer is used for indexing and searching of this property. For example:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
  <analyzers> 
        <analyzer class="org.apache.lucene.analysis.KeywordAnalyzer">
            <property>mytext</property>
        </analyzer>
        <analyzer class="org.apache.lucene.analysis.WhitespaceAnalyzer">
            <property>mytext2</property>
        </analyzer>
  </analyzers> 
</configuration>

The configuration above means that the property "mytext" for the entire workspace is indexed (and searched) with the Lucene KeywordAnalyzer, and property "mytext2" with the WhitespaceAnalyzer. Using different analyzers for different languages is particularly useful.

The WhitespaceAnalyzer tokenizes a property, the KeywordAnalyzer takes the property as a whole.

1.16.4.4.2. Characteristics of Node Scope Searches

When using analyzers, you may encounter an unexpected behavior when searching within a property compared to searching within a node scope. The reason is that the node scope always uses the global analyzer.

Let's suppose that the property "mytext" contains the text : "testing my analyzers" and that you haven't configured any analyzers for the property "mytext" (and not changed the default analyzer in SearchIndex).

If your query is for example:

xpath = "//*[jcr:contains(mytext,'analyzer')]"

This xpath does not return a hit in the node with the property above and default analyzers.

Also a search on the node scope

xpath = "//*[jcr:contains(.,'analyzer')]"

won't give a hit. Realize that you can only set specific analyzers on a node property, and that the node scope indexing/analyzing is always done with the globally defined analyzer in the SearchIndex element.

Now, if you change the analyzer used to index the "mytext" property above to

<analyzer class="org.apache.lucene.analysis.Analyzer.GermanAnalyzer">
     <property>mytext</property>
</analyzer>

and you do the same search again, then for

xpath = "//*[jcr:contains(mytext,'analyzer')]"

you would get a hit because of the word stemming (analyzers - analyzer).

The other search,

xpath = "//*[jcr:contains(.,'analyzer')]"

still would not give a result, since the node scope is indexed with the global analyzer, which in this case does not take into account any word stemming.

In conclusion, be aware that when using analyzers for specific properties, you might find a hit in a property for some search text, and you do not find a hit with the same search text in the node scope of the property!

Note

Both index rules and index aggregates influence how content is indexed in JCR. If you change the configuration, the existing content is not automatically re-indexed according to the new rules. You, therefore, have to manually re-index the content when you change the configuration!

1.16.4.5. Advanced features

eXo JCR supports some advanced features, which are not specified in JSR 170:

Get a text excerpt with highlighted words that matches the query: ExcerptProvider.
Search a term and its synonyms: SynonymSearch
Search similar nodes: SimilaritySearch
Check spelling of a full text query statement: SpellChecker
Define index aggregates and rules: IndexingConfiguration.

1.17. JCR Configuration persister

eXo JCR allows using persister to store configuration. In this section, you will understand how to use and configure eXo JCR persister.

1.17.1. Idea

JCR Repository Service uses org.exoplatform.services.jcr.config.RepositoryServiceConfiguration component to read its configuration.

<component>
    <key>org.exoplatform.services.jcr.config.RepositoryServiceConfiguration</key>
    <type>org.exoplatform.services.jcr.impl.config.RepositoryServiceConfigurationImpl</type>
    <init-params>
      <value-param>
        <name>conf-path</name>
        <description>JCR configuration file</description>
        <value>/conf/standalone/exo-jcr-config.xml</value>
      </value-param>
    </init-params>
  </component>

In the example, Repository Service will read the configuration from the file /conf/standalone/exo-jcr-config.xml.

But in some cases, it's required to change the configuration on the fly. And know that the new one will be used. Additionally we wish not to modify the original file.

In this case, we have to use the configuration persister feature which allows to store the configuration in different locations.

1.17.2. Usage

On startup RepositoryServiceConfiguration component checks if a configuration persister was configured. In that case, it uses the provided ConfigurationPersister implementation class to instantiate the persister object.

Configuration with persister:

<component>
    <key>org.exoplatform.services.jcr.config.RepositoryServiceConfiguration</key>
    <type>org.exoplatform.services.jcr.impl.config.RepositoryServiceConfigurationImpl</type>
    <init-params>
      <value-param>
        <name>conf-path</name>
        <description>JCR configuration file</description>
        <value>/conf/standalone/exo-jcr-config.xml</value>
      </value-param>
      <properties-param>
        <name>working-conf</name>
        <description>working-conf</description>
        <property name="source-name" value="jdbcjcr" />
        <property name="dialect" value="mysql" />
        <property name="persister-class-name" value="org.exoplatform.services.jcr.impl.config.JDBCConfigurationPersister" />
      </properties-param>
    </init-params>
  </component>

eXo JCR persistent data container can work in two configuration modes:

source-name: JNDI source name configured in InitialContextInitializer component. (sourceName prior v.1.9.) Find more in database configuration.
dialect: SQL dialect which will be used with database from source-name. Find more in database configuration.
persister-class-name - class name of ConfigurationPersister interface implementation. (persisterClassName prior v.1.9.)

ConfigurationPersister interface:

/**
   * Init persister.
   * Used by RepositoryServiceConfiguration on init. 
   * @return - config data stream
   */
  void init(PropertiesParam params) throws RepositoryConfigurationException;
  
  /**
   * Read config data.
   * @return - config data stream
   */
  InputStream read() throws RepositoryConfigurationException;
  
  /**
   * Create table, write data.
   * @param confData - config data stream
   */
  void write(InputStream confData) throws RepositoryConfigurationException;
  
  /**
   * Tell if the config exists.
   * @return - flag
   */
  boolean hasConfig() throws RepositoryConfigurationException;

JCR Core implementation contains a persister which stores the repository configuration in the relational database using JDBC calls - org.exoplatform.services.jcr.impl.config.JDBCConfigurationPersister.

The implementation will crate and use table JCR_CONFIG in the provided database.

But the developer can implement his own persister for his particular usecase.

1.18. JDBC Data Container Config

Multi-database: One database for each workspace (used in standalone eXo JCR service mode)
Single-database: All workspaces persisted in one database (used in embedded eXo JCR service mode, e.g. in GateIn)

The data container uses the JDBC driver to communicate with the actual database software, i.e. any JDBC-enabled data storage can be used with eXo JCR implementation.

Currently the data container is tested with the following configurations:

MySQL 5.0.18 MYSQL Connector/J 5.0.8
MySQL 5.1.36 MYSQL Connector/J 5.1.14
MySQL 5.5.17 MYSQL Connector/J 5.1.18
MySQL Cluster (NDB engine)
PostgreSQL 8.2.4 JDBC4 Driver, Version 8.2-507
PostgreSQL 8.3.7 JDBC4 Driver, Version 8.3-606
PostgreSQL 8.4.14 JDBC4 Driver, Version 8.4-702
PostgreSQL 9.1.5 JDBC4 Driver, Version 9.1-902
PostgreSQL 9.2.4 JDBC4 Driver, Version 9.2-1002
Enterprise DB Postgres Plus Advanced Server 9.2.1 JDBC4 Driver, Version 9.2.1.3
Oracle DB 10g R2 (10.2.0.4), JDBC Driver Oracle 10g R2 (10.2.0.4)
Oracle DB 11g R1 (11.1.0.6.0), JDBC Driver Oracle 11g R1 (11.1.0.6.0)
Oracle DB 11g R2 (11.2.0.1.0), JDBC Driver Oracle 11g R2 (11.2.0.1.0)
DB2 9.7.4 IBM Data Server Driver for JDBC and SQLJ (JCC Driver) v.9.7
MS SQL Server 2005 SP3 JDBC Driver 3.0
MS SQL Server 2008 JDBC Driver 3.0
MS SQL Server 2008 R2 JDBC Driver 3.0
Sybase 15.0.3 ASE Driver: Sybase jConnect JDBC driver v7 (Build 26502)
Sybase ASE 15.7 Driver: Sybase jConnect JDBC driver v7 (build 26666)
HSQLDB (2.0.0)
H2 (1.3.161)

Each database software supports ANSI SQL standards but also has its own specifics. So, each database has its own configuration in eXo JCR as a database dialect parameter. If you need a more detailed configuration of the database, it's possible to do that by editing the metadata SQL-script files.

SQL-scripts you can obtain from jar-file exo.jcr.component.core-XXX.XXX.jar:conf/storage/. They also can be found at GitHub here.

In the next two tables correspondence between the scripts and databases is shown.

Table 1.3. Single-database
MySQL DB	jcr-sjdbc.mysql.sql
MySQL DB with utf-8	jcr-sjdbc.mysql-utf8.sql
MySQL DB with MyISAM*	jcr-sjdbc.mysql-myisam.sql
MySQL DB with MyISAM and utf-8*	jcr-sjdbc.mysql-myisam-utf8.sql
MySQL DB with NDB engine	jcr-sjdbc.mysql-ndb.sql
MySQL DB with NDB engine and utf-8	jcr-sjdbc.mysql-ndb-utf8.sql
PostgresSQL and Postgre Plus	jcr-sjdbc.pqsql.sql
Oracle DB	jcr-sjdbc.ora.sql
DB2	jcr-sjdbc.db2.sql
MS SQL Server	jcr-sjdbc.mssql.sql
Sybase	jcr-sjdbc.sybase.sql
HSQLDB	jcr-sjdbc.sql
H2	jcr-sjdbc.h2.sql

Table 1.4. Multi-database
MySQL DB	jcr-mjdbc.mysql.sql
MySQL DB with utf-8	jcr-mjdbc.mysql-utf8.sql
MySQL DB with MyISAM*	jcr-mjdbc.mysql-myisam.sql
MySQL DB with MyISAM and utf-8*	jcr-mjdbc.mysql-myisam-utf8.sql
MySQL DB with NDB engine	jcr-mjdbc.mysql-ndb.sql
MySQL DB with NDB engine and utf-8	jcr-mjdbc.mysql-ndb-utf8.sql
PostgresSQL and Postgre Plus	jcr-mjdbc.pqsql.sql
Oracle DB	jcr-mjdbc.ora.sql
DB2	jcr-mjdbc.db2.sql
MS SQL Server	jcr-mjdbc.mssql.sql
Sybase	jcr-mjdbc.sybase.sql
HSQLDB	jcr-mjdbc.sql
H2	jcr-mjdbc.h2.sql

In case the non-ANSI node name is used, it's necessary to use a database with MultiLanguage support. Some JDBC drivers need additional parameters for establishing a Unicode friendly connection. E.g. under mysql it's necessary to add an additional parameter for the JDBC driver at the end of JDBC URL. For instance: jdbc:mysql://exoua.dnsalias.net/portal?characterEncoding=utf8

There are preconfigured configuration files for HSQLDB. Look for these files in /conf/portal and /conf/standalone folders of the jar-file exo.jcr.component.core-XXX.XXX.jar or source-distribution of eXo JCR implementation.

By default, the configuration files are located in service jars /conf/portal/configuration.xml (eXo services including JCR Repository Service) and exo-jcr-config.xml (repositories configuration). In GateIn product, JCR is configured in portal web application portal/WEB-INF/conf/jcr/jcr-configuration.xml (JCR Repository Service and related serivces) and repository-configuration.xml (repositories configuration).

Read more about Repository configuration.

1.18.1. General recommendations for database configuration

Please note, that JCR requires at least READ_COMMITED isolation level and other RDBMS configurations can cause some side-effects and issues. So, please, make sure proper isolation level is configured on database server side.

1.18.1.1. DB2 configuration

RDBMS reindexing feature use queries based on LIMIT and OFFSET clauses which are not enabled by default. However, you can ensure they are enabled by executing the following
```
$ db2set DB2_COMPATIBILITY_VECTOR=MYS
$ db2stop
$ db2start
```
Statistics is collected automatically starting from DB2 Version 9, however it is needed to launch statistics collection manually during the very first start, otherwise it could be very long. You need to run 'RUNSTATS' command
```
RUNSTATS ON TABLE <scheme>.<table> WITH DISTRIBUTION AND INDEXES ALL
```
for JCR_SITEM (or JCR_MITEM) and JCR_SVALUE (or JCR_MVALUE) tables.
If you don't want to enable the LIMIT/OFFSET clauses, you can still use "db2-mys" as dialect however please note that the indexing is 120 times slower.

1.18.1.2. MySQL configuration

MyISAM is not supported due to its lack of transaction support and integrity check, so use it only if you don't expect any support and if performances in read accesses are more important than the consistency in your use-case. This dialect is only dedicated to the community.
MySQL relies on collected statistics for keeping track of data distribution in tables and for optimizing join statements, but you can manually call 'ANALYZE' to update statistics if needed. For example
```
ANALYZE TABLE JCR_SITEM, JCR_SVALUE
```

1.18.1.3. PostgreSQL/PostgrePlus configuration

Be aware, when using the RDBMS reindexing, you need to set "enable_seqscan" to "off" or "default_statistics_target" to at least "50"
Though PostgreSQL/PostgrePlus server performs query optimization automatically, you can manualy call 'ANALYZE' command to collect statistics which can influence the performance. For example
```
ANALYZE JCR_SITEM
ANALYZE JCR_SVALUE
```
If for a version prior to 9.1, the parameter standard_conforming_strings is enabled, you need to use "pgsql-scs" as dialect

1.18.1.4. MS SQL configuration

One more mandatory JCR requirement for underlying databases is a case sensitive collation. Microsoft SQL Server both 2005 and 2008 customers must configure their server with collation corresponding to personal needs and requirements, but obligatorily case sensitive. For more information please refer to Microsoft SQL Server documentation page "Selecting a SQL Server Collation" here.
MS SQL DB server's optimizer automatically processes queries to increase performance. Optimization is based on statistical data which is collected automatically, but you can manually call Transact-SQL command 'UPDATE STATISTICS' which in very few situations may increase performance. For example
```
UPDATE STATISTICS JCR_SITEM
UPDATE STATISTICS JCR_SVALUE
```

1.18.1.5. Sybase configuration

Sybase DB Server optimizer automatically processes queries to increase performance. Optimization is based on statistical data which is collected automatically, but you can manually call Transact-SQL command 'update statistics' which in very few situations may increase performance. For example
```
update statistics JCR_SITEM
update statistics JCR_SVALUE
```

1.18.1.6. Oracle configuration

Oracle DB automatically collects statistics to optimize performance of queries, but you can manually call 'ANALYZE' command to start collecting statistics immediately which may improve performance. For example

ANALYZE TABLE JCR_SITEM COMPUTE STATISTICS
ANALYZE TABLE JCR_SVALUE COMPUTE STATISTICS
ANALYZE TABLE JCR_SREF COMPUTE STATISTICS
ANALYZE INDEX JCR_PK_SITEM COMPUTE STATISTICS
ANALYZE INDEX JCR_IDX_SITEM_PARENT_FK COMPUTE STATISTICS
ANALYZE INDEX JCR_IDX_SITEM_PARENT COMPUTE STATISTICS
ANALYZE INDEX JCR_IDX_SITEM_PARENT_NAME COMPUTE STATISTICS
ANALYZE INDEX JCR_IDX_SITEM_PARENT_ID COMPUTE STATISTICS
ANALYZE INDEX JCR_PK_SVALUE COMPUTE STATISTICS
ANALYZE INDEX JCR_IDX_SVALUE_PROPERTY COMPUTE STATISTICS
ANALYZE INDEX JCR_PK_SREF COMPUTE STATISTICS
ANALYZE INDEX JCR_IDX_SREF_PROPERTY COMPUTE STATISTICS
ANALYZE INDEX JCR_PK_SCONTAINER COMPUTE STATISTICS

1.18.2. Isolated-database Configuration

Isolated-database configuration allows to configure single database for repository but separate database tables for each workspace. First step is to configure the data container in the org.exoplatform.services.naming.InitialContextInitializer service. It's the JNDI context initializer, which registers (binds) naming resources (DataSources) for data containers.

We configure the database connection parameters:

 <external-component-plugins>
    <target-component>org.exoplatform.services.naming.InitialContextInitializer</target-component>
    <component-plugin>
      <name>bind.datasource</name>
      <set-method>addPlugin</set-method>
      <type>org.exoplatform.services.naming.BindReferencePlugin</type>
      <init-params>
        <value-param>
          <name>bind-name</name>
          <value>jdbcjcr</value>
        </value-param>
        <value-param>
          <name>class-name</name>
          <value>javax.sql.DataSource</value>
        </value-param>
        <value-param>
          <name>factory</name>
          <value>org.apache.commons.dbcp.BasicDataSourceFactory</value>
        </value-param>
          <properties-param>
            <name>ref-addresses</name>
            <description>ref-addresses</description>
            <property name="driverClassName" value="org.postgresql.Driver"/>
            <property name="url" value="jdbc:postgresql://exoua.dnsalias.net/portal"/>
            <property name="username" value="exoadmin"/>
            <property name="password" value="exo12321"/>
          </properties-param>
      </init-params>
    </component-plugin>
  </external-component-plugins>

driverClassName, e.g. "org.hsqldb.jdbcDriver", "com.mysql.jdbc.Driver", "org.postgresql.Driver"
url, e.g. "jdbc:hsqldb:file:target/temp/data/portal", "jdbc:mysql://exoua.dnsalias.net/jcr"
username, e.g. "sa", "exoadmin"
password, e.g. "", "exo12321"

When the data container configuration is done, we can configure the repository service. Each workspace will be configured for the same data container.

In this way, we have configured two workspace which will be persisted in different database tables.

<workspaces>
   <workspace name="ws">
      <!-- for system storage -->
      <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
         <properties>
            <property name="source-name" value="jdbcjcr" />
            <property name="db-structure-type" value="isolated" />
            ...
         </properties>
         ...
      </container>
      ...
   </workspace>

   <workspace name="ws1">
      <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
         <properties>
            <property name="source-name" value="jdbcjcr" />
            <property name="db-structure-type" value="isolated" />
            ...
         </properties>
         ...
      </container>
      ...
   </workspace>
</workspaces>

Note

Starting from v.1.9 repository configuration parameters supports human-readable formats of values (e.g. 200K - 200 Kbytes, 30m - 30 minutes etc)

1.18.3. Multi-database Configuration

Note

This configuration option is now deprecated. Use isolated database configuration instead.

You need to configure each workspace in a repository. You may have each one on different remote servers as far as you need.

First of all configure the data containers in the org.exoplatform.services.naming.InitialContextInitializer service. It's the JNDI context initializer which registers (binds) naming resources (DataSources) for data containers.

When the data container configuration is done, we can configure the repository service. Each workspace will be configured for its own data container.

<component>
   <key>org.exoplatform.services.naming.InitialContextInitializer</key>
   <type>org.exoplatform.services.naming.InitialContextInitializer</type>
   <component-plugins>
      <component-plugin>
         <name>bind.datasource</name>
         <set-method>addPlugin</set-method>
         <type>org.exoplatform.services.naming.BindReferencePlugin</type>
         <init-params>
            <value-param>
               <name>bind-name</name>
               <value>jdbcjcr</value>
            </value-param>
            <value-param>
               <name>class-name</name>
               <value>javax.sql.DataSource</value>
            </value-param>
            <value-param>
               <name>factory</name>
               <value>org.apache.commons.dbcp.BasicDataSourceFactory</value>
            </value-param>
            <properties-param>
               <name>ref-addresses</name>
               <description>ref-addresses</description>
               <property name="driverClassName" value="org.hsqldb.jdbcDriver"/>
               <property name="url" value="jdbc:hsqldb:file:target/temp/data/portal"/>
               <property name="username" value="sa"/>
               <property name="password" value=""/>
            </properties-param>
         </init-params>
      </component-plugin>
      <component-plugin>
         <name>bind.datasource</name>
         <set-method>addPlugin</set-method>
         <type>org.exoplatform.services.naming.BindReferencePlugin</type>
         <init-params>
            <value-param>
               <name>bind-name</name>
               <value>jdbcjcr1</value>
            </value-param>
            <value-param>
               <name>class-name</name>
               <value>javax.sql.DataSource</value>
            </value-param>
            <value-param>
               <name>factory</name>
               <value>org.apache.commons.dbcp.BasicDataSourceFactory</value>
            </value-param>
            <properties-param>
               <name>ref-addresses</name>
               <description>ref-addresses</description>
               <property name="driverClassName" value="com.mysql.jdbc.Driver"/>
               <property name="url" value="jdbc:mysql://exoua.dnsalias.net/jcr"/>
               <property name="username" value="exoadmin"/>
               <property name="password" value="exo12321"/>
               <property name="maxActive" value="50"/>
               <property name="maxIdle" value="5"/>
               <property name="initialSize" value="5"/>
            </properties-param>
         </init-params>
      </component-plugin>
   <component-plugins>
</component>

In this way, we have configured two workspace which will be persisted in two different databases (ws in HSQLDB, ws1 in MySQL).

<workspaces>
   <workspace name="ws">
      <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
         <properties>
            <property name="source-name" value="jdbcjcr"/>
            <property name="db-structure-type" value="multi"/>
            ...
         </properties>
      </container>
      ...
   </workspace>

   <workspace name="ws1">
      <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
         <properties>
            <property name="source-name" value="jdbcjcr1"/>
            <property name="db-structure-type" value="multi"/>
            ...
         </properties>
      </container>
      ...
   </workspace>
</workspaces>

1.18.4. Single-database configuration

It's simplier to configure a single-database data container. We have to configure one naming resource.

And configure repository workspaces in repositories configuration with this one database.

<external-component-plugins>
    <target-component>org.exoplatform.services.naming.InitialContextInitializer</target-component>
    <component-plugin>
        <name>bind.datasource</name>
        <set-method>addPlugin</set-method>
        <type>org.exoplatform.services.naming.BindReferencePlugin</type>
        <init-params>
          <value-param>
            <name>bind-name</name>
            <value>jdbcjcr</value>
          </value-param>
          <value-param>
            <name>class-name</name>
            <value>javax.sql.DataSource</value>
          </value-param>
          <value-param>
            <name>factory</name>
            <value>org.apache.commons.dbcp.BasicDataSourceFactory</value>
          </value-param>
          <properties-param>
            <name>ref-addresses</name>
            <description>ref-addresses</description>
            <property name="driverClassName" value="org.postgresql.Driver"/>
            <property name="url" value="jdbc:postgresql://exoua.dnsalias.net/portal"/>
            <property name="username" value="exoadmin"/>
            <property name="password" value="exo12321"/>
            <property name="maxActive" value="50"/>
            <property name="maxIdle" value="5"/>
            <property name="initialSize" value="5"/>
          </properties-param>
        </init-params>
    </component-plugin>
  </external-component-plugins>

In this way, we have configured two workspaces which will be persisted in one database (PostgreSQL).

<workspaces>
  <workspace name="ws">
    <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
      <properties>
        <property name="source-name" value="jdbcjcr"/>
        <property name="db-structure-type" value="single" />
        ...
      </properties>
    </container>
    ...
  </workspace>

  <workspace name="ws1">
    <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
    <properties>
      <property name="source-name" value="jdbcjcr"/>
      <property name="db-structure-type" value="single" />
      ...
    </properties>
    ...
  </workspace>
</workspaces>

1.18.4.1. Dynamic Workspace Creation

Workspaces can be added dynamically during runtime.

This can be performed in two steps:

Firstly, ManageableRepository.configWorkspace(WorkspaceEntry wsConfig) - register a new configuration in RepositoryContainer and create a WorkspaceContainer.
Secondly, the main step, ManageableRepository.createWorkspace(String workspaceName) - creation of a new workspace.

1.18.5. Simple and Complex queries

eXo JCR provides two ways for interact with Database - JDBCStorageConnection that uses simple queries and CQJDBCStorageConection that uses complex queries for reducing amount of database callings.

Simple queries will be used if you chose org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer:

<workspaces>
  <workspace name="ws">
    <container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
    ...
  </workspace>
</worksapces>

Complex queries will be used if you chose org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer:

<workspaces>
  <workspace name="ws">
    <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
    ...
  </workspace>
</worksapces>

Why we should use a Complex Queries?

They are optimised to reduce amount of requests to database.

Why we should use a Simple Queries?

Simple queries implemented in way to support as many database dialects as possible.

Simple queries do not use sub queries, left or right joins.

1.18.6. Forse Query Hints

Some databases supports hints to increase query performance (like Oracle, MySQL, etc). eXo JCR have separate Complex Query implementation for Orcale dialect, that uses query hints to increase performance for few important queries.

To enable this option put next configuration property:

<workspace name="ws">
  <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
    <properties>
      <property name="dialect" value="oracle"/>
      <property name="force.query.hints" value="true" />
      ......

Query hints enabled by default.

eXo JCR uses query hints only for Complex Query Oracle dialect. For all other dialects this parameter is ignored.

1.18.7. Notes for Microsoft Windows users

The current configuration of eXo JCR uses Apache DBCP connection pool (org.apache.commons.dbcp.BasicDataSourceFactory). It's possible to set a big value for maxActive parameter in configuration.xml. That means usage of lots of TCP/IP ports from a client machine inside the pool (i.e. JDBC driver). As a result, the data container can throw exceptions like "Address already in use". To solve this problem, you have to configure the client's machine networking software for the usage of shorter timeouts for opened TCP/IP ports.

Microsoft Windows has MaxUserPort, TcpTimedWaitDelay registry keys in the node HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesTcpipParameters, by default these keys are unset, set each one with values like these:

"TcpTimedWaitDelay"=dword:0000001e, sets TIME_WAIT parameter to 30 seconds, default is 240.
"MaxUserPort"=dword:00001b58, sets the maximum of open ports to 7000 or higher, default is 5000.

A sample registry file is below:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]
"MaxUserPort"=dword:00001b58
"TcpTimedWaitDelay"=dword:0000001e

1.19. External Value Storages

By default JCR Values are stored in the Workspace Data container along with the JCR structure (i.e. Nodes and Properties). eXo JCR offers an additional option of storing JCR Values separately from Workspace Data container, which can be extremely helpful to keep Binary Large Objects (BLOBs) for example.

Value storage configuration is a part of Repository configuration, find more details there.

Tree-based storage is recommended for most of cases. If you run an application on Amazon EC2 - the S3 option may be interesting for architecture. Simple 'flat' storage is good in speed of creation/deletion of values, it might be a compromise for a small storages.

1.19.1. Tree File Value Storage

Holds Values in tree-like FileSystem files. path property points to the root directory to store the files.

This is a recommended type of external storage, it can contain large amount of files limited only by disk/volume free space.

A disadvantage is that it's a higher time on Value deletion due to unused tree-nodes remove.

<value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
     <properties>
       <property name="path" value="data/values"/>
     </properties>
     <filters>
       <filter property-type="Binary" min-value-size="1M"/>
     </filters>

Where :

id: The value storage unique identifier, used for linking with properties stored in workspace container.

path: A location where value files will be stored.

Each file value storage can have the filter(s) for incoming values. A filter can match values by property type (property-type), property name (property-name), ancestor path (ancestor-path) and/or size of values stored (min-value-size, in bytes). In code sample, we use a filter with property-type and min-value-size only. I.e. storage for binary values with size greater of 1MB. It's recommended to store properties with large values in file value storage only.

Another example shows a value storage with different locations for large files (min-value-size a 20Mb-sized filter). A value storage uses ORed logic in the process of filter selection. That means the first filter in the list will be asked first and if not matched the next will be called etc. Here a value matches the 20 MB-sized filter min-value-size and will be stored in the path "data/20Mvalues", all other in "data/values".

<value-storages>
  <value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
    <properties>
      <property name="path" value="data/20Mvalues"/>
    </properties>
    <filters>
      <filter property-type="Binary" min-value-size="20M"/>
    </filters>
  <value-storage>
  <value-storage id="Storage #2" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
    <properties>
      <property name="path" value="data/values"/>
    </properties>
    <filters>
      <filter property-type="Binary" min-value-size="1M"/>
    </filters>
  <value-storage>
<value-storages>

1.19.2. Simple File Value Storage

Note

But if you're sure in your file-system or data amount is small it may be useful for you as haves a faster speed of Value removal.

Hold Values in flat FileSystem files. path property points to root directory in order to store files

<value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.fs.SimpleFileValueStorage">
     <properties>
       <property name="path" value="data/values"/>
     </properties>
     <filters>
       <filter property-type="Binary" min-value-size="1M"/>
     </filters>

1.19.3. Content Addressable Value storage (CAS) support

eXo JCR supports Content-addressable storage feature for Values storing.

Note

Content-addressable storage, also referred to as associative storage and abbreviated CAS, is a mechanism for storing information that can be retrieved based on its content, not its storage location. It is typically used for high-speed storage and retrieval of fixed content, such as documents stored for compliance with government regulations.

Content Addressable Value storage stores unique content once. Different properties (values) with same content will be stored as one data file shared between those values. We can tell the Value content will be shared across some Values in storage and will be stored on one physical file.

Storage size will be decreased for application which governs potentially same data in the content.

Note

For example: if you have 100 different properties containing the same data (e.g. mail attachment), the storage stores only one single file. The file will be shared with all referencing properties.

If property Value changes, it is stored in an additional file. Alternatively the file is shared with other values, pointing to the same content.

The storage calculates Value content address each time the property was changed. CAS write operations are much more expensive compared to the non-CAS storages.

Content address calculation based on java.security.MessageDigest hash computation and tested with MD5 and SHA1 algorithms.

Note

CAS storage works most efficiently on data that does not change often. For data that changes frequently, CAS is not as efficient as location-based addressing.

CAS support can be enabled for Tree and Simple File Value Storage types.

To enable CAS support, just configure it in JCR Repositories configuration as we do for other Value Storages.

<workspaces>
        <workspace name="ws">
          <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
            <properties>
              <property name="source-name" value="jdbcjcr"/>
              <property name="dialect" value="oracle"/>
              <property name="multi-db" value="false"/>
              <property name="max-buffer-size" value="200k"/>
              <property name="swap-directory" value="target/temp/swap/ws"/>
            </properties>
            <value-storages>
<!------------------- here ----------------------->
              <value-storage id="ws" class="org.exoplatform.services.jcr.impl.storage.value.fs.CASableTreeFileValueStorage">
                <properties>
                  <property name="path" value="target/temp/values/ws"/>
                  <property name="digest-algo" value="MD5"/>
                  <property name="vcas-type" value="org.exoplatform.services.jcr.impl.storage.value.cas.JDBCValueContentAddressStorageImpl"/>
                  <property name="jdbc-source-name" value="jdbcjcr"/>
                  <property name="jdbc-dialect" value="oracle"/>
                </properties>
                <filters>
                  <filter property-type="Binary"/>
                </filters>
              </value-storage>
            </value-storages>

Properties:

digest-algo: Digest hash algorithm (MD5 and SHA1 were tested);

vcas-type: Value CAS internal data type, JDBC backed is currently implemented org.exoplatform.services.jcr.impl.storage.value.cas.JDBCValueContentAddressStorageImp;l

jdbc-source-name: JDBCValueContentAddressStorageImpl specific parameter, database will be used to save CAS metadata. It's simple to use same as in workspace container;

jdbc-dialect: JDBCValueContentAddressStorageImpl specific parameter, database dialect. It's simple to use the same as in workspace container;

1.19.4. Disabling value storage

JCR allows to disable value storage by adding property into configuration. For interal usage and testing purpose only.

<property name="enabled" value="false" />

Be careful, all stored values will be unaccessible.

1.20. Workspace Data Container

Each Workspace of JCR has its own persistent storage to hold workspace's items data. eXo Content Repository can be configured so that it can use one or more workspaces that are logical units of the repository content. Physical data storage mechanism is configured using mandatory element container. The type of container is described in the attribute class = fully qualified name of org.exoplatform.services.jcr.storage.WorkspaceDataContainer subclass like

<container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
  <properties>
    <property name="source-name" value="jdbcjcr1"/>
    <property name="dialect" value="hsqldb"/>
    <property name="multi-db" value="true"/>
    <property name="max-buffer-size" value="200K"/>
    <property name="swap-directory" value="target/temp/swap/ws"/>
    <property name="lazy-node-iterator-page-size" value="50"/>
    <property name="acl-bloomfilter-false-positive-probability" value="0.1d"/>
    <property name="acl-bloomfilter-elements-number" value="1000000"/>
    <property name="check-sns-new-connection" value="false"/>
    <property name="batch-size" value="1000"/>
  </properties>

Workspace Data Container specific parameters:

max-buffer-size: A threshold in bytes, if a value size is greater, then it will be spooled to a temporary file. Default value is 200k.
swap-directory: A location where the value will be spooled if no value storage is configured but a max-buffer-size is exceeded. Default value is the value of "java.io.tmpdir" system property.
lazy-node-iterator-page-size: "Lazy" child nodes iterator settings. Defines size of page, the number of nodes that are retrieved from persistent storage at once. Default value is 100.
acl-bloomfilter-false-positive-probability: ACL Bloom-filter settings. ACL Bloom-filter desired false positive probability. Range [0..1]. Default value 0.1d.
acl-bloomfilter-elements-number: ACL Bloom-filter settings. Expected number of ACL-elements in the Bloom-filter. Default value 1000000.
check-sns-new-connection: Defines if we need to create new connection for checking if an older same-name sibling exists. Default value is "false".
trigger-events-for-descendants-on-rename: Indicates whether or not each descendant item must be included into the changes log in case of a rename. If it is set to false, it will allow to increase performance on rename operations if there is a big amount of nodes under the source parent node but it will decrease the performance with a small amount of sub nodes. If it is set to true, we will get the exact opposite, the performance will be better in case of small amount of sub nodes and worse in case of big amount of sub nodes. When this parameter is not set, the application will rely on the parameter max-descendant-nodes-allowed-on-move to add or not the descendant items to the changes log. If this parameter is not set but the parameter trigger-events-for-descendants-on-move is set, it will have the same value.
trigger-events-for-descendants-on-move: Indicates whether or not each descendant item must be included into the changes log in case of a move. If it is set to false, it will allow to increase performance on move operations if there is a big amount of nodes under the source parent node but it will decrease the performance with a small amount of sub nodes. If it is set to true, we will get the exact opposite, the performance will be better in case of small amount of sub nodes and worse in case of big amount of sub nodes. When this parameter is not set, the application will rely on the parameter max-descendant-nodes-allowed-on-move to add or not the descendant items to the changes log.
max-descendant-nodes-allowed-on-move: The maximum amount of descendant nodes allowed before considering that the descendant items should not be included into the changes log. This allows to have the best possible performances whatever the total amount of sub nodes. The default value is 100. This parameter is only used if and only if trigger-events-for-descendants-on-move is not set and in case of a rename trigger-events-for-descendants-on-rename is not set.

Note

Bloom filters are not supported by all the cache implementations so far only the inplementation for infinispan supports it. They are used to avoid read nodes that definitely do not have ACL. acl-bloomfilter-false-positive-probability and acl-bloomfilter-elements-number used to configure such filters.More about Bloom filters you can read here.

eXo JCR has an RDB (JDBC) based, production ready Workspace Data Container.

JDBC Workspace Data Container specific parameters:

source-name: JDBC data source name, registered in JDNI by InitialContextInitializer. ( sourceName prior v.1.9). This property is mandatory.
dialect: Database dialect, one of "hsqldb", "h2", "mysql", "mysql-myisam", "mysql-utf8", "mysql-myisam-utf8", "pgsql", "pgsql-scs", "oracle", "oracle-oci", "mssql", "sybase", "derby", "db2" ,"db2-mys", "db2v8". The default value is "auto".
multi-db: Enable multi-database container with this parameter (if "true"). Otherwise (if "false") configured for single-database container. Please, be aware, that this property is currently deprecated. It is advised to use db-structure-type instead.
db-structure-type: Can be set to isolated, multi, single to set corresponding configuration for data container. This property is mandatory.
db-tablename-suffix: If db-structure-type is set to isolated, tables, used by repository service, have the following format:
- JCR_I${db-tablename-suffix} - for items
- JCR_V${db-tablename-suffix} - for values
- JCR_R${db-tablename-suffix} - for references
  db-tablename-suffix by default equals to workspace name, but can be set via configuration to any suitable.
batch-size: the batch size. Default value is -1 (disabled)

Workspace Data Container MAY support external storages for javax.jcr.Value (which can be the case for BLOB values for example) using the optional element value-storages. Data Container will try to read or write Value using underlying value storage plugin if the filter criteria (see below) match the current property.

<value-storages>
  <value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
    <properties>
      <property name="path" value="data/values"/>
    </properties>
    <filters>
     <filter property-type="Binary" min-value-size="1M"/><!-- Values large of 1Mbyte -->
    </filters>
.........
</value-storages>

Where value-storage is the subclass of org.exoplatform.services.jcr.storage.value.ValueStoragePlugin and properties are optional plugin specific parameters.

filters : Each file value storage can have the filter(s) for incoming values. If there are several filter criteria, they all have to match (AND-Condition).

A filter can match values by property type (property-type), property name (property-name), ancestor path (ancestor-path) and/or the size of values stored (min-value-size, e.g. 1M, 4.2G, 100 (bytes)).

In a code sample, we use a filter with property-type and min-value-size only. That means that the storage is only for binary values whose size is greater than 1Mbyte.

It's recommended to store properties with large values in a file value storage only.

1.20.1. Database's dialects

1.20.1.1. PostgreSQL/PostgrePlus database

PostgreSQL/PostgrePlus's dialect is set automatically. The dialect depends on the version of database. If you change default value of standard_conforming_strings parameter than you must configure one of the following dialects manually:

PgSQL - this dialect is used if standard_conforming_strings is set to off. This is default value for version before 9.1.
PgSQL-SCS - this dialect is used if standard_conforming_strings is set to on. This is default value for version after 9.1.

1.20.1.2. DB2 database

As well as PostgreSQL, DB2's dialect is set automatically depends on the version of database. If you change the default value of DB2_COMPATIBILITY_VECTOR parameter than you must configure one of the following dialects manually:

DB2V8 - this dialect is used if version of database is lower than 9

DB2 - this dialect is used if version of database not lower than 9 and DB2_COMPATIBILITY_VECTOR is se to 0
DB2-MYS - this dialect is used if version of database not lower than 9 and DB2_COMPATIBILITY_VECTOR is se to MYS. This is default value for version begining from 9.7.2.

1.20.1.3. MySQL database

mysql - this dialect is used if needed to create JCR tables with InnoDB engine (by default)
mysql-utf8 - this dialect is used if needed to create JCR tables with InnoDB engine with UTF-8 encoding support
mysql-myisam - this dialect is used if needed to create JCR tables with MyISAM engine
mysql-myisam-utf8 - this dialect is used if needed to create JCR tables with MyISAM engine with UTF-8 encoding support
mysql-ndb - this dialect is used if needed to create JCR tables with NDB engine (mysql cluster)
mysql-ndb-utf8 - this dialect is used if needed to create JCR tables with NDB engine (mysql cluster) with UTF-8 encoding support

Note

Since MySQL NDB engine does not support foreign keys, which may lead to improper item removal and as consequence to InvalidItemStateException. In this case you will need to use consistency checker tool.

1.21. REST Services on Groovy

Starting from version 1.9, JCR Service supports REST services creation on Groovy script.

The feature bases on RESTful framework and uses ResourceContainer concept.

1.21.1. Usage

Scripts should extend ResourceContainer and should be stored in JCR as a node of type exo:groovyResourceContainer.

Detailed REST services step-by-step implementation check there Create REST service step by step.

Component configuration enables Groovy services loader:

<component>
  <type>org.exoplatform.services.jcr.ext.script.groovy.GroovyScript2RestLoader</type>
  <init-params>
    <object-param>
      <name>observation.config</name>
      <object type="org.exoplatform.services.jcr.ext.script.groovy.GroovyScript2RestLoader$ObservationListenerConfiguration">
        <field name="repository">
          <string>repository</string>
        </field>
        <field name="workspaces">
          <collection type="java.util.ArrayList">
            <value>
              <string>collaboration</string>
            </value>
          </collection>
        </field>
      </object>
    </object-param>
  </init-params>
</component>

1.22. Configuring JBoss AS with eXo JCR in cluster

1.22.1. Launching Cluster

1.22.1.1. Deploying eXo JCR to JBoss As

To deploy eXo JCR to JBoss, do the following steps:

Download the latest version of eXo JCR .ear file distribution.
Copy <jcr.ear> into <%jboss_home%/server/default/deploy>
Put exo-configuration.xml to the root <%jboss_home%/exo-configuration.xml>

Configure JAAS by inserting XML fragment shown below into <%jboss_home%/server/default/conf/login-config.xml>

<application-policy name="exo-domain">
   <authentication>
      <login-module code="org.exoplatform.services.security.j2ee.JbossLoginModule" flag="required"></login-module>
   </authentication>
</application-policy>

Ensure that you use JBossTS Transaction Service and JBossCache Transaction Manager. Your exo-configuration.xml must contain such parts:

<component>
   <key>org.jboss.cache.transaction.TransactionManagerLookup</key>
   <type>org.jboss.cache.GenericTransactionManagerLookup</type>^
</component>

<component>
   <key>org.exoplatform.services.transaction.TransactionService</key>
   <type>org.exoplatform.services.transaction.jbosscache.JBossTransactionsService</type>
   <init-params>
      <value-param>
         <name>timeout</name>
         <value>300</value>
      </value-param>
   </init-params>
</component>

Start server:
- bin/run.sh for Unix
- bin/run.bat for Windows
Try accessing http://localhostu:8080/browser with root/exo as login/password if you have done everything right, you'll get access to repository browser.

1.22.1.2. Configuring JCR to use external configuration

To manually configure repository, create a new configuration file (e.g., exo-jcr-configuration.xml). For details, see JCR Configuration. Your configuration must look like:

<repository-service default-repository="repository1">
   <repositories>
      <repository name="repository1" system-workspace="ws1" default-workspace="ws1">
         <security-domain>exo-domain</security-domain>
         <access-control>optional</access-control>
         <authentication-policy>org.exoplatform.services.jcr.impl.core.access.JAASAuthenticator</authentication-policy>
         <workspaces>
            <workspace name="ws1">
               <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
                  <properties>
                     <property name="source-name" value="jdbcjcr" />
                     <property name="dialect" value="oracle" />
                     <property name="multi-db" value="false" />
                     <property name="update-storage" value="false" />
                     <property name="max-buffer-size" value="200k" />
                     <property name="swap-directory" value="../temp/swap/production" />
                  </properties>
                  <value-storages>
                     see "Value storage configuration" part.
                  </value-storages>
               </container>
               <initializer class="org.exoplatform.services.jcr.impl.core.ScratchWorkspaceInitializer">
                  <properties>
                     <property name="root-nodetype" value="nt:unstructured" />
                  </properties>
               </initializer>
               <cache enabled="true" class="org.exoplatform.services.jcr.impl.dataflow.persistent.jbosscache.JBossCacheWorkspaceStorageCache">
                     see  "Cache configuration" part.
               </cache>
               <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
                  see  "Indexer configuration" part.
               </query-handler>
               <lock-manager class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl">
                  see  "Lock Manager configuration" part.
               </lock-manager>
            </workspace>
            <workspace name="ws2">
                        ...
            </workspace>
            <workspace name="wsN">
                        ...
            </workspace>
         </workspaces>
      </repository>
   </repositories>
</repository-service>

Then, update RepositoryServiceConfiguration configuration in exo-configuration.xml to use this file:

<component>
   <key>org.exoplatform.services.jcr.config.RepositoryServiceConfiguration</key>
   <type>org.exoplatform.services.jcr.impl.config.RepositoryServiceConfigurationImpl</type>
   <init-params>
      <value-param>
         <name>conf-path</name>
         <description>JCR configuration file</description>
         <value>exo-jcr-configuration.xml</value>
      </value-param>
   </init-params>
</component>

1.22.2. Requirements

1.22.2.1. Environment requirements

Every node of cluster MUST have the same mounted Network File System with the read and write permissions on it.
"/mnt/tornado" - path to the mounted Network File System (all cluster nodes must use the same NFS).
Every node of cluster MUST use the same database.
The same Clusters on different nodes MUST have the same names (e.g., if Indexer cluster in workspace production on the first node has the name "production_indexer_cluster", then indexer clusters in workspace production on all other nodes MUST have the same name "production_indexer_cluster" ).

1.22.2.2. Configuration requirements

Configuration of every workspace in repository must contains of such parts:

Value Storage configuration:

<value-storages>
   <value-storage id="system" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
      <properties>
         <property name="path" value="/mnt/tornado/temp/values/production" />    <!--path within NFS where ValueStorage will hold it's data-->
      </properties>
      <filters>
         <filter property-type="Binary" />
      </filters>
   </value-storage>
</value-storages>

Cache configuration:

<cache enabled="true" class="org.exoplatform.services.jcr.impl.dataflow.persistent.jbosscache.JBossCacheWorkspaceStorageCache">
   <properties>
      <property name="jbosscache-configuration" value="jar:/conf/portal/test-jbosscache-data.xml" />     <!--    path to JBoss Cache configuration for data storage -->
      <property name="jgroups-configuration" value="jar:/conf/portal/udp-mux.xml" />                     <!--    path to JGroups configuration -->
      <property name="jbosscache-cluster-name" value="JCR_Cluster_cache" />                              <!--    JBoss Cache data storage cluster name -->
      <property name="jgroups-multiplexer-stack" value="false" />
      <property name="jbosscache-shareable" value="true" />
   </properties>
</cache>

Indexer configuration:

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
   <properties>
      <property name="changesfilter-class" value="org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter" />
      <property name="index-dir" value="/mnt/tornado/temp/jcrlucenedb/production" />                       <!--    path within NFS where ValueStorage will hold it's data -->
      <property name="jbosscache-configuration" value="jar:/conf/portal/test-jbosscache-indexer.xml" />    <!--    path to JBoss Cache configuration for indexer -->
      <property name="jgroups-configuration" value="jar:/conf/portal/udp-mux.xml" />                       <!--    path to JGroups configuration -->
      <property name="jbosscache-cluster-name" value="JCR_Cluster_indexer" />                              <!--    JBoss Cache indexer cluster name -->
      <property name="jgroups-multiplexer-stack" value="false" />
      <property name="jbosscache-shareable" value="true" />
   </properties>
</query-handler>

Lock Manager configuration:

<lock-manager class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl">
   <properties>
      <property name="time-out" value="15m" />
      <property name="jbosscache-configuration" value="jar:/conf/portal/test-jbosscache-lock.xml" />       <!--    path to JBoss Cache configuration for lock manager -->
      <property name="jgroups-configuration" value="jar:/conf/portal/udp-mux.xml" />                       <!--    path to JGroups configuration -->
      <property name="jbosscache-cluster-name" value="JCR_Cluster_locks" />                                <!--    JBoss Cache locks cluster name -->                    
      <property name="jbosscache-cl-cache.jdbc.table.name" value="jcrlocks"/>                              <!--    the name of the DB table where lock's data will be stored -->
      <property name="jbosscache-cl-cache.jdbc.table.create" value="true"/>
      <property name="jbosscache-cl-cache.jdbc.table.drop" value="false"/>
      <property name="jbosscache-cl-cache.jdbc.table.primarykey" value="jcrlocks_pk"/>
      <property name="jbosscache-cl-cache.jdbc.fqn.column" value="fqn"/>
      <property name="jbosscache-cl-cache.jdbc.node.column" value="node"/>
      <property name="jbosscache-cl-cache.jdbc.parent.column" value="parent"/>
      <property name="jbosscache-cl-cache.jdbc.datasource" value="jdbcjcr"/>
      <property name="jbosscache-cl-cache.jdbc.dialect" value="${dialect}" />
      <property name="jgroups-multiplexer-stack" value="false" />
      <property name="jbosscache-shareable" value="true" />
   </properties>
</lock-manager>

1.23. JBoss Cache configuration

This section will show you how to use and configure Jboss Cache in the clustered environment. Also, you will know how to use a template-based configuration offered by eXo JCR for JBoss Cache instances.

1.23.1. JBoss cache configuration for indexer, lock manager and data container

Each mentioned components uses instances of JBoss Cache product for caching in clustered environment. So every element has its own transport and has to be configured in a proper way. As usual, workspaces have similar configuration but with different cluster-names and may-be some other parameters. The simplest way to configure them is to define their own configuration files for each component in each workspace:

<property name="jbosscache-configuration" value="conf/standalone/test-jbosscache-lock-db1-ws1.xml" />

But if there are few workspaces, configuring them in such a way can be painful and hard-manageable. eXo JCR offers a template-based configuration for JBoss Cache instances. You can have one template for Lock Manager, one for Indexer and one for data container and use them in all the workspaces, defining the map of substitution parameters in a main configuration file. Just simply define ${jbosscache-<parameter name>} inside xml-template and list correct value in JCR configuration file just below "jbosscache-configuration", as shown:

Template:

...
<clustering mode="replication" clusterName="${jbosscache-cluster-name}">
  <stateRetrieval timeout="20000" fetchInMemoryState="false" />
...

and JCR configuration file:

...
<property name="jbosscache-configuration" value="jar:/conf/portal/jbosscache-lock.xml" />
<property name="jbosscache-cluster-name" value="JCR-cluster-locks-db1-ws" />
...

1.23.2. JGroups configuration

JGroups is used by JBoss Cache for network communications and transport in a clustered environment. If property "jgroups-configuration" is defined in component configuration, it will be injected into the JBoss Cache instance on startup.

<property name="jgroups-configuration" value="your/path/to/modified-udp.xml" />

As mentioned above, each component (lock manager, data container and query handler) for each workspace requires its own clustered environment. In other words, they have their own clusters with unique names. By default, each cluster should perform multi-casts on a separate port. This configuration leads to much unnecessary overhead on cluster. That's why JGroups offers multiplexer feature, providing ability to use one single channel for set of clusters. This feature reduces network overheads and increase performance and stability of application. To enable multiplexer stack, you should define appropriate configuration file (upd-mux.xml is pre-shipped one with eXo JCR) and set "jgroups-multiplexer-stack" into "true".

<property name="jgroups-configuration" value="jar:/conf/portal/udp-mux.xml" />
<property name="jgroups-multiplexer-stack" value="true" />

It is now highly recommended to use the shared transport instead of the multiplexer, to do so simply disable the multiplexer stack in the configuration of each component then set the property singleton_name of your JGroups configuration to a unique name.

<property name="jgroups-configuration" value="jar:/conf/portal/udp-mux.xml" />
<property name="jgroups-multiplexer-stack" value="false" />

1.23.3. Allow to share JBoss Cache instances

A JBoss Cache instance is quite resource consuming and by default we will have 3 JBoss Cache instances (one instance for the indexer, one for the lock manager and one for the data container) for each workspace, so if you intend to have a lot of workspaces it could make sense to decide to share one JBoss Cache instance with several cache instances of the same type (i.e. indexer, lock manager or data container). This feature is disabled by default and can be enabled at component configuration level (i.e. indexer configuration, lock manager configuration and/or data container configuration) by setting the property "jbosscache-shareable" to true as below:

<property name="jbosscache-shareable" value="true" />

Once enabled this feature will allow the JBoss Cache instance used by the component to be re-used by another components of the same type (i.e. indexer, lock manager or data container) with the exact same JBoss Cache configuration (except the eviction configuration that cans be different), which means that all the parameters of type ${jbosscache-<parameter name>} must be identical between the components of same type of different workspaces. In other words, if we use the same values for the parameters of type ${jbosscache-<parameter name>} in each workspace, we will have only 3 JBoss Cache instances (one instance for the indexer, one for the lock manager and one for the data container) used whatever the total amount of workspaces defined.

1.23.4. Configure the maximum invalidations

In clustered environment, jbosscache-max-invalidations parameter is used by JBossCacheWorkspaceStorageCache to indicate the maximum total amount of invalidation operations that can be done outside the current transaction in auto commit mode to prevent potential deadlocks. This parameter is dedicated to cluster mode only and is needed to ensure that even with big transactions that do a lot of invalidations, we won't have too many JGroups calls which would dramatically affect the scalability and the performance of the whole platform. The value of this parameter must be big enough to prevent deadlocks on operations that add and/or remove concurrently JCR nodes to/from the same parent node but also it should not be too big to avoid affecting too much the scalability and the performance of the whole platform. Default value is 10.

This parameter can be configured at cache configuration as below:

<property name="jbosscache-max-invalidations" value="20" />

1.23.5. Shipped JBoss Cache configuration templates

eXo JCR implementation is shipped with ready-to-use JBoss Cache configuration templates for JCR's components. They are situated in application package in /conf/porta/ folder.

1.23.5.1. Data container template

Data container template is "jbosscache-data.xml":

<?xml version="1.0" encoding="UTF-8"?>
<jbosscache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:jboss:jbosscache-core:config:3.1">

   <locking useLockStriping="false" concurrencyLevel="500" lockParentForChildInsertRemove="false"
      lockAcquisitionTimeout="20000" />

   <clustering mode="replication" clusterName="${jbosscache-cluster-name}">
      <stateRetrieval timeout="20000" fetchInMemoryState="false" />
      <sync />
   </clustering>

   <!-- Eviction configuration -->
   <eviction wakeUpInterval="5000">
      <default algorithmClass="org.jboss.cache.eviction.ExpirationAlgorithm"
         actionPolicyClass="org.exoplatform.services.jcr.impl.dataflow.persistent.jbosscache.ParentNodeEvictionActionPolicy"
         eventQueueSize="1000000">
         <property name="maxNodes" value="1000000" />
         <property name="warnNoExpirationKey" value="false" />
      </default>
   </eviction>
</jbosscache>

Table 1.5. Template variables

Variable	Description
jbosscache-cluster-name	cluster name (must be unique)

1.23.5.2. Lock manager template

It's template name is "jbosscache-lock.xml"

<?xml version="1.0" encoding="UTF-8"?>
<jbosscache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:jboss:jbosscache-core:config:3.1">

   <locking useLockStriping="false" concurrencyLevel="500" lockParentForChildInsertRemove="false"
      lockAcquisitionTimeout="20000" />
   <clustering mode="replication" clusterName="${jbosscache-cluster-name}">
      <stateRetrieval timeout="20000" fetchInMemoryState="false" />
      <sync />
   </clustering>
   <loaders passivation="false" shared="true">
      <preload>
         <node fqn="/" />
      </preload>
      <loader class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.JDBCCacheLoader" async="false" fetchPersistentState="false"
         ignoreModifications="false" purgeOnStartup="false">
         <properties>
            cache.jdbc.table.name=${jbosscache-cl-cache.jdbc.table.name}
            cache.jdbc.table.create=${jbosscache-cl-cache.jdbc.table.create}
            cache.jdbc.table.drop=${jbosscache-cl-cache.jdbc.table.drop}
            cache.jdbc.table.primarykey=${jbosscache-cl-cache.jdbc.table.primarykey}
            cache.jdbc.fqn.column=${jbosscache-cl-cache.jdbc.fqn.column}
            cache.jdbc.fqn.type=${jbosscache-cl-cache.jdbc.fqn.type}
            cache.jdbc.node.column=${jbosscache-cl-cache.jdbc.node.column}
            cache.jdbc.node.type=${jbosscache-cl-cache.jdbc.node.type}
            cache.jdbc.parent.column=${jbosscache-cl-cache.jdbc.parent.column}
            cache.jdbc.datasource=${jbosscache-cl-cache.jdbc.datasource}
         </properties>
      </loader>
   </loaders>
</jbosscache>

Note

To prevent any consistency issue regarding the lock data please ensure that your cache loader is org.exoplatform.services.jcr.impl.core.lock.jbosscache.JDBCCacheLoader and that your database engine is transactional.

Table 1.6. Template variables

Variable	Description
jbosscache-cluster-name	cluster name (must be unique)
jbosscache-cl-cache.jdbc.table.name	the name of the table.
jbosscache-cl-cache.jdbc.table.create	can be true or false. Indicates whether to create the able during startup. If true, the table is created if it doesn't already exist. The default value is true.
jbosscache-cl-cache.jdbc.table.drop	can be true or false. Indicates whether to drop the table during shutdown. The default value is true.
jbosscache-cl-cache.jdbc.table.primarykey	the name of the primary key for the table.
jbosscache-cl-cache.jdbc.fqn.column	FQN column name. The default value is 'fqn'.
jbosscache-cl-cache.jdbc.fqn.type	FQN column type. The default value is 'varchar(255)'.
jbosscache-cl-cache.jdbc.node.column	node contents column name. The default value is 'node'.
jbosscache-cl-cache.jdbc.node.type	node contents column type. The default value is 'blob'. This type must specify a valid binary data type for the database being used.
jbosscache-cl-cache.jdbc.parent.column	Parent column name. The default value is 'parent'.
jbosscache-cl-cache.jdbc.datasource	JNDI name of the DataSource.

1.23.5.3. Query handler (indexer) template

Have a look at "jbosscache-indexer.xml"

<?xml version="1.0" encoding="UTF-8"?>
<jbosscache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:jboss:jbosscache-core:config:3.1">
   <locking useLockStriping="false" concurrencyLevel="500" lockParentForChildInsertRemove="false"
      lockAcquisitionTimeout="20000" />
   <clustering mode="replication" clusterName="${jbosscache-cluster-name}">
      <stateRetrieval timeout="20000" fetchInMemoryState="false" />
      <sync />
   </clustering>
</jbosscache>

Table 1.7. Template variables

Variable	Description
jbosscache-cluster-name	cluster name (must be unique)

1.24. LockManager configuration

What LockManager does?

In general, LockManager stores Lock objects, so it can give a Lock object or can release it.

Also, LockManager is responsible for removing Locks that live too long. This parameter may be configured with "time-out" property.

JCR provides one basic implementations of LockManager:

org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl

1.24.1. CacheableLockManagerImpl

CacheableLockManagerImpl stores Lock objects in JBoss-cache, so Locks are replicable and affect on cluster, not only a single node. Also, JBoss-cache has JDBCCacheLoader, so Locks will be stored to the database.

You can enable LockManager by adding lock-manager-configuration to workspace-configuration.

Wher time-out parameter represents interval to remove Expired Locks. LockRemover separates threads, that periodically ask LockManager to remove Locks that live so long.

<workspace name="ws">
   ...
   <lock-manager class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl">
      <properties>
         <property name="time-out" value="15m" />
         ...
      </properties>
   </lock-manager>               
   ...
</workspace>

1.24.1.1. Configuration

The configuration uses the template JBoss-cache configuration for all LockManagers.

Lock template configuration

test-jbosscache-lock.xml

<?xml version="1.0" encoding="UTF-8"?>
<jbosscache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:jboss:jbosscache-core:config:3.1">

   <locking useLockStriping="false" concurrencyLevel="500" lockParentForChildInsertRemove="false"
      lockAcquisitionTimeout="20000" />

   <clustering mode="replication" clusterName="${jbosscache-cluster-name}">
      <stateRetrieval timeout="20000" fetchInMemoryState="false" />
      <sync />
   </clustering>

   <loaders passivation="false" shared="true">
      <!-- All the data of the JCR locks needs to be loaded at startup -->
      <preload>
         <node fqn="/" />
      </preload>  
      <!--
      For another cache-loader class you should use another template with
      cache-loader specific parameters
      ->
      <loader class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.JDBCCacheLoader" async="false" fetchPersistentState="false"
         ignoreModifications="false" purgeOnStartup="false">
         <properties>
            cache.jdbc.table.name=${jbosscache-cl-cache.jdbc.table.name}
            cache.jdbc.table.create=${jbosscache-cl-cache.jdbc.table.create}
            cache.jdbc.table.drop=${jbosscache-cl-cache.jdbc.table.drop}
            cache.jdbc.table.primarykey=${jbosscache-cl-cache.jdbc.table.primarykey}
            cache.jdbc.fqn.column=${jbosscache-cl-cache.jdbc.fqn.column}
            cache.jdbc.fqn.type=${jbosscache-cl-cache.jdbc.fqn.type}
            cache.jdbc.node.column=${jbosscache-cl-cache.jdbc.node.column}
            cache.jdbc.node.type=${jbosscache-cl-cache.jdbc.node.type}
            cache.jdbc.parent.column=${jbosscache-cl-cache.jdbc.parent.column}
            cache.jdbc.datasource=${jbosscache-cl-cache.jdbc.datasource}
         </properties>
      </loader>
   </loaders>
</jbosscache>

Note

To prevent any consistency issue regarding the lock data + please ensure that your cache loader is org.exoplatform.services.jcr.impl.core.lock.jbosscache.JDBCCacheLoader and that your database engine is transactional.

As you see, all configurable parameters are filled by templates and will be replaced by LockManagers configuration parameters:

<lock-manager class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl">
   <properties>
      <property name="time-out" value="15m" />
      <property name="jbosscache-configuration" value="test-jbosscache-lock.xml" />
      <property name="jgroups-configuration" value="udp-mux.xml" />
      <property name="jgroups-multiplexer-stack" value="true" />
      <property name="jbosscache-cluster-name" value="JCR-cluster-locks-ws" />
      <property name="jbosscache-cl-cache.jdbc.table.name" value="jcrlocks_ws" />
      <property name="jbosscache-cl-cache.jdbc.table.create" value="true" />
      <property name="jbosscache-cl-cache.jdbc.table.drop" value="false" />
      <property name="jbosscache-cl-cache.jdbc.table.primarykey" value="jcrlocks_ws_pk" />
      <property name="jbosscache-cl-cache.jdbc.fqn.column" value="fqn" />
      <property name="jbosscache-cl-cache.jdbc.fqn.type" value="AUTO"/>
      <property name="jbosscache-cl-cache.jdbc.node.column" value="node" />
      <property name="jbosscache-cl-cache.jdbc.node.type" value="AUTO"/>
      <property name="jbosscache-cl-cache.jdbc.parent.column" value="parent" />
      <property name="jbosscache-cl-cache.jdbc.datasource" value="jdbcjcr" />
      <property name="jbosscache-cl-cache.jdbc.dialect" value="${dialect}" />
      <property name="jbosscache-shareable" value="true" />
   </properties>
</lock-manager>

Configuration requirements:

jbosscache-cl-cache.jdbc.fqn.column and jbosscache-cl-cache.jdbc.node.type is the same as cache.jdbc.fqn.type and cache.jdbc.node.type in JBoss-Cache configuration. You can set those data types according to your database type or set it as AUTO (or do not set at all) and data type will be detected automatically.
As you see, jgroups-configuration is moved to separate the configuration file - udp-mux.xml. In this case, the udp-mux.xml file is a common JGroup configuration for all components (QueryHandler, Cache, LockManager), but we can still create our own configuration.

our udp-mux.xml

<config>
    <UDP
         singleton_name="JCR-cluster" 
         mcast_addr="${jgroups.udp.mcast_addr:228.10.10.10}"
         mcast_port="${jgroups.udp.mcast_port:45588}"
         tos="8" 
         ucast_recv_buf_size="20000000"
         ucast_send_buf_size="640000" 
         mcast_recv_buf_size="25000000" 
         mcast_send_buf_size="640000" 
         loopback="false"
         discard_incompatible_packets="true" 
         max_bundle_size="64000" 
         max_bundle_timeout="30"
         use_incoming_packet_handler="true" 
         ip_ttl="${jgroups.udp.ip_ttl:2}"
         enable_bundling="false" 
         enable_diagnostics="true"
         thread_naming_pattern="cl" 

         use_concurrent_stack="true" 

         thread_pool.enabled="true" 
         thread_pool.min_threads="2"
         thread_pool.max_threads="8" 
         thread_pool.keep_alive_time="5000" 
         thread_pool.queue_enabled="true"
         thread_pool.queue_max_size="1000"
         thread_pool.rejection_policy="discard"

         oob_thread_pool.enabled="true"
         oob_thread_pool.min_threads="1"
         oob_thread_pool.max_threads="8"
         oob_thread_pool.keep_alive_time="5000"
         oob_thread_pool.queue_enabled="false" 
         oob_thread_pool.queue_max_size="100" 
         oob_thread_pool.rejection_policy="Run" />

    <PING timeout="2000"
            num_initial_members="3"/>
    <MERGE2 max_interval="30000"
            min_interval="10000"/>
   <FD_SOCK />
   <FD timeout="10000" max_tries="5" shun="true" />
   <VERIFY_SUSPECT timeout="1500" />
   <BARRIER />
    <pbcast.NAKACK use_stats_for_retransmission="false"
                   exponential_backoff="150"
                   use_mcast_xmit="true" gc_lag="0"
                   retransmit_timeout="50,300,600,1200"
                   discard_delivered_msgs="true"/>
   <UNICAST timeout="300,600,1200" />
    <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
                   max_bytes="1000000"/>
   <VIEW_SYNC avg_send_interval="60000" />
    <pbcast.GMS print_local_addr="true" join_timeout="3000"
                shun="false"
                view_bundling="true"/>
    <FC max_credits="500000"
                    min_threshold="0.20"/>
   <FRAG2 frag_size="60000" />
   <!--pbcast.STREAMING_STATE_TRANSFER /-->
   <pbcast.STATE_TRANSFER />
   <pbcast.FLUSH />
</config>

1.24.1.2. Data Types in Different Databases

Table 1.8. FQN type and node type in different databases

DataBase name	Node data type	FQN data type
default	BLOB	VARCHAR(512)
HSSQL	OBJECT	VARCHAR(512)
MySQL	LONGBLOB	VARCHAR(512)
ORACLE	BLOB	VARCHAR2(512)
PostgreSQL/PostgrePlus	bytea	VARCHAR(512)
MSSQL	VARBINARY(MAX)	VARCHAR(512)
DB2	BLOB	VARCHAR(512)
Sybase	IMAGE	VARCHAR(512)
Ingres	long byte	VARCHAR(512)

1.24.1.3. Lock migration from 1.12.x

There are 3 choices:

I. When new Shareable Cache feature is not going to be used and all locks should be kept after migration.

Ensure that the same lock tables used in configuration;
Start the server;

II. When new Shareable Cache feature is not going to be used and all locks should be removed after migration.

Ensure that the same lock tables used in configuration;
Start the sever WITH system property -Dorg.exoplatform.jcr.locks.force.remove=true;
Stop the server;
Start the server (WITHOUT system property -Dorg.exoplatform.jcr.locks.force.remove);

III. When new Shareable Cache feature will be used (in this case all locks are removed after migration).

Start the sever WITH system property -Dorg.exoplatform.jcr.locks.force.remove=true;
Stop the server;
Start the server (WITHOUT system property -Dorg.exoplatform.jcr.locks.force.remove);
(Not mandatory) manually remove old tables for lock;

1.25. QueryHandler configuration

This section shows you how to configure QueryHandler: Indexing in clustered environment.

1.25.1. Indexing in clustered environment

JCR offers multiple indexing strategies. They include both for standalone and clustered environments using the advantages of running in a single JVM or doing the best to use all resources available in cluster. JCR uses Lucene library as underlying search and indexing engine, but it has several limitations that greatly reduce possibilities and limits the usage of cluster advantages. That's why eXo JCR offers three strategies that are suitable for it's own usecases. They are standalone, clustered with shared index, clustered with local indexes and RSync-based. Each one has it's pros and cons.

Stanadlone strategy provides a stack of indexes to achieve greater performance within single JVM.

It combines in-memory buffer index directory with delayed file-system flushing. This index is called "Volatile" and it is invoked in searches also. Within some conditions volatile index is flushed to the persistent storage (file system) as new index directory. This allows to achieve great results for write operations.

Clustered implementation with local indexes is built upon same strategy with volatile in-memory index buffer along with delayed flushing on persistent storage.

As this implementation designed for clustered environment it has additional mechanisms for data delivery within cluster. Actual text extraction jobs done on the same node that does content operations (i.e. write operation). Prepared "documents" (Lucene term that means block of data ready for indexing) are replicated withing cluster nodes and processed by local indexes. So each cluster instance has the same index content. When new node joins the cluster it has no initial index, so it must be created. There are some supported ways of doing this operation. The simplest is to simply copy the index manually but this is not intended for use. If no initial index found JCR uses automated sceneries. They are controlled via configuration (see "index-recovery-mode" parameter) offering full re-indexing from database or copying from another cluster node.

For some reasons having a multiple index copies on each instance can be costly. So shared index can be used instead (see diagram below).

This indexing strategy combines advantages of in-memory index along with shared persistent index offering "near" real time search capabilities. This means that newly added content is accessible via search practically immediately. This strategy allows nodes to index data in their own volatile (in-memory) indexes, but persistent indexes are managed by single "coordinator" node only. Each cluster instance has a read access for shared index to perform queries combining search results found in own in-memory index also. Take in account that shared folder must be configured in your system environment (i.e. mounted NFS folder). But this strategy in some extremely rare cases can have a bit different volatile indexes within cluster instances for a while. In a few seconds they will be up2date.

Shared index is consistent and stable enough, but slow, while local index is fast, but requires much time for re-synchronization, when cluster node is leaving a cluster for a small period of time. RSync-based index solves this problem along with local file system advantages in term of speed.

This strategy is the same shared index, but stores actual data on local file system, instead of shared. Eventually triggering a synchronization job, that woks on the level of file blocks, synchronizing only modified data. Diagram shows it in action. Only single node in the cluster is responsible for modifying index files, this is the Coordinator node. When data persisted, corresponding command fired, starting synchronization jobs all over the cluster.

See more about Search Configuration.

1.25.2. Configuration

1.25.2.1. Query-handler configuration overview

Configuration example:

<workspace name="ws">
   <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
      <properties>
         <property name="index-dir" value="shareddir/index/db1/ws" />
         <property name="changesfilter-class"
            value="org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter" />
         <property name="jbosscache-configuration" value="jbosscache-indexer.xml" />
         <property name="jgroups-configuration" value="udp-mux.xml" />
         <property name="jgroups-multiplexer-stack" value="true" />
         <property name="jbosscache-cluster-name" value="JCR-cluster-indexer-ws" />
         <property name="max-volatile-time" value="60" />
         <property name="rdbms-reindexing" value="true" />
         <property name="reindexing-page-size" value="1000" />
         <property name="index-recovery-mode" value="from-coordinator" />
         <property name="index-recovery-filter" value="org.exoplatform.services.jcr.impl.core.query.lucene.DocNumberRecoveryFilter" />
         <property name="indexing-thread-pool-size" value="16" />
      </properties>
   </query-handler>
</workspace>

Table 1.9. Config properties description

Property name	Description
index-dir	path to index
changesfilter-class	The FQN of the class to use to indicate the policy to use to manage the lucene indexes changes. This class must extend org.exoplatform.services.jcr.impl.core.query.IndexerChangesFilter. This must be set in cluster environment to define the clustering strategy to adopt. To use the Shared Indexes Strategy, you can set it to org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter. I you prefer the Local Indexes Strategy, you can set it to org.exoplatform.services.jcr.impl.core.query.jbosscache.LocalIndexChangesFilter.
jbosscache-configuration	template of JBoss-cache configuration for all query-handlers in repository (search, cache, locks)
jgroups-configuration	This is the path to JGroups configuration that should not be anymore jgroups' stack definitions but a normal jgroups configuration format with the shared transport configured by simply setting the jgroups property singleton_name to a unique name (it must remain unique from one portal container to another). This file is also pre-bundled with templates and is recommended for use.
jgroups-multiplexer-stack	if set to true, it will indicate that the file corresponding to the parameter jgroups-configuration is a actually a file defining a set of jgroups multiplexer stacks. In the XML tag jgroupsConfig within the jboss cache configuration, you will then be able to set the name of the multiplexer stack to use thanks to the attribute multiplexerStack. Please note that the jgroups multiplexer has been deprecated by the jgroups Team and has been replaced by the shared transport so it is highly recommended to not use it anymore.
jbosscache-cluster-name	cluster name (must be unique)
max-volatile-time	max time to live for Volatile Index
rdbms-reindexing	Indicates whether the rdbms re-indexing mechanism must be used, the default value is true.
reindexing-page-size	maximum amount of nodes which can be retrieved from storage for re-indexing purpose, the default value is 100
index-recovery-mode	If the parameter has been set to `from-indexing`, so a full indexing will be automatically launched, if the parameter has been set to `from-coordinator` (default behavior), the index will be retrieved from coordinator
index-recovery-filter	Defines implementation class or classes of RecoveryFilters, the mechanism of index synchronization for Local Index strategy.
async-reindexing	Controls the process of re-indexing on JCR's startup. If flag set, indexing will be launched asynchronously, without blocking the JCR. Default is "false".
indexing-thread-pool-size	Defines the total amount of indexing threads.
max-volatile-size	The maximum volatile index size in bytes until it is written to disk. The default value is 1048576 (1MB).

Note

If you use postgreSQL and the parameter rdbms-reindexing is set to true, the performances of the queries used while indexing can be improved by setting the parameter "enable_seqscan" to "off" or "default_statistics_target" to at least "50" in the configuration of your database. Then you need to restart DB server and make analyze of the JCR_SVALUE (or JCR_MVALUE) table.

Note

If you use DB2 and the parameter rdbms-reindexing is set to true, the performance of the queiries used while indexing can be improved by making statisticks on tables by running "RUNSTATS ON TABLE <scheme>.<table> WITH DISTRIBUTION AND INDEXES ALL" for JCR_SITEM (or JCR_MITEM) and JCR_SVALUE (or JCR_MVALUE) tables.

1.25.2.2. Standalone strategy

When running JCR in standalone usually standalone indexing is used also. Such parameters as "changesfilter-class", "jgroups-configuration" and all the "jbosscache-*" must be skipped and not defined. Like the configuration below.

<workspace name="ws">
   <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
      <properties>
         <property name="index-dir" value="shareddir/index/db1/ws" />
         <property name="max-volatile-time" value="60" />
         <property name="rdbms-reindexing" value="true" />
         <property name="reindexing-page-size" value="1000" />
         <property name="index-recovery-mode" value="from-coordinator" />
      </properties>
   </query-handler>
</workspace>

1.25.2.3. Cluster-ready indexing strategies

1.25.2.3.1. Shared Index

For both cluster-ready implementations JBoss Cache, JGroups and Changes Filter values must be defined. Shared index requires some kind of remote or shared file system to be attached in a system (i.e. NFS, SMB or etc). Indexing directory ("indexDir" value) must point to it. Setting "changesfilter-class" to "org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter" will enable shared index implementation.

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
   <properties>
      <property name="index-dir" value="/mnt/nfs_drive/index/db1/ws" />
      <property name="changesfilter-class"
         value="org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter" />
      <property name="jbosscache-configuration" value="jbosscache-indexer.xml" />
      <property name="jgroups-configuration" value="udp-mux.xml" />
      <property name="jgroups-multiplexer-stack" value="true" />
      <property name="jbosscache-cluster-name" value="JCR-cluster-indexer-ws" />
      <property name="max-volatile-time" value="60" />
      <property name="rdbms-reindexing" value="true" />
      <property name="reindexing-page-size" value="1000" />
      <property name="index-recovery-mode" value="from-coordinator" />
   </properties>
</query-handler>

1.25.2.3.2. RSync Index

1.25.2.3.2.1. System requirements

Mandatory requirement for Rsync-based indexing strategy is an installed and properly configured RSync utility. It must be accessible by calling "rsync" without defining it's full path, in addition each cluster node should have a running RSync Server supporting "rsync://" protocol. For more details, please refer to RSync and operation system documentations. Sample RSync Server configuration will be shown below. There are some additional limitations also. Path for index for each workspace must be the same across the cluster, i.e. "/var/data/index/<repository-name>/<workspace-name>". Next limitation is RSync Server configuration. It must share some of index's parent folders. For example, "/var/data/index". In other words, index is stored inside of RSync Server shared folder. Configuration details are give below.

1.25.2.3.2.2. Configuration

Configuration has much in common with Shared Index, it just requires some additional parameters for RSync options. If they are present, JCR switches from shared to RSync-based index. Here is an example configuration:

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
   <properties>
      <property name="index-dir" value="/var/data/index/repository1/production" />
      <property name="changesfilter-class"
         value="org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter" />
      <property name="jbosscache-configuration" value="jar:/conf/portal/cluster/jbosscache-indexer.xml" />
      <property name="jgroups-configuration" value="jar:/conf/portal/cluster/udp-mux.xml" />
      <property name="jgroups-multiplexer-stack" value="false" />
      <property name="jbosscache-cluster-name" value="JCR-cluster-indexer" />
      <property name="jbosscache-shareable" value="true" />
      <property name="max-volatile-time" value="60" /> 
      <property name="rsync-entry-name" value="index" />
      <property name="rsync-entry-path" value="/var/data/index" />
      <property name="rsync-port" value="8085" />
      <property name="rsync-user" value="rsyncexo" />
      <property name="rsync-password" value="exo" />
   </properties>
</query-handler>

Let's start with authentication: "rsync-user" and "rsync-password". They are optional and can be skipped if RSync Server configured to accept anonymous identity. Before reviewing other RSync index options need to have a look at RSync Server configuration. Sample RSync Server (rsyncd) Configuration

uid = nobody
gid = nobody
use chroot = no
port = 8085
log file = rsyncd.log
pid file = rsyncd.pid
[index]
        path = /var/data/index
        comment = indexes
        read only = true
        auth users = rsyncexo
        secrets file= rsyncd.secrets

This sample configuration shares folder "/var/data/index" as an entry "index". Those parameters should match corresponding properties in JCR configuration. Respectively "rsync-entry-name", "rsync-entry-path", "rsync-port" properties. Notice! Make sure "index-dir" is a descendant folder of RSync shared folder and those paths are the same on each cluster node.

1.25.2.3.3. Local Index

In order to use cluster-ready strategy based on local indexes, when each node has own copy of index on local file system, the following configuration must be applied. Indexing directory must point to any folder on local file system and "changesfilter-class" must be set to "org.exoplatform.services.jcr.impl.core.query.jbosscache.LocalIndexChangesFilter".

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
   <properties>
      <property name="index-dir" value="/mnt/nfs_drive/index/db1/ws" />
      <property name="changesfilter-class"
         value="org.exoplatform.services.jcr.impl.core.query.jbosscache.LocalIndexChangesFilter" />
      <property name="jbosscache-configuration" value="jbosscache-indexer.xml" />
      <property name="jgroups-configuration" value="udp-mux.xml" />
      <property name="jgroups-multiplexer-stack" value="true" />
      <property name="jbosscache-cluster-name" value="JCR-cluster-indexer-ws" />
      <property name="max-volatile-time" value="60" />
      <property name="rdbms-reindexing" value="true" />
      <property name="reindexing-page-size" value="1000" />
      <property name="index-recovery-mode" value="from-coordinator" />
   </properties>
</query-handler>

1.25.2.3.3.1. Local Index Recover Filters

Common usecase for all cluster-ready applications is a hot joining and leaving of processing units. Node that is joining cluster for the first time or node joining after some downtime, they all must be in a synchronized state. When having a deal with shared value storages, databases and indexes, cluster nodes are synchronized anytime. But it's an issue when local index strategy used. If new node joins cluster, having no index it is retrieved or recreated. Node can be restarted also and thus index not empty. By default existing index is thought to be actual, but can be outdated. JCR offers a mechanism called RecoveryFilters that will automatically retrieve index for the joining node on startup. This feature is a set of filters that can be defined via QueryHandler configuration:

<property name="index-recovery-filter" value="org.exoplatform.services.jcr.impl.core.query.lucene.DocNumberRecoveryFilter" />

Filter number is not limited so they can be combined:

<property name="index-recovery-filter" value="org.exoplatform.services.jcr.impl.core.query.lucene.DocNumberRecoveryFilter" />
<property name="index-recovery-filter" value="org.exoplatform.services.jcr.impl.core.query.lucene.SystemPropertyRecoveryFilter" />

If any one fires, the index is re-synchronized. Please take in account, that DocNumberRecoveryFilter is used in cases when no filter configured. So, if resynchronization should be blocked, or strictly required on start, then ConfigurationPropertyRecoveryFilter can be used.

This feature uses standard index recovery mode defined by previously described parameter (can be "from-indexing" or "from-coordinator" (default value))

<property name="index-recovery-mode" value="from-coordinator" />

There are couple implementations of filters:

org.exoplatform.services.jcr.impl.core.query.lucene.DummyRecoveryFilter: always returns true, for cases when index must be force resynchronized (recovered) each time;
org.exoplatform.services.jcr.impl.core.query.lucene.SystemPropertyRecoveryFilter : return value of system property "org.exoplatform.jcr.recoveryfilter.forcereindexing". So index recovery can be controlled from the top without changing documentation using system properties;
org.exoplatform.services.jcr.impl.core.query.lucene.ConfigurationPropertyRecoveryFilter : return value of QueryHandler configuration property "index-recovery-filter-forcereindexing". So index recovery can be controlled from configuration separately for each workspace. I.e:
```
<property name="index-recovery-filter" value="org.exoplatform.services.jcr.impl.core.query.lucene.ConfigurationPropertyRecoveryFilter" />
<property name="index-recovery-filter-forcereindexing" value="true" />
```
org.exoplatform.services.jcr.impl.core.query.lucene.DocNumberRecoveryFilter : checks number of documents in index on coordinator side and self-side. Return true if differs. Advantage of this filter comparing to other, it will skip reindexing for workspaces where index wasn't modified. I.e. there is 10 repositories with 3 workspaces in each one. Only one is really heavily used in cluster : frontend/production. So using this filter will only reindex those workspaces that are really changed, without affecting other indexes thus greatly reducing startup time.

1.25.2.4. JBoss-Cache template configuration

JBoss-Cache template configuration for query handler is about the same for both clustered strategies.

jbosscache-indexer.xml

<?xml version="1.0" encoding="UTF-8"?>
<jbosscache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:jboss:jbosscache-core:config:3.1">

   <locking useLockStriping="false" concurrencyLevel="50000" lockParentForChildInsertRemove="false"
      lockAcquisitionTimeout="20000" />
   <!-- Configure the TransactionManager -->
   <transaction transactionManagerLookupClass="org.jboss.cache.transaction.JBossStandaloneJTAManagerLookup" />

   <clustering mode="replication" clusterName="${jbosscache-cluster-name}">
      <stateRetrieval timeout="20000" fetchInMemoryState="false" />
      <sync />
   </clustering>
</jbosscache>

See more about template configurations here.

1.25.3. Asynchronous reindexing

Managing a big set of data using JCR in production environment sometimes requires special operations with Indexes, stored on File System. One of those maintenance operations is a recreation of it. Also called "re-indexing". There are various usecases when it's important to do. They include hardware faults, hard restarts, data-corruption, migrations and JCR updates that brings new features related to index. Usually index re-creation requested on server's startup or in runtime.

1.25.3.1. On startup indexing

Common usecase for updating and re-creating the index is to stop the server and manually remove indexes for workspaces requiring it. When server will be started, missing indexes are automatically recovered by re-indexing. JCR Supports direct RDBMS re-indexing, that usually is faster than ordinary and can be configured via QueryHandler parameter "rdbms-reindexing" set to "true" (for more information please refer to "Query-handler configuration overview"). New feature to introduce is asynchronous indexing on startup. Usually startup is blocked until process is finished. Block can take any period of time, depending on amount of data persisted in repositories. But this can be resolved by using an asynchronous approaches of startup indexation. Saying briefly, it performs all operations with index in background, without blocking the repository. This is controlled by the value of "async-reindexing" parameter in QueryHandler configuration. With asynchronous indexation active, JCR starts with no active indexes present. Queries on JCR still can be executed without exceptions, but no results will be returned until index creation completed. Checking index state is possible via QueryManagerImpl:

boolean online = ((QueryManagerImpl)Workspace.getQueryManager()).getQueryHandeler().isOnline();

"OFFLINE" state means that index is currently re-creating. When state changed, corresponding log event is printed. From the start of background task index is switched to "OFFLINE", with following log event :

[INFO] Setting index OFFLINE (repository/production[system]).

When process finished, two events are logged :

[INFO] Created initial index for 143018 nodes (repository/production[system]).
[INFO] Setting index ONLINE (repository/production[system]).

Those two log lines indicates the end of process for workspace given in brackets. Calling isOnline() as mentioned above, will also return true.

1.25.3.2. Hot Asynchronous Workspace Reindexing via JMX

Some hard system faults, error during upgrades, migration issues and some other factors may corrupt the index. Most likely end customers would like the production systems to fix index issues in run-time, without delays and restarts. Current versions of JCR supports "Hot Asynchronous Workspace Reindexing" feature. It allows end-user (Service Administrator) to launch the process in background without stopping or blocking whole application by using any JMX-compatible console (see screenshot below, "JConsole in action").

Server can continue working as expected while index is recreated. This depends on the flag "allow queries", passed via JMX interface to reindex operation invocation. If the flag set, then application continues working. But there is one critical limitation the end-users must be aware. Index is frozen while background task is running. It meant that queries are performed on index present on the moment of task startup and data written into repository after startup won't be available through the search until process finished. Data added during re-indexation is also indexed, but will be available only when task is done. Briefly, JCR makes the snapshot of indexes on asynch task startup and uses it for searches. When operation finished, stale indexes replaced by newly created including newly added data. If flag "allow queries" is set to false, then all queries will throw an exception while task is running. Current state can be acquired using the following JMX operation:

getHotReindexingState() - returns information about latest invocation: start time, if in progress or finish time if done.

1.25.3.3. Notices

First of all, can't launch Hot re-indexing via JMX if index is already in offline mode. It means that index is currently is invoked in some operations, like re-indexing at startup, copying in cluster to another node or whatever. Another important this is Hot Asynchronous Reindexing via JMX and "on startup" reindexing are completely different features. So you can't get the state of startup reindexing using command getHotReindexingState in JMX interface, but there are some common JMX operations:

getIOMode - returns current index IO mode (READ_ONLY / READ_WRITE), belongs to clustered configuration states;
getState - returns current state: ONLINE / OFFLINE.

1.25.4. Advanced tuning

1.25.4.1. Lucene tuning

As mentioned above, JCR Indexing is based on Lucene indexing library as underlying search engine. It uses Directories to store index and manages access to index by Lock Factories. By default JCR implementation uses optimal combination of Directory implementation and Lock Factory implementation. When running on OS different from Windows, NIOFSDirectory implementation used. And SimpleFSDirectory for Windows stations. NativeFSLockFactory is an optimal solution for wide variety of cases including clustered environment with NFS shared resources. But those default can be overridden with the help of system properties. There are two properties: "org.exoplatform.jcr.lucene.store.FSDirectoryLockFactoryClass" and "org.exoplatform.jcr.lucene.FSDirectory.class" that are responsible for changing default behavior. First one defines implementation of abstract Lucene LockFactory class and the second one sets implementation class for FSDirectory instances. For more information please refer to Lucene documentation. But be sure You know what You are changing. JCR allows end users to change implementation classes of Lucene internals, but doesn't guarantee it's stability and functionality.

1.25.4.2. Index optimization

From time to time, the Lucene index needs to be optimized. The process is essentially a defragmentation. Until an optimization is triggered Lucene only marks deleted documents as such, no physical deletions are applied. During the optimization process the deletions will be applied. Optimizing the Lucene index speeds up searches but has no effect on the indexation (update) performance. First of all ensure repository is suspended to avoid any possible inconsistency. It is recommended to schedule optimization. Also checking for pending deletions is supported. If it is so, it is a first signal to index optimization. All operation are available via JMX:

1.26. JBossTransactionsService

JBossTransactionsService implements eXo TransactionService and provides access to JBoss Transaction Service (JBossTS) JTA implementation via eXo container dependency.

TransactionService is used in JCR cache org.exoplatform.services.jcr.impl.dataflow.persistent.jbosscache.JBossCacheWorkspaceStorageCache implementaion. See Cluster configuration for example.

1.26.1. Configuration

Example configuration:

  <component>
    <key>org.exoplatform.services.transaction.TransactionService</key>
    <type>org.exoplatform.services.transaction.jbosscache.JBossTransactionsService</type>
    <init-params>
      <value-param>
        <name>timeout</name>
        <value>3000</value>
      </value-param>
    </init-params>   
  </component>

timeout - XA transaction timeout in seconds

1.27. TransactionManagerLookup

JBossCache class is registered as an eXo container component in the configuration.xml file.

  <component>
     <key>org.jboss.cache.transaction.TransactionManagerLookup</key>
     <type>org.jboss.cache.transaction.JBossStandaloneJTAManagerLookup</type>
  </component>

JBossStandaloneJTAManagerLookup is used in a standalone environment, but GenericTransactionManagerLookup is used in the Application Server environment.

1.28. Infinispan integration

eXo JCR can rely on distributed cache such as Infinispan. This article describes the required configuration.

1.28.1. Components configuration requirements

<component>
    <key>org.infinispan.transaction.lookup.TransactionManagerLookup</key>
    <type>org.exoplatform.services.transaction.infinispan.JBossStandaloneJTAManagerLookup</type>
</component>

<component profiles="ispn">
    <key>org.exoplatform.services.transaction.TransactionService</key>
    <type>org.exoplatform.services.transaction.infinispan.JBossTransactionsService</type>
    <init-params>
        <value-param>
            <name>timeout</name>
            <value>3000</value>
        </value-param>
    </init-params>
</component>

<component profiles="ispn">
    <key>org.exoplatform.services.rpc.RPCService</key>
    <type>org.exoplatform.services.rpc.jgv3.RPCServiceImpl</type>
    <init-params>
        <value-param>
            <name>jgroups-configuration</name>
            <value>jar:/conf/udp-mux-v3.xml</value>
        </value-param>
        <value-param>
            <name>jgroups-cluster-name</name>
            <value>RPCService-Cluster</value>
        </value-param>
        <value-param>
            <name>jgroups-default-timeout</name>
            <value>0</value>
        </value-param>
    </init-params>
</component>

1.28.2. Workspaces configuration requirements

Each mentioned below components uses instances of Infinispan Cache product for caching in clustered environment. So every element has it's own transport and has to be configured in a proper way. As usual, workspaces have similar configuration. The simplest way to configure them is to define their own configuration files for each component in each workspace. There are several commons parameters.

"infinispan-configuration" defines path to template based configuration for Infinispan Cache instance.

JGroups is used by Infinispan Cache for network communications and transport in a clustered environment. If property "jgroups-configuration" is defined in component configuration, it will be injected into the Infinispan Cache instance on startup.

The another parameter is "infinispan-cluster-name". This defines the name of the cluster. Needs to be the same for all nodes in a cluster in order to find each other.

Cache configuration:

<cache enabled="true" class="org.exoplatform.services.jcr.impl.dataflow.persistent.infinispan.ISPNCacheWorkspaceStorageCache">
    <properties>
        <property name="infinispan-configuration" value="jar:/conf/portal/cluster/infinispan-data.xml" />
        <property name="jgroups-configuration" value="jar:/conf/udp-mux-v3.xml" />
        <property name="infinispan-cluster-name" value="JCR-cluster" />
    </properties>
</cache>

Indexer configuration

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
    <properties>
        <property name="index-dir" value="${exo.jcr.parent.dir:..}/temp/jcrlucenedb/production" />
        <property name="changesfilter-class" value="org.exoplatform.services.jcr.impl.core.query.ispn.ISPNIndexChangesFilter" />
        <property name="infinispan-configuration" value="jar:/conf/portal/cluster/infinispan-indexer.xml" />
        <property name="jgroups-configuration" value="jar:/conf/udp-mux-v3.xml" />
        <property name="infinispan-cluster-name" value="JCR-cluster" />
        <property name="max-volatile-time" value="60" />
    </properties>
</query-handler>

changesfilter-class - defines cluster-ready index strategy based on Infinispan Cache, it can be either org.exoplatform.services.jcr.impl.core.query.ispn.ISPNIndexChangesFilter (for shared and rsync-based index strategies) or org.exoplatform.services.jcr.impl.core.query.ispn.LocalIndexChangesFilter (for local index)

Lock Manager configuration

<lock-manager class="org.exoplatform.services.jcr.impl.core.lock.infinispan.ISPNCacheableLockManagerImpl">
    <properties>
        <property name="time-out" value="15m" />
        <property name="infinispan-configuration" value="jar:/conf/portal/cluster/infinispan-lock.xml" />
        <property name="jgroups-configuration" value="jar:/conf/udp-mux-v3.xml" />
        <property name="infinispan-cluster-name" value="JCR-cluster" />
        <property name="infinispan-cl-cache.jdbc.table.name" value="lk" />
        <property name="infinispan-cl-cache.jdbc.table.create" value="true" />
        <property name="infinispan-cl-cache.jdbc.table.drop" value="false" />
        <property name="infinispan-cl-cache.jdbc.id.column" value="id" />
        <property name="infinispan-cl-cache.jdbc.data.column" value="data" />
        <property name="infinispan-cl-cache.jdbc.timestamp.column" value="timestamp" />
        <property name="infinispan-cl-cache.jdbc.datasource" value="jdbcjcr" />
        <property name="infinispan-cl-cache.jdbc.dialect" value="${dialect}" />
        <property name="infinispan-cl-cache.jdbc.connectionFactory" value="org.infinispan.loaders.jdbc.connectionfactory.ManagedConnectionFactory" />
    </properties>
</lock-manager>^

infinispan-cl-cache.jdbc.table.name - table name
infinispan-cl-cache.jdbc.table.create - is true or false. Indicates whether to create table at start phase. If true, the table is created if it does not already exist.
infinispan-cl-cache.jdbc.table.drop - is true or false. Indicates whether to drop the table at stop phase.
infinispan-cl-cache.jdbc.id.column - id column name
infinispan-cl-cache.jdbc.data.column - data column name
infinispan-cl-cache.jdbc.timestamp.column - timestamp column name
infinispan-cl-cache.jdbc.datasource - name of the datasource to use to store locks.
infinispan-cl-cache.jdbc.dialect - dialect of the database.
infinispan-cl-cache.jdbc.connectionFactory - connection factory to use with the JDBC Cache Store.

1.28.3. Shipped Infinispan Cache configuration templates

eXo JCR implementation is shipped with ready-to-use Infinispan Cache configuration templates for JCR's components.

1.28.3.1. Data container template

Data container template is "infinispan-data.xml":

<infinispan 
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance". 
      xsi:schemaLocation="urn:infinispan:config:5.1 http://www.infinispan.org/schemas/infinispan-config-5.1.xsd". 
      xmlns="urn:infinispan:config:5.1">

    <global>
      <evictionScheduledExecutor factory="org.infinispan.executors.DefaultScheduledExecutorFactory">
        <properties>
          <property name="threadNamePrefix" value="EvictionThread"/>
        </properties>
      </evictionScheduledExecutor>

      <globalJmxStatistics jmxDomain="exo" enabled="true" allowDuplicateDomains="true"/>

      <transport transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport" clusterName="${infinispan-cluster-name}" distributedSyncTimeout=
        <properties>
          <property name="configurationFile" value="${jgroups-configuration}"/>
        </properties>
      </transport>
    </global>

    <default>
      <clustering mode="replication">
        <stateTransfer timeout="20000" fetchInMemoryState="false" />
        <sync replTimeout="20000"/>
      </clustering>

      <locking isolationLevel="READ_COMMITTED" lockAcquisitionTimeout="20000" writeSkewCheck="false" concurrencyLevel="500" useLockStriping="true"/>
      <transaction transactionManagerLookupClass="org.exoplatform.services.transaction.infinispan.JBossStandaloneJTAManagerLookup" syncRollbackPhase="true" s
      <jmxStatistics enabled="true"/>
      <eviction strategy="LRU" threadPolicy="DEFAULT" maxEntries="1000000"/>
      <expiration wakeUpInterval="5000"/>
   </default>
</infinispan>

Table 1.10. Template variables

Variable	Description
jgroups-configuration	This is the path to JGroups configuration that should not be anymore jgroups' stack definitions but a normal jgroups configuration format with the shared transport configured by simply setting the jgroups property singleton_name to a unique name (it must remain unique from one portal container to another). This file is also pre-bundled with templates and is recommended for use.
infinispan-cluster-name	This defines the name of the cluster. Needs to be the same for all nodes in a cluster in order to find each other.

1.28.3.2. Lock manager template

Its template name is "infinispan-lock.xml"

<infinispan 
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance". 
      xsi:schemaLocation="urn:infinispan:config:5.1 http://www.infinispan.org/schemas/infinispan-config-5.1.xsd". 
      xmlns="urn:infinispan:config:5.1">

    <global>
      <evictionScheduledExecutor factory="org.infinispan.executors.DefaultScheduledExecutorFactory">
        <properties>
          <property name="threadNamePrefix" value="EvictionThread"/>
        </properties>
      </evictionScheduledExecutor>

      <globalJmxStatistics jmxDomain="exo" enabled="true" allowDuplicateDomains="true"/>

      <transport transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport" clusterName="${infinispan-cluster-name}" distributedSyncTimeout=
        <properties>
          <property name="configurationFile" value="${jgroups-configuration}"/>
        </properties>
      </transport>
    </global>

    <default>
      <clustering mode="replication">
        <stateTransfer timeout="20000" fetchInMemoryState="false" />
        <sync replTimeout="20000"/>
      </clustering>

      <locking isolationLevel="READ_COMMITTED" lockAcquisitionTimeout="20000" writeSkewCheck="false" concurrencyLevel="500" useLockStriping="false"/>
      <transaction transactionManagerLookupClass="org.exoplatform.services.transaction.infinispan.JBossStandaloneJTAManagerLookup" syncRollbackPhase="true" s
      <jmxStatistics enabled="true"/>
      <eviction strategy="NONE"/>

      <loaders passivation="false" shared="true" preload="true">
        <loader class="org.infinispan.loaders.jdbc.stringbased.JdbcStringBasedCacheStore" fetchPersistentState="true" ignoreModifications="false" purgeOnStar
          <properties>
             <property name="stringsTableNamePrefix" value="${infinispan-cl-cache.jdbc.table.name}"/>
             <property name="idColumnName" value="${infinispan-cl-cache.jdbc.id.column}"/>
             <property name="dataColumnName" value="${infinispan-cl-cache.jdbc.data.column}"/>
             <property name="timestampColumnName" value="${infinispan-cl-cache.jdbc.timestamp.column}"/>
             <property name="idColumnType" value="${infinispan-cl-cache.jdbc.id.type}"/>
             <property name="dataColumnType" value="${infinispan-cl-cache.jdbc.data.type}"/>
             <property name="timestampColumnType" value="${infinispan-cl-cache.jdbc.timestamp.type}"/>
             <property name="dropTableOnExit" value="${infinispan-cl-cache.jdbc.table.drop}"/>
             <property name="createTableOnStart" value="${infinispan-cl-cache.jdbc.table.create}"/>
             <property name="connectionFactoryClass" value="${infinispan-cl-cache.jdbc.connectionFactory}"/>
             <property name="datasourceJndiLocation" value="${infinispan-cl-cache.jdbc.datasource}"/>
          </properties>
          <async enabled="false"/>
        </loader>
      </loaders>
   </default>

</infinispan>

Table 1.11. Template variables

Variable	Description
jgroups-configuration	This is the path to JGroups configuration that should not be anymore jgroups' stack definitions but a normal jgroups configuration format with the shared transport configured by simply setting the jgroups property singleton_name to a unique name (it must remain unique from one portal container to another). This file is also pre-bundled with templates and is recommended for use.
infinispan-cluster-name	This defines the name of the cluster. Needs to be the same for all nodes in a cluster in order to find each other.
infinispan-cl-cache.jdbc.table.name	The table name.
infinispan-cl-cache.jdbc.id.column	The name of the column id.
infinispan-cl-cache.jdbc.data.column	The name of the column data.
infinispan-cl-cache.jdbc.timestamp.column	The name of the column timestamp.
infinispan-cl-cache.jdbc.id.type	The type of the column id.
infinispan-cl-cache.jdbc.data.type	The type of the column data.
infinispan-cl-cache.jdbc.timestamp.type	The type of the column timestamp.
infinispan-cl-cache.jdbc.table.drop	Can be set to true or false. Indicates whether to drop the table at stop phase.
infinispan-cl-cache.jdbc.table.create	Can be set to true or false. Indicates whether to create table at start phase. If true, the table is created if it does not already exist.
infinispan-cl-cache.jdbc.connectionFactory	The connection factory to use with the JDBC Cache Store.
infinispan-cl-cache.jdbc.datasource	The name of the datasource to use to store locks.

1.28.3.3. Query handler (indexer) template

Have a look at "infinispan-indexer.xml"

<infinispan 
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance". 
      xsi:schemaLocation="urn:infinispan:config:5.1 http://www.infinispan.org/schemas/infinispan-config-5.1.xsd". 
      xmlns="urn:infinispan:config:5.1">

    <global>
      <evictionScheduledExecutor factory="org.infinispan.executors.DefaultScheduledExecutorFactory">
        <properties>
          <property name="threadNamePrefix" value="EvictionThread"/>
        </properties>
      </evictionScheduledExecutor>

      <globalJmxStatistics jmxDomain="exo" enabled="true" allowDuplicateDomains="true"/>

      <transport transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport" clusterName="${infinispan-cluster-name}" distributedSyncTimeout=
        <properties>
          <property name="configurationFile" value="${jgroups-configuration}"/>
        </properties>
      </transport>
    </global>

    <default>
      <clustering mode="replication">
        <stateTransfer timeout="20000" fetchInMemoryState="false" />
        <sync replTimeout="20000"/>
      </clustering>

      <locking isolationLevel="READ_COMMITTED" lockAcquisitionTimeout="20000" writeSkewCheck="false" concurrencyLevel="500" useLockStriping="false"/>
      <transaction transactionManagerLookupClass="org.exoplatform.services.transaction.infinispan.JBossStandaloneJTAManagerLookup" syncRollbackPhase="true" s
      <jmxStatistics enabled="true"/>
      <eviction strategy="NONE"/>

      <loaders passivation="false" shared="false" preload="false">
        <loader class="${infinispan-cachestore-classname}" fetchPersistentState="false" ignoreModifications="false" purgeOnStartup="false">
          <async enabled="false"/>
        </loader>
      </loaders>
   </default>
</infinispan>

Table 1.12. Template variables

Variable	Description
jgroups-configuration	This is the path to JGroups configuration that should not be anymore jgroups' stack definitions but a normal jgroups configuration format with the shared transport configured by simply setting the jgroups property singleton_name to a unique name (it must remain unique from one portal container to another). This file is also pre-bundled with templates and is recommended for use.
infinispan-cluster-name	This defines the name of the cluster. Needs to be the same for all nodes in a cluster in order to find each other.

1.29. RepositoryCreationService

RepositoryCreationService is the service which is used to create repositories in runtime. The service can be used in a standalone or cluster environment.

1.29.1. Dependencies

RepositoryConfigurationService depends to next components:

DBCreator - DBCreator used to create new database for each unbinded datasource.
BackupManager - BackupManager used to created repository from backup.
RPCService - RPCService used for communication between cluster-nodes
Note
RPCService may not be configured - in this case, RepositoryService will work as standalone service.

1.29.2. How it works

User executes reserveRepositoryName(String repositoryName) - client-node calls coordinator-node to reserve repositoryName. If this name is already reserved or repository with this name exist, client-node will fetch RepositoryCreationException. If not Client will get token string.
than user executes createRepository(String backupId, RepositoryEntry rEntry, String token). Coordinator-node checks the token, and creates Repository.
whan repository become created - user-node broadcast message to all clusterNodes with RepositoryEntry, so each cluster node starts new Repository.

There is two ways to create repositry: make it in single step - just call createRepository(String backupId, RepositoryEntry); or reserve repositoryName at first (reserveRepositoryName(String repositoryName)), than create reserved repository (createRepository(String backupId, RepositoryEntry rEntry, String token)).

1.29.3. Configuration

RepositoryCreationService configuration

<component>
   <key>org.exoplatform.services.jcr.ext.backup.BackupManager</key>
   <type>org.exoplatform.services.jcr.ext.backup.impl.BackupManagerImpl</type>
   <init-params>
      <properties-param>
         <name>backup-properties</name>
         <property name="backup-dir" value="target/backup" />
      </properties-param>
   </init-params>
</component>

<component>
   <key>org.exoplatform.services.database.creator.DBCreator</key>
   <type>org.exoplatform.services.database.creator.DBCreator</type>
   <init-params>
      <properties-param>
         <name>db-connection</name>
         <description>database connection properties</description>
         <property name="driverClassName" value="org.hsqldb.jdbcDriver" />
         <property name="url" value="jdbc:hsqldb:file:target/temp/data/" />
         <property name="username" value="sa" />
         <property name="password" value="" />
      </properties-param>
      <properties-param>
         <name>db-creation</name>
         <description>database creation properties</description>
         <property name="scriptPath" value="src/test/resources/test.sql" />
         <property name="username" value="sa" />
         <property name="password" value="" />
      </properties-param>
   </init-params>
</component>

<component>
   <key>org.exoplatform.services.rpc.RPCService</key>
   <type>org.exoplatform.services.rpc.impl.RPCServiceImpl</type>
   <init-params>
      <value-param>
         <name>jgroups-configuration</name>
         <value>jar:/conf/standalone/udp-mux.xml</value>
      </value-param>
      <value-param>
         <name>jgroups-cluster-name</name>
         <value>RPCService-Cluster</value>
      </value-param>
      <value-param>
         <name>jgroups-default-timeout</name>
         <value>0</value>
      </value-param>
   </init-params>
</component>  

<component>
   <key>org.exoplatform.services.jcr.ext.repository.creation.RepositoryCreationService</key>
   <type>
      org.exoplatform.services.jcr.ext.repository.creation.RepositoryCreationServiceImpl
   </type>
     <init-params> 
         <value-param> 
            <name>factory-class-name</name> 
            <value>org.apache.commons.dbcp.BasicDataSourceFactory</value> 
         </value-param> 
      </init-params>
</component>

factory-class-name - is not mandatory parameter, indicates what the factory need to use to create DataSource objects

1.29.4. RepositoryCreationService Interface

public interface RepositoryCreationService
{
   /**
    * Reserves, validates and creates repository in a simplified form.
    * 
    * @param rEntry - repository Entry - note that datasource must not exist.
    * @param backupId - backup id
    * @param creationProps - storage creation properties 
    * @throws RepositoryConfigurationException
    *          if some exception occurred during repository creation or repository name is absent in reserved list
    * @throws RepositoryCreationServiceException
    *          if some exception occurred during repository creation or repository name is absent in reserved list
    */
   void createRepository(String backupId, RepositoryEntry rEntry, StorageCreationProperties creationProps)
      throws RepositoryConfigurationException, RepositoryCreationException;

   /**
    * Reserves, validates and creates repository in a simplified form. 
    * 
    * @param rEntry - repository Entry - note that datasource must not exist.
    * @param backupId - backup id
    * @throws RepositoryConfigurationException
    *          if some exception occurred during repository creation or repository name is absent in reserved list
    * @throws RepositoryCreationServiceException
    *          if some exception occurred during repository creation or repository name is absent in reserved list
    */
   void createRepository(String backupId, RepositoryEntry rEntry) throws RepositoryConfigurationException,
      RepositoryCreationException;

   /**
    * Reserve repository name to prevent repository creation with same name from other place in same time
    * via this service.
    * 
    * @param repositoryName - repositoryName
    * @return repository token. Anyone obtaining a token can later create a repository of reserved name.
    * @throws RepositoryCreationServiceException if can't reserve name
    */
   String reserveRepositoryName(String repositoryName) throws RepositoryCreationException;

   /**
    * Creates repository, using token of already reserved repository name. 
    * Good for cases, when repository creation should be delayed or made asynchronously in dedicated thread. 
    * 
    * @param rEntry - repository entry - note, that datasource must not exist
    * @param backupId - backup id
    * @param rToken - token
    * @param creationProps - storage creation properties
    * @throws RepositoryConfigurationException
    *          if some exception occurred during repository creation or repository name is absent in reserved list
    * @throws RepositoryCreationServiceException
    *          if some exception occurred during repository creation or repository name is absent in reserved list
    */
   void createRepository(String backupId, RepositoryEntry rEntry, String rToken, StorageCreationProperties creationProps)
      throws RepositoryConfigurationException, RepositoryCreationException;

   /**
    * Creates  repository, using token of already reserved repository name. Good for cases, when repository creation should be delayed or 
    * made asynchronously in dedicated thread. 
    * 
    * @param rEntry - repository entry - note, that datasource must not exist
    * @param backupId - backup id
    * @param rToken - token
    * @throws RepositoryConfigurationException
    *          if some exception occurred during repository creation or repository name is absent in reserved list
    * @throws RepositoryCreationServiceException
    *          if some exception occurred during repository creation or repository name is absent in reserved list
    */
   void createRepository(String backupId, RepositoryEntry rEntry, String rToken)
      throws RepositoryConfigurationException, RepositoryCreationException;

   /**
    * Remove previously created repository. 
    * 
    * @param repositoryName - the repository name to delete
    * @param forceRemove - force close all opened sessions 
    * @throws RepositoryCreationServiceException
    *          if some exception occurred during repository removing occurred
    */
   void removeRepository(String repositoryName, boolean forceRemove) throws RepositoryCreationException;

}

1.29.5. Conclusions and restrictions

Each datasource in RepositoryEntry of new Repository must have unbinded datasources. Thats mean, such datasource must have not databases behind them. This restriction exists to avoid corruption of existing repositories data.
RPCService is optional component, but without it, RepositoryCreatorService can not communicate with other cluster-nodes and works as standalone.

1.30. JCR Query Usecases

JCR supports two query languages - JCR and XPath. A query, whether XPath or SQL, specifies a subset of nodes within a workspace, called the result set. The result set constitutes all the nodes in the workspace that meet the constraints stated in the query.

1.30.1. Query Lifecycle

1.30.1.1. Query Creation and Execution

// get QueryManager
QueryManager queryManager = workspace.getQueryManager(); 
// make SQL query
Query query = queryManager.createQuery("SELECT * FROM nt:base ", Query.SQL);
// execute query
QueryResult result = query.execute();

Now we can get result in an iterator of nodes:

// get QueryManager
QueryManager queryManager = workspace.getQueryManager();
// make XPath query
Query query = queryManager.createQuery("//element(*,nt:base)", Query.XPATH);
// execute query
QueryResult result = query.execute();

1.30.1.2. Query Result Processing

// fetch query result
QueryResult result = query.execute();

NodeIterator it = result.getNodes();

or we get the result in a table:

// get column names
String[] columnNames = result.getColumnNames();
// get column rows
RowIterator rowIterator = result.getRows();
while(rowIterator.hasNext()){
   // get next row
   Row row = rowIterator.nextRow();
   // get all values of row
   Value[] values = row.getValues();
}

1.30.1.3. Scoring

The result returns a score for each row in the result set. The score contains a value that indicates a rating of how well the result node matches the query. A high value means a better matching than a low value. This score can be used for ordering the result.

eXo JCR Scoring is a mapping of Lucene scoring. For a more in-depth understanding, please study Lucene documentation.

jcr:score counted in next way - (lucene score)*1000f.

Score may be increased for specified nodes, see Index Boost Value

Also, see an example Order by Score

1.30.2. Query result settings

Set Offset And Limit

1.30.3. Type Constraints

1.30.4. Property Constraints

1.30.5. Path Constraint

1.30.6. Ordering specifying

1.30.7. Fulltext Search

1.30.8. Indexing rules and additional features

1.30.9. Query Examples

1.30.9.1. SetOffset and SetLimit

Select all nodes with primary type 'nt:unstructured' and returns only 3 nodes starting with the second node in the list.

1.30.9.1.1. Common info

QueryImpl class has two methods: one to indicate how many results shall be returned at most, and another to fix the starting position.

setOffset(long offset) - Sets the start offset of the result set.
setLimit(long position) - Sets the maximum size of the result set.

1.30.9.1.2. Repository structure

Repository contains mix:title nodes, where jcr:title has different values.

root

1.30.9.1.3. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:unstructured";
QueryImpl query = (QueryImpl)queryManager.createQuery(sqlStatement, Query.SQL);
//return starting with second result
query.setOffset(1);
// return 3 results
query.setLimit(3);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.1.4. Fetching result

In usual case (without using setOffset and setLimit methods), Node iterator returns all nodes (node1...node6). But in our case NodeIterator will return "node2","node3" and "node4".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

\[node1 node2 node3 node4 node5 node6\]

1.30.9.2. Finding All Nodes

Find all nodes in the repository. Only those nodes are found to which the session has READ permission. See also Access Control.

1.30.9.2.1. Repository structure:

Repository contains many different nodes.

root
- folder1 (nt:folder)
  - document1 (nt:file)
  - folder2 (nt:folder)
    document2 (nt:unstructured)
    document3 (nt:folder)

1.30.9.2.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:base";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:base)";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.2.3. Fetching result

NodeIterator will return "folder1", "folder2","document1","document2","document3", and each other nodes in workspace if they are here.

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes whose primary type is "nt:file".

Table 1.13. Table content

jcr:path	jcr:score
/folder1	1000
/folder1/document1	1000
/folder1/folder2	1000
/folder1/folder2/document2	1000
/folder1/folder2/document3	1000
...	...

1.30.9.3. Finding Nodes by Primary Type

1.30.9.3.1. Repository structure:

The repository contains nodes with different primary types and mixin types.

root

1.30.9.3.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:file";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:file)";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.3.3. Fetching result

NodeIterator will return "document2" and "document3".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes in repository, that contain a mixin type "mix:title".

Table 1.14. Table content

jcr:path	jcr:score
/document2	2674
/document3	2674

1.30.9.4. Finding Nodes by Mixin Type

1.30.9.4.1. Repository structure:

The repository contains nodes with different primary types and mixin types.

root

1.30.9.4.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.4.3. Fetching result

The NodeIterator will return "document1" and "document3".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes with mixin type 'mix:title' where the prop_pagecount property contains a value less than 90. Only select the title of each node.

Table 1.15. Table content

jcr:title	...	jcr:path	jcr:score
First document	...	/document1	2674
Second document	...	/document3	2674

1.30.9.5. Property Comparison

1.30.9.5.1. Repository structure:

Repository contains several mix:title nodes, where each prop_pagecount contains a different value.

root

1.30.9.5.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT jcr:title FROM mix:title WHERE prop_pagecount < 90";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[@prop_pagecount < 90]/@jcr:title";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.5.3. Fetching result

The NodeIterator will return "document3".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes with mixin type 'mix:title' and where the property 'jcr:title' starts with 'P'.

Table 1.16. Table Content

jcr:title	jcr:path	jcr:score
Puss in Boots	/document3	1725

1.30.9.6. LIKE Constraint

Note

1.30.9.6.1. Repository structure:

The repository contains 3 mix:title nodes, where each jcr:title has a different value.

root

1.30.9.6.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE jcr:title LIKE 'P%'";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[jcr:like(@jcr:title, 'P%')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.6.3. Fetching result

The NodeIterator will return "document2" and "document3".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes with a mixin type 'mix:title' and whose property 'jcr:title' starts with 'P%ri'.

Table 1.17. Table content

jcr:title	jcr:description	jcr:path	jcr:score
Prison break	Run, Forest, run ))	/document2	4713
Panopticum	It's imagine film	/document3	5150

1.30.9.7. Escaping in LIKE Statements

As you see "P%rison break" contains the symbol '%'. This symbol is reserved for LIKE comparisons. So what can we do?

Within the LIKE pattern, literal instances of percent ("%") or underscore ("_") must be escaped. The SQL ESCAPE clause allows the definition of an arbitrary escape character within the context of a single LIKE statement. The following example defines the backslash ' \' as escape character:

SELECT * FROM mytype WHERE a LIKE 'foo\%' ESCAPE '\'

XPath does not have any specification for defining escape symbols, so we must use the default escape character (' \').

1.30.9.7.1. Repository structure

The repository contains mix:title nodes, where jcr:title can have different values.

root

1.30.9.7.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE jcr:title LIKE 'P#%ri%' ESCAPE '#'";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[jcr:like(@jcr:title, 'P\\%ri%')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.7.3. Fetching result

NodeIterator will return "document2".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes with a mixin type 'mix:title' and where the property 'jcr:title' does NOT start with a 'P' symbol

Table 1.18. Table content

jcr:title	jcr:description	jcr:path	jcr:score
P%rison break	Run, Forest, run ))	/document2	7452

1.30.9.8. NOT Constraint

1.30.9.8.1. Repository Structure

The repository contains a mix:title nodes, where the jcr:title has different values.

root

1.30.9.8.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE NOT jcr:title LIKE 'P%'";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

NodeIterator will return "document1".

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[not(jcr:like(@jcr:title, 'P%'))]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.8.3. Fetching the result

Let's get the nodes:

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all fairytales with a page count more than 90 pages.

Table 1.19. Table content

jcr:title	jcr:description	jcr:path	jcr:score
Star wars	Dart rules!!	/document1	4713

1.30.9.9. AND Constraint

How does it sound in jcr terms - Find all nodes with mixin type 'mix:title' where the property 'jcr:description' equals "fairytale" and whose "prop_pagecount" property value is less than 90.

Note

1.30.9.9.1. Repository Structure:

The repository contains mix:title nodes, where prop_pagecount has different values.

root

1.30.9.9.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE jcr:description = 'fairytale' AND prop_pagecount > 90";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[@jcr:description='fairytale' and @prop_pagecount > 90]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.9.3. Fetching the Result

NodeIterator will return "document2".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all documents whose title is 'Cinderella' or whose description is 'novel'.

Table 1.20. Table content

jcr:title	jcr:description	prop_pagecount	jcr:path	jcr:score
Cinderella	fairytale	100	/document2	7086

1.30.9.10. OR Constraint

How does it sound in jcr terms? - Find all nodes with a mixin type 'mix:title' whose property 'jcr:title' equals "Cinderella" or whose "jcr:description" property value is "novel".

1.30.9.10.1. Repository Structure

The repository contains mix:title nodes, where jcr:title and jcr:description have different values.

root

1.30.9.10.2. Query Execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE jcr:title = 'Cinderella' OR jcr:description = 'novel'";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[@jcr:title='Cinderella' or @jcr:description = 'novel']";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.10.3. Fetching the Result

NodeIterator will return "document1" and "document2".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes with a mixin type 'mix:title' where the property 'jcr:description' does not exist (is null).

Table 1.21. Table content

jcr:title	jcr:description	jcr:path	jcr:score
War and peace	novel	/document1	3806
Cinderella	fairytale	/document2	3806

1.30.9.11. Property Existence Constraint

1.30.9.11.1. Repository Structure

The repository contains mix:title nodes, in one of these nodes the jcr:description property is null.

root

1.30.9.11.2. Query Execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE jcr:description IS NULL";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = ""//element(*,mix:title)[not(@jcr:description)]"";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.11.3. Fetching the Result

NodeIterator will return "document3".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes with a mixin type 'mix:title' and where the property 'jcr:title' equals 'casesensitive' in lower or upper case.

Table 1.22. Table content

jcr:title	jcr:description	jcr:path	jcr:score
Titanic	null	/document3	1947

1.30.9.12. Finding Nodes in a Case-Insensitive Way

1.30.9.12.1. Repository Structure

The repository contains mix:title nodes, whose jcr:title properties have different values.

root

1.30.9.12.2. Query Execution

UPPER case

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE UPPER(jcr:title) = 'CASESENSITIVE'";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[fn:upper-case(@jcr:title)='CASESENSITIVE']";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

LOWER case

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE LOWER(jcr:title) = 'casesensitive'";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[fn:lower-case(@jcr:title)='casesensitive']";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.12.3. Fetching the Result

NodeIterator will return "document1", "document2" and "document3" (in all examples).

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes of primary type "nt:resource" whose jcr:lastModified property value is greater than 2006-06-04 and less than 2008-06-04.

Table 1.23. Table content

jcr:title	...	jcr:path
CaseSensitive	...	/document1
casesensitive	...	/document2
caseSENSITIVE	...	/document3

1.30.9.13. Date Property Comparison

1.30.9.13.1. Repository Structure

Repository contains nt:resource nodes with different values of jcr:lastModified property

root

1.30.9.13.2. Query Execution

In SQL you have to use the keyword TIMESTAMP for date comparisons. Otherwise, the date would be interpreted as a string. The date has to be surrounded by single quotes (TIMESTAMP 'datetime') and in the ISO standard format: YYYY-MM-DDThh:mm:ss.sTZD ( http://en.wikipedia.org/wiki/ISO_8601 and well explained in a W3C note http://www.w3.org/TR/NOTE-datetime).

You will see that it can be a date only (YYYY-MM-DD) but also a complete date and time with a timezone designator (TZD).

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
StringBuffer sb = new StringBuffer();
sb.append("select * from nt:resource where ");
sb.append("( jcr:lastModified >= TIMESTAMP '");
sb.append("2006-06-04T15:34:15.917+02:00");
sb.append("' )");
sb.append(" and ");
sb.append("( jcr:lastModified <= TIMESTAMP '");
sb.append("2008-06-04T15:34:15.917+02:00");
sb.append("' )");
String sqlStatement = sb.toString();
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

XPath

Compared to the SQL format, you have to use the keyword xs:dateTime and surround the datetime by extra brackets: xs:dateTime('datetime'). The actual format of the datetime also conforms with the ISO date standard.

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
StringBuffer sb = new StringBuffer();
sb.append("//element(*,nt:resource)");
sb.append("[");
sb.append("@jcr:lastModified >= xs:dateTime('2006-08-19T10:11:38.281+02:00')");
sb.append(" and ");
sb.append("@jcr:lastModified <= xs:dateTime('2008-06-04T15:34:15.917+02:00')");
sb.append("]");
String xpathStatement = sb.toString();
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.13.3. Fetching the result

NodeIterator will return "/document3/jcr:content".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node foundNode = it.nextNode();
}

Find all nodes with primary type 'nt:file' whose node name is 'document'. The node name is accessible by a function called "fn:name()".

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

The table content is:

Table 1.24. Table content

jcr:lastModified	...	jcr:path
2007-01-19T15:34:15.917+02:00	...	/document3/jcr:content

1.30.9.14. Node Name Constraint

Note

fn:name() can be used ONLY with an equal('=') comparison.

1.30.9.14.1. Repository Structure

The repository contains nt:file nodes with different names.

root

1.30.9.14.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:file WHERE fn:name() = 'document'";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:file)[fn:name() = 'document']";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.14.3. Fetching the Result

The NodeIterator will return the node whose fn:name equals "document".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

Also we can get a table:

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes with the primary type 'nt:unstructured' whose property 'multiprop' contains both values "one" and "two".

Table 1.25. Table content

jcr:path	jcr:score
/document1	3575

1.30.9.15. Multivalue Property Comparison

1.30.9.15.1. Repository Structure

The repository contains nt:unstructured nodes with different 'multiprop' properties.

root

1.30.9.15.2. Query Execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:unstructured WHERE multiprop = 'one' AND multiprop = 'two'";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:unstructured)[@multiprop = 'one' and @multiprop = 'two']";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.15.3. Fetching the Result

The NodeIterator will return "node1" and "node2".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find a node with the primary type 'nt:file' that is located on the exact path "/folder1/folder2/document1".

Table 1.26. Table content

jcr:primarytyp	jcr:path	jcr:score
nt:unstructured	/node1	3806
nt:unstructured	/node2	3806

1.30.9.16. Exact Path Constraint

1.30.9.16.1. Repository Structure

Repository filled by different nodes. There are several folders which contain other folders and files.

root
- folder1 (nt:folder)
  - folder2 (nt:folder)
    document1 (nt:file) // This document we want to find
    folder3 (nt:folder)
    document1 (nt:file)

1.30.9.16.2. Query Execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// we want find 'document1'
String sqlStatement = "SELECT * FROM nt:file WHERE jcr:path = '/folder1/folder2/document1'";
// create query
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

Remark: The indexes [1] are used in order to get the same result as the SQL statement. SQL by default only returns the first node, whereas XPath fetches by default all nodes.

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// we want to find 'document1'
String xpathStatement = "/jcr:root/folder1[1]/folder2[1]/element(document1,nt:file)[1]";
// create query
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.16.3. Fetching the Result

NodeIterator will return expected "document1".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes with the primary type 'nt:folder' that are children of node by path "/root1/root2". Only find children, do not find further descendants.

Table 1.27. Table content

jcr:path	jcr:score
/folder1/folder2/document1	1030

1.30.9.17. Child Node Constraint

1.30.9.17.1. Repository Structure

The repository is filled by "nt:folder" nodes. The nodes are placed in a multilayer tree.

root
- folder1 (nt:folder)
  - folder2 (nt:folder)
    folder3 (nt:folder) // This node we want to find
    folder4 (nt:folder) // This node is not child but a descendant of '/folder1/folder2/'.
    folder5 (nt:folder) // This node we want to find

1.30.9.17.2. Query Execution

The use of "%" in the LIKE statement includes any string, therefore there is a second LIKE statement that excludes that the string contains "/". This way child nodes are included but descendant nodes are excluded.

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:folder WHERE jcr:path LIKE '/folder1/folder2/%' AND NOT jcr:path LIKE '/folder1/folder2/%/%'";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "/jcr:root/folder1[1]/folder2[1]/element(*,nt:folder)";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.17.3. Fetching the Result

The NodeIterator will return "folder3" and "folder5".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

Find all nodes with the primary type 'nt:folder' that are descendants of the node "/folder1/folder2".

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

The table content is:

Table 1.28. Table content

jcr:path	jcr:score
/folder1/folder2/folder3	1707
/folder1/folder2/folder5	1707

1.30.9.18. Finding All Descendant Nodes

1.30.9.18.1. Repository Structure

The repository contains "nt:folder" nodes. The nodes are placed in a multilayer tree.

root
- folder1 (nt:folder)
  - folder2 (nt:folder)
    folder3 (nt:folder) // This node we want to find
    folder4 (nt:folder) // This node we want to find
    folder5 (nt:folder) // This node we want to find

1.30.9.18.2. Query Execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:folder WHERE jcr:path LIKE '/folder1/folder2/%'";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "/jcr:root/folder1[1]/folder2[1]//element(*,nt:folder)";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.18.3. Fetching the Result

The NodeIterator will return "folder3", "folder4" and "folder5" nodes.

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Select all nodes with the mixin type ''mix:title' and order them by the 'prop_pagecount' property.

Table 1.29. Table content

jcr:path	jcr:score
/folder1/folder2/folder3	1000
/folder1/folder2/folder3/folder4	1000
/folder1/folder2/folder5	1000

1.30.9.19. Sorting Nodes by Property

1.30.9.19.1. Repository Structure

The repository contains several mix:title nodes, where prop_pagecount has different values.

root

1.30.9.19.2. Query Execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title ORDER BY prop_pagecount ASC";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title) order by @prop_pagecount ascending";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.19.3. Fetching the Result

The NodeIterator will return nodes in the following order "document3", "document1", "document2".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all nodes with the primary type 'nt:unstructured' and sort them by the property value of descendant nodes with the relative path '/a/b'.

Table 1.30. Table content

jcr:title	jcr:description	prop_pagecount	jcr:path	jcr:score
Puss in Boots	fairytale	1	/document3	1405
War and peace	roman	4	/document1	1405
Cinderella	fairytale	7	/document2	1405

1.30.9.20. Ordering by Descendant Nodes Property (XPath only)

Note

This ORDER BY construction only works in XPath!

1.30.9.20.1. Repository structure:

root

1.30.9.20.2. Query Execution

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "/jcr:root/* order by a/b/c/@prop descending;
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.20.3. Fetching the Result

NodeIterator will return nodes in the following order - "node3","node2" and "node1".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Select all nodes with the mixin type 'mix:title' containing any word from the set {'brown','fox','jumps'}. Then, sort result by the score in ascending node. This way nodes that match better the query statement are ordered at the last positions in the result list.

Table 1.31. Table content

jcr:primaryType	jcr:path	jcr:score
nt:unstructured	/testroot/node3	1000
nt:unstructured	/testroot/node2	1000
nt:unstructured	/testroot/node1	1000

1.30.9.21. Ordering by Score

1.30.9.21.1. Info

SQL and XPath queries support both score constructions jcr:score and jcr:score()

SELECT * FROM nt:base ORDER BY jcr:score [ASC|DESC]
SELECT * FROM nt:base ORDER BY jcr:score()[ASC|DESC]

//element(*,nt:base) order by jcr:score() [descending]
//element(*,nt:base) order by @jcr:score [descending]

Do not use "ascending" combined with jcr:score in XPath. The following XPath statement may throw an exception:

... order by jcr:score() ascending

Do not set any ordering specifier - ascending is default:

... order by jcr:score()

1.30.9.21.2. Repository Structure

The repository contains mix:title nodes, where the jcr:description has different values.

root

1.30.9.21.3. Query Execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE CONTAINS(*, 'brown OR fox OR jumps') ORDER BY jcr:score() ASC";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

NodeIterator will return nodes in the following order: "document3", "document2", "document1".

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[jcr:contains(., 'brown OR fox OR jumps')] order by jcr:score()";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.21.4. Fetching the Result

Let's get nodes

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Ordering by jcr:path or jcr:name does not supported.

Table 1.32. Table content

jcr:description	...	jcr:path	jcr:score
The fox is a nice animal.	...	/document3	2512
The brown fox lives in the forest.	...	/document2	3595
The quick brown fox jumps over the lazy dog.	...	/document1	5017

1.30.9.22. Ordering by Path or Name

Warning

There is two ways to order results, when path may be used as criteria:

Order by property with value type NAME or PATH (jcr supports it)
Order by jcr:path or jcr:name - sort by exact path or name of node (jcr do not supports it)

If no order specification is supplied in the query statement, implementations may support document order on the result nodes (see jsr-170 / 6.6.4.2 Document Order). And it's sorted by order number.

By default, (if query do not contains any ordering statements) result nodes is sorted by document order.

SELECT * FROM nt:unstructured WHERE jcr:path LIKE 'testRoot/%'

1.30.9.23. Fulltext Search by Property

Find all nodes containing a mixin type 'mix:title' and whose 'jcr:description' contains "forest" string.

1.30.9.23.1. Repository Structure

The repository is filled with nodes of the mixin type 'mix:title' and different values of the 'jcr:description' property.

root

1.30.9.23.2. Query Execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// we want find document which contains "forest" word
String sqlStatement = "SELECT \* FROM mix:title WHERE CONTAINS(jcr:description, 'forest')";
// create query
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// we want find document which contains "forest" word
String xpathStatement = "//element(*,mix:title)[jcr:contains(@jcr:description, 'forest')]";
// create query
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.23.3. Fetching the Result

NodeIterator will return "document2".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find nodes with mixin type 'mix:title' where any property contains 'break' string.

Table 1.33. Table content

jcr:description	...	jcr:path
The brown fox lives in forest.	...	/document2

1.30.9.24. Fulltext Search by All Properties in Node

1.30.9.24.1. Repository structure:

Repository filled with different nodes with mixin type 'mix:title' and different values of 'jcr:title' and 'jcr:description' properties.

root

1.30.9.24.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
String sqlStatement = "SELECT * FROM mix:title WHERE CONTAINS(*,'break')";
// create query
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// we want find 'document1'
String xpathStatement = "//element(*,mix:title)[jcr:contains(.,'break')]";
// create query
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.24.3. Fetching result

NodeIterator will return "document1" and "document2".

NodeIterator it = result.getNodes();

while(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

In this example, we will create new Analyzer, set it in QueryHandler configuration, and make query to check it.

Table 1.34. Table content

jcr:title	jcr:description	...	jcr:path
Prison break.	Run, Forest, run ))	...	/document2
Titanic	An iceberg breaks a ship.	...	/document3

1.30.9.25. Ignoring Accent Symbols. New Analyzer Setting.

Standard analyzer does not normalize accents like é,è,à. So, a word like 'tréma' will be stored to index as 'tréma'. But if we want to normalize such symbols or not? We want to store 'tréma' word as 'trema'.

There is two ways of setting up new Analyzer (no matter standarts or our):

The first way: Create descendant class of SearchIndex with new Analyzer (see Search Configuration);

There is only one way - create new Analyzer (if there is no previously created and accepted for our needs) and set it in Search index.

The second way: Register new Analyzer in QueryHandler configuration (this one eccepted since 1.12 version);

We will use the last one:

Create new MyAnalyzer

public class MyAnalyzer extends Analyzer
{
   @Override
   public TokenStream tokenStream(String fieldName, Reader reader)
   {
      StandardTokenizer tokenStream = new StandardTokenizer(reader);
      // process all text with standard filter
      // removes 's (as 's in "Peter's") from the end of words and removes dots from acronyms.
      TokenStream result = new StandardFilter(tokenStream);
      // this filter normalizes token text to lower case
      result = new LowerCaseFilter(result);
      // this one replaces accented characters in the ISO Latin 1 character set (ISO-8859-1) by their unaccented equivalents
      result = new ISOLatin1AccentFilter(result);
      // and finally return token stream
      return result;
   }
}

Then, register new MyAnalyzer in configuration

<workspace name="ws">
   ...
   <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
      <properties>
         <property name="analyzer" value="org.exoplatform.services.jcr.impl.core.MyAnalyzer"/>
         ...
      </properties>
   </query-handler>
   ...
</workspace>

After that, check it with query:

Find node with mixin type 'mix:title' where 'jcr:title' contains "tréma" and "naïve" strings.

1.30.9.25.1. Repository structure:

Repository filled by nodes with mixin type 'mix:title' and different values of 'jcr:title' property.

root
- node1 (mix:title) jcr:title = "tréma blabla naïve"
- node2 (mix:title) jcr:description = "trema come text naive"

1.30.9.25.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE CONTAINS(jcr:title, 'tr\u00E8ma na\u00EFve')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[jcr:contains(@jcr:title, 'tr\u00E8ma na\u00EFve')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.25.3. Fetching result

NodeIterator will return "node1" and "node2". How is it possible? Remember that our MyAnalyzer transforms 'tréma' word to 'trema'. So node2 accepts our constraints to.

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

Also, we can get a table:

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

The node type nt:file represents a file. It requires a single child node, called jcr:content. This node type represents images and other binary content in a JCRWiki entry. The node type of jcr:conent is nt:resource which represents the actual content of a file.

Table 1.35. Table content

cr:title	...	cr:path
trèma blabla naïve	...	/node1
trema come text naive	...	/node2

1.30.9.26. Finding nt:file node by content of child jcr:content node

Find node with the primary type is 'nt:file' and which whose 'jcr:content' child node contains "cats".

Normally, we can't find nodes (in our case) using just JCR SQL or XPath queries. But we can configure indexing so that nt:file aggregates jcr:content child node.

So, change indexing-configuration.xml:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.2.dtd">
<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
               xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
    <aggregate primaryType="nt:file">
        <include>jcr:content</include>
        <include>jcr:content/*</include>
        <include-property>jcr:content/jcr:lastModified</include-property>
    </aggregate>
</configuration>

Now the content of 'nt:file' and 'jcr:content' ('nt:resource') nodes are concatenated in a single Lucene document. Then, we can make a fulltext search query by content of 'nt:file'; this search includes the content of child 'jcr:content' node.

1.30.9.26.1. Repository structure:

Repository contains different nt:file nodes.

root

1.30.9.26.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:file WHERE CONTAINS(*,'cats')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:file)[jcr:contains(.,'cats')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.26.3. Fetching the result

NodeIterator will return "document2" and "document3".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

In this example, we will set different boost values for predefined nodes, and will check effect by selecting those nodes and order them by jcr:score.

Table 1.36. Table content

jcr:path	jcr:score
/document2	1030
/document3	1030

1.30.9.27. Changing Priority of Node

The default boost value is 1.0. Higher boost values (a reasonable range is 1.0 - 5.0) will yield a higher score value and appear as more relevant.

Note

See 4.2.2 Index Boost Value Search Configuration

1.30.9.27.1. Indexing configuration

In next configuration, we will set boost values for nt:ustructured nodes 'text' property.

indexing-config.xml:

<!-- 
This rule actualy do nothing. 'text' property has default boost value.
-->
<index-rule nodeType="nt:unstructured" condition="@rule='boost1'">
   <!-- default boost: 1.0 -->
   <property>text</property>
</index-rule>

<!-- 
Set boost value as 2.0 for 'text' property in nt:unstructured nodes where property 'rule' equal to 'boost2'
-->
<index-rule nodeType="nt:unstructured" condition="@rule='boost2'">
   <!-- boost: 2.0 -->
   <property boost="2.0">text</property>
</index-rule>

<!-- 
Set boost value as 3.0 for 'text' property in nt:unstructured nodes where property 'rule' equal to 'boost3'
-->
<index-rule nodeType="nt:unstructured" condition="@rule='boost3'">
   <!-- boost: 3.0 -->
   <property boost="3.0">text</property>
</index-rule>

1.30.9.27.2. Repository structure:

Repository contains many nodes with primary type nt:unstructured. Each node contains 'text' property and 'rule' property with different values.

root

1.30.9.27.3. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:unstructured WHERE CONTAINS(text, 'quick') ORDER BY jcr:score() DESC";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:unstructured)[jcr:contains(@text, 'quick')] order by @jcr:score descending";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.27.4. Fetching result

NodeIterator will return nodes in next order "node3", "node2", "node1".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

1.30.9.28. Removing Nodes Property From Indexing Scope

In this example, we will exclude some 'text' property of nt:unstructured node from indexind. And, therefore, node will not be found by the content of this property, even if it accepts all constraints.

First of all, add rules to indexing-configuration.xml:

<index-rule nodeType="nt:unstructured" condition="@rule='nsiTrue'">
    <!-- default value for nodeScopeIndex is true -->
    <property>text</property>
</index-rule>

<index-rule nodeType="nt:unstructured" condition="@rule='nsiFalse'">
    <!-- do not include text in node scope index -->
    <property nodeScopeIndex="false">text</property>
</index-rule>

Note

See Search Configuration

1.30.9.28.1. Repository structure:

Repository contains nt:unstructured nodes, with same 'text'property and different 'rule' properties (even null)

root

1.30.9.28.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:unstructured WHERE CONTAINS(*,'quick')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:unstructured)[jcr:contains(., 'quick')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.28.3. Fetching result

NodeIterator will return "node1" and "node3". Node2, as you see, is not in result set.

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

Also, we can get a table:

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

In this example, we want to configure indexind in the next way. All properties of nt:unstructured nodes must be excluded from search, except properties whoes names ends with 'Text' string. First of all, add rules to indexing-configuration.xml:

Table 1.37. Table content

jcr:primarytype	jcr:path	jcr:score
nt:unstructured	/node1	3806
nt:unstructured	/node3	3806

1.30.9.29. Regular Expression as Property Name in Indexing Rules

<index-rule nodeType="nt:unstructured"">
   <property isRegexp="true">.*Text</property>
</index-rule>

Note

See Search Configuration

Now, let's check this rule with simple query - select all nodes with primary type 'nt:unstructured' and containing 'quick' string (fulltext search by full node).

1.30.9.29.1. Repository structure:

Repository contains nt:unstructured nodes, with different 'text'-like named properties

root

1.30.9.29.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:unstructured WHERE CONTAINS(*,'quick')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:unstructured)[jcr:contains(., 'quick')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.29.3. Fetching result

NodeIterator will return "node1" and "node2". "node3", as you see, is not in result set.

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

Also, we can get a table:

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

High-lighting is not default feature so we must set it in jcr-config.xml, also excerpt provider must be defined:

Table 1.38. Table content

jcr:primarytype	jcr:path	jcr:score
nt:unstructured	/node1	3806
nt:unstructured	/node2	3806

1.30.9.30. High-lighting Result of Fulltext Search

It's also called excerption (see Excerpt configuration in Search Configuration and in Searching Repository article).

The goal of this query is to find words "eXo" and "implementation" with fulltext search and high-light this words in result value.

1.30.9.30.1. Base info

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
   <properties>
      ...
      <property name="support-highlighting" value="true" />
      <property name="excerptprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.WeightedHTMLExcerpt"/>
      ...
   <properties>
</query-handler>

Also, remember that we can make indexing rules, as in the example below:

Let's write rule for all nodes with primary node type 'nt:unstructed' where property 'rule' equal to "excerpt" string. For those nodes, we will exclude property "title" from high-lighting and set "text" property as highlightable. Indexing-configuration.xml must containt the next rule:

<index-rule nodeType="nt:unstructured" condition="@rule='excerpt'">
   <property useInExcerpt="false">title</property>
   <property>text</property>
</index-rule>

1.30.9.30.2. Repository structure:

We have single node with primary type 'nt:unstructured'

document (nt:unstructured)

1.30.9.30.3. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT rep:excerpt() FROM nt:unstructured WHERE CONTAINS(*, 'eXo implementation')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

Now let's see on the result table:

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:unstructured)[jcr:contains(., 'eXo implementation')]/rep:excerpt(.)";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.30.4. Fetching result

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

Find all mix:title nodes where title contains synonims to 'fast' word.

Table 1.39. Table content

rep:excerpt()	jcr:path	jcr:score
<div><span><strong>eXo<strong>is JCR<strong>implementation<strong><span><div>	/testroot/node1	335

As you see, words "eXo" and "implamentation" is highlighted.

Also, we can get exactly "rep:excerpt" value:

RowIterator rows = result.getRows();
Value excerpt = rows.nextRow().getValue("rep:excerpt(.)");
// excerpt will be equal to "<div><span\><strong>eXo</strong> is a JCR <strong>implementation</strong></span></div>"

1.30.9.31. Searching By Synonim

Note

Synonim provider must be configured in indexing-configuration.xml :

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
   <properties>
      ...
      <property name="synonymprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.PropertiesSynonymProvider" />
      <property name="synonymprovider-config-path" value="../../synonyms.properties" />
      ...
   </properties>
</query-handler>

File synonim.properties contains next synonims list:

ASF=Apache Software Foundation
quick=fast
sluggish=lazy

1.30.9.31.1. Repository structure:

Repository contains mix:title nodes, where jcr:title has different values.

root
- document1 (mix:title) jcr:title="The quick brown fox jumps over the lazy dog."

1.30.9.31.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE CONTAINS(jcr:title, '~fast')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[jcr:contains(@jcr:title, '~fast')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.31.3. Fetching result

NodeIterator will return expected document1. This is a purpose of synonim providers. Find by specified word, but return by all synonims to.

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

1.30.9.32. Checking the spelling of Phrase

Check the correct spelling of phrase 'quik OR (-foo bar)' according to data already stored in index.

Note

SpellChecker must be settled in query-handler config.

test-jcr-config.xml:

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
   <properties>
      ...
   <property name="spellchecker-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.spell.LuceneSpellChecker$FiveSecondsRefreshInterval" />
      ...
   </properties>
</query-handler>

1.30.9.32.1. Repository structure:

Repository contains node, with string property "The quick brown fox jumps over the lazy dog."

root
- node1 property="The quick brown fox jumps over the lazy dog."

1.30.9.32.2. Query execution

Query looks only for root node, because spell checker looks for suggestions by full index. So complicated query is redundant.

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT rep:spellcheck() FROM nt:base WHERE jcr:path = '/' AND SPELLCHECK('quik OR (-foo bar)')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

Get suggestion of coorect spelling our phrase:

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "/jcr:root[rep:spellcheck('quik OR (-foo bar)')]/(rep:spellcheck())";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.32.3. Fetching result

RowIterator it = result.getRows();
Row r = rows.nextRow();
Value v = r.getValue("rep:spellcheck()");
String correctPhrase = v.getString();

So, correct spelling for phrase "quik OR (-foo bar)" is "quick OR (-fox bar)".

1.30.9.33. Finding Similar Nodes

In our example, baseFile will contain text where "terms" word happens many times. That's a reason why the existanse of this word will be used as a criteria of node similarity (for node baseFile).

Note

Higlighting support must be added to configuration. test-jcr-config.xml:

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
   <properties>
      ...
      <property name="support-highlighting" value="true" />
      ...
   </properties>
</query-handler>

1.30.9.33.1. Repository structure:

Repository contains many nt:file nodes"

root

1.30.9.33.2. Query execution

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:resource WHERE SIMILAR(.,'/baseFile/jcr:content')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*, nt:resource)[rep:similar(., '/testroot/baseFile/jcr:content')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();

1.30.9.33.3. Fetching result

NodeIterator will return "/baseFile/jcr:content","/target1/jcr:content" and "/target3/jcr:content".

NodeIterator it = result.getNodes();

if(it.hasNext())
{
   Node findedNode = it.nextNode();
}

As you see the base node are also in result set.

String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
   Row row = rit.nextRow();
   // get values of the row
   Value[] values = row.getValues();
}

If you execute an XPath request like this:

Table 1.40. Table content

jcr:path	...	jcr:score
/baseFile/jcr:content	...	2674
/target1/jcr:content	...	2674
/target3/jcr:content	...	2674

1.30.10. Tips and tricks

1.30.10.1. XPath queries containing node names starting with a number

You will have an error : "Invalid request". This happens because XML does not allow names starting with a number - and XPath is part of XML: http://www.w3.org/TR/REC-xml/#NT-Name

// get QueryManager
QueryManager queryManager = workspace.getQueryManager(); 
// make XPath query
Query query = queryManager.createQuery("/jcr:root/Documents/Publie/2010//element(*, exo:article)", Query.XPATH);

Therefore, you cannot do XPath requests using a node name that starts with a number.

Easy workarounds:

Use an SQL request.
Use escaping :

XPath

// get QueryManager
QueryManager queryManager = workspace.getQueryManager(); 
// make XPath query
Query query = queryManager.createQuery("/jcr:root/Documents/Publie/_x0032_010//element(*, exo:article)", Query.XPATH);

1.31. Searching Repository Content

You can find the JCR configuration file at .../portal/WEB-INF/conf/jcr/repository-configuration.xml. Please read also Search Configuration for more information about index configuration.

1.31.1. Bi-directional RangeIterator (since 1.9)

QueryResult.getNodes() will return bi-directional NodeIterator implementation.

Note

Bi-directional NodeIterator is not supported in two cases:

SQL query: select * from nt:base
XPath query: //* .

TwoWayRangeIterator interface:

/**
 * Skip a number of elements in the iterator.
 * 
 * @param skipNum the non-negative number of elements to skip
 * @throws java.util.NoSuchElementException if skipped past the first element
 *           in the iterator.
 */
public void skipBack(long skipNum);

Usage:

NodeIterator iter = queryResult.getNodes();
while (iter.hasNext()) {
  if (skipForward) {
    iter.skip(10); // Skip 10 nodes in forward direction
  } else if (skipBack) {
    TwoWayRangeIterator backIter = (TwoWayRangeIterator) iter; 
    backIter.skipBack(10); // Skip 10 nodes back 
  }
  .......
}

1.31.2. Fuzzy Searches (since 1.0)

JCR supports such features as Lucene Fuzzy Searches Apache Lucene - Query Parser Syntax.

To use it, you have to form a query like the one described below:

QueryManager qman = session.getWorkspace().getQueryManager();
Query q = qman.createQuery("select * from nt:base where contains(field, 'ccccc~')", Query.SQL);
QueryResult res = q.execute();

1.31.3. SynonymSearch (since 1.9)

Searching with synonyms is integrated in the jcr:contains() function and uses the same syntax as synonym searches in Google. If a search term is prefixed by a tilde symbol ( ~ ), also synonyms of the search term are taken into consideration. For example:

SQL: select * from nt:resource where contains(., '~parameter')

XPath: //element(*, nt:resource)[jcr:contains(., '~parameter')

This feature is disabled by default and you need to add a configuration parameter to the query-handler element in your jcr configuration file to enable it.

<param  name="synonymprovider-config-path" value="..you path to configuration file....."/>
<param  name="synonymprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.PropertiesSynonymProvider"/>

/**
 * <code>SynonymProvider</code> defines an interface for a component that
 * returns synonyms for a given term.
 */
public interface SynonymProvider {

   /**
    * Initializes the synonym provider and passes the file system resource to
    * the synonym provider configuration defined by the configuration value of
    * the <code>synonymProviderConfigPath</code> parameter. The resource may be
    * <code>null</code> if the configuration parameter is not set.
    *
    * @param fsr the file system resource to the synonym provider
    *            configuration.
    * @throws IOException if an error occurs while initializing the synonym
    *                     provider.
    */
   public void initialize(InputStream fsr) throws IOException;

   /**
    * Returns an array of terms that are considered synonyms for the given
    * <code>term</code>.
    *
    * @param term a search term.
    * @return an array of synonyms for the given <code>term</code> or an empty
    *         array if no synonyms are known.
    */
   public String[] getSynonyms(String term);
}

1.31.4. High-lighting (Since 1.9)

An ExcerptProvider retrieves text excerpts for a node in the query result and marks up the words in the text that match the query terms.

By default highlighting words matched the query is disabled because this feature requires that additional information is written to the search index. To enable this feature, you need to add a configuration parameter to the query-handler element in your jcr configuration file to enable it.

<param name="support-highlighting" value="true"/>

Additionally, there is a parameter that controls the format of the excerpt created. In JCR 1.9, the default is set to org.exoplatform.services.jcr.impl.core.query.lucene.DefaultHTMLExcerpt. The configuration parameter for this setting is:

<param name="excerptprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.DefaultXMLExcerpt"/>

1.31.4.1. DefaultXMLExcerpt

This excerpt provider creates an XML fragment of the following form:

<excerpt>
    <fragment>
        <highlight>exoplatform</highlight> implements both the mandatory
        XPath and optional SQL <highlight>query</highlight> syntax.
    </fragment>
    <fragment>
        Before parsing the XPath <highlight>query</highlight> in
        <highlight>exoplatform</highlight>, the statement is surrounded
    </fragment>
</excerpt>

1.31.4.2. DefaultHTMLExcerpt

This excerpt provider creates an HTML fragment of the following form:

<div>
    <span>
        <strong>exoplatform</strong> implements both the mandatory XPath
        and optional SQL <strong>query</strong> syntax.
    </span>
    <span>
        Before parsing the XPath <strong>query</strong> in
        <strong>exoplatform</strong>, the statement is surrounded
    </span>
</div>

1.31.4.3. How to use it

If you are using XPath, you must use the rep:excerpt() function in the last location step, just like you would select properties:

QueryManager qm = session.getWorkspace().getQueryManager();
Query q = qm.createQuery("//*[jcr:contains(., 'exoplatform')]/(@Title|rep:excerpt(.))", Query.XPATH);
QueryResult result = q.execute();
for (RowIterator it = result.getRows(); it.hasNext(); ) {
   Row r = it.nextRow();
   Value title = r.getValue("Title");
   Value excerpt = r.getValue("rep:excerpt(.)");
}

The above code searches for nodes that contain the word exoplatform and then gets the value of the Title property and an excerpt for each result node.

It is also possible to use a relative path in the call Row.getValue() while the query statement still remains the same. Also, you may use a relative path to a string property. The returned value will then be an excerpt based on string value of the property.

Both available excerpt provider will create fragments of about 150 characters and up to 3 fragments.

In SQL, the function is called excerpt() without the rep prefix, but the column in the RowIterator will nonetheless be labled rep:excerpt(.)!

QueryManager qm = session.getWorkspace().getQueryManager();
Query q = qm.createQuery("select excerpt(.) from nt:resource where contains(., 'exoplatform')", Query.SQL);
QueryResult result = q.execute();
for (RowIterator it = result.getRows(); it.hasNext(); ) {
   Row r = it.nextRow();
   Value excerpt = r.getValue("rep:excerpt(.)");
}

1.31.5. SpellChecker

The lucene based query handler implementation supports a pluggable spell checker mechanism. By default, spell checking is not available and you have to configure it first. See parameter spellCheckerClass on page Search Configuration. JCR currently provides an implementation class , which uses the lucene-spellchecker to contribute . The dictionary is derived from the fulltext indexed content of the workspace and updated periodically. You can configure the refresh interval by picking one of the available inner classes of org.exoplatform.services.jcr.impl.core.query.lucene.spell.LuceneSpellChecker:

OneMinuteRefreshInterval
FiveMinutesRefreshInterval
ThirtyMinutesRefreshInterval
OneHourRefreshInterval
SixHoursRefreshInterval
TwelveHoursRefreshInterval
OneDayRefreshInterval

For example, if you want a refresh interval of six hours, the class name is: org.exoplatform.services.jcr.impl.core.query.lucene.spell.LuceneSpellChecker$SixHoursRefreshInterval. If you use org.exoplatform.services.jcr.impl.core.query.lucene.spell.LuceneSpellChecker, the refresh interval will be one hour.

The spell checker dictionary is stored as a lucene index under "index-dir"/spellchecker. If it does not exist, a background thread will create it on startup. Similarly, the dictionary refresh is also done in a background thread to not block regular queries.

1.31.5.1. How do I use it?

You can spell check a fulltext statement either with an XPath or a SQL query:

// rep:spellcheck('explatform') will always evaluate to true
Query query = qm.createQuery("/jcr:root[rep:spellcheck('explatform')]/(rep:spellcheck())", Query.XPATH);
RowIterator rows = query.execute().getRows();
// the above query will always return the root node no matter what string we check
Row r = rows.nextRow();
// get the result of the spell checking
Value v = r.getValue("rep:spellcheck()");
if (v == null) {
   // no suggestion returned, the spelling is correct or the spell checker
   // does not know how to correct it.
} else {
   String suggestion = v.getString();
}

And the same using SQL:

// SPELLCHECK('exoplatform') will always evaluate to true
Query query = qm.createQuery("SELECT rep:spellcheck() FROM nt:base WHERE jcr:path = '/' AND SPELLCHECK('explatform')", Query.SQL);
RowIterator rows = query.execute().getRows();
// the above query will always return the root node no matter what string we check
Row r = rows.nextRow();
// get the result of the spell checking
Value v = r.getValue("rep:spellcheck()");
if (v == null) {
   // no suggestion returned, the spelling is correct or the spell checker
   // does not know how to correct it.
} else {
   String suggestion = v.getString();
}

1.31.6. Similarity (Since 1.12)

Starting with version, 1.12 JCR allows you to search for nodes that are similar to an existing node.

Only terms with at least 4 characters are considered.
Only terms that occur at least 2 times in the source node are considered.
Only terms that occur in at least 5 nodes are considered.

<param name="support-highlighting" value="true"/>

The functions are called rep:similar() (in XPath) and similar() (in SQL) and have two arguments:

relativePath: a relative path to a descendant node or . for the current node. absoluteStringPath: a string literal that contains the path to the node for which to find similar nodes.

Warning

Relative path is not supported yet.

Examples:

//element(*, nt:resource)[rep:similar(., '/parentnode/node.txt/jcr:content')]

Finds nt:resource nodes, which are similar to node by path /parentnode/node.txt/jcr:content.

1.32. Fulltext Search And Affecting Settings

1.32.1. Property content indexing

Each property of a node (if it is indexable) is processed with Lucene analyzer and stored in Lucene index. That's called indexing of a property. After that we can perform a fulltext search among these indexed properties.

1.32.2. Lucene Analyzers

The sense of analyzers is to transform all strings stored in the index in a well-defined condition. The same analyzer(s) is/are used when searching in order to adapt the query string to the index reality.

Therefore, performing the same query using different analyzers can return different results.

Now, let's see how the same string is transformed by different analyzers.

Table 1.41. "The quick brown fox jumped over the lazy dogs"

Analyzer	Parsed
org.apache.lucene.analysis.WhitespaceAnalyzer	[The] [quick] [brown] [fox] [jumped] [over] [the] [lazy] [dogs]
org.apache.lucene.analysis.SimpleAnalyzer	[the] [quick] [brown] [fox] [jumped] [over] [the] [lazy] [dogs]
org.apache.lucene.analysis.StopAnalyzer	[quick] [brown] [fox] [jumped] [over] [lazy] [dogs]
org.apache.lucene.analysis.standard.StandardAnalyzer	[quick] [brown] [fox] [jumped] [over] [lazy] [dogs]
org.apache.lucene.analysis.snowball.SnowballAnalyzer	[quick] [brown] [fox] [jump] [over] [lazi] [dog]
org.apache.lucene.analysis.standard.StandardAnalyzer (configured without stop word - jcr default analyzer)	[the] [quick] [brown] [fox] [jumped] [over] [the] [lazy] [dogs]

Table 1.42. "XY&Z Corporation - xyz@example.com"

Analyzer	Parsed
org.apache.lucene.analysis.WhitespaceAnalyzer	[XY&Z] [Corporation] [-] [xyz@example.com]
org.apache.lucene.analysis.SimpleAnalyzer	[xy] [z] [corporation] [xyz] [example] [com]
org.apache.lucene.analysis.StopAnalyzer	[xy] [z] [corporation] [xyz] [example] [com]
org.apache.lucene.analysis.standard.StandardAnalyzer	[xy&z] [corporation] [xyz@example] [com]
org.apache.lucene.analysis.snowball.SnowballAnalyzer	[xy&z] [corpor] [xyz@exampl] [com]
org.apache.lucene.analysis.standard.StandardAnalyzer (configured without stop word - jcr default analyzer)	[xy&z] [corporation] [xyz@example] [com]

Note

StandardAnalyzer is the default analyzer in exo's jcr search engine. But we do not use stop words.

You can assign your analyzer as described in Search Configuration

1.32.3. How are different properties indexed?

Different properties are indexed in different ways, this affects to if it can be searched like fulltext by property or not.

Only two property types are indexed as fulltext searcheable: STRING and BINARY.

Table 1.43. Fulltext search by different properties

Property Type	Fulltext search by all properties	Fulltext search by exact property
STRING	YES	YES
BINARY	YES	NO

For example, ưe have property jcr:data (it' BINARY). It's stored well, but you will never find any string with query like:

SELECT * FROM nt:resource WHERE CONTAINS(jcr:data, 'some string')

Because, BINARY is not searchable by fulltext search on exact property.

But, next query will return result (off course if node has searched data):

SELECT * FROM nt:resource WHERE CONTAINS( * , 'some string')

1.32.4. Fulltext search query examples

1.32.5. Different analyzers in action

First of all, we will fill repository by nodes with mixin type 'mix:title' and different values of 'jcr:description' property.

root

Let's see analyzers effect closer. In first case, we use base jcr settings, so, as mentioned above, string "The quick brown fox jumped over the lazy dogs" will be transformed to set {[the] [quick] [brown] [fox] [jumped] [over] [the] [lazy] [dogs] }

// make SQL query
QueryManager queryManager = workspace.getQueryManager();
String sqlStatement = "SELECT * FROM mix:title WHERE CONTAINS(jcr:description, 'the')";
// create query
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();

NodeIterator will return "document1".

Now change the default analyzer to org.apache.lucene.analysis.StopAnalyzer. Fill repository again (new Analyzer must process nodes properties) and run the same query again. It will return nothing, because stop words like "the" will be excluded from parsed string set.

1.33. JCR API Extensions

eXo JCR implementation offers new extended feature beyond JCR specification. Sometimes one JCR Node has hundreds or even thousands of child nodes. This situation is highly not recommended for content repository data storage, but sometimes it occurs. JCR Team is pleased to announce new feature that will help to have a deal with huge child lists. They can be iterated in a "lazy" manner now giving improvement in term of performance and RAM usage.

1.33.1. API and usage

Lazy child nodes iteration feature is accessible via extended interface org.exoplatform.services.jcr.core.ExtendedNode, the inheritor of javax.jcr.Node. It provides a new single method shown below:

   /**
    * Returns a NodeIterator over all child Nodes of this Node. Does not include properties 
    * of this Node. If this node has no child nodes, then an empty iterator is returned.
    * 
    * @return A NodeIterator over all child Nodes of this <code>Node</code>.
    * @throws RepositoryException If an error occurs.
    */
   public NodeIterator getNodesLazily() throws RepositoryException;

From the view of end-user or client application, getNodesLazily() works similar to JCR specified getNodes() returning NodeIterator. "Lazy" iterator supports the same set of features as an ordinary NodeIterator, including skip() and excluding remove() features. "Lazy" implementation performs reading from DB by pages. Each time when it has no more elements stored in memory, it reads next set of items from persistent layer. This set is called "page". Must admit that getNodesLazily feature fully supports session and transaction changes log, so it's a functionally-full analogue of specified getNodes() operation. So when having a deal with huge list of child nodes, getNodes() can be simply and safely substituted with getNodesLazily().

JCR gives an experimental opportunity to replace all getNodes() invocations with getNodesLazily() calls. It handles a boolean system property named "org.exoplatform.jcr.forceUserGetNodesLazily" that internally replaces one call with another, without any code changes. But be sure using it only for development purposes. This feature can be used with top level products using eXo JCR to perform a quick compatibility and performance tests without changing any code. This is not recommended to be used as a production solution.

1.33.2. Configuration

In order to enable add the "-Dorg.exoplatform.jcr.forceUserGetNodesLazily=true" to the java system properties.

The "lazy" iterator reads the child nodes "page" after "page" into the memory. In this context, a "page" is a set of nodes that is read at once. The size of the page is by default 100 nodes and can be configured though workspace container configuration using "lazy-node-iterator-page-size" parameter. For example:

<container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
   <properties>
      <property name="source-name" value="jdbcjcr" />
      <property name="multi-db" value="true" />
      <property name="max-buffer-size" value="200k" />
      <property name="swap-directory" value="target/temp/swap/ws" />
      <property name="lazy-node-iterator-page-size" value="50" />
      ...
   </properties>

1.33.3. Implementation notices

Current "lazy" child nodes iterator supports caching, when pages are cached atomically in safe and optimized way. Cache is always kept in consistent state using invalidation if child list changed. Take in account the following difference in getNodes and getNodesLazily. Specification defined getNodes method reads whole list of nodes, so child items added after invocation will never be in results. GetNodesLazily doesn't acquire full list of nodes, so child items added after iterator creation can be found in result. So getNodesLazily can represent some kind of "real-time" results. But it is highly depend on numerous conditions and should not be used as a feature, it more likely implementation specific issue typical for "lazy-pattern".

1.34. WebDAV

The WebDAV protocol enables you to use the third party tools to communicate with hierarchical content servers via HTTP. It is possible to add and remove documents or a set of documents from a path on the server. DeltaV is an extension of the WebDav protocol that allows managing document versioning. Locking guarantees protection against multiple access when writing resources. The ordering support allows changing the position of the resource in the list and sort the directory to make the directory tree viewed conveniently. The full-text search makes it easy to find the necessary documents. You can search by using two languages: SQL and XPATH.

In eXo JCR, we plug in the WebDAV layer - based on the code taken from the extension modules of the reference implementation - on the top of our JCR implementation so that it is possible to browse a workspace using the third party tools (it can be Windows folders or Mac ones as well as a Java WebDAV client, such as DAVExplorer or IE using File->Open as a Web Folder).

Now WebDav is an extension of the REST service. To get the WebDav server ready, you must deploy the REST application. Then, you can access any workspaces of your repository by using the following URL:

Standalone mode:

http://host:port/rest/jcr/{RepositoryName}/{WorkspaceName}/{Path}

Portal mode:

http://host:port/portal/rest/private/jcr/{RepositoryName}/{WorkspaceName}/{Path}

When accessing the WebDAV server with the URLhttp://localhost:8080/rest/jcr/repository/production, you might also use "collaboration" (instead of "production") which is the default workspace in eXo products. You will be asked to enter your login and password. Those will then be checked by using the organization service that can be implemented thanks to an InMemory (dummy) module or a DB module or an LDAP one and the JCR user session will be created with the correct JCR Credentials.

Note

If you try the "in ECM" option, add "@ecm" to the user's password. Alternatively, you may modify jaas.conf by adding the domain=ecm option as follows:

exo-domain {
     org.exoplatform.services.security.jaas.BasicLoginModule required domain=ecm;
};

Related documents

Link Producer

1.34.1. Configuration

<component>
  <key>org.exoplatform.services.jcr.webdav.WebDavServiceImpl</key>
  <type>org.exoplatform.services.jcr.webdav.WebDavServiceImpl</type>
  <init-params>

    <!-- default node type which is used for the creation of collections -->
    <value-param>
      <name>def-folder-node-type</name>
      <value>nt:folder</value>
    </value-param>

    <!-- default node type which is used for the creation of files -->
    <value-param>
      <name>def-file-node-type</name>
      <value>nt:file</value>
    </value-param>

    <!-- if MimeTypeResolver can't find the required mime type, 
         which conforms with the file extension, and the mimeType header is absent
         in the HTTP request header, this parameter is used 
         as the default mime type-->
    <value-param>
      <name>def-file-mimetype</name>
      <value>application/octet-stream</value>
    </value-param>

    <!-- This parameter indicates one of the three cases when you update the content of the resource by PUT command.
         In case of "create-version", PUT command creates the new version of the resource if this resource exists.
         In case of "replace" - if the resource exists, PUT command updates the content of the resource and its last modification date.
         In case of "add", the PUT command tries to create the new resource with the same name (if the parent node allows same-name siblings).-->

    <value-param>
      <name>update-policy</name>
      <value>create-version</value>
      <!--value>replace</value -->
      <!-- value>add</value -->
    </value-param>

    <!--
        This parameter determines how service responds to a method that attempts to modify file content.
        In case of "checkout-checkin" value, when a modification request is applied to a checked-in version-controlled resource, the request is automatically preceded by a checkout and followed by a checkin operation.
        In case of "checkout" value, when a modification request is applied to a checked-in version-controlled resource, the request is automatically preceded by a checkout operation.
        In case of "checkin-checkout" value, when a modification request is applied, the request is automatically preceded by a checkin then a checkout operation.
    -->         
    <value-param>
      <name>auto-version</name>
      <value>checkout-checkin</value>
      <!--value>checkout</value -->
    </value-param>

    <!--
        This parameter is responsible for managing Cache-Control header value which will be returned to the client.
        You can use patterns like "text/*", "image/*" or wildcard to define the type of content.
    -->  
    <value-param>
      <name>cache-control</name>
      <value>text/xml,text/html:max-age=3600;image/png,image/jpg:max-age=1800;*/*:no-cache;</value>
    </value-param>
    
    <!--
        This parameter determines the absolute path to the folder icon file, which is shown
        during WebDAV view of the contents
    -->
    <value-param>
      <name>folder-icon-path</name>
      <value>/absolute/path/to/file</value>
    </value-param>

    <!--
        This parameter determines the absolute path to the file icon file, which is shown
        during WebDAV view of the contents
    -->
    <value-param>
      <name>file-icon-path</name>
      <value>/absolute/path/to/file</value>
    </value-param>

    <!-- 
        This parameter is responsible for untrusted user agents definition.
        Content-type headers of listed here user agents should be
        ignored and MimeTypeResolver should be explicitly used instead 
    -->
    <values-param>
      <name>untrusted-user-agents</name>
      <value>Microsoft Office Core Storage Infrastructure/1.0</value>
    </values-param>

    <--
        Allows to define which node type can be used to
        create files via WebDAV.
        Default value: nt:file
    -->
    <values-param>
      <name>allowed-file-node-types</name>
      <value>nt:file</value>
    </values-param>

    <--
        Allows to define which node type can be used to
        create folders via WebDAV.
        Default value: nt:folder
    -->
    <values-param>
      <name>allowed-folder-node-types</name>
      <value>nt:folder</value>
    </values-param>

  </init-params>
</component>

1.34.2. Screenshots

At present, eXo JCR WebDav server is tested by using MS Internet Explorer, Dav Explorer, Xythos Drive, Microsoft Office 2003 (as client), and Ubuntu Linux.

1.34.2.1. MS Internet Explorer

(File -> Open as Web Folder)

1.34.2.2. Dav Explorer

1.34.2.3. Xythos Drive

1.34.2.4. Microsoft Office 2003

(as client) (File->Open with typing http://... href in the file name box)

1.34.2.5. Ubuntu Linux

1.34.3. Comparison table of WebDav and JCR commands

Table 1.44.

WebDav	JCR
COPY	Workspace.copy(...)
DELETE	Node.remove()
GET	Node.getProperty(...); Property.getValue()
HEAD	Node.getProperty(...); Property.getLength()
MKCOL	Node.addNode(...)
MOVE	Session.move(...) or Workspace.move(...)
PROPFIND	Session.getNode(...); Node.getNode(...); Node.getNodes(...); Node.getProperties()
PROPPATCH	Node.setProperty(...); Node.getProperty(...).remove()
PUT	Node.addNode("node","nt:file"); Node.setProperty("jcr:data", "data")
CHECKIN	Node.checkin()
CHECKOUT	Node.checkout()
REPORT	Node.getVersionHistory(); VersionHistory.getAllVersions(); Version.getProperties()
UNCHECKOUT	Node.restore(...)
VERSION-CONTROL	Node.addMixin("mix:versionable")
LOCK	Node.lock(...)
UNLOCK	Node.unlock()
ORDERPATCH	Node.orderBefore(...)
SEARCH	Workspace.getQueryManager(); QueryManager.createQuery(); Query.execute()
ACL	Node.setPermission(...)

1.34.4. Restrictions

There are some restrictions for WebDAV in different Operating systems.

1.34.4.1. Windows 7

When you try to set up a web folder by “adding a network location” or “map a network drive” through My Computer, you can get an error message saying that either “The folder you entered does not appear to be valid. Please choose another” or “Windows cannot access… Check the spelling of the name. Otherwise, there might be…”. These errors may appear when you are using SSL or non-SSL.

To fix this, do as follows:

Go to Windows Registry Editor.
Find a key: \HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlset\services\WebClient\Parameters\BasicAuthLevel .
Change the value to 2.

1.34.4.2. Microsoft Office 2010

If you have Microsoft Office 2010 applications or Microsoft Office 2007 applications installed on a client computer. From that client computer, you try to access an Office file that is stored on a web server that is configured for Basic authentication. The connection between your computer and the web server does not use Secure Sockets Layer (SSL). When you try to open or to download the file, you experience the following symptoms:

The Office file does not open or download.
You do not receive a Basic authentication password prompt when you try to open or to download the file.
You do not receive an error message when you try to open the file. The associated Office application starts. However, the selected file does not open.

To enable Basic authentication on the client computer, follow these steps:

Click Start, type regedit in the Start Search box, and then press Enter.
Locate and then click the following registry subkey:
HKEY_CURRENT_USER\Software\Microsoft\Office\14.0\Common\Internet
On the Edit menu, point to New, and then click DWORD Value.
Type BasicAuthLevel, and then press Enter.
Right-click BasicAuthLevel, and then click Modify.
In the Value data box, type 2, and then click OK.

1.35. FTP

The JCR-FTP Server represents the standard eXo service, operates as an FTP server with an access to a content stored in JCR repositories in the form of nt:file/nt:folder nodes or their successors. The client of an executed Server can be any FTP client. The FTP server is supported by a standard configuration which can be changed as required.

1.35.1. Configuration Parameters

1.35.1.1. command-port:

<value-param>
   <name>command-port</name>
   <value>21</value>
</value-param>

The value of the command channel port. The value '21' is default.

When you have already some FTP server installed in your system , this parameter needs to be changed (2121 for example) to avoid conflicts or if the port is protected.

1.35.1.2. data-min-port & data-max-port

<value-param>
   <name>data-min-port</name>
   <value>52000</value>
</value-param>

<value-param>
   <name>data-max-port</name>
   <value>53000</value>
</value-param>

These two parameters indicate the minimal and maximal values of the range of ports, used by the server. The usage of the additional data channel is required by the FTP - protocol, which is used to transfer the contents of files and the listing of catalogues. This range of ports should be free from listening by other server-programs.

1.35.1.3. system

<value-param>
   <name>system</name>

   <value>Windows_NT</value>
     or
   <value>UNIX Type: L8</value>
</value-param>

Types of formats of listing of catalogues which are supported.

1.35.1.4. client-side-encoding

<value-param>
   <name>client-side-encoding</name>
      
   <value>windows-1251</value>
     or
   <value>KOI8-R</value>
     
</value-param>

This parameter specifies the coding which is used for dialogue with the client.

1.35.1.5. def-folder-node-type

<value-param>
   <name>def-folder-node-type</name>
   <value>nt:folder</value>
</value-param>

This parameter specifies the type of a node, when an FTP-folder is created.

1.35.1.6. def-file-node-type

<value-param>
   <name>def-file-node-type</name>
   <value>nt:file</value>
</value-param>

This parameter specifies the type of a node, when an FTP - file is created.

1.35.1.7. def-file-mime-type

<value-param>
   <name>def-file-mime-type</name>                 
   <value>application/zip</value>
</value-param>

The mime type of a created file is chosen by using its file extention. In case, a server cannot find the corresponding mime type, this value is used.

1.35.1.8. cache-folder-name

<value-param>
   <name>cache-folder-name</name>
   <value>../temp/ftp_cache</value>
</value-param>

The Path of the cache folder.

1.35.1.9. upload-speed-limit

<value-param>
   <name>upload-speed-limit</name>           
   <value>20480</value>
</value-param>

Restriction of the upload speed. It is measured in bytes.

1.35.1.10. download-speed-limit

<value-param>
   <name>download-speed-limit</name>
   <value>20480</value>          
</value-param>

Restriction of the download speed. It is measured in bytes.

1.35.1.11. timeout

<value-param>
   <name>timeout</name>
   <value>60</value>
</value-param>

Defines the value of a timeout.

1.35.1.12. replace-forbidden-chars

<value-param>
   <name>replace-forbidden-chars</name>
   <value>true</value>
</value-param>

Indicates whether or not the forbidden characters must be replaced.

1.35.1.13. forbidden-chars

<value-param>
   <name>forbidden-chars</name>
   <value>:[]*'"|</value>
</value-param>

Defines the list of forbidden characters.

1.35.1.14. replace-char

<value-param>
   <name>replace-char</name>
   <value>_</value>
</value-param>

Defines the character that will be used to replace the forbidden characters.

1.36. eXo JCR Backup Service

Note

Restore of system workspace is not supported only as part of restoring of whole repository.

1.36.1. Concept

The main purpose of that feature is to restore data in case of system faults and repository crashes. Also, the backup results may be used as a content history.

The concept is based on the export of a workspace unit in the Full, or Full + Incrementals model. A repository workspace can be backup and restored using a combination of these modes. In all cases, at least one Full (initial) backup must be executed to mark a starting point of the backup history. An Incremental backup is not a complete image of the workspace. It contains only changes for some period. So it is not possible to perform an Incremental backup without an initial Full backup.

The Backup service may operate as a hot-backup process at runtime on an in-use workspace. It's a case when the Full + Incrementals model should be used to have a guaranty of data consistency during restoration. An Incremental will be run starting from the start point of the Full backup and will contain changes that have occured during the Full backup, too.

A restore operation is a mirror of a backup one. At least one Full backup should be restored to obtain a workspace corresponding to some points in time. On the other hand, Incrementals may be restored in the order of creation to reach a required state of a content. If the Incremental contains the same data as the Full backup (hot-backup), the changes will be applied again as if they were made in a normal way via API calls.

According to the model there are several modes for backup logic:

Full backup only : Single operation, runs once
Full + Incrementals : Start with an initial Full backup and then keep incrementals changes in one file. Run until it is stopped.
Full + Incrementals(periodic) : Start with an initial Full backup and then keep incrementals with periodic result file rotation. Run until it is stopped.

1.36.2. How it works

1.36.2.1. Implementation details

Full backup/restore is implemented using the JCR SysView Export/Import. Workspace data will be exported into Sysview XML data from root node.

Restoring is implemented, using the special eXo JCR API feature: a dynamic workspace creation. Restoring of the workspace Full backup will create one new workspace in the repository. Then, the SysView XML data will be imported as the root node.

Incremental backup is implemented using the eXo JCR ChangesLog API. This API allows to record each JCR API call as atomic entries in a changelog. Hence, the Incremental backup uses a listener that collects these logs and stores them in a file.

Restoring an incremental backup consists in applying the collected set of ChangesLogs to a workspace in the correct order.

Note

Incremental backup is an experimental feture and not supported, so it must be used with a lot of caution.

1.36.2.2. Work basics

The work of Backup is based on the BackupConfig configuration and the BackupChain logical unit.

BackupConfig describes the backup operation chain that will be performed by the service. When you intend to work with it, the configuration should be prepared before the backup is started.

The configuration contains such values as:

Types of full and incremental backup (fullBackupType, incrementalBackupType): Strings with full names of classes which will cover the type functional.
Incremental period: A period after that a current backup will be stopped and a new one will be started in seconds (long).
Target repository and workspace names: Strings with described names
Destination directory for result files: String with a path to a folder where operation result files will be stored.

BackupChain is a unit performing the backup process and it covers the principle of initial Full backup execution and manages Incrementals operations. BackupChain is used as a key object for accessing current backups during runtime via BackupManager. Each BackupJob performs a single atomic operation - a Full or Incremental process. The result of that operation is data for a Restore. BackupChain can contain one or more BackupJobs. But at least the initial Full job is always there. Each BackupJobs has its own unique number which means its Job order in the chain, the initial Full job always has the number 0.

Backup process, result data and file location

To start the backup process, it's necessary to create the BackupConfig and call the BackupManager.startBackup(BackupConfig) method. This method will return BackupChain created according to the configuration. At the same time, the chain creates a BackupChainLog which persists BackupConfig content and BackupChain operation states to the file in the service working directory (see Configuration).

When the chain starts the work and the initial BackupJob starts, the job will create a result data file using the destination directory path from BackupConfig. The destination directory will contain a directory with an automatically created name using the pattern repository_workspace-timestamp where timestamp is current time in the format of yyyyMMdd_hhmmss (E.g. db1_ws1-20080306_055404). The directory will contain the results of all Jobs configured for execution. Each Job stores the backup result in its own file with the name repository_workspace-timestamp.jobNumber. BackupChain saves each state (STARTING, WAITING, WORKING, FINISHED) of its Jobs in the BackupChainLog, which has a current result full file path.

BackupChain log file and job result files are a whole and consistent unit, that is a source for a Restore.

Note

BackupChain log contains absolute paths to job result files. Don't move these files to another location.

Restore requirements

As mentioned before a Restore operation is a mirror of a Backup. The process is a Full restore of a root node with restoring an additional Incremental backup to reach a desired workspace state. Restoring of the workspace Full backup will create a new workspace in the repository using given RepositoyEntry of existing repository and given (preconfigured) WorkspaceEntry for a new target workspace. A Restore process will restore a root node from the SysView XML data.

Note

The target workspace should not be in the repository. Otherwise, a BackupConfigurationException exception will be thrown.

Finally, we may say that Restore is a process of a new Workspace creation and filling it with a Backup content. In case you already have a target Workspace (with the same name) in a Repository, you have to configure a new name for it. If no target workspace exists in the Repositor, you may use the same name as the Backup one.

1.36.3. Configuration

As an optional extension, the Backup service is not enabled by default. You need to enable it via configuration.

The following is an example configuration :

<component>
  <key>org.exoplatform.services.jcr.ext.backup.BackupManager</key>
  <type>org.exoplatform.services.jcr.ext.backup.impl.BackupManagerImpl</type>
  <init-params>
    <properties-param>
      <name>backup-properties</name>
      <property name="backup-dir" value="target/backup" />
    </properties-param>
  </init-params>
</component>

Where mandatory paramet is:

backup-dir : The path to a working directory where the service will store internal files and chain logs.

Also, there are optional parameters:

incremental-backup-type : The FQN of incremental job class. Must implement org.exoplatform.services.jcr.ext.backup.BackupJob. By default : org.exoplatform.services.jcr.ext.backup.impl.fs.FullBackupJob used.
default-incremental-job-period : The period between incremetal flushes (in seconds). Default is 3600 seconds.
full-backup-type : The FQN of the full backup job class; Must implement org.exoplatform.services.jcr.ext.backup.BackupJob. By default : org.exoplatform.services.jcr.ext.backup.impl.rdbms.FullBackupJob used. Please, notice that file-system based implementation org.exoplatform.services.jcr.ext.backup.impl.fs.FullBackupJob is deprecated and not recommended for use.

Note

The number of rows that should be fetched from the database during backup operation, can be changed thanks to the System property exo.jcr.component.ext.FullBackupJob.fetch-size. The default value of this parameter is 1000.

1.36.4. RDBMS backup

RDBMS backup It is the lastest, currently supportedm used by default and recommended implementation of full backup job for BackupManager service. It is useful in case when database is used to store data.

Brings such advantages:

fast: backup takes only several minutes to perform full backup of repository with 1 million rows in tables;
atomic restore: restore process into existing workspace/repository with same configuration is atomic, it means you don’t loose the data when restore failed, the original data remains;
cluster aware: it is possible to make backup/restore in cluster environment into existing workspace/repository with same configuration;
consistence backup: all threads make waiting until backup is finished and then continue to work, so, there are no data modification during backup process;

1.36.5. Usage

1.36.5.1. Performing a Backup

In the following example, we create a BackupConfig bean for the Full + Incrementals mode, then we ask the BackupManager to start the backup process.

// Obtaining the backup service from the eXo container.
BackupManager backup = (BackupManager) container.getComponentInstanceOfType(BackupManager.class);

// And prepare the BackupConfig instance with custom parameters. 
// full backup & incremental
File backDir = new File("/backup/ws1"); // the destination path for result files
backDir.mkdirs();

BackupConfig config = new BackupConfig();
config.setRepository(repository.getName());
config.setWorkspace("ws1");
config.setBackupDir(backDir);

// Before 1.9.3, you also need to indicate the backupjobs class FDNs
// config.setFullBackupType("org.exoplatform.services.jcr.ext.backup.impl.fs.FullBackupJob");
// config.setIncrementalBackupType("org.exoplatform.services.jcr.ext.backup.impl.fs.IncrementalBackupJob");

// start backup using the service manager
BackupChain chain = backup.startBackup(config);

To stop the backup operation, you have to use the BackupChain instance.

// stop backup
backup.stopBackup(chain);

1.36.5.2. Performing a Restore

Restoration involves reloading the backup file into a BackupChainLog and applying appropriate workspace initialization. The following snippet shows the typical sequence for restoring a workspace :

// find BackupChain using the repository and workspace names (return null if not found)
BackupChain chain = backup.findBackup("db1", "ws1");

// Get the RepositoryEntry and WorkspaceEntry
ManageableRepository repo = repositoryService.getRepository(repository);
RepositoryEntry repoconf = repo.getConfiguration();
List<WorkspaceEntry> entries = repoconf.getWorkspaceEntries();
WorkspaceEntry = getNewEntry(entries, workspace); // create a copy entry from an existing one

// restore backup log using ready RepositoryEntry and WorkspaceEntry
File backLog = new File(chain.getLogFilePath());
BackupChainLog bchLog = new BackupChainLog(backLog);

// initialize the workspace
repository.configWorkspace(workspaceEntry);

// run restoration
backup.restore(bchLog, repositoryEntry, workspaceEntry);

1.36.5.2.1. Restoring into an existing workspace

Note

These instructions only applies to regular workspace. Special instructions are provided for System workspace below.

To restore a backup over an existing workspace, you are required to clear its data. Your backup process should follow these steps:

Remove workspace

ManageableRepository repo = repositoryService.getRepository(repository);
repo.removeWorkspace(workspace);

Clean database, value storage, index
Restore (see snippet above)

1.36.5.2.2. System workspace

Note

The BackupWorkspaceInitializer is available in JCR 1.9 and later.

Restoring the JCR System workspace requires to shutdown the system and use of a special initializer.

Follow these steps (this will also work for normal workspaces):

Stop repository (or portal)
Clean database, value storage, index;
In configuration, the workspace set BackupWorkspaceInitializer to refer to your backup.

Repository and Workspace initialization from backup can use the BackupWorkspaceInitializer.

<workspaces>
  <workspace name="production" ... >
    <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
      ...
    </container>
    <initializer class="org.exoplatform.services.jcr.impl.core.BackupWorkspaceInitializer">
      <properties>
         <property name="restore-path" value="D:\java\exo-working\backup\repository_production-20090527_030434"/>
      </properties>
   </initializer>
    ...
</workspace>

Start repository (or portal).

1.36.5.3. Repository and Workspace initialization from backup

Will be configured BackupWorkspaceInitializer in configuration of workspace to restore the Workspace from backup over initializer.

Will be configured BackupWorkspaceInitializer in all configurations workspaces of the Repository to restore the Repository from backup over initializer.

Restoring the repository or workspace requires to shutdown the repository.

Follow these steps:

Stop repository (will be skipped this step if repository or workace is not exists)
Clean database, value storage, index; (will be skipped this step if repository or worksace is new)
In configuration, the workspace/-s set BackupWorkspaceInitializer to refer to your backup.
Start repository

Example of configuration initializer to restore workspace "backup" over BackupWorkspaceInitializer:

<workspaces>
  <workspace name="backup" ... >
    <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
      ...
    </container>
    <initializer class="org.exoplatform.services.jcr.impl.core.BackupWorkspaceInitializer">
      <properties>
         <property name="restore-path" value="D:\java\exo-working\backup\repository_backup-20110120_044734"/>
      </properties>
   </initializer>
    ...
</workspace>

1.36.5.3.1. Restore the Workspace over BackupWorksaceInitializer

Example of configuration initializer to resore the workspace "backup" over BackupWorkspaceInitializer:

Stop repository (will be skipped this step if workspace is not exists)
Clean database, value storage, index; (will be skipped this step if workspace is new)

In configuration, the workspace/-s set BackupWorkspaceInitializer to refer to your backup.

<workspaces>
  <workspace name="backup" ... >
    <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
      ...
    </container>
    <initializer class="org.exoplatform.services.jcr.impl.core.BackupWorkspaceInitializer">
      <properties>
         <property name="restore-path" value="D:\java\exo-working\backup\repository_backup-20110120_044734"/>
      </properties>
   </initializer>
    ...
</workspace>

Start repository

1.36.5.3.2. Restore the Repository over BackupWorksaceInitializer

Example of configuration initializers to restore the repository "repository" over BackupWorkspaceInitializer:

Stop repository (will be skipped this step if repository is not exists)
Clean database, value storage, index; (will be skipped this step if repository is new)

In configuration of repository will be configured initializers of workspace to refer to your backup.

The resore of existing workspace or repositry is available.

...
<workspaces>
 <workspace name="system" ... >
  <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
  ...
  </container>
  <initializer class="org.exoplatform.services.jcr.impl.core.BackupWorkspaceInitializer">
   <properties>
    <property name="restore-path" value="D:\java\exo-working\backup\repository_system-20110120_052334"/>
   </properties>
  </initializer>
  ...
 </workspace>

 <workspace name="collaboration" ... >
   <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
   ...
  </container>
  <initializer class="org.exoplatform.services.jcr.impl.core.BackupWorkspaceInitializer">
   <properties>
    <property name="restore-path" value="D:\java\exo-working\backup\repository_collaboration-20110120_052341"/>
   </properties>
  </initializer>
  ...
 </workspace>

 <workspace name="backup" ... >
  <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
  ...
  </container>

  <initializer class="org.exoplatform.services.jcr.impl.core.BackupWorkspaceInitializer">
   <properties>
    <property name="restore-path" value="D:\java\exo-working\backup\repository_backup-20110120_052417"/>
   </properties>
  </initializer>
  ...
  </workspace>
</workspaces>

Start repository.

1.36.6. Restore existing workspace or repository

For restore will be used spacial methods:

 /**
    * Restore existing workspace. Previous data will be deleted.
    * For getting status of workspace restore can use 
    * BackupManager.getLastRestore(String repositoryName, String workspaceName) method 
    * 
    * @param workspaceBackupIdentifier
    *          backup identifier
    * @param workspaceEntry
    *          new workspace configuration
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread)
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred
    */
   void restoreExistingWorkspace(String workspaceBackupIdentifier, String repositoryName, WorkspaceEntry workspaceEntry,
      boolean asynchronous) throws BackupOperationException, BackupConfigurationException;

   /**
    * Restore existing workspace. Previous data will be deleted.
    * For getting status of workspace restore use can use 
    * BackupManager.getLastRestore(String repositoryName, String workspaceName) method 
    * 
    * @param log
    *          workspace backup log
    * @param workspaceEntry
    *          new workspace configuration
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread)
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred
    */
   void restoreExistingWorkspace(BackupChainLog log, String repositoryName, WorkspaceEntry workspaceEntry, boolean asynchronous)  throws BackupOperationException, BackupConfigurationException;

   /**
    * Restore existing repository. Previous data will be deleted.
    * For getting status of repository restore can use 
    * BackupManager.getLastRestore(String repositoryName) method 
    * 
    * @param repositoryBackupIdentifier
    *          backup identifier
    * @param repositoryEntry
    *          new repository configuration
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread)
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred
    */
   void restoreExistingRepository(String  repositoryBackupIdentifier, RepositoryEntry repositoryEntry, boolean asynchronous)  throws BackupOperationException, BackupConfigurationException;

   /**
    * Restore existing repository. Previous data will be deleted.
    * For getting status of repository restore can use 
    * BackupManager.getLastRestore(String repositoryName) method 
    * 
    * @param log
    *          repository backup log
    * @param repositoryEntry
    *          new repository configuration
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread)
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred
    */
   void restoreExistingRepository(RepositoryBackupChainLog log, RepositoryEntry repositoryEntry, boolean asynchronous)
      throws BackupOperationException, BackupConfigurationException;

These methods for restore will do:

remove existed workspace or repository;
clean database;
clean index data;
clean value storage;
restore from backup.

1.36.7. Restore a workspace or a repository using original configuration

The Backup manager allows you to restore a repository or a workspace using the original configuration stored into the backup log:

/**
    * Restore existing workspace. Previous data will be deleted.
    * For getting status of workspace restore can use 
    * BackupManager.getLastRestore(String repositoryName, String workspaceName) method
    * WorkspaceEntry for restore should be contains in BackupChainLog. 
    * 
    * @param workspaceBackupIdentifier
    *          identifier to workspace backup. 
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread) 
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred 
    */
   void restoreExistingWorkspace(String workspaceBackupIdentifier, boolean asynchronous)
            throws BackupOperationException,
            BackupConfigurationException;

   /**
    * Restore existing repository. Previous data will be deleted.
    * For getting status of repository restore can use 
    * BackupManager.getLastRestore(String repositoryName) method.
    * ReprositoryEntry for restore should be contains in BackupChainLog. 
    * 
    * @param repositoryBackupIdentifier
    *          identifier to repository backup.   
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread)
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred
    */
   void restoreExistingRepository(String repositoryBackupIdentifier, boolean asynchronous)
            throws BackupOperationException,
            BackupConfigurationException;

   /**
    * WorkspaceEntry for restore should be contains in BackupChainLog. 
    * 
    * @param workspaceBackupIdentifier
    *          identifier to workspace backup. 
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread) 
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred 
    */
   void restoreWorkspace(String workspaceBackupIdentifier, boolean asynchronous) throws BackupOperationException,
            BackupConfigurationException;

   /**
    * ReprositoryEntry for restore should be contains in BackupChainLog. 
    * 
    * @param repositoryBackupIdentifier
    *          identifier to repository backup.   
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread)
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred
    */
   void restoreRepository(String repositoryBackupIdentifier, boolean asynchronous) throws BackupOperationException,
            BackupConfigurationException;

    /**
    * Restore existing workspace. Previous data will be deleted.
    * For getting status of workspace restore can use 
    * BackupManager.getLastRestore(String repositoryName, String workspaceName) method
    * WorkspaceEntry for restore should be contains in BackupChainLog. 
    * 
    * @param workspaceBackupSetDir
    *          the directory with backup set  
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread) 
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred 
    */
   void restoreExistingWorkspace(File workspaceBackupSetDir, boolean asynchronous)
            throws BackupOperationException, BackupConfigurationException;

   /**
    * Restore existing repository. Previous data will be deleted.
    * For getting status of repository restore can use 
    * BackupManager.getLastRestore(String repositoryName) method.
    * ReprositoryEntry for restore should be contains in BackupChainLog. 
    * 
    * @param repositoryBackupSetDir
    *          the directory with backup set     
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread)
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred
    */
   void restoreExistingRepository(File repositoryBackupSetDir, boolean asynchronous)
            throws BackupOperationException, BackupConfigurationException;

   /**
    * WorkspaceEntry for restore should be contains in BackupChainLog. 
    * 
    * @param workspaceBackupSetDir
    *          the directory with backup set 
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread) 
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred 
    */
   void restoreWorkspace(File workspaceBackupSetDir, boolean asynchronous) throws BackupOperationException,
            BackupConfigurationException;

   /**
    * ReprositoryEntry for restore should be contains in BackupChainLog. 
    * 
    * @param repositoryBackupSetDir
    *          the directory with backup set   
    * @param asynchronous
    *          if 'true' restore will be in asynchronous mode (i.e. in separated thread)
    * @throws BackupOperationException
    *           if backup operation exception occurred 
    * @throws BackupConfigurationException
    *           if configuration exception occurred
    */
   void restoreRepository(File repositoryBackupSetDir, boolean asynchronous) throws BackupOperationException,
            BackupConfigurationException;

1.36.8. Backup set portability

The Backup log is stored during the Backup operation into two different locations: backup-dir directory of BackupService to support interactive operations via Backup API (e.g. console) and backup set files for portability (e.g. on another server).

1.36.9. DB type migration

You can use backup/restore mechanism to migrate between different DB types configuration. Currently three DB types supported (single, multi, isolated) and you can migrate between each of them.

To accomplish migration you simply need to set desired DB type in the repository configuration file of backup set. It is highly recommended to make backup at the DB level before starting the migration process.

Note

After migration process, due to different DB structures, there can remain some unnecessary DB tables, which can be removed safetly.

Before starting migrating the data of your JCR from single/multi data format to isolated data format, you need to have the backupconsole.

See the Building application section for more details.

Or you can download it from ow2 directly.

1.36.9.1. Migrate JCR from single data format to isolated data format

Enable the Backup service

See the Configuration Backup service section for details.

Create a full backup

For example:

             jcrbackup.cmd http://root:exo@localhost:8080/rest start /repository

Return

             Successful :
             status code = 200

Get the backup id

You need get the backup id used in restore action.

For example:

             jcrbackup http://root:exo@localhost:8080 list completed

Return

             The completed (ready to restore) backups information :
             1) Repository backup with id 5dcbc851c0a801c9545eb434947dbe87 :
             repository name           : repository
             backup type               : full only
             started time              : lun., 21 janv. 2013 16:48:21 GMT+01:00
             finished time             : lun., 21 janv. 2013 16:48:25 GMT+01:00

The backup id: 5dcbc851c0a801c9545eb434947dbe87

See the Backup Client Usage section for more details.

Set desired DB type in the repository configuration file of backup

Change db-structure-type to isolated.

For example: In original-repository-config :

              exo-tomcat\temp\backup\repository_repository_backup_1358783301705\original-repository-config

replace

             <property name="db-structure-type" value="single"/>

             <property name="db-structure-type" value="isolated"/>

This change must be done for all workspaces.

Activate the persister config

Before starting the restore operation, ensure that the persister is configured to save the changes of the repository configuration.

If it's not activated, it should be configured, See the JCR Configuration persister section for more details.

Restore repository with original configuation and remove exists

For example:

             jcrbackup.cmd http://root:exo@localhost:8080/rest restore remove-exists 5dcbc851c0a801c9545eb434947dbe87

Return

             Successful :
             status code = 200

Drop the old tables with the old data format

             drop table JCR_SREF;
             drop table JCR_SVALUE;
             drop table JCR_SITEM;

1.36.9.2. Migrate JCR from multi data format to isolated data format

Enable the Backup service

See the Configuration Backup service section for details.

Create a full backup

For example:

              jcrbackup.cmd http://root:exo@localhost:8080/rest start /repository

Return

              Successful :
              status code = 200

Get the backup id

You need get the backup id to launch the restore action.

For example:

              jcrbackup http://root:exo@localhost:8080 list completed

Return

              The completed (ready to restore) backups information :
              1) Repository backup with id 5dcbc851c0a801c9545eb434947dbe87 :
              repository name           : repository
              backup type               : full only
              started time              : lun., 21 janv. 2013 16:48:21 GMT+01:00
              finished time             : lun., 21 janv. 2013 16:48:25 GMT+01:00

The backup id: 5dcbc851c0a801c9545eb434947dbe87

See the Backup Client Usage section for more details.

Set desired DB type in the repository configuration file of backup

Change db-structure-type to isolated.

For example: In original-repository-config :

              exo-tomcat\temp\backup\repository_repository_backup_1358783301705\original-repository-config

replace

              <property name="db-structure-type" value="multi"/>

              <property name="db-structure-type" value="isolated"/>

This change must be done for all workspaces.

Configure the datasource name used for the isolated mode

Make sure that in your repository configuration all the workspaces of a same repository share the same datasource.

Activate the persister config

Before starting the restore operation, ensure that the persister is configured to save the changes of the repository configuration.

If it's not activated, it should be configured, See the JCR Configuration persister section for more details.

Restore repository with original configuation and remove exists

For example:

              jcrbackup.cmd http://root:exo@localhost:8080/rest restore remove-exists 5dcbc851c0a801c9545eb434947dbe87

Return

              Successful :
              status code = 200

Drop the old tables with the old data format

              drop table JCR_MREF;
              drop table JCR_MVALUE;
              drop table JCR_MITEM;

1.37. HTTPBackupAgent and backup client

Warning

For this service, you should configure the org.exoplatform.services.jcr.impl.config.JDBCConfigurationPersister in order to save the changes of the repository configuration. See the eXo JCR Configuration article at the 'Portal and Standalone configuration' section.

GateIn uses context /portal/rest, therefore you need to use http://host:port/portal/rest/ instread of http://host:port/rest/

GateIn uses form authentication, so first you need to login (url to form authentication is http://host:port/portal/login) and then perform requests.

The service org.exoplatform.services.jcr.ext.backup.server.HTTPBackupAgent is REST-based front-end to service org.exoplatform.services.jcr.ext.backup.BackupManager. HTTPBackupAgent is representation BackupManager to creation backup, restore, getting status of current or completed backup/restore, etc.

The backup client is http client for HTTPBackupAgent.

1.37.1. HTTPBackupAgent

The HTTPBackupAgent is based on REST (see details about the REST Framework).

HTTPBackupAgent is using POST and GET methods for request.

The HTTPBackupAgent allows :

Start backup
Stop backup
Restore from backup
Delete the workspace
Get information about backup service (BackupManager)
Get information about current backup / restores / completed backups

1.37.1.1. HTTPBackupAgent methods

1.37.1.1.1. Starting Backup Service

/rest/jcr-backup/start/{repo}/{ws}

Start backup on specific workspace

URL: http://host:port/rest/jcr-backup/start/{repo}/{ws}

Formats: json.

Method: POST

The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.BackupConfigBean :

{repo} - the repository name;
{ws} - the workspace name;
BackupConfigBean - the JSON to BackupConfigBean.

The BackupConfigBean:

header :
"Content-Type" = "application/json; charset=UTF-8"

body:
<JSON to BackupConfigBean>

{"incrementalRepetitionNumber":<Integer>,"incrementalBackupJobConfig":<JSON to BackupJobConfig>,
"backupType":<Integer>,"fullBackupJobConfig":<JSON to BackupJobConfig>,
"incrementalJobPeriod":<Long>,"backupDir":"<String>"}

Where :

backupType                  - the type of backup:
                                  0 - full backup only;
                                  1 - full and incremental backup.
backupDir                   - the path to backup folder;
incrementalJobPeriod        - the incremental job period;
incrementalRepetitionNumber - the incremental repetition number;
fullBackupJobConfig         - the configuration to full backup, JSON to BackupJobConfig;
incrementalJobPeriod        - the configuration to incremental backup, JSON to BackupJobConfig.

The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.response.BackupJobConfig :

{"parameters":[<JSON to Pair>, ..., <JSON to pair> ],"backupJob":"<String>"}

The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.response.Pair :

backupJob  - the FQN (fully qualified name) to BackupJob class;
parameters - the list of JSON of Pair.

{"name":"<String>","value":"<String>"}

name  - the name of parameter;
value - the value of parameter.

Return when being successful
```
status code = 200
```

/rest/jcr-backup/stop/{id}

status code = 404            - the not found repositry '{repo}' or workspace '{ws}'
status code = 500            - the other unknown errors
failure message in response  - the description of failure

1.37.1.1.2. Stopping Backup Service

Stop backup with identifier {id}.

URL: http://host:port/rest/jcr-backup/stop/{id}

Formats: plain text

{id} - the identifier of backup

Return when being successful
```
status code = 200
```

/rest/jcr-backup/info

status code = 404            - the no active backup with identifier {id}
status code = 500            - the other unknown errors
failure message in response  - the description of failure

1.37.1.1.3. Backup Info Service

Information about the backup service.

URL: http://host:port/rest/jcr-backup/info

Return the JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.response.BackupServiceInfoBean :

{"backupLogDir":"<String>","defaultIncrementalJobPeriod":<Long>,"fullBackupType":"<String>","incrementalBackupType":"<String>"}

fullBackupType              - the FQN (fully qualified name) of BackupJob class for full backup type;
incrementalBackupType       - the FQN (fully qualified name) of BackupJob class for incremental backup type;
backupLogDir                - path to backup folder;
defaultIncrementalJobPeriod - the default incremental job period.

/rest/jcr-backup/drop-workspace/{repo}/{ws}/{force-session-close}

status code = 500            - the unknown error
failure message in response  - the description of failure

1.37.1.1.4. Dropping Workspace Service

Delete the workspace from repository /{repo}/{ws}. With this service, you can delete any workspaces regardless of whether the workspace is a backup or has been copied to a backup.

URL: http://host:port/rest/jcr-backup/drop-workspace/{repo}/{ws}/{force-session-close}

Formats: plain text

{repo} - the repository name;
{ws} - the workspace name;
{force-session-close} - the boolean value : true - the open sessions on workspace will be closed; false - will not close open sessions.

Return when being successful.
```
status code = 200
```

/rest/jcr-backup/info/backup

status code = 500            - the other unknown errors;
                             - not found repositry '{repo}' or workspace '{ws}'
failure message in response  - the description of failure

1.37.1.1.5. Backup Info

Information about the current and completed backups

URL: http://host:port/rest/jcr-backup/info/backup

The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.response.ShortInfoList :

{"backups":[<JSON to ShortInfo>,<JSON to ShortInfo>,...,<JSON to ShortInfo>]}

The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.response.ShortInfo :

{"startedTime":"<String>","backupId":"<String>","type":<Integer>,"state":<Integer>,"backupType":<Integer>,
"workspaceName":"<String>","finishedTime":"<String>","repositoryName":"<String>"}

type           - the type of ShortInfo :
                   0 - the ShorInfo to completed backup;
                  -1 - the ShorInfo to current (active) backup.
                   1 - the ShorInfo to current restore.
backupType     - the type of backup:
                   0 - full backup only;
                   1 - full and incremental backup.
backupId       - the identifier of backup;
workspaceName  - the name of workspace;
repositoryName - the name of repository.
startedTime    - the date of started backup. The date in format RFC 1123 (for examle "Thu, 16 Apr 2009 14:56:49 EEST").

The ShorInfo to current (active) backup :
  finishedTime - no applicable, always an empty string ("");
  state        - the state of full backup :
                   0 - starting;
                   1 - waiting;
                   2 - working;
                   4 - finished.

The ShorInfo to completed backup :
  finishedTime - the date of finished backup. The date in format RFC 1123;
  state        - no applicable, always zero (0).    

The ShorInfo to current restore :
  finishedTime - the date of finished backup. The date in format RFC 1123;
  state        - the state of restore :
                   1 - started;
                   2 - successful;
                   3 - failure;
                   4 - initialized.

/rest/jcr-backup/info/backup/current Information about the current backups

status code = 500            - the unknown error
failure message in response  - the description of failure

1.37.1.1.6. Current Backups Information

URL: http://host:port/rest/jcr-backup/info/backup/current

Return when being successful
The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.response.ShortInfoList (see item /rest/jcr-backup/info/backup)

/rest/jcr-backup/info/backup/completed Information about the completed backups.

status code = 500            - the unknown error
failure message in response  - the description of failure

1.37.1.1.7. Completed Backups Information

URL: http://host:port/rest/jcr-backup/info/backup/completed

Return when being successful
The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.response.ShortInfoList (see item /rest/jcr-backup/info/backup)

/rest/jcr-backup/info/backup/{repo}/{ws} Information about the current and completed backups for specific workspace.

status code = 500            - the unknown error
failure message in response  - the description of failure

1.37.1.1.8. Workspace-specific Backup Information

URL: http://host:port/rest/jcr-backup/info/backup/{repo}/{ws}

Formats: json

{repo} - the repository name
{ws} - the workspace name

Return when being successful
The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.response.ShortInfoList (see item /rest/jcr-backup/info/backup)

/rest/jcr-backup/info/backup/{id} Detailed information about a current or completed backup with identifier '{id}'.

status code = 500            - the unknown error
failure message in response  - the description of failure

1.37.1.1.9. Single Backup Information

URL: http://host:port/rest/jcr-backup/info/backup/{id}

Formats: json

{id} - the identifier of backup

The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.response.DetailedInfo :

{"backupConfig":<JSON to BackupConfigBean>,"startedTime":"<String>","backupId":"<String>","type":<Integer>,
"state":<Integer>,"backupType":<Integer>,"workspaceName":"<String>","finishedTime":"<String>",
"repositoryName":"<String>"}

The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.BackupConfigBean (see item /rest/jcr-backup/start/{repo}/{ws}).

type           - the type of DetailedInfo :
                   0 - the DetailedInfo to completed backup;
                  -1 - the DetailedInfo to current (active) backup;
                   1 - the DetailedInfo to restore.
backupType     - the type of backup:
                   0 - full backup only;
                   1 - full and incremental backup.
backupId       - the identifier of backup;
workspaceName  - the name of workspace;
repositoryName - the name of repository;
backupConfig   - the JSON to BackupConfigBean.

The DetailedInfo to current (active) backup :
  startedTime  - the date of started backup. The date in format RFC 1123 (for examle "Thu, 16 Apr 2009 14:56:49 EEST");
  finishedTime - no applicable, always an empty string ("");
  state        - the state of full backup :
                   0 - starting;
                   1 - waiting;
                   2 - working;
                   4 - finished.
The DetailedInfo to completed backup :
  startedTime  - the date of started backup. The date in format RFC 1123 (for examle "Thu, 16 Apr 2009 14:56:49 EEST");
  finishedTime - the date of finished backup. The date in format RFC 1123;
  state        - no applicable, always zero (0).

The DetailedInfo to restore :
  startedTime  - the date of started restore. The date in format RFC 1123 (for examle "Thu, 16 Apr 2009 14:56:49 EEST");
  finishedTime - the date of finished restore;
  state        - the state of restore :
                   1 - started;
                   2 - successful;
                   3 - failure;
                   4 - initialized.

/rest/jcr-backup/info/restore/{repo}/{ws} The information about the last restore on a specific workspace /{repo}/{ws}.

status code = 404            - not found the backup with {id}
status code = 500            - the unknown error
failure message in response  - the description of failure

1.37.1.1.10. Restores on a Workspace Information

URL: http://host:port/rest/jcr-backup/info/restore/{repo}/{ws}

Formats: json

{repo} - the repository name
{ws} - the workspace name

Return when being successful
The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.response.DetailedInfo (see item /rest/jcr-backup/info/backup/{id})

/rest/jcr-backup/info/restores

status code = 404            - the not found the restore for workspace /{repo}/{ws}
status code = 500            - the unknown error
failure message in response  - the description of failure

1.37.1.1.11. Restores Information

The information about the last restores.

URL: http://host:port/rest/jcr-backup/info/restores

Return when being successful
The JSON bean of org.exoplatform.services.jcr.ext.backup.server.bean.response.ShortInfoList (see item /rest/jcr-backup/info/backup)

/rest/jcr-backup/restore/{repo}/{id}

status code = 500            - the unknown error
failure message in response  - the description of failure

1.37.1.1.12. Restoring Service

Restore the workspace from specific backup.

URL: http://host:port/rest/jcr-backup/restore/{repo}/{id}

Formats: json.

Method: POST

The example of JSON bean to org.exoplatform.services.jcr.config.WorkspaceEntry :

{repo} - the repository name;
{id} - the identifier to backup; * WorkspaceEntry - the JSON to WorkspaceEntry.

The RestoreBean:

header :
"Content-Type" = "application/json; charset=UTF-8"

body:
<JSON to WorkspaceEntry>

{ "accessManager" : null,
  "autoInitPermissions" : null,
  "autoInitializedRootNt" : null,
  "cache" : { "parameters" : [ { "name" : "max-size",
            "value" : "10k"
          },
          { "name" : "live-time",
            "value" : "1h"
          }
        ],
      "type" : "org.exoplatform.services.jcr.impl.dataflow.persistent.LinkedWorkspaceStorageCacheImpl"
    },
  "container" : { "parameters" : [ { "name" : "source-name",
            "value" : "jdbcjcr"
          },
          { "name" : "dialect",
            "value" : "hsqldb"
          },
          { "name" : "multi-db",
            "value" : "false"
          },
          { "name" : "max-buffer-size",
            "value" : "200k"
          },
          { "name" : "swap-directory",
            "value" : "../temp/swap/production"
          }
        ],
      "type" : "org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer",
      "valueStorages" : [ { "filters" : [ { "ancestorPath" : null,
                  "minValueSize" : 0,
                  "propertyName" : null,
                  "propertyType" : "Binary"
                } ],
            "id" : "system",
            "parameters" : [ { "name" : "path",
                  "value" : "../temp/values/production"
                } ],
            "type" : "org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage"
          } ]
    },
  "initializer" : { "parameters" : [ { "name" : "root-nodetype",
            "value" : "nt:unstructured"
          } ],
      "type" : "org.exoplatform.services.jcr.impl.core.ScratchWorkspaceInitializer"
    },
  "lockManager" : 
      "timeout" : 15728640
    },
  "name" : "production",
  "queryHandler" : { "analyzer" : {  },
      "autoRepair" : true,
      "bufferSize" : 10,
      "cacheSize" : 1000,
      "documentOrder" : true,
      "errorLogSize" : 50,
      "excerptProviderClass" : "org.exoplatform.services.jcr.impl.core.query.lucene.DefaultHTMLExcerpt",
      "excludedNodeIdentifers" : null,
      "extractorBackLogSize" : 100,
      "extractorPoolSize" : 0,
      "extractorTimeout" : 100,
      "indexDir" : "../temp/jcrlucenedb/production",
      "indexingConfigurationClass" : "org.exoplatform.services.jcr.impl.core.query.lucene.IndexingConfigurationImpl",
      "indexingConfigurationPath" : null,
      "maxFieldLength" : 10000,
      "maxMergeDocs" : 2147483647,
      "mergeFactor" : 10,
      "minMergeDocs" : 100,
      "parameters" : [ { "name" : "index-dir",
            "value" : "../temp/jcrlucenedb/production"
          } ],
      "queryClass" : "org.exoplatform.services.jcr.impl.core.query.QueryImpl",
      "queryHandler" : null,
      "resultFetchSize" : 2147483647,
      "rootNodeIdentifer" : "00exo0jcr0root0uuid0000000000000",
      "spellCheckerClass" : null,
      "supportHighlighting" : false,
      "synonymProviderClass" : null,
      "synonymProviderConfigPath" : null,
      "type" : "org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex",
      "useCompoundFile" : false,
      "volatileIdleTime" : 3
    },
  "uniqueName" : "repository_production"
}

Return when being successful
```
status code = 200
```
Return the JSON bean org.exoplatform.services.jcr.ext.backup.server.bean.response.ShortInfo of just started restore. For JSON description see item /rest/jcr-backup/info/backup

/rest/jcr-backup/info/default-ws-config Will be returned the JSON bean to WorkspaceEntry for default workspace.

status code = 403            - the already was restore to workspace /{repo}/{ws}
status code = 404            - the not found repositry '{repo}' or unsupported encoding to workspaceConfig
status code = 500            - the other unknown errors
failure message in response  - the description of failure

1.37.1.1.13. Default Workspace Information

URL: http://host:port/rest/jcr-backup/info/default-ws-config

The JSON bean to org.exoplatform.services.jcr.config.WorkspaceEntry :

{ "accessManager" : null,
  "autoInitPermissions" : null,
  "autoInitializedRootNt" : null,
  "cache" : { "parameters" : [ { "name" : "max-size",
            "value" : "10k"
          },
          { "name" : "live-time",
            "value" : "1h"
          }
        ],
      "type" : "org.exoplatform.services.jcr.impl.dataflow.persistent.LinkedWorkspaceStorageCacheImpl"
    },
  "container" : { "parameters" : [ { "name" : "source-name",
            "value" : "jdbcjcr"
          },
          { "name" : "dialect",
            "value" : "hsqldb"
          },
          { "name" : "multi-db",
            "value" : "false"
          },
          { "name" : "max-buffer-size",
            "value" : "200k"
          },
          { "name" : "swap-directory",
            "value" : "../temp/swap/production"
          }
        ],
      "type" : "org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer",
      "valueStorages" : [ { "filters" : [ { "ancestorPath" : null,
                  "minValueSize" : 0,
                  "propertyName" : null,
                  "propertyType" : "Binary"
                } ],
            "id" : "system",
            "parameters" : [ { "name" : "path",
                  "value" : "../temp/values/production"
                } ],
            "type" : "org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage"
          } ]
    },
  "initializer" : { "parameters" : [ { "name" : "root-nodetype",
            "value" : "nt:unstructured"
          } ],
      "type" : "org.exoplatform.services.jcr.impl.core.ScratchWorkspaceInitializer"
    },
  "lockManager" :
      "timeout" : 15728640
    },
  "name" : "production",
  "queryHandler" : { "analyzer" : {  },
      "autoRepair" : true,
      "bufferSize" : 10,
      "cacheSize" : 1000,
      "documentOrder" : true,
      "errorLogSize" : 50,
      "excerptProviderClass" : "org.exoplatform.services.jcr.impl.core.query.lucene.DefaultHTMLExcerpt",
      "excludedNodeIdentifers" : null,
      "extractorBackLogSize" : 100,
      "extractorPoolSize" : 0,
      "extractorTimeout" : 100,
      "indexDir" : "../temp/jcrlucenedb/production",
      "indexingConfigurationClass" : "org.exoplatform.services.jcr.impl.core.query.lucene.IndexingConfigurationImpl",
      "indexingConfigurationPath" : null,
      "maxFieldLength" : 10000,
      "maxMergeDocs" : 2147483647,
      "mergeFactor" : 10,
      "minMergeDocs" : 100,
      "parameters" : [ { "name" : "index-dir",
            "value" : "../temp/jcrlucenedb/production"
          } ],
      "queryClass" : "org.exoplatform.services.jcr.impl.core.query.QueryImpl",
      "queryHandler" : null,
      "resultFetchSize" : 2147483647,
      "rootNodeIdentifer" : "00exo0jcr0root0uuid0000000000000",
      "spellCheckerClass" : null,
      "supportHighlighting" : false,
      "synonymProviderClass" : null,
      "synonymProviderConfigPath" : null,
      "type" : "org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex",
      "useCompoundFile" : false,
      "volatileIdleTime" : 3
    },
  "uniqueName" : "repository_production"
}

Add the components org.exoplatform.services.jcr.ext.backup.server.HTTPBackupAgent and org.exoplatform.services.jcr.ext.backup.BackupManager to services configuration :

status code = 500            - the unknown error
failure message in response  - the description of failure

1.37.1.2. HTTPBackupAgent Configuration

<component>
  <type>org.exoplatform.services.jcr.ext.backup.server.HTTPBackupAgent</type>
</component>

<component>
  <type>org.exoplatform.services.jcr.ext.repository.RestRepositoryService</type>
</component>

<component>
  <key>org.exoplatform.services.jcr.ext.backup.BackupManager</key>
  <type>org.exoplatform.services.jcr.ext.backup.impl.BackupManagerImpl</type>
  <init-params>
    <properties-param>
      <name>backup-properties</name>
      <property name="backup-dir" value="../temp/backup" />
    </properties-param>
  </init-params>
</component>

In case, if you will restore backup in same workspace (so you will drop previous workspace), you need configure RepositoryServiceConfiguration in order to save the changes of the repository configuration. For example

<component>
  <key>org.exoplatform.services.jcr.config.RepositoryServiceConfiguration</key>
  <type>org.exoplatform.services.jcr.impl.config.RepositoryServiceConfigurationImpl</type>
  <init-params>
    <value-param>
      <name>conf-path</name>
      <description>JCR repositories configuration file</description>
      <value>jar:/conf/portal/exo-jcr-config.xml</value>
    </value-param>
    <properties-param>
      <name>working-conf</name>
      <description>working-conf</description>
      <property name="source-name" value="jdbcjcr" />
      <property name="dialect" value="hsqldb" />
      <property name="persister-class-name" value="org.exoplatform.services.jcr.impl.config.JDBCConfigurationPersister" />
    </properties-param>
  </init-params>
</component>

See the eXo JCR Configuration article at the 'Portal and Standalone configuration' section for details.

1.37.2. Backup Client

For GateIn should use context "/portal/rest". GateIn uses form authentication, so first you need to login (url to form authentication is http://host:port/portal/login) and then perform requests.

Backup client is support form authentication. For example call command "info" with form authentication to GateIn :

./jcrbackup.sh http://127.0.0.1:8080/portal/rest form POST "/portal/login?initialURI=/portal/private&username=root&password=gtn" info

Backup client is console application.

The backup client is http client for HTTPBackupAgent.

Command signature:

Help info:
 <url_basic_authentication>|<url form authentication>  <cmd> 
 <url_basic_authentication>  :   http(s)//login:password@host:port/<context> 

 <url form authentication>   :   http(s)//host:port/<context> "<form auth parm>" 
     <form auth parm>        :   form <method> <form path>
     <method>                :   POST or GET
     <form path>             :   /path/path?<paramName1>=<paramValue1>&<paramName2>=<paramValue2>...
     Example to <url form authentication> : http://127.0.0.1:8080/portal/rest form POST "/portal/login?initialURI=/portal/private&username=root&password=gtn"

 <cmd>  :   start <repo[/ws]> <backup_dir> [<incr>] 
            stop <backup_id> 
            status <backup_id> 
            restores <repo[/ws]> 
            restore [remove-exists] {{<backup_id>|<backup_set_path>} | {<repo[/ws]> {<backup_id>|<backup_set_path>} [<pathToConfigFile>]}} 
            list [completed] 
            info 
            drop [force-close-session] <repo[/ws]>  
            help  

 start          - start backup of repository or workspace 
 stop           - stop backup 
 status         - information about the current or completed backup by 'backup_id' 
 restores       - information about the last restore on specific repository or workspace 
 restore        - restore the repository or workspace from specific backup 
 list           - information about the current backups (in progress) 
 list completed - information about the completed (ready to restore) backups 
 info           - information about the service backup 
 drop           - delete the repository or workspace 
 help           - print help information about backup console 

 <repo[/ws]>         - /<reponsitory-name>[/<workspace-name>]  the repository or workspace 
 <backup_dir>        - path to folder for backup on remote server 
 <backup_id>         - the identifier for backup 
 <backup_set_dir>    - path to folder with backup set on remote server
 <incr>              - incemental job period 
 <pathToConfigFile>  - path (local) to  repository or workspace configuration 
 remove-exists       - remove fully (db, value storage, index) exists repository/workspace 
 force-close-session - close opened sessions on repository or workspace. 

 All valid combination of parameters for command restore: 
  1. restore remove-exists <repo/ws> <backup_id>       <pathToConfigFile> 
  2. restore remove-exists <repo>    <backup_id>       <pathToConfigFile> 
  3. restore remove-exists <repo/ws> <backup_set_path> <pathToConfigFile> 
  4. restore remove-exists <repo>    <backup_set_path> <pathToConfigFile> 
  5. restore remove-exists <backup_id> 
  6. restore remove-exists <backup_set_path> 
  7. restore <repo/ws> <backup_id>       <pathToConfigFile> 
  8. restore <repo>    <backup_id>       <pathToConfigFile> 
  9. restore <repo/ws> <backup_set_path> <pathToConfigFile> 
 10. restore <repo>    <backup_set_path> <pathToConfigFile> 
 11. restore <backup_id> 
 12. restore <backup_set_path>

1.37.3. Backup Client Usage

1.37.3.1. Building application

Go to folder of "backup client" ${JCR-SRC-HOME}/applications/exo.jcr.applications.backupconsole . - build the application :
```
mvn clean install -P deploy
```
Go to ${JCR-SRC-HOME}/applications/exo.jcr.applications.backupconsole/target/backupconsole-binary and use it.

Note

${JCR-SRC-HOME} the path where eXo JCR sources located

1.37.3.2. Running application

Run jar

java -jar exo.jcr.applications.backupconsole-binary.jar <command>

or use jcrbackup.cmd (or .sh);

1.37.3.3. Getting information about backup service

jcrbackup http://root:exo@127.0.0.1:8080 info

Start full backup only on workspace "backup", the parameter <bakcup_dir> (../temp/backup) should be exists:

The backup service information : 
  full backup type               : org.exoplatform.services.jcr.ext.backup.impl.fs.FullBackupJob
  incremetal backup type         : org.exoplatform.services.jcr.ext.backup.impl.fs.IncrementalBackupJob
  backup log folder              : /home/rainf0x/java/exo-working/JCR-839/new_JCR/exo-tomcat/bin/../temp/backup
  default incremental job period : 3600

1.37.3.4. Starting full backup

jcrbackup http://root:exo@127.0.0.1:8080 start /repository/backup ../temp/backup

Start full and incremental backup on workspace "production":

Successful : 
  status code = 200

1.37.3.5. Starting full and incremental backup on a single workspace

jcrbackup http://root:exo@127.0.0.1:8080 start /repository/production ../temp/backup 10000

Successful : 
 tatus code = 200

1.37.3.6. Getting information about the current backups (in progress)

jcrbackup http://root:exo@127.0.0.1:8080 list

The current backups information : 
  1) Backup with id b46370107f000101014b03ea5fbe8d54 :
    repository name           : repository
    workspace name            : production
    backup type               : full + incremetal
    full backup state         : finished
    incremental backup state  : working
    started time              : Fri, 17 Apr 2009 17:03:16 EEST
  2) Backup with id b462e4427f00010101cf243b4c6015bb :
    repository name           : repository
    workspace name            : backup
    backup type               : full only
    full backup state         : finished
    started time              : Fri, 17 Apr 2009 17:02:41 EEST

1.37.3.7. Getting information about the current backup by 'backup_id'

jcrbackup http://root:exo@127.0.0.1:8080 status b46370107f000101014b03ea5fbe8d54

return:

The current backup information : 
    backup id                : b46370107f000101014b03ea5fbe8d54
    backup folder            : /home/rainf0x/java/exo-working/JCR-839/new_JCR/exo-tomcat/bin/../temp/backup
    repository name          : repository
    workspace name           : production
    backup type              : full + incremetal
    full backup state        : finished
    incremental backup state : working
    started time             : Fri, 17 Apr 2009 17:03:16 EEST

1.37.3.8. Stopping backup by "backup_id"

jcrbackup http://root:exo@127.0.0.1:8080 stop 6c302adc7f00010100df88d29535c6ee

Successful : 
  status code = 200

1.37.3.9. Getting information about the completed (ready to restore) backups

jcrbackup http://root:exo@127.0.0.1:8080 list completed

Restore to workspace "backup3", for restore need the <backup_id> of completed backup and path to file with workspace configuration:

The completed (ready to restore) backups information : 
  1) Backup with id adf6fadc7f00010100053b2cba43513c :
    repository name           : repository
    workspace name            : backup
    backup type               : full only
    started time              : Thu, 16 Apr 2009 11:07:05 EEST

  2) Backup with id b46370107f000101014b03ea5fbe8d54 :
    repository name           : repository
    workspace name            : production
    backup type               : full + incremetal
    started time              : Fri, 17 Apr 2009 17:03:16 EEST

  3) Backup with id aec419cc7f000101004aca277b2b4e9f :
    repository name           : repository
    workspace name            : backup8
    backup type               : full only
    started time              : Thu, 16 Apr 2009 14:51:08 EEST

1.37.3.10. Restoring to workspace

jcrbackup http://root:exo@127.0.0.1:8080 restore /repository/backup3 6c302adc7f00010100df88d29535c6ee /home/rainf0x/java/exo-working/JCR-839/exo-jcr-config_backup3.xml

Get information about the current restore for workspace /repository/backup3:

Successful : 
  status code = 200

1.37.3.11. Getting information about the current restore

jcrbackup http://root:exo@127.0.0.1:8080 restores

Restore to workspace "backup" and remove fully (will be removed content from db, value storage, index) exists workspace, for restore need the <backup_id> of completed backup and path to file with workspace configuration:

The current restores information : 
1) Restore with id 6c302adc7f00010100df88d29535c6ee:
    full backup date        : 2009-04-03T16:34:37.394+03:00
    backup log file         : /home/rainf0x/java/exo-working/JCR-839/exo-tomcat/bin/../temp/backup/backup-6c302adc7f00010100df88d29535c6ee.xml
    repository name         : repository
    workspace name          : backup3
    backup type             : full only
    path to backup folder   : /home/rainf0x/java/exo-working/JCR-839/exo-tomcat/bin/../temp/backup
    restore state           : successful

1.37.3.12. Restoring workspace and remove exists workspace

jcrbackup http://root:exo@127.0.0.1:8080 restore remove-exists /repository/backup 6c302adc7f00010100df88d29535c6ee /home/rainf0x/java/exo-working/JCR-839/exo-jcr-config_backup.xml

Restore to workspace "backup", for restore need the <backup_set_path> (<backup_set_path> is path to backup set folder on server side) of completed backup and path to file with workspace configuration:

Successful : 
  status code = 200

1.37.3.13. Restoring workspace from backup set

jcrbackup http://root:exo@127.0.0.1:8080 restore /repository/backup /tmp/123/repository_backup-20101220_114156 /home/rainf0x/java/exo-working/JCR-839/exo-jcr-config_backup.xml

Restore to workspace "backup" and remove fully (will be removed content from db, value storage, index) exists workspace, for restore need the <backup_set_path> (<backup_set_path> is path to backup set folder on server side) of completed backup and path to file with workspace configuration:

Successful : 
  status code = 200

1.37.3.14. Restoring workspace from backup set and remove exists workspace

jcrbackup http://root:exo@127.0.0.1:8080 restore remove-exists /repository/backup /repository/backup /tmp/123/repository_backup-20101220_114156 /home/rainf0x/java/exo-working/JCR-839/exo-jcr-config_backup.xml

Restore to workspace "backup" with original configuration of workspace (the original configuration was stored in backup set), for restore need the <backup_id> of completed backup:

Successful : 
  status code = 200

1.37.3.15. Restoring workspace with original configuation

jcrbackup http://root:exo@127.0.0.1:8080 restore  6c302adc7f00010100df88d29535c6ee

Restore to workspace "backup" with original configuration of workspace (the original configuration was stored in backup set) and remove fully (will be removed content from db, value storage, index) exists workspace, for restore need the <backup_id> of completed backup:

Successful : 
  status code = 200

1.37.3.16. Restoring workspace with original configuation and remove exists workspace

jcrbackup http://root:exo@127.0.0.1:8080 restore remove-exists 6c302adc7f00010100df88d29535c6ee

Restore to workspace "backup" with original configuration of workspace (the original configuration was stored in backup set), for restore need the <backup_set_path> (<backup_set_path> is path to backup set folder on server side) of completed backup:

Successful : 
  status code = 200

1.37.3.17. Restoring workspace from backup set with original configuation

jcrbackup http://root:exo@127.0.0.1:8080 restore /tmp/123/repository_backup-20101220_114156

Restore to workspace "backup" and remove fully (will be removed content from db, value storage, index) exists workspace with original configuration of workspace (the original configuration was stored in backup set), for restore need the <backup_set_path> (<backup_set_path> is path to backup set folder on server side) of completed backup:

Successful : 
  status code = 200

1.37.3.18. Restoring workspace from backup set with original configuation and remove exists workspace

jcrbackup http://root:exo@127.0.0.1:8080 restore remove-exists /tmp/123/repository_backup-20101220_114156

Restore to repository "repository" , for restore need the <backup_id> of completed backup and path to file with repository configuration:

Successful : 
  status code = 200

1.37.3.19. Restoring repository

jcrbackup http://root:exo@127.0.0.1:8080 restore remove-exists /repository 6c302adc7f00010100df88d29535c6ee /home/rainf0x/java/exo-working/JCR-839/exo-jcr-config.xml

Restore to repositoy "repository" and remove fully (will be removed content from db, value storage, index) exists repository, for restore need the <backup_id> of completed backup and path to file with repository configuration:

Successful : 
  status code = 200

1.37.3.20. Restoring repository and remove exists repository

jcrbackup http://root:exo@127.0.0.1:8080 restore remove-exists /repository 6c302adc7f00010100df88d29535c6ee /home/rainf0x/java/exo-working/JCR-839/exo-jcr-config.xml

Restore to repository "repository", for restore need the <backup_set_path> (<backup_set_path> is path to backup set folder on server side) of completed backup and path to file with repository configuration:

Successful : 
  status code = 200

1.37.3.21. Restoring repository from backup set

jcrbackup http://root:exo@127.0.0.1:8080 restore /repository /tmp/123/repository_repository_backup_1292833493681 /home/rainf0x/java/exo-working/JCR-839/exo-jcr-config.xml

Restore to repository "repository" and remove fully (will be removed content from db, value storage, index) exists repository, for restore need the <backup_set_path> (<backup_set_path> is path to backup set folder on server side) of completed backup and path to file with repository configuration:

Successful : 
  status code = 200

1.37.3.22. Restoring repository from backup set and remove exists repository

jcrbackup http://root:exo@127.0.0.1:8080 restore remove-exists /repository /repository/backup /tmp/123/repository_repository_backup_1292833493681 /home/rainf0x/java/exo-working/JCR-839/exo-jcr-config.xml

Restore to repository "repository" with original configuration of repository (the original configuration was stored in backup set), for restore need the <backup_id> of completed backup:

Successful : 
  status code = 200

1.37.3.23. Restoring repository with original configuation

jcrbackup http://root:exo@127.0.0.1:8080 restore  6c302adc7f00010100df88d29535c6ee

Restore to repository "repository" with original configuration of repository (the original configuration was stored in backup set) and remove fully (will be removed content from db, value storage, index) exists repository, for restore need the <backup_id> of completed backup:

Successful : 
  status code = 200

1.37.3.24. Restoring repository with original configuation and remove exists repository

jcrbackup http://root:exo@127.0.0.1:8080 restore remove-exists 6c302adc7f00010100df88d29535c6ee

Restore to repository "repository" with original configuration of repository (the original configuration was stored in backup set), for restore need the <backup_set_path> (<backup_set_path> is path to backup set folder on server side) of completed backup:

Successful : 
  status code = 200

1.37.3.25. Restoring repository from backup set with original configuation

jcrbackup http://root:exo@127.0.0.1:8080 restore /tmp/123/repository_repository_backup_1292833493681

Restore to repository "repository" and remove fully (will be removed content from db, value storage, index) exists repository with original configuration of repository (the original configuration was stored in backup set), for restore need the <backup_set_path> (<backup_set_path> is path to backup set folder on server side) of completed backup:

Successful : 
  status code = 200

1.37.3.26. Restoring repository from backup set with original configuation and remove exists repository

jcrbackup http://root:exo@127.0.0.1:8080 restore remove-exists /tmp/123/repository_repository_backup_1292833493681

Successful : 
  status code = 200

1.37.4. Full example about creating backup and restoring it for workspace 'backup'

1.37.4.1. Creating backup

jcrbackup http://root:exo@127.0.0.1:8080 start /repository/backup ../temp/backup 10000

Successful : 
  status code = 200

1.37.4.2. Getting information about current backups

jcrbackup http://root:exo@127.0.0.1:8080 list

Stop backup with id b469ba957f0001010178febaedf20eb7 :

The current backups information : 
  1) Backup with id b469ba957f0001010178febaedf20eb7 :
    repository name           : repository
    workspace name            : backup
    backup type               : full + incremetal
    full backup state         : finished
    incremental backup state  : working
    started time              : Fri, 17 Apr 2009 17:10:09 EEST

1.37.4.3. Stopping backup by id

jcrbackup http://root:exo@127.0.0.1:8080 stop b469ba957f0001010178febaedf20eb7

Successful : 
  status code = 200

1.37.4.4. Deleting the workspace "backup" and close opened sessions on this workspace

jcrbackup http://root:exo@127.0.0.1:8080 drop force-close-session /repository/backup

Delete/clean the database for workspace "backup" : When we use "single-db", then we will run the SQL queries for clean database :

Successful : 
  status code = 200

1.37.4.5. Restoring the workspace "backup"

delete from JCR_SREF where NODE_ID in (select ID from JCR_SITEM where CONTAINER_NAME = 'backup')
delete from JCR_SVALUE where PROPERTY_ID in (select ID from JCR_SITEM where CONTAINER_NAME = 'backup')
delete from JCR_SITEM where CONTAINER_NAME='backup'

Delete the value storage for workspace "backup"; - delete the index data for workspace "backup"; - restore :
```
jcrbackup http://root:exo@127.0.0.1:8080 restore /repository/backup b469ba957f0001010178febaedf20eb7 /home/rainf0x/java/exo-working/JCR-839/exo-jcr-config_backup.xml
```
Return :
```
Successful : 
  status code = 200
```
The /home/rainf0x/java/exo-working/JCR-839/exo-jcr-config_backup.xml content the configuration for restored workspace "backup":

<repository-service default-repository="repository">
  <repositories>
    <repository name="repository" system-workspace="production" default-workspace="production">
      <security-domain>exo-domain</security-domain>
      <access-control>optional</access-control>
      <authentication-policy>org.exoplatform.services.jcr.impl.core.access.JAASAuthenticator</authentication-policy>
      <workspaces>
        
        <workspace name="backup">
          <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
            <properties>
              <property name="source-name" value="jdbcjcr" />
              <property name="dialect" value="pgsql" />
              <property name="multi-db" value="false" />
              <property name="max-buffer-size" value="200k" />
              <property name="swap-directory" value="../temp/swap/backup" />
            </properties>
            <value-storages>
              <value-storage id="draft" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
                <properties>
                  <property name="path" value="../temp/values/backup" />
                </properties>
                <filters>
                  <filter property-type="Binary"/>
                </filters>
              </value-storage>
            </value-storages>
          </container>
          <initializer class="org.exoplatform.services.jcr.impl.core.ScratchWorkspaceInitializer">
            <properties>
              <property name="root-nodetype" value="nt:unstructured" />
            </properties>
          </initializer>
          <cache enabled="true" class="org.exoplatform.services.jcr.impl.dataflow.persistent.LinkedWorkspaceStorageCacheImpl">
            <properties>
              <property name="max-size" value="10k" />
              <property name="live-time" value="1h" />
            </properties>
          </cache>
          <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
            <properties>
              <property name="index-dir" value="../temp/jcrlucenedb/backup" />
            </properties>
          </query-handler>
          <lock-manager class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl">
             <properties>
                <property name="time-out" value="15m" />
                <property name="jbosscache-configuration" value="jbosscache-lock.xml" />
                <property name="jbosscache-cl-cache.jdbc.table.name" value="jcrlocks" />
                <property name="jbosscache-cl-cache.jdbc.table.create" value="true" />
                <property name="jbosscache-cl-cache.jdbc.table.drop" value="false" />
                <property name="jbosscache-cl-cache.jdbc.table.primarykey" value="jcrlocks_pk" />
                <property name="jbosscache-cl-cache.jdbc.fqn.column" value="fqn" />
                <property name="jbosscache-cl-cache.jdbc.node.column" value="node" />
                <property name="jbosscache-cl-cache.jdbc.parent.column" value="parent" />
                <property name="jbosscache-cl-cache.jdbc.datasource" value="jdbcjcr" />
                <property name="jbosscache-cl-cache.jdbc.dialect" value="${dialect}" />
                <property name="jbosscache-shareable" value="true" />
             </properties>
          </lock-manager>
        </workspace>
      </workspaces>
    </repository>
  </repositories>
</repository-service>

1.37.4.6. Getting information about restore for workspace /repository/backup

jcrbackup http://root:exo@127.0.0.1:8080 restores /repository/backup

If delete default repository that should be restored repository with name as default repository.

The current restores information : 
  Restore with id b469ba957f0001010178febaedf20eb7:
    backup folder           : /home/rainf0x/java/exo-working/JCR-839/new_JCR/exo-tomcat/bin/../temp/backup
    repository name         : repository
    workspace name          : backup
    backup type             : full + incremetal
    restore state           : successful
    started time            : Fri, 17 Apr 2009 16:38:00 EEST
    finished time           : Fri, 17 Apr 2009 16:38:00 EEST

1.37.5. Full example about creating backup and restoring it for repository 'repository'

Note

This usecase needs RestRepositoryService enabled. (Deleting the repository needs it)

<component>
   <type>org.exoplatform.services.jcr.ext.repository.RestRepositoryService</type>
</component>

1.37.5.1. Creating backup

jcrbackup http://root:exo@127.0.0.1:8080 start /repository ../temp/backup 10000

Successful : 
  status code = 200

1.37.5.2. Getting information about current backups

jcrbackup http://root:exo@127.0.0.1:8080 list

Stop backup with id 9a4d40fb7f0000012ec8f0a4ec70b3da :

The current backups information : 
  1) Repository backup with id 9a4d40fb7f0000012ec8f0a4ec70b3da :
    repository name            : repository
    backup type                : full + incremetal
    full backups state         : finished
    incremental backups state  : working
    started time               : Mon, 11 Oct 2010 10:59:35 EEST

1.37.5.3. Stopping backup by id

jcrbackup http://root:exo@127.0.0.1:8080 stop 9a4d40fb7f0000012ec8f0a4ec70b3da

Successful : 
  status code = 200

1.37.5.4. Deleting the repository "repository" and close all opened sessions

jcrbackup http://root:exo@127.0.0.1:8080 drop force-close-session /repository

Successful : 
  status code = 200

1.37.5.5. Restoring the repository "repository"

Delete/clean the database for workspace "repository": When we use "single-db", then we will run the SQL queries for clean database :
```
       drop table JCR_SREF;
       drop table JCR_SVALUE;
       drop table JCR_SITEM;
```
Delete the value storage for repository "repository";
Delete the index data for repository "repository";

Restore:

jcrbackup http://root:exo@127.0.0.1:8080 restore /repository 9a6dba327f000001325dfb228a181b07 /home/rainf0x/exo-jcr-config_backup.xml

The /home/rainf0x/exo-jcr-config_backup.xml content the configuration for restored repository "repository":

Successful : 
  status code = 200

<repository-service default-repository="repository">
   <repositories>
      <repository name="repository" system-workspace="production" default-workspace="production">
         <security-domain>exo-domain</security-domain>
         <access-control>optional</access-control>
         <authentication-policy>org.exoplatform.services.jcr.impl.core.access.JAASAuthenticator</authentication-policy>
         <workspaces>
            <workspace name="production">
               <!-- for system storage -->
               <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
                  <properties>
                     <property name="source-name" value="jdbcjcr" />
                     <property name="multi-db" value="false" />
                     <property name="max-buffer-size" value="200k" />
                     <property name="swap-directory" value="../temp/swap/production" />
                  </properties>
                  <value-storages>
                     <value-storage id="system" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
                        <properties>
                           <property name="path" value="../temp/values/production" />
                        </properties>
                        <filters>
                           <filter property-type="Binary" />
                        </filters>
                     </value-storage>
                  </value-storages>
               </container>
               <initializer class="org.exoplatform.services.jcr.impl.core.ScratchWorkspaceInitializer">
                  <properties>
                     <property name="root-nodetype" value="nt:unstructured" />
                  </properties>
               </initializer>
               <cache enabled="true" class="org.exoplatform.services.jcr.impl.dataflow.persistent.LinkedWorkspaceStorageCacheImpl">
                  <properties>
                     <property name="max-size" value="10k" />
                     <property name="live-time" value="1h" />
                  </properties>
               </cache>
               <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
                  <properties>
                     <property name="index-dir" value="../temp/jcrlucenedb/production" />
                  </properties>
               </query-handler>
               <lock-manager class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl">
                  <properties>
                     <property name="time-out" value="15m" />
                     <property name="jbosscache-configuration" value="jbosscache-lock.xml" />
                     <property name="jbosscache-cl-cache.jdbc.table.name" value="jcrlocks" />
                     <property name="jbosscache-cl-cache.jdbc.table.create" value="true" />
                     <property name="jbosscache-cl-cache.jdbc.table.drop" value="false" />
                     <property name="jbosscache-cl-cache.jdbc.table.primarykey" value="jcrlocks_pk" />
                     <property name="jbosscache-cl-cache.jdbc.fqn.column" value="fqn" />
                     <property name="jbosscache-cl-cache.jdbc.node.column" value="node" />
                     <property name="jbosscache-cl-cache.jdbc.parent.column" value="parent" />
                     <property name="jbosscache-cl-cache.jdbc.datasource" value="jdbcjcr" />
                     <property name="jbosscache-cl-cache.jdbc.dialect" value="${dialect}" />
                     <property name="jbosscache-shareable" value="true" />
                  </properties>
               </lock-manager>
            </workspace>

            <workspace name="backup">
               <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
                  <properties>
                     <property name="source-name" value="jdbcjcr" />
                     <property name="multi-db" value="false" />
                     <property name="max-buffer-size" value="200k" />
                     <property name="swap-directory" value="../temp/swap/backup" />
                  </properties>
                  <value-storages>
                     <value-storage id="draft" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
                        <properties>
                           <property name="path" value="../temp/values/backup" />
                        </properties>
                        <filters>
                           <filter property-type="Binary" />
                        </filters>
                     </value-storage>
                  </value-storages>
               </container>
               <initializer class="org.exoplatform.services.jcr.impl.core.ScratchWorkspaceInitializer">
                  <properties>
                     <property name="root-nodetype" value="nt:unstructured" />
                  </properties>
               </initializer>
               <cache enabled="true" class="org.exoplatform.services.jcr.impl.dataflow.persistent.LinkedWorkspaceStorageCacheImpl">
                  <properties>
                     <property name="max-size" value="10k" />
                     <property name="live-time" value="1h" />
                  </properties>
               </cache>
               <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
                  <properties>
                     <property name="index-dir" value="../temp/jcrlucenedb/backup" />
                  </properties>
               </query-handler>
                <lock-manager class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl">
                  <properties>
                     <property name="time-out" value="15m" />
                     <property name="jbosscache-configuration" value="jbosscache-lock.xml" />
                     <property name="jbosscache-cl-cache.jdbc.table.name" value="jcrlocks" />
                     <property name="jbosscache-cl-cache.jdbc.table.create" value="true" />
                     <property name="jbosscache-cl-cache.jdbc.table.drop" value="false" />
                     <property name="jbosscache-cl-cache.jdbc.table.primarykey" value="jcrlocks_pk" />
                     <property name="jbosscache-cl-cache.jdbc.fqn.column" value="fqn" />
                     <property name="jbosscache-cl-cache.jdbc.node.column" value="node" />
                     <property name="jbosscache-cl-cache.jdbc.parent.column" value="parent" />
                     <property name="jbosscache-cl-cache.jdbc.datasource" value="jdbcjcr" />
                     <property name="jbosscache-cl-cache.jdbc.dialect" value="${dialect}" />
                     <property name="jbosscache-shareable" value="true" />
                  </properties>
               </lock-manager>
            </workspace>

            <workspace name="digital-assets">
               <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
                  <properties>
                     <property name="source-name" value="jdbcjcr" />
                     <property name="multi-db" value="false" />
                     <property name="max-buffer-size" value="200k" />
                     <property name="swap-directory" value="../temp/swap/digital-assets" />
                  </properties>
                  <value-storages>
                     <value-storage id="digital-assets" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
                        <properties>
                           <property name="path" value="../temp/values/digital-assets" />
                        </properties>
                        <filters>
                           <filter property-type="Binary" />
                        </filters>
                     </value-storage>
                  </value-storages>
               </container>
               <initializer class="org.exoplatform.services.jcr.impl.core.ScratchWorkspaceInitializer">
                  <properties>
                     <property name="root-nodetype" value="nt:folder" />
                  </properties>
               </initializer>
               <cache enabled="true" class="org.exoplatform.services.jcr.impl.dataflow.persistent.LinkedWorkspaceStorageCacheImpl">
                  <properties>
                     <property name="max-size" value="5k" />
                     <property name="live-time" value="15m" />
                  </properties>
               </cache>
               <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
                  <properties>
                     <property name="index-dir" value="../temp/jcrlucenedb/digital-assets" />
                  </properties>
               </query-handler>
               <lock-manager class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl">
                  <properties>
                     <property name="time-out" value="15m" />
                     <property name="jbosscache-configuration" value="jbosscache-lock.xml" />
                     <property name="jbosscache-cl-cache.jdbc.table.name" value="jcrlocks" />
                     <property name="jbosscache-cl-cache.jdbc.table.create" value="true" />
                     <property name="jbosscache-cl-cache.jdbc.table.drop" value="false" />
                     <property name="jbosscache-cl-cache.jdbc.table.primarykey" value="jcrlocks_pk" />
                     <property name="jbosscache-cl-cache.jdbc.fqn.column" value="fqn" />
                     <property name="jbosscache-cl-cache.jdbc.node.column" value="node" />
                     <property name="jbosscache-cl-cache.jdbc.parent.column" value="parent" />
                     <property name="jbosscache-cl-cache.jdbc.datasource" value="jdbcjcr" />
                     <property name="jbosscache-cl-cache.jdbc.dialect" value="${dialect}" />
                     <property name="jbosscache-shareable" value="true" />
                  </properties>
               </lock-manager>
            </workspace>
         </workspaces>
      </repository>
   </repositories>
</repository-service>

1.37.5.6. Getting information about restore for repository 'repository'

jcrbackup http://root:exo@127.0.0.1:8080 restores /repository