Chapter 74. JCR FAQ

It's the draft for a future FAQ of JCR usage.

74.1. Kernel

74.1.1. What is the best, standardized way to get the instance of a service ?

container.getComponentInstanceOfType(ServiceName.class);

74.2. JCR

74.2.1. JCR core

74.2.1.1. Is it better to use Session.getNodeByUUID or Session.getItem?

Session.getNodeByUUID() about 2.5 times faster of Session.getItem(String) and only 25% faster of Node.getNode(String). See the daily tests results for such comparisons, e.g.

http://tests.exoplatform.org/JCR/1.12.2-GA/rev.2442/daily-performance-testing-results/jcr.core./index.html

74.2.1.2. Does it make sense to have all the node referencable to use getNodeByUUID all the time?

Until it's applicable for a business logic it can be. But take in account the paths are human readable and lets you think in hierarchy. If it's important a location based approach is preferable.

74.2.1.3. What should I use to check if an Item exists before getting the Value?

Use Session.itemExists(String absPath), Node.hasNode(String relPath) or Property.hasProperty(String name). It's also is possible to check Node.hasNodes() and Node.hasProprties().

SELECT * FROM nt:unstructured WHERE jcr:path LIKE 'testRoot/%'

For specified jcr:path ordering there is different proceeding in XPath and SQL:

SQL no matter ascending or descending - query returns result nodes in random order: {code}SELECT * FROM nt:unstructured WHERE jcr:path LIKE 'testRoot/%' ORDER BY jcr:path{code}

XPath - jcr:path order construction is ignored (so result is not sorted according path); {code}/testRoot/* @jcr:primaryType='nt:unstructured' order by jcr:path{code}

74.2.1.8. How eXo JCR indexer uses content encoding?

1. Indexer uses jcr:encoding property of nt:resource node (used as jcr:content child node of nt:file) 2. if no jcr:encoding property set the Document Service will use the one configured in the service (defaultEncoding) 3. if has nothing configured a JVM default encoding will be used

74.2.1.9. Which database server is better for eXo JCR?

If question is a performance it's difficult question, as each database can be configured to be more (and more) faster for each special case. MySQL with MyISAM engine will be faster. But MySQL has limitations for indexes for multilingual columns (Item names actually). So, with long Item names (larger ofOracle or PostgreSQL also are good for performance. DB2 and MSSQL slower in default configurations. Default configuration of Sybase leader of slowness. But in this question take the database server maintenance in account. MySQL and PostgreSQL are simple in installation and can works even on limited hardware. Oracle, DB2, MSSQL or Sybase need more efforts. Same actual for maintenance during the work. Note for Sybase: "check-sns-new-connection" data container configuration parameter should be set to "true". For testing or embedded use HSQLDB is the best. Apache Derby and H2 also supported. But H2 surprisingly needs "beta" feature enabled - MVCC=TRUE in JDBC url.

74.2.1.10. How to setup eXo JCR for mutilingial content on MySQL?

MySQL database should be configured to use single-byte encoding, e.g. "latin1". eXo JCR application (e.g. GateIn) should use JCR dialect "MySQL-UTF8".

In other words: MySQL database default encoding and JCR dialect cannot be UTF8 both. Use single-byte encoding (e.g. "latin1") for database and "mysql-utf8" dialect for eXo JCR.

Notice: "MySQL-UTF8" dialect cannot be auto-detected, it should be set explicitly in configuration.

74.2.1.11. Does MySQL has limitation affecting eXo JCR features?

Index's key length of JCR_SITEM (JCR_MITEM) table for mysql-utf8 dialect is reduced to 765 bytes (or 255 chars).

74.2.1.12. Does use of Sybase database needs special options in eXo JCR configuration?

For properly work JCR with Sybase need for each workspace data container add new property "check-sns-new-connection" with false value like this:

<container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
  <properties>
    <property name="source-name" value="jdbcjcr" />
    <property name="dialect" value="auto" />
    <property name="multi-db" value="true" />
    <property name="update-storage" value="false" />
    <property name="max-buffer-size" value="200k" />
    <property name="swap-directory" value="target/temp/swap/ws" />
    <property name="swap-directory" value="target/temp/swap/ws" />
    <property name="check-sns-new-connection" value="false" />
  </properties>

74.2.1.13. How to open and close a session properly to avoid memory leaks?

Session session = repository.login(credentials);
try
{
// here your code
}
finally
{
session.logout();
}

74.2.1.14. Can I use Session after logout?

No. Any instance of Session or Node (acquired through session) shouldn't be used after logout anymore. At least it is highly recommended not to use.

74.2.1.15. How to configure jcr for cluster ?

So we have configured JCR in standalone mode and want to reconfigure it for clustered environment. First of all let's check whether all requirements are satisfied:

Dedicated RDBMS anyone like MySQL, Postges, Oracle and etc. but just not HSSQL;
Shared storage. The simples thing is to use shared FS like NFS or SMB mounted in operation system, but they are rather slow. The best thing is to use SAN (Storage Area Network);
Fast network between JCR nodes.

So now, need to configure Container a bit. Check exo-configuration.xml to be sure You are using JBossTS Transaction Service and JBossCache Transaction Manager, as shown below.

<component>
   <key>org.jboss.cache.transaction.TransactionManagerLookup</key>
   <type>org.jboss.cache.GenericTransactionManagerLookup</type>
</component>

<component>
   <key>org.exoplatform.services.transaction.TransactionService</key>
   <type>org.exoplatform.services.transaction.jbosscache.JBossTransactionsService</type>
   <init-params>
      <value-param>
         <name>timeout</name>
         <value>300</value>
      </value-param>
   </init-params>
</component>

Next stage is actually the JCR configuration. We need JBossCache configuration templates for : data-cache, indexer-cache and lock-manager-cache. Later they will be used configuring JCR's core components. There are pre-bundled templates in EAR or JAR in conf/standalone/cluster. They can be used as is or re-written if needed. And now time to re-configure a bit each workspace. Actually need to change few parameters of <cache>, <query-handler> and <lock-manager>.

<cache> configuration should looks like this:
```
<cache enabled="true"
     class="org.exoplatform.services.jcr.impl.dataflow.persistent.jbosscache.JBossCacheWorkspaceStorageCache">
     <properties>
        <property name="jbosscache-configuration" value="test-jbosscache-data.xml" />
        <property name="jgroups-configuration" value="udp-mux.xml" />
        <property name="jgroups-multiplexer-stack" value="true" />
        <property name="jbosscache-cluster-name" value="JCR-cluster-db1-ws" />
     </properties>
</cache>
```

<query-handler> configuration

You must replace or add in <query-handler> block the "changesfilter-class" parameter equals with:

<property name="changesfilter-class" value="org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter"/>

add JBossCache-oriented configuration:

<property name="jbosscache-configuration" value="test-jbosscache-indexer.xml" />
<property name="jgroups-configuration" value="udp-mux.xml" />
<property name="jgroups-multiplexer-stack" value="true" />
<property name="jbosscache-cluster-name" value="JCR-cluster-indexer-db1-ws" />
<property name="max-volatile-time" value="60" />

Those properties have the same meaning and restrictions as in previous block. Last property "max-volatile-time" is not mandatory but recommended. This notifies that latest changes in index will be visible for each cluster node not later than in 60s.

<lock-manager> configuration
Maybe this is the hardest element to configure, because we have to define access to DB where locks will be stored. Replace exsiting lock-manager configuration with shown below.
```
<lock-manager class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl">
   <properties>
      <property name="time-out" value="15m" />
      <property name="jbosscache-configuration" value="test-jbosscache-lock.xml" />
      <property name="jgroups-configuration" value="udp-mux.xml" />
      <property name="jgroups-multiplexer-stack" value="true" />
      <property name="jbosscache-cluster-name" value="JCR-cluster-locks-db1-ws" />
      <property name="jbosscache-cl-cache.jdbc.table.name" value="jcrlocks_db1_ws" />
      <property name="jbosscache-cl-cache.jdbc.table.create" value="true" />
      <property name="jbosscache-cl-cache.jdbc.table.drop" value="false" />
      <property name="jbosscache-cl-cache.jdbc.table.primarykey" value="jcrlocks_db1_ws_pk" />
      <property name="jbosscache-cl-cache.jdbc.fqn.column" value="fqn" />
      <property name="jbosscache-cl-cache.jdbc.node.column" value="node" />
      <property name="jbosscache-cl-cache.jdbc.parent.column" value="parent" />
      <property name="jbosscache-cl-cache.jdbc.datasource" value="jdbcjcr" />
   </properties>
</lock-manager>
```
First few properties are the same as in previous components, but here you can see some strange "jbosscache-cl-cache.jdbc.*" properties. They define access parameters for database where lock are persisted.

That's all. JCR is ready for running in a cluster.

74.2.1.16. Is JCR suitable for remote sites\* synchronization

understand remote site as different buildings separated by a WAN network.

74.2.1.17. How to use lucene spellchecker?

There is few steps:

enable lucene spellchecker in jcr QueryHandler configuration:

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
   <properties>
      ...
      <property name="spellchecker-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.spell.LuceneSpellChecker$FiveSecondsRefreshInterval" />
      ...
   </properties>
</query-handler>

execute query with rep:spellcheck function and word that is checked:

Query query = qm.createQuery("select rep:spellcheck() from nt:base where " + "jcr:path = '/' and spellcheck('word that is checked')", Query.SQL);
RowIterator rows = query.execute().getRows();

fetch a result:

Row r = rows.nextRow();
Value v = r.getValue("rep:spellcheck()");

If there is no any results, that means there is no suggestion, so word is correct or spellcheckers dictionary do not contain any word looking like checked word.

74.2.1.18. How can I affect to spellchecker results?

There is two parameters in jcr QueryHandler configuration:

minimal distance between checked word and proposed suggestion;

search for more popular suggestions;

<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
   <properties>
      ...
      <property name="spellchecker-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.spell.LuceneSpellChecker$FiveSecondsRefreshInterval" />
      <property name="spellchecker-more-popular" value="false" />
      <property name="spellchecker-min-distance" value="0.55" />
      ...
   </properties>
</query-handler>

Minimal distance is counted as Levenshtein distance between checked word and spellchecker suggestion.

if the proposed word exist in the directory - no suggestion given;
if the proposed word doesn't exist in the directory - propose the closed word;

If "morePopular" enabled:

no matter word exist or not, checker will propose the closed word that is the most popular than checked word.

74.2.2. JCR extensions

74.2.2.1. How to restore repository to existing repository ?

Remove existing repository, use :

RepositoryService.removeRepository(String repositoryName)

Restore repository, use

BackupManager.restore(RepositoryBackupChainLog log, RepositoryEntry repositoryEntry, boolean asynchronous)

74.2.2.2. How to restore workspace to existing worksapce ?

Remove existing workspace, use :

ManageableRepository.removeWorkspace(String workspaceName)

Restore workspace, use :

BackupManager.restore(BackupChainLog log, String repositoryName, WorkspaceEntry workspaceEntry, boolean asynchronous)

74.2.2.3. Does JCR support hot backup ?

Yes, JCR is support hot backup. Will use org.exoplatform.services.jcr.ext.backup.BackupManager.

74.2.3. WebDAV

74.2.3.1. I uploaded a file to WebDAV server using Mac OS Finder, but the file size is '0', what is wrong ?

This is a known finder bug started from Mac OS v.10.5.3 and not yet fixed, .

for more details follow:  Apple Disscussion thread.

74.2.3.2. Can I manage 'cache-control' value for different media-types from server configuration ?

Use "cache-control" configuration parameter.

The value of this parameter must contain colon-separated pairs "MediaType:cache-control value"

For example if you need to cache all text/xml and text/plain files for 5 minutes (300 sec.) and other text/\* files for 10 minutes (600 sec.) use the next configuration:

<component>
   <type>org.exoplatform.services.jcr.webdav.WebDavServiceImpl</type>
   <init-params>
      <value-param>
         <name>cache-control</name>
         <value>text/xml,text/plain:max-age=300;text/*:max-age=600;</value>
      </value-param>
   <init-params>
<component>

74.2.3.3. How to perform WebDAV requests using curl ?

Simple Requests

For simple request such as: GET, HEAD, MKCOL, COPY, MOVE, DELETE, CHECKIN, CHECKOUT, UNCHECKOUT, LOCK, UNLOCK, VERSIONCONTROL, OPTIONS

perform:

curl -i -u 'user:pass' -X 'METHOD_NAME' 'resource_url'

for example to create a folder named test perform:

curl -i -u 'root:exo' -X MKCOL 'http://localhost:8080/rest/jcr/repository/production/test

to PUT a test.txt file from your current folder to "test "folder on server perform:

curl -i -u 'root:exo' -X PUT 'http://localhost:8080/rest/jcr/repository/production/test/test.txt' -d @test.txt

Requests with XML body

For requests which contains xml body such as: ORDER, PROPFIND, PROPPATCH, REPORT, SEARCH

add -d 'xml_body text' or -d @body.xml

(body.xml must contain a valid xml request bidy.) to you curl-command:

curl -i -u 'user:pass'  -X 'METHOD_NAME' -H 'Headers' 'resource_url' -d 'xml_body text'

for example to find all files containing "test" perform:

curl -i -u "root:exo" -X "SEARCH" "http://192.168.0.7:8080/rest/jcr/repository/production/" -d
"<?xml version='1.0' encoding='UTF-8' ?>
   <D:searchrequest xmlns:D='DAV:'>
      <D:sql>SELECT * FROM nt:base WHERE contains(*, 'text')</D:sql>
</D:searchrequest>"

if you need to add some headers to your request use \-H key.

More information about methods parameters you can find in HTTP Extensions for Distributed Authoring specification.

74.2.3.4. How eXo JCR WebDAV server treats content encoding?

OS client (Windows, Linux etc) doesn't set an encoding in a request. But eXo JCR WebDAV server looks for an encoding in a Content-Type header and set it to jcr:encoding. See http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html, 14.17 Content-Type. e.g. Content-Type: text/html; charset=ISO-8859-4 So, if a client will set Content-Type header, e.g. JS code from a page, it will works for a text file as expected.

If WebDAV request doesn't contain a content encoding it's possible to write a dedicated action in a customer application. The action will set jcr:encoding using its own logic, e.g. based on IP or user preferences.