Disambiguation API

Introduction

As the design page suggests we have a problem with uniquely identifying resources in the search results in a user friendly way (i.e. not by id but rather by context). On various places in the UI different approaches were taken to mitigate this or the problem was just ignored.

Also on different UI pages, different contextual data (apart from the name, type and parent) need to be displayed according to the purpose of each page.

So in another words we need a mechanism of updating search results of unknown "format" with disambiguation data in such a way that would require minimal updates to the existing APIs and UI pages.

Solution

One way of doing things was to update each JPQL/native query we do to contain the disambiguating data and update all the SLSBs and UIBeans to use those instead. The problem with this approach is that the queries are usually used in a number of situations not all of which need disambiguation data, so we would have to "fork" the queries, the SLSB methods using them and update the users of the original method to use the new alternative if needed. Another problem is that of the legacy Struts-based UI that we have conserved and don't want to touch unless absolutely necessary. This would need to be updated in non-trivial ways to support the new data format.

The other way of approaching this is to separate out the disambiguation into a separate call that would operate on original result list. This way, the only places we need to update are the UIBeans and associated pages. The rest of the API stays the same as it was, the legacy UI can stay there, no breakage in sight. The price for this is two calls into the EJB layer instead of just one and performing 3 or 4 SQL queries instead of just 1 or 2 (the data and the count query) per search.

API

The ResourceManagerLocal SLSB has a new method

    /**
     * Given a list of results, this method produces an object 
     * decorates the provided original results with data needed 
     * to disambiguate the results with respect to resource names, 
     * their types and ancestory.
     * <p>
     * The disambiguation result contains information on what types 
     * of information are needed to make the resources in the original 
     * result unambiguous and contains the decorated original data in 
     * the same order as the supplied result list.
     *
     * @see ResourceNamesDisambiguationResult
     *
     * @param <T> the type of the result elements
     * @param results the results to disambiguate
     * @param alwayIncludeParent if true, the parent disambiguation 
     *        will always be included in the result even if the results 
     *        wouldn't have to be disambiguated using parents.
     * @param resourceIdExtractor an object able to extract resource 
     *        id from an instance of type parameter.
     * @return the disambiguation result or null on error
     */
    ResourceNamesDisambiguationResult<T> disambiguate(List<T> results, 
                                                      boolean alwaysIncludeParent, 
                                                      IntExtractor<T> resourceIdExtractor);

This method, as one would hope, performs the disambiguation. The result list contains objects of any type and the resourceIdExtractor is used to extract a resource id out from those objects. The resource id is needed in the disambiguation procedure and since we have to support varied types of results (as mentioned in the introduction) the disambiguation method needs to be told how to work with those. It would be very cumbersome to define an overload of this method for every conceivable type of search results that need disambiguated.

Each "user" of the disambiguate method needs to supply an instance of an implementation of the IntExtractor<T> interface. This is a simple one-method interface perfectly implemented using an anonymous inner class in a somewhat functional style of programming.

The disambiguation result consists of the following:

booleans specifying whether the parent names, type names or plugin names are needed to make the results as a whole unique (the plugin name might be needed if there are two distinct types with the same name).
the decorated results.

The getResolution() method of the disambiguation result returns a List<DisambiguationReport<T>>. The disambiguation report contains the appropriate element from the original result list along with the type and plugin name of the corresponding resource and a list of parents that are needed to disambiguate the element in the result list (the parents are actually not the full blown resource objects but rather only an id and name "flyweight" representation of them).

Application in the UI

So now we have a way of disambiguating the search results and we want to update the UI to show them.

Most of the results in the RHQ GUI are represented as PagedListDataModel<T> that is the RHQ way of paging through search results. We can take advantage of that and extend this class to automatically provide the disambiguation functionality (to reduce the amount of work needed in the UIBeans themselves):

/**
 * This is an extension to the {@link PagedListDataModel} that 
 * automatically performs the disambiguation of the resource 
 * names contained in the pages of fetched data.
 * <p>
 * This class implements the {@link PagedListDataModel#fetchPage(PageControl)} 
 * method and defers the actual loading of the data to a new 
 * {@link #fetchDataForPage(PageControl)} method. The result of that call
 * is supplied to the {@link ResourceManagerLocal#disambiguate(List, boolean, IntExtractor)}
 * method and the disambiguated results are then returned from the 
 * {@link #fetchPage(PageControl)} method.
 *
 * @author Lukas Krejci
 */
public abstract class ResourceNameDisambiguatingPagedListDataModel<T> extends
    PagedListDataModel<DisambiguationReport<T>> {
    ...
}

The UIBeans usually have a getDataModel() method providing data to the results table. This method returns an instance of a private class of the UIBean that fetches the actual data (this class extends from the PagedListDataModel<T>). So all we need to do is to make that class to inherit from the new ResourceNameDisambiguatingPagedListDataModel<T> and modify it slightly to conform to that new super class (i.e. suggest in the constructor whether to alwaysLoadParents, change the name of the fetchPage method to fetchDataForPage and implement the getResourceIdExtractor() method).

With that the UIBean is converted.

The XHTML page using that UIBean to show the results needs to be updated to reflect the changed data types and to actually show the disambiguated parents.

Suppose that the rich:dataTable showing the results stores the current row in a variable called item.

Inside the table we need to change references to properties of the item from #{item.property} to #{item.original.property}. To reference the same data as before the change.
To show the ancestry of the resource, one needs to insert the following rich:column:

<rich:column>
    <f:facet name="header">
        <h:outputText value="Parent"/>
    </f:facet>
    <onc:resourcePartialLineage parents="#{item.parents}" renderLinks="true"/>
</rich:column>

The important thing to note here is that the implementation of the disambiguation disallows the results to be sorted by the ancestry, so you cannot define any sorting on that column.
The new onc:resourcePartialLineage component takes care of the rendering of the parents. The renderLinks attribute defines whether to render the individual parents as links to their detail pages or just in clear text.

Based on the actual disambiguation results it is possible to render the columns in the results table conditionally.
To be able to do that, you need to first store the data model in a variable like this:

            <ui:param name="model" value="#{MyUIBean.dataModel}"/>

use it in the rich:dataTable by specifying it as its value (value="#{model}") and then you can do:

<rich:column rendered="#{model.currentPageNeedsTypeResolution}">
    <f:facet name="header">
        <h:outputText value="Resource Type">
    </f:facet>
    <h:outputText value="#{item.resourceType}"/>
</rich:column>

RHQ4 Changes new!

There remains a need for disambiguation (hereafter D12N) in the new GWT-based GUI for RHQ4. The RHQ3 solution has some disadvantages that we wanted to address as we move forward, given that we will now need to work D12N into the new GUI from scratch.

Although the RHQ3 design is flexible in the varying D12N it can perform, it has the following disadvantages:

It requires additional database round trips to fetch the required information.
It performs a complicated and expensive JOIN query to determine the ancestry.
It has a max-child-depth restriction.
Repeated work for the same information.
Query time increases with Inventory size.
Processing the D12N report can itself be somewhat complicated.

Additionally, in the GWT-based RHQ4 interface the additional database calls adds more complexity to the async services.

The new approach - Pre-Computed Ancestry

The solution is to take advantage of the fact that resource ancestry rarely changes and is already established in a top-down fashion. So, determining the ancestry in advance and keeping it persisted allows it to be returned along with the Resource entity itself.

How it Works

Basically, the ancestry is determined and stored at resource-creation time. It is an encoded value define recursively as follows:

resourceAncestry=parentResourceResourceTypeId_:_parentResourceId_:_parentResourceName_::_parentResourceAncestry

NOTE: platform resources have no parent and therefore have a null ancestry.

Because parent resources are always inventoried prior to children, we can assume the parent's ancestry is already set. The value is stored as a varchar2(4000), providing worst-case space for seven levels of parentage (which is eight levels of hierarchy). Of course resource names rarely come close to the max-length of 500 so in reality there really is no practical depth limit.

For cases where Resource entities are returned from the original query (almost always, especially in the RHQ4 interface and when using Criteria fetch) this approach eliminates additional db-round trips. The only exception being if you want to display the type ancestry and the necessary ResourceType entities have yet to be cached.

Since the resource names are stores verbatim a displayable name-ancestry can be created very quickly. And it can include links to the resource views utilizing the provided resource ids. Given a Set of result resources it is also possible (although probably not necessary) to perform set-wise operations to trim the ancestry of repeated ancestry (for example, if all resources have the same platform, eliminate it from all the display strings).

Updating the Ancestry

There is a minimal cost associated with maintaining the pre-computed ancestry in the case of ancestry change. Ancestry can change in the following situations:

Resource name change.
If a resource has a name change (typically user-initiated) then all of its child lineage will need their ancestry updated.
Resource type reassignment.
During a plugin update a resource type could have its location in the hierarchy change. In this situation resources of the changed type, and their child lineage will need their ancestry updated.

Neither of these is a common occurrence and meta-data update can only occur at plugin upgrade-time. And update time should not be a problem, even for large inventories.

RHQ3 Upgrade

For RHQ3 to RHQ4 upgrades the legacy inventory will need to be updated with pre-computed ancestry values. This one-time process will be performed as an upgrade JavaTask.

RHQ4 Approach

This is a current design discussion.

Current Design (subject to change)

Thanks for the input so far. What I'm hearing:

The rhq3 solution favored the ability to disambiguate given the on-screen information.
There are several corner-cases that make a single display format difficult.
The rhq3 solution can be difficult to parse and/or understand given lack of visual uniformity (the trade-off for putting it all on screen).
Try and utilize smargwt as best as possible.

So, after playing with this stuff for a bit I've been looking at two different formats for the Resource/Name column in various views. In the Inventory Views we have just the resource name and the ancestry resource names (note that here we also have Plugin and Type columns but that is more the exception than the rule):

images/author/download/attachments/73139364/delete.jpg

And in other places, like portlets, we have a more verbose resource field:

images/author/download/attachments/73139364/delete-1.jpg

Currently, in both views the Ancestry provides "hover" information that gives type info to supplement the resource name info:

images/author/download/attachments/73139364/delete-2.jpg

Note that there has been no attempt made in rhq4 to mimic the contextual, varying display, approach used in rhq3. At the moment the column formats for name and ancestry are fixed.

I think I'm in favor of the following approach:

The Name/Resource column should be consistently named. For consistency throughout the UI I would recommend 'Resource'. 'Name' is not sufficient as resource information is often embedded in views of a different contxt, like in the portlet above. Also 'Resource Ancestry' makes more sense than 'Name Ancestry'.
On screen we limit the 'Resource' column value to the resource name (like shown above for the inventory view), and 'Ancestry' to the resource name ancestry (as shown above). This provides a consistent, clean viewing experience that in many cases will provide enough D12N for the user, in the given context. Since all of the values are links the fonts and colors are uniform.
Provide Hover information for both Resource and Ancestry values. This is more than what we have at the moment, which is just for Ancestry. Although expanding rows are nice, we use that elsewhere in the GUI for different purposes. Typically to provide detail about the above row (like audit info) as opposed to a more verbose rendition of the same. Also, it requires a click which is a bit more work than hover for performing D12N between rows.
For the Resource I think the hover should contain the Plugin, Type and Resource Name. A format like seen above, simple, unchanged for different locales, maybe:
- PluginName TypeName ResourceName
- PluginName TypeName ResourceName
- PluginName / TypeName ResourceName
- PluginName, TypeName ResourceName
- [PluginName] TypeName Resource Name
For the Ancestry things are a little trickier. I don't love the current hover format, the staggered rows are hard to parse. But the separation of name and type into two hierarchies is fairly clear. Additional ideas:
- Single line (could get very long) :
  - [RHQAgent] RHQ Agent JVM JVM > [RHQAgent] RHQ Agent RHQ Agent > [Platforms] Windows jshaughn
- I'm a bit afraid of a tall, tree-like format, but it may work:
  - [Platforms] Windows jshaughn
  - > [RHQAgent] RHQ Agent RHQ Agent
  - > [RHQAgent] RHQ Agent JVM JVM
- or maybe even a table format (if I could figue out how to format that so the columns are even):
```
RHQAgent  RHQ Agent JVM  JVM
RHQAgent  RHQ Agent      RHQ Agent
Platforms Windows        jshaughn
```
For inventory views, keep Plugin and Type columns. It allows for more sorting options and can provide D12N help. I suggest actually we reorder the columns to be Ancestry, Plugin, Type, Description. I think this is most useful for D12N, description is generally not looked at, I think. Also, in general display plugin to the left of type. I think it's a bit clearer to prefix the type with its defining plugin. Although verbose, since plugins can and do share type names, I think we have to keep plugin name in the D12N display.
In certain views, like selectors, there is only one column, In this case the hover should combine the Resource and Ancestry info.

So, the approach is basically to favor a clean, uniform look over immediate coverage of all D12N scenarios. And to use hover for more verbosity when required.

Coding Tips

This section is for any tips for usage of ancestry strings in RHQ4.

AncestryUtil class

This class holds any general purpose methods for processing ancestry strings. Typically decoding options or possibly trim utilities.

ResourceDatasource and ResourceCompositeDatasource

These are the main datasource classes fetching Resource entities and will embed logic to provide decoded ancestry strings for display.

Performance

Some minor befor/after performance tests have been performed on the ProblemResourcesPortlet. This portlet had already been updated with the rhq3 DisambiguationReport approach. A comparison and analysis follows:

The test was simple and performed with a small inventory of 319 resources, all DOWN, and therefore all problem resources.

D12N approach	user	Avg Time spent in getProblemResources (ms)
rhq3	admin	77
rhq3	non-admin	98
rhq4	admin	22
rhq4	non-admin	42

Analyisis

performance-wise at the RPC level we see a 2.3x to 3.5x speedup in rhq4.
non-admin is slower do to permission level joining that both approaches must perform.
not all work is at the RPC level but the expectation is that non-rpc work is roughly equivalent. Both approaches perform client-side processing of the returned data to generate the displayable content.
the major gain for rhq4 is the omission of the second db call, which performs the disambiguation query and returns the DisambiguationReport.
the expectation is that rhq3 performance will degrade as inventory grows whereas rhq4 should remain relative to the number of returned rows.

JBoss Community Archive (Read Only)

RHQ 4.9