JBoss Community Archive (Read Only)

RHQ 4.9

RHQ 4.4.0

This is the RHQ 4.4 release. It was released on May 9th, 2012

Upgrade note

If upgrading from RHQ 4.2 you must first make a manual change to your database. Apply this change only if upgrading from RHQ 4.2, not earlier versions. Execute the following SQL to update the schema version from 2.114 to 2.115:

update rhq_system_config
   set property_value='2.115', default_property_value='2.115'
 where property_key='DB_SCHEMA_VERSION';

After this update proceed with the upgrade normally.

Please note
  • The autodiscovery portlet is no longer included on the default global dashboard. Go to the Inventory->Discovery Queue to visit the discovery queue. Or, you can optionally edit the global dashboard and add the portlet.

  • Some browsers (most likely Webkit based like Safari and Chrome) will not automatically forward you from the installer to the login page. Manually switch to

    http://localhost:7080/coregui/
  • RHQ 4.4 requires Java 6 – it may work on Java7, but has not yet been extensively tested (feedback is welcome).

New Features (since RHQ 4.3 )

Availability Improvements

There have been several changes made to enhance availability collection, reporting and alerting. For more on all of the changes to availability see: Availability Improvements.

The UNKNOWN and DISABLED Availability Types

In addition to DOWN and UP availability types RHQ now has UNKNOWN and DISABLED as well.

UNKNOWN

The UNKNOWN availability allows RHQ to better represent resources for whom we don't know the current availability. The best example of this is when an agent is down. It's managed resources may be up or down, we don't know.

DISABLED

RHQ now allows users to mark resources DISABLED. This is an availability type primarily assigned by users, not by agent reporting. DISABLED resources will ignore availability reported by the agent. This is useful for planned outages or resources that are expected to be, or are somehow set administratively down. Since DISABLED resources are not DOWN, they are omitted from dashboard portlets and availability alerting scenarios.

Behavioral Change: Agent Down Handling

When an agent is down its platform resource will be marked DOWN but all of the platform children will now be marked UNKNOWN to represent that the RHQ server is not getting updated. In the past the children were also marked as DOWN. Note that DISABLED children will be left as DISABLED (see more below).

Behavioral Change: Alerting

Existing Goes DOWN alert conditions will not fire when a resource is set to UNKNOWN. So, the new availability assignment for down agents can affect existing alerting. The intent is to be more accurate and avoid false positives but if the prior behavior is desired the alert conditions should be updated to Goes NOT UP, which is a new option.

Behavioral Change: Group Availability

The introduction of new availability types forced changes to the way group availability is determined. Group availability is now determined with the following algorithm, evaluated top to bottom in the table below:

Member Availability

Group Availability

Empty Group

Grey / EMPTY

All Down

Red / DOWN

Some Down or Unknown

Yellow / WARN

Some Disabled

Orange / DISABLED

All UP

Green / UP

Behavioral Change: Remote API

The remote API method ResourceManagerRemote.getLiveResourceAvailability() no longer returns null for unknown, it now properly returns AvailabilityType.UNKNOWN. This may affect existing remote clients or CLI scripts.

Goes DOWN Alerting

Note that 'Goes DOWN' alert conditions remain unchanged and are unaffected by the upgrade. And are satisfied as before, when the resource's availability changes from NOT DOWN to DOWN. But note that resources moving to UNKNOWN or DISABLED will not meet the condition. There is now a 'Goes NOT UP' operator that will match when the state moves from UP to any other availability type.

Availability Duration Alerting

In addition to Availability Change alert conditions, it is now possible to create Availability Duration conditions. 'Stays DOWN for Xm' will match if a resource goes from UP to DOWN and stays down for X minutes. 'Stays NOT UP' is similar, but affects changes from UP to any other availability.

Prioritized Availability Collection

This is a major change. Previously, all resources were checked on every availability scan. By default every five minutes. This could caused 'peak and valley' issues with CPU and/or memory spikes. It also did not provide any way to favor checking of critical resources and lessen priority for many non-critical, service-level resources. With the changes:

  • Provide resource-level granularity for collecting avail information.

  • Every non-platform resource type will have a built-in metric called "AvailabilityType"

    • The value is in seconds

Upgrade note

The new Availability metric schedule will be added automatically to all types in updated plugins. So, for upgrades, new versions (updated MD5) of current plugins must be deployed. Custom plugins must be rebuilt and redeployed to get the new metric schedule.

Behavioral Change: Availability Check Intervals

Previously an availability check was performed on all resources with a 5 minute interval, and all resources were checked in one pass. Now, availability checking is performed based on the Availability metric schedule. If not set in the plugin descriptor the resource type's default availability check interval is based on its category:

  • Server 60s (1 minute)

  • Service 600s (10 minutes)

  • Platform not applicable, platform availability is determined by agent activity, not getAvailability() calls.

This means that Availability collection intervals can be set, like other metric schedules, at the Template, Group and Resource levels. And can be changed at the user's discretion. If the metric is disabled then affected resources will defer to their parent's availability type.

Behavioral Change: Agent Avail Prompt Command

The Avail prompt command generated either a changes-only or full report, and that is still true. But it always performed an avail check on every resource. With the introduction of prioritized availability checking that is not true, the avail check will be performed only if there is no current availability for the resource, or it's scheduled time is past. There is a new option, --force that can be specified to force the availability checks. Note that this option will increase execution time.

For best performance it is recommended that the collection interval for non-interesting resources be set to a large interval, or be disabled.

Behavioral Change: Availability Check Approach

Availability checking now happens incrementally. The availability job runs at 30 second intervals and not every resource is checked on each pass. Instead, checking is spread out, still respecting the desired intervals as much as possible, but in a fashion that avoids the 'peak and valley' issues of the past.

Other Behavioral Changes

Behavioral Change: Agent Max Quiet Time

Back-filling of an agent's platform resources was performed after a 15 minute period of no communication from the agent. This period is set as the AGENT_MAX_QUIET_TIME_ALLOWED system setting. This was true of an agent shut down gracefully or one that went down unexpectedly. The upgrade will now set this value to 5 minutes, which is being reduced due to architectural improvements. Also, agents shut down gracefully will be back-filled immediately.

Behavioral Change: Operations that Affect Availability

Operations now have the ability to request an immediate availability check after completion. All of the RHQ plugins have been updated for any Start/Stop/Restart operations. So, availability should typically be updated within 60s of the operation completing and can be reflected in the UI if it is refreshed.

REST api

The REST api has been enhanced. This API is included to get the effort started to build a REST interface into RHQ so that the server is better accessible from other tools and languages.

This API IS NOT STABLE. Do not rely on it. IT WILL CHANGE

To access the API, go to http://localhost:7080/rest/ See also Design-REST and this blog post

  • RESTEasy has been updated to version 2.3.2.Final

GUI

  • Updated Japanese translations by Fusaykui Minamoto

  • Initial Russian translations by Denis Krusko

  • The reports under /coregui/#Reports can now be exported in CSV format.

    • Several of the reports offer filtering capabilities to generate a fixed data set.

Server

Plugins

  • [bug 805987] The platform plugin now reports metrics for the actual free and actual used system memory.

  • The JBoss AS 7 plugin has been renamed. If you install RHQ 4.4 into an existing RHQ 4.3 database and had the as7 plugin installed before, you should remove it before the upgrade.

See also

There is now a project RHQ samples on GitHub available that lists additional sample code that works together with RHQ. This also contains examples in other programming languages than Java to access the REST api.

Known Issues

  • The embedded agent may fail to find the server / register with the server - this means that it will not be able
    to discover and manage any resources. Please use an external agent. BZ 819766

  • For the group display, it may look if resource counts are wrong when you have resources sitting in the autodiscovery queue BZ 819897

Translations

The GWT part of the UI has partially been translated into German, Portuguese, Japanese, Chinese and Russian. The language should be automatically selected depending on your browser settings. You can explicitly access other translations by appending a locale specifier in the URL. For example to select the German translation you would append ?locale=de to the base URL, e.g. http://localhost:7080/coregui/?locale=de.

Supported locales are:

  • zh for Chinese

  • de for German

  • ja for Japanese

  • pt for Portuguese

  • ru for Russian

Please ping us if you want to help translating the UI to your language. Translations are done via the translations project on GitHub, which also has some instructions on how to start.

Bug reporting

Please report all bugs you find in Bugzilla. If you find a bug that has been recorded in the above list, please leave a comment on them especially if this needs special steps to reproduce.

List of resolved Bugzilla entries

Please consult Bugzilla with a target release of RHQ 4.4.0 for a list of resolved issues

Download

You can download the release here.

JBoss.org Content Archive (Read Only), exported from JBoss Community Documentation Editor at 2020-03-13 07:55:33 UTC, last content change 2013-09-18 19:40:20 UTC.