JBoss.orgCommunity Documentation

Chapter 3. Visualising the Runtime Governance Information

3.1. Accessing the Runtime Governance UI
3.2. Services
3.3. Situations
3.3.1. Situations List
3.3.2. Situation Details
3.4. Analytics
3.4.1. Dashboard
3.4.2. Changing the Time Frame and Refresh Cycle
3.4.3. Filtering by selection
3.4.4. Segmenting information by query
3.4.5. Adhoc queries
3.4.6. Customizing and sharing the Dashboard

This section describes how to use the Runtime Governance User Interface (UI).

To access the Runtime Governance UI, after the server has been started, use the url: <host>/rtgov-ui


Once displayed, it will request an username and password. When successfully logged in, you will be presented with the top level dashboard:


The dashboard provides access to three types of information related to runtime governance. These will be discussed in more detail in the following sections.

The Services page lists the services that have been deployed to a service container (e.g. switchyard) and are being monitored by RTGov.


The list shows the service name, optional application in which it is contained, the external interface(s) it implements and finally the bindings through which it can be accessed.

Note

For switchyard services, this list will only contain the public (promoted) services, however activity information will be collected for internal component services as well.

When a service name is selected, it will navigate to the details page:


This page shows high level information about the service, and where appropriate, any promoted references it has to other external services.

The Dependencies tab can be used to view dynamic dependency information about the service. This information is based on a short term rolling window, and will therefore only show the relationships associated with recent invocations:


This section shows the Situations that are reported when RTGov policies detect issues that need to be bought to the attention of users.


The left hand panel provides a variety of options for filtering the list of situations.

The list contains the following columns:

  • Severity - an icon to indicate how severe the situation is.
  • Type - identifies the nature of the situation (e.g. SLA Violation, Exception, etc).
  • Status - where the situation is in its lifeycycle (see further down for description of the lifecycle). The status Open represents all non-resolved states.
  • Subject - the subject of the situation, which will generally be a service type and operation, with optional fault name.
  • Timestamp - when it occurred.
  • Description - further details about the situation.
  • Action - show properties for the situation.

At the bottom left is a collapsed region containing controls for performing bulk actions.


These actions can operate either on the filtered situations (as shown here), or all situations. The actions themselves, from left to right, are resubmit (i.e. resubmit an associated message to the target service), export and remove.

When a new Situation occurs, if the user is already viewing the situations page, then a small notification will be displayed in the top right corner:


If this notification is expanded, it will list some of the details for the new Situations:


The user can either select one of the entries to navigate to it’s details, or alternatively use the refresh button to update the list.

To view the details associated with a Situation in the list, select its type field, which will navigate to the details page.


This page shows the details of the situation, including properties and context data.

The Call Trace tab shows the call stack associated with the business transaction, if appropriate context information has been recorded with the situation.


Selecting a node in the call trace displays further details in the right hand panel.

The optional Message tab is displayed if the Situation has an associated business message.


If the service and operation, associated with the situation, supports resubmission of the messages (i.e. if a SwitchYard service, then it would need to have an SCA binding and the operation would need to be one-way), and the Situation is not in a RESOLVED state, then the user will be able to edit the message content and press the Resubmit button. This will result in resubmission information being displayed:


RTGov uses Elasticsearch to store the activity information, and Kibana to provide a dashboard for analysing that information.

The "out of the box" dashboard layout presents the following information:



The Kibana dashboard enables a user to filter the information being viewed by:

  • pressing the magnifying glass symbol associated with some information of interest (see action in the image below)

  • pressing the no entry sign symbol associated with the information to be excluded (see action in the image above)
  • selecting the information of interest from a pie chart (e.g. selecting a fault, as shown in the image below)

As well as being able to focus/exclude information based on the other graphs, the Documents table provides even more fine grained control over what is displayed. In the following image it shows how the fault value of itemnotfound could be used as a filter, instead of selecting it from the pie chart. However, more importantly adhoc fields such as customer or productName could be equally used as the subject of the filter, if that information is recorded with the activity events (and therefore the response time data).


As each filter is added, to progressively refine the results being viewed, their details are listed in the "FILTERING" section at the top of the dashboard, as shown in the following image:


The first box identifies the initial time range used to display the data, which has been refined by the next box based on interactively selecting a region on the response time graph. The third box applies a filter to only show information related to the InventoryService service type, and finally the fourth box narrows the information further to show the subset of response time information associated with the itemnotfound fault.

Any of the these filter criteria can individually be disabled (using the tick symbol) or cancelled (using the cross symbol).

Although filtering provides a useful way to narrow in on information of interest to view that data in the available graphs. It is sometimes more interesting to be able to compare different sets of results.

In the default dashboard all response time information is treated in the same way, and therefore not differentiated. If we want to segment the information based on various groupings, then we need to create what are called pinned queries. At the top of the dashboard, you will need to expand the blue "QUERY" region to find a data entry area. This can be used to enter adhoc queries to filter the results displayed in the dashboard (see following section).

However for the purpose of comparing different sets of data, we leave the default entry blank and instead create one or more additional query fields, but pressing the plus symbol present in the last entry field.

When an entry field has been created, enter an appropriate query. For example,

  • serviceType: "{urn:switchyard-quickstart-demo-orders:0.1.0}OrderService/InventoryService"

This query will identify response times associated with the InventoryService service type.

  • properties.customer: "Fred"

If the customer name has been associated with the reported activity events, then this query will identify the response time information associated with a particular customer.

As shown in the following image, the colour coded segmented queries are reflected in the response time graph:


as well as the Distribution over time chart:


To change the label associated with a query, select the query coloured dot and enter the label in the field, followed by pressing the close button:


It is also possible to temporarily disable a particular query, or change its colour, using this popup dialog.