JBoss Community Archive (Read Only)

RHQ 4.9

Design - Storage Node Information

Scope

This document will cover the design for storage node management and reporting capabilities from the CLI and Administratiion UI.

Summary

There are currently a couple of ways to update storage node configuration:

  1. Rremote interface via storage node entities

  2. Remote interface via storage node resource configuration 

  3. Admin UI for storage nodes

  4. Regular UI for storage node resources 

  5. Updating the resource directly, update the yaml or start scripts

  6. rhq-server.properties (not a major factor anymore, see email below)

This requires a set of design principles to keep the information synchronized and the Cassandra cluster in a good state.

Design

  1. The resource configuration will be primary source of information

    1. Justification:

      1. If an update is done directly to the storage node server (which is not recommended), the update would be detected by the plugin and avaiable via the resource

      2. If an update is done via the storage node entity (UI or remote interface), the update would need to be sent do the actual server via the resource

    2.  All the updates to storage nodes will go through the plugin regardless of the origin

    3. No resource configuration should be stored on other entities; dependent entities should read the resource configuration and metrics on demand.

  2. The storage node entity will be just a proxy for the associated storage node resource

    1. A select set metrics will be included in the storage node load composite; this set is to be expanded as needed

    2. A select set settings will be available for updates

      1. These settings are important for the administration and health of the cluster (eg. heap size,  timeouts)

      2. The set will be expanded as needed, heap size will be the first 

      3. This is a similar concept to storage node load composite but will be designed to handle updates of the underlying properties, rather than just report the values

      4. The update procedure for these settings could be complex:

        1. For example, update resource configuration, shutdown the storage node, and start the storage node

      5. The logic will need to be incorporated in a single operation at the server level

    3. Neither the metric values, nor the configuration settings are to be stored on the storage node entity. The values are retrieved on demand when the funcionality is accessed by the user; and  configuration updates are passed to the storage resource.

  3. There are three exceptions to these rules: address, native cassandra port, and JMX port

    1. These settings are to be stored on the storage node entity because they are essential to establishing connectivity between the RHQ server and storage node entity, prior to managing the resource via the agent

    2. The way these settings get discovered is detailed in the email noted below for reference

    3. How these are updated:

    4. The native and JMX ports will be updated in the same manner as the rest of the proxy settings

    5. The address is updateable but is not updated on the resource

    6. The native port is not updatable individually, only at a cluster level

    7. The address & JMX port are the primary keys for the entity

UI Design

References

Email from John Sanda:

We have been persisting storage nodes for the values specified in  rhq.cassandra.seeds at server start up. As I mentioned during our call  earlier, I removed the logic from the start up code and the seed nodes  are now created and persisted by the installer. I want to explain the  motivations for this change as well as what precipitated it. There was  logic in StorageNodeManagerBean compared storage node info from the db  against values specified in rhq.cassandra.seeds. If you specified a  different JMX port in the seeds property, that would override what was  stored in the database. That is bad because it will result in the server  failing to making JMX connections to the storage node, and thus the  server will go into maintenance mode. Changing those connection  parameters should only be done as a managed change (i.e., resource  config update or resource operation). This effectively makes them read  only after the storage nodes have been created.

What  about adding new storage nodes? When new nodes are added, we schedule  maintenance that needs to be run on each node in the cluster. We have  logic to schedule the maintenance job when a node is committed into  inventory. We had similar logic in the start up code to handle the  scenario of new nodes being added via rhq.cassandra.seeds. That start up  logic isn't needed because we already have it when the node is  committed into inventory. There is no reason for the user to add new  node info to rhq.cassandra.seeds. He still has to deploy the node with  rhqctl install. Updating rhq.cassandra.seeds was an unnecessary, extra  step.

What about rhq.cassandra.seeds when the  user installs a second server? Since we already have a storage node in  the db, the server installer can simply ignore rhq.cassandra.seeds. The  installer only persists the seed nodes if the rhq_storage_node table is  empty. The installer can get the storage node connection info from the  database. Likewise, the storage installer for the second storage node  will get settings for stuff like the CQL port from the database and  apply those to the new storage node (This is yet to be implemented).

I  spent a good bit of time thinking about this before I finally decided  to commit. I do not see a good reason to deal with the  rhq.cassandra.seeds property after the initial server installation. The  change has the added benefit of simplifying server start up by removing  some non-trivial code from StorageManagerBean.

- John

JBoss.org Content Archive (Read Only), exported from JBoss Community Documentation Editor at 2020-03-13 08:09:57 UTC, last content change 2013-09-18 19:41:51 UTC.