StorageNodeManagerBean.deployStorageNode(Subject subject, StorageNode storageNode)
This document discusses deploying as well as undeploying storage nodes. Like the RHQ relational database, there must be at least one storage node installed and running in order to install the RHQ server. Once the server is installed, additional storage node can be deployed.
An RHQ Storage Node has to be installed and running prior to installing the RHQ Server. If you run rhqctl install or rhqctl upgrade on a machine that does not have a storage node or if you do not specify a remote storage node, a new one will be deployed. The storage node will be installed on disk, configured, and started.
Shared cluster settings will be initialized when the first storage node is imported into inventory.
Storage nodes are automatically imported into inventory.
If you deploy additional storage nodes prior to server installation, you must ensure that the shared, cluster settings are the same for each node. If the settings do not match for subsequent nodes, an exception will be thrown when the agent reports them to the server; consequently, they will not be imported into inventory.
TODO - Discuss shared, cluster settings
The deployment process consists of several steps or phases. The names of phases correspond to a storage node's operation mode and are as follows,
INSTALL
ANNOUNCE
BOOTSTRAP
ADD_MAINTENANCE
The API for deployment is provided by
StorageNodeManagerBean.deployStorageNode(Subject subject, StorageNode storageNode)
deployStorageNode is exposed in both the local and remote APIs. It will determine in which phase to start the deployment based on the storage node's mode. Deployment is not finished until each phase completes successfully. The process will be aborted if an error occurs.Each phase should be idempotent. This makes it easier and possible to retry or resume a deployment. The following sections describe each phase in greater detail.
The storage node is imported into inventory
The StorageNode entity is created
Its mode is set to INSTALLED
Its status is INSTALLED
The mode is set to ANNOUNCE
The status changes to JOINING
Note that deployment happens automatically. A property will be added to the cluster setting to disable/enable automatic deployment.
It might seem unnecessary to first set the mode to INSTALLED since it is subsequently changed to ANNOUNCE. The INSTALL phase as a part of the node being imported into inventory. It is possible for multiple nodes to be simultaneously be imported into inventory. Deployment however should be serialized. This is why the mode is updated twice.
The mode is set to ANNOUNCE.
The node's status is JOINING.
The announce resource operation is scheduled to run on each cluster node.
This operation updates the internode authentication configuration file so that existing cluster nodes will accept gossip from the new node.
If the operation fails
Set the failedOperation property of the corresponding storage node
Set the errorMessage property of the new node
The node's status changes to DOWN
Schedule the prepareForBootstrap operation on the new node. This operation performs a few tasks.
Shut down the storage node
Purge the data directories
Update cluster settings in cassandra.yaml
Update internode auth conf settings
Restart the node
If the operation fails,
Set the errorMessage property of the new node
Set the failedOperation property of the new node
The node's status changes to DOWN
The node is reported up (by the cluster)
Set the mode to ADD_MAINTENANCE
The status is still JOINING
When the node starts up, it will go through the bootstrap process in which it streams data from other nodes. When the bootstrap process finishes, the node will start serving client requests. At this point, the new is fully operational and part of the cluster as far as Cassandra is concerned. For our purposes though, the deployment process is not yet complete.
Apply schema updates if necessary
Schedule the addNodeMaintenance operation on each node (including the new node)
If the operation fails
Set the failedOperation property of the corresponding storage node
Set the errorMessage property of the new node
The node's status changes to DOWN
The operation completes successfully on all nodes
Set the mode to NORMAL
The status changes to NORMAL
TODO - Add docs on addNodeMaintenance operation and schema changes
Undeployment is the process removing the node from the cluster and completely uninstalling it. Like deployment it consists of several phases where each phase corresponds to the name of a storage node's operation mode. The phases are,
DECOMMISSION
REMOVE_MAINTENANCE
UNANNOUNCE
UNINSTALL
The API for undeployment is provided by
StorageNodeManagerBean.undeployStorageNode(Subject subject, StorageNode storageNode)
undeployStorageNode is exposed in both the local and remote APIs. It will determine in which phase to start the undeployment based on the storage node's mode. Undeployment is not finished until each phase completes successfully. The process is aborted if an error occurs. Each phase should be idempotent. This makes it easier and possible to retry or resume an undeployment. The following sections describe each phase in greater detail.
Set the mode to DECOMMISSION
Note that this assumes the previous mode was NORMAL
The status changes to LEAVING
Apply schema updates if necessary
Schedule the decommission resource operation on the node
If the operation fails
Set the errorMessage property of the node
Set the failedOperation property of the node
The node's status changes to DOWN
The cluster reports that the node has been removed (from the cluster)
Set the mode to REMOVE_MAINTENANCE
The status is LEAVING
Note that the decommission operation performs the unbootstrap process. The node stops serving client requests and starts streaming the data it owns to other nodes. When the process finishes, the node is no longer part of the cluster as far as Cassandra is concerned. For our purposes though, undeployment is not yet complete.
Run the removeNodeMaintenance resource operation on each node (excluding the node being undeployed)
If the operation fails
Set the errorMessage property of the node
Set the failedOperation property of the node
The status of the node being undeployed changes to DOWN
The operation completes successfully on all cluster nodes
Set the mode to UNANNOUNCE
The status is LEAVING
Run the unannounce resource operation on each node (excluding the node being undeployed)
This removes the undeployed node from the internode authentication conf file
If the operation fails
Set the errorMessage property of the node
Set the failedOperation property of the node
The status of the node being undeployed changes to DOWN
The operation completes successfully on all cluster nodes
Set the mode to UNINSTALL
The status is LEAVING
Run the uninstall operation against the node being undeployed.
The operation will
Shut down the node if it is running
Delete all of its files from disk
If the operation fails
Set the errorMessage property of the node
Set the failedOperation property of the node
The status of the node being undeployed changes to DOWN
Uninventory the storage node resource
Delete the storage node entity
Undeployment has completed successfully
The previous sections assume that the node's current status is NORMAL when the undeployment begins. We need to handle undeploying a node with other statuses. For example, deploying a storage node may have failed, leaving the node with a mode of something other than NORMAL. The user might decide that is simply easier and/or faster to deploy a different node on a different machine instead of trying to resolve the issues with the failed deployment. The table below details what the starting undeployment phase will give for a given mode.
storage node mode |
starting undeployment phase |
INSTALLED |
UNINSTALL |
ANNOUNCE |
UNANNOUNCE |
BOOTSTRAP |
UNANNOUNCE |
ADD_MAINTENANCE |
DECOMMISSION |
NORMAL |
DECOMMISSION |
DECOMMISSION |
DECOMMISSION |
REMOVE_MAINTENANCE |
REMOVE_MAINTENANCE |
UNANNOUNCE |
UNANNOUNCE |
UNINSTALL |
UNINSTALL |
While the three are closely related, there are some subtle distinctions that need to explained.
Availability is a familiar concept within RHQ. It is a special measurement collected by the agent to determine whether or not a resource is up or down. A storage node is linked to a resource; so, it undergoes availability checks like every other resource. The RHQ Storage plugin performs a simple check, ensuring that it can make a JMX connection to the storage node. JMX is used only for management. It is not used for internode communication or for client requests. The next table describes what availability does and does not mean for a storage node.
Availability Type |
Meaning |
UP |
The storage node is running and the agent can perform management operations via JMX. This does not necessarily mean that the node serving client requests or that it is participating in internode communication. |
DOWN |
The agent cannot make a JMX connection to the storage node. Unless the JMX URL in the connection settings is wrong, this indicates that the storage node is not running. |
The mode and status differ from availability in that they report on the node's state relative to the rest of the cluster. The status provides more general info while the mode is more specific, and multiple modes actually map to the same status flags. The following table lists the possible combinations for the mode and status. The errorMessage and failedOperation properties are included as well as they too are used to determine the status.
Mode |
errorMessage and failedOperation empty? |
Status |
Description |
UP |
yes |
UP |
The node is serving client requests and participating in gossip (i.e., internode communication). |
UP |
no |
UP |
A resource operation against this node failed during the deployment of another node. In this scenario the failedOperation is set on the existing node to show precisely where the failure occurred. The node is however still actively participating in cluster operations. |
DOWN |
yes |
DOWN |
The node is part of the cluster but not actively participating. The issue could be that the node is actually shut down or it could be running but unable to reach other cluster nodes due to a network partition. |
ANNOUNCE |
yes |
JOINING |
The node is being deployed. |
ANNOUNCE |
no |
DOWN |
Deploying the node failed. It is installed (i.e., linked to a resource in inventory) but not part of the cluster. |
BOOTSTRAP |
yes |
JOINING |
The node is being deployed. |
BOOTSTRAP |
no |
DOWN |
Deploying the node failed. It is not yet part of the cluster. |
ADD_MAINTENANCE |
yes |
JOINING |
The node is being deployed. At this point the node will actually service client requests but maintenance task still need to be performed to ensure that the cluster is in a consistent state. |
ADD_MAINTENANCE |
no |
DOWN |
Deployment failed. At this point the node will actually service client requests but maintenance task still need to be performed to ensure that the cluster is in a consistent state. |
DECOMMISSION |
yes |
LEAVING |
The node is being undeployed. |
DECOMMISSION |
no |
DOWN |
Undeploying the node failed. It is possible that it is still serving client requests and participating in gossip. |
REMOVE_MAINTENANCE |
yes |
LEAVING |
The node is being undeployed. The node is no longer a participating member of the cluster, but it is still accepting JMX connections. |
REMOVE_MAINTENANCE |
no |
DOWN |
Undeployment failed. The node is no longer a participating member of the cluster. |
UNANNOUNCE |
yes |
LEAVING |
The node is being undeployed and is no longer a participating member of the cluster. |
UNANNOUNCE |
no |
DOWN |
Undeployment failed and the node is no longer a participating member of the cluster. |
UNINSTALL |
yes |
LEAVING |
The node is being undeployed. |
UNINSTALL |
no |
LEAVING |
Undeployment failed and the node is no longer a participating member of the cluster. |
MAINTENANCE |
yes |
NORMAL |
The node is undergoing routine maintenance but is part of the cluster and is operational. |
MAINTENANCE |
no |
NORMAL |
An error occurred while undergoing routine maintenance. The node is part of the cluster and is operational. |
If both the status and the availability are DOWN, then we can reasonably assume that the node is not running. Furthermore, when they are both DOWN, we can look at the mode to know where exactly in the (un)deployment process something failed.