Changing Storage Node Address

Background

This design document covers the work being done for BZ 1103841. The current version of RHQ is 4.12.0.

The StorageNode entity has an address property. The server communicates with the storage node via the DataStax driver using that address. Internode communication, i.e., gossip, is also configured to use that address. The address is configured initially during storage node installation, and then it gets propagated to other nodes through the deploy and undeploy processes. There is no support for otherwise changing the storage node address. This can be problematic for example if the user realizes that he specified the wrong address when installing the storage node.

We need to provide a way for the user to change the node's address. We also need to be resilient to manual, unplanned changes. If the user decides to directly edit cassandra.yaml for example, then we need to reconcile the changes as best we can.

The Address

Cassandra uses two different endpoints, one for client requests and for gossip. The former is configured with the rpc_address property in cassandra.yaml, and the latter is configured with the listen_address property. As already mentioned, the StorageNode entity stores a single address which is used for both client requests and gossip. It is also used in the resource key in the RHQ Storage Node resource type. The next section will talk more about about the resource key. The StorageNode.address will continue to refer to the rpc_address, but we will add a new a new property that refers to listen_address.

When the user wants to change the address, we can by default change both rpc_address and listen_address, but we should give the user to update each one independently. The user should be able to effect the change through the storage node admin UI or through the CLI.

Resource Key

The StorageNode address is used as the resource key. Specifically, the listen_address is used for the resource key. This means that when the user changes the listen_address, a new storage node resource would be discovered, which we do not want to happen. Cassandra stores a host ID which is a UUID that does not change. It can be accessed through JMX via the StorageServiceMBean.LocalHostId attribute. The host ID should be used as the resource key. For existing storage nodes we need to implement resource upgrade functionality to convert existing resource keys over to use the host ID.

Changing rpc_address

What should we name this endpoint?

The property in cassandra.yaml is rpc_address, but I do not think we want to refer to it as that for end users. Should we call it the CQL address, or CQL endpoint address, or client address, or client endpoint address or something else?

This is a pretty straightforward change as it does not effect other nodes. The following steps need to be performed,

Update StorageNode.address property
Update rpc_address in cassandra.yaml
Restart the storage node for changes to take effect

Note that the resource operation that does this work should not fail simply because the node is down. If the operation is invoked when the node happens to be down, we can and should still update cassandra.yaml.

Changing listen_address

What should we name this endpoint?

The property in cassandra.yaml is {{listen_address}, but I do not think we want to refer to it as that for end users. Should we call it the gossip address, or the gossip endpoint address, or something else?

In contrast to changing rpc_address, changing listen_address is more complicated.

Update StorageNode entity (assuming we decide to store listen_address directly in the StorageNode entity)
Update cassandra-jvm.properties with these system properties
- -Dcassandra.replace_address=<new_address>
- -Dcassandra.replace_address_first_boot=true (See CASSANDRA-7356 for more details)
In cassandra.yaml update the following properties,
- seeds
- listen_address
Update rhq-storage-auth.conf
Restart the storage node for changes to take effect
For other nodes in the cluster,
- Update seeds property in cassandra.yaml
- Update rhq-storage-auth.conf
- Invoke RhqInternodeAuthenticator.reloadConfiguration JMX operation

Note that we can perform these changes even if nodes are down. We will not be able to invoke the RhqInternodeAuthenticator.reloadConfiguration if the node is down, but that is ok because the configuration is loaded at start up as well.

Resource Operations

We need a couple of new resource operations, one for updating the node whose address is changing and another for updating the rest of the cluster. For the remainder of this document I will call these operations updateEndpoints and endpointChanged. Note that the actual operation names are subject to change. Both of these operations should be exposed on the RHQ Storage Node resource type.

updateEndpoints

This operation should be capable of updating either or both of the rpc_address and listen_address properties in cassandra.yaml (and make the changes described earlier). We should still be able to execute this operation even if the node is down. The operation's description should state very explicitly that it is not intended for direct usage and that changing a nodes address(es) should be done through either the admin UI or the CLI. The operation should provide an option to skip doing the node restart.

endpointChanged

This operation should make the necessary changes previously described to the other cluster nodes. We should still be able to execute this operation even if the node is down. The operation's description should state very explicitly that it is not intended for direct usage. This operation should only be invoked by the server as part of a workflow that is executed when a node's listen_address changes. It also needs to verify that the change was successful. The operation should provide an option to skip doing the node restart.

Installer Changes

When the server installer runs the first time, it creates the initial StorageNode entity. The installer needs query for the node's host ID and persist that with the StorageNode entity. It can be retrieved from the system.local table.

Use Cases

This section includes various use cases to facilitate testing and to help identify any additional design and/or implementation changes that may be necessary.

Scenario: User is running a single storage node and wants to change its address

Assume that we are updating both rpc_address and listen_address.

Steps

Update StorageNode entity
Shutdown storage client subsystem
Schedule updateEndpoints resource operation
Restart the storage client subsystem on the server

We have to do things a bit differently when we have a single node cluster. We have to restart the storage client subsystem so that the Cassandra driver can reconnect to the storage node. While the maintenance is being performed, we will not be able to read or write metric data. We do not however want to put the server in maintenance mode because it would not process the resource operation result. If an agent sends a measurement report while the maintenance is underway, the server should throw an exception back to the agent, letting it know to resend the report at some point in the near future.

If there are any read requests, like for graphs, while the maintenance is underway, the server should throw an exception indicating that the backend is temporarily unavailable. This will allow the UI to display an appropriate message.

Scenario: User is running multiple storage nodes and wants to change the addresses on a node

Assume that we are updating both rpc_address and listen_address.

Steps

Update StorageNode entity
Schedule updateEndpoints resource operation
Schedule endpointChanged on each of the remaining cluster nodes

The changes are applied as a rolling update with each resource operation being executed serially. This will make it easier to ensure that the cluster stays in a known, consistent state, it also allows us to avoid having to restart the storage client subsystem.

Scenario: User is running multiple storage nodes and wants to change the rpc_address of a node

Steps

Update the StorageNode entity
Schedule updateEndpoints resource operation

Since we are not changing the listen_address, there is no need to invoke the endpointChanged operation on the other cluster nodes.

Scenario: User installs storage node with wrong address and re-installs with new address

This is a scenario that users have encountered and is a big impetus for this work. It should also be noted that this would be an incorrect way to go about changing the address; however, there is nothing to prevent the user from doing it so we need make sure we handle it adequately.

Steps

User performs a default installation, rhqctl install
User realizes that he specified the wrong address for the rhq.storage.hostname property in rhq-storage.properties. Let's call it bad.address.com.
User deletes the rhq-storage installation directory
User corrects the rhq.storage.hostname property. Let's call this address good.address.com.
User runs rhqctl install --storage
User starts the system up, e.g., rhqctl start

The server installer creates a StorageNode entity for bad.address.com. When the agent discovers the storage node, it will report the good.address.com node to the server. The server detect that both bad.address.com and good.address.com refer to the same node since they both share the same host ID. And since good.address.com is what is reported by the agent, we know that is the value configured in cassandra.yaml; however, we do not know that all of the settings have been updated. We go ahead and schedule the updateEndpoints operation to make sure that everything is in a consistent state. We do not need to restart the node though since it is already listening on good.address.com.

Scenario: User is running multiple storage nodes and manually changes rpc_address

This is an out of band change that users should avoid, but we need to handle it as best we can.

Steps

User manually edits cassandra.yaml changing rpc_address from old.address.com to new.address.com
User restarts the storage node
Cassandra driver notifies server that old.address.com has been removed from and new.address.com has been added to its connection pools.

We can compare the value of the address found in the live resource configuration against the StorageNode.address property to verify that there was an out of band change. We should then log a warning about it and schedule the updateEndpoints operation on the node to ensure its configuration is in a consistent state. We can skip the node restart with the operation.

Scenario: User is running multiple storage nodes and manually changes rpc_address and listen_address

This is an out of band that users should avoid, but we need to handle it as best we can.

Steps

User manually edits cassandra.yaml changing rpc_address and listen_address from old.address.com to new.address.com
User restarts the storage node
Cassandra driver notifies the server that old.address.com is down

We can compare the value of the address found in in the live resource configuration against the value in the StorageNode entity (note that this will be a new property since the existing address property refers to rpc_address) to verify that there was an out of band change. We should then log a warning about it and then schedule the resource operations as if the user had initiated the change.

Scenario: User is running multiple storage nodes and manually changes listen_address

This is an out of band change that users should avoid, but we need to handle it as best we can.

User manually edits cassandra.yaml changing listen_address from old.address.com to new.address.com
User restarts the storage node
Cassandra driver notifies the server that old.address.com is down
Cassandra driver notifies the server that old.address.com is up

Remember that the driver talk to the nodes over rpc_address; so, it will still be able to connect to old.address.com. Gossip and internode authentication use listen_address; so, other cluster nodes will not be able to connect to old.address.com. Once again, we should compare the value of the address found in in the live resource configuration against the value in the StorageNode entity (note that this will be a new property since the existing address property refers to rpc_address) to verify that there was an out of band change. We should then log a warning about it and then schedule the resource operations as if the user had initiated the change.

Scenario: User is running a single storage node and manually changes rpc_address

This is an out of band change that users should avoid, but we need to handle it as best we can.

Steps

User manually edits cassandra.yaml changing rpc_address from old.address.com to new.address.com
User restarts the storage node
The driver will not be able to connect and subsequent reads/writes will result in NoHostAvailableExceptions
The server goes into maintenance mode

We can compare the value of address found in the (live) resource configuration against the StorageNode.address property to verify that there was an out of band change. We should then log a warning about it and schedule the updateEndpoints operation on the node to ensure its configuration is in a consistent state. We can skip the node restart with the operation.

We need to take the server out of maintenance mode so that it can receive the results of the updateEndpoints operation. When the operation completes, we need to restart the storage client subsystem. If an agent sends a measurement report while the maintenance is underway, the server should throw an exception back to the agent, letting it know to resend the report at some point in the near future. If there are any read requests, like for graphs, while the maintenance is underway, the server should throw an exception indicating that the backend is temporarily unavailable. This will allow the UI to display an appropriate message.