Skip to end of metadata
Go to start of metadata

WildFly 9 includes the ability to use the CLI to cancel management requests that are not proceeding normally.

The cancel-non-progressing-operation operation

The cancel-non-progressing-operation operation instructs the target process to find any operation that isn't proceeding normally and cancel it.

On a standalone server:

The result value is an internal identification number for the operation that was cancelled.

On a managed domain host controller, the equivalent resource is in the host=<hostname> portion of the management resource tree:

An operation can be cancelled on an individual managed domain server as well:

An operation is considered to be not proceeding normally if it has been executing with the exclusive operation lock held for longer than 15 seconds. Read-only operations do not acquire the exclusive operation lock, so this operation will not cancel read-only operations. Operations blocking waiting for another operation to release the exclusive lock will also not be cancelled.

If there isn't any operation that is failing to proceed normally, there will be a failure response:

The find-non-progressing-operation operation

To simply learn the id of an operation that isn't proceeding normally, but not cancel it, use the find-non-progressing-operation operation:

If there is no non-progressing operation, the outcome will still be success but the result will be undefined.

Once the id of the operation is known, the management resource for the operation can be examined to learn more about its status.

Examining the status of an active operation

There is a management resource for any currently executing operation that can be queried:

The response includes the following attributes:

Field Meaning
access-mechanism The mechanism used to submit a request to the server. NATIVE, JMX, HTTP
address The address of the resource targeted by the operation. The value in the final element of the address will be '<hidden>' if the caller is not authorized to address the operation's target resource.
caller-thread The name of the thread that is executing the operation.
cancelled Whether the operation has been cancelled.
exclusive-running-time Amount of time in nanoseconds the operation has been executing with the exclusive operation execution lock held, or -1 if the operation does not hold the exclusive execution lock.
execution-status The current activity of the operation. See below for details.
operation The name of the operation, or '<hidden>' if the caller is not authorized to address the operation's target resource.
running-time Amount of time the operation has been executing, in nanoseconds.

The following are the values for the exclusive-running-time attribute:

Value Meaning
executing The caller thread is actively executing
awaiting-other-operation The caller thread is blocking waiting for another operation to release the exclusive execution lock
awaiting-stability The caller thread has made changes to the service container and is waiting for the service container to stabilize
completing The operation is committed and is completing execution
rolling-back The operation is rolling back

All currently executing operations can be viewed in one request using the read-children-resources operation:

Canceling a specific operation

The cancel-non-progressing-operation operation is a convenience operation for identifying and canceling an operation. However, an administrator can examine the active-operation resources to identify any operation, and then directly cancel it by invoking the cancel operation on the resource for the desired operation.

Controlling operation blocking time

As an operation executes, the execution thread may block at various points, particularly while waiting for the service container to stabilize following any changes. Since an operation may be holding the exclusive execution lock while blocking, in WildFly 9 execution behavior was changed to ensure that blocking will eventually time out, resulting in roll back of the operation.

The default blocking timeout is 300 seconds. This is intentionally long, as the idea is to only trigger a timeout when something has definitely gone wrong with the operation, without any false positives.

An administrator can control the blocking timeout for an individual operation by using the blocking-timeout operation header. For example, if a particular deployment is known to take an extremely long time to deploy, the default 300 second timeout could be increased:

Note the blocking timeout is not a guaranteed maximum execution time for an operation. If it only a timeout that will be enforced at various points during operation execution.

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.