WildFly 9 includes the ability to use the CLI to cancel management requests that are not proceeding normally.
The cancel-non-progressing-operation operation instructs the target process to find any operation that isn't proceeding normally and cancel it.
On a standalone server:
The result value is an internal identification number for the operation that was cancelled.
On a managed domain host controller, the equivalent resource is in the host=<hostname> portion of the management resource tree:
An operation can be cancelled on an individual managed domain server as well:
An operation is considered to be not proceeding normally if it has been executing with the exclusive operation lock held for longer than 15 seconds. Read-only operations do not acquire the exclusive operation lock, so this operation will not cancel read-only operations. Operations blocking waiting for another operation to release the exclusive lock will also not be cancelled.
If there isn't any operation that is failing to proceed normally, there will be a failure response:
To simply learn the id of an operation that isn't proceeding normally, but not cancel it, use the find-non-progressing-operation operation:
If there is no non-progressing operation, the outcome will still be success but the result will be undefined.
Once the id of the operation is known, the management resource for the operation can be examined to learn more about its status.
There is a management resource for any currently executing operation that can be queried:
The response includes the following attributes:
|access-mechanism||The mechanism used to submit a request to the server. NATIVE, JMX, HTTP|
|address||The address of the resource targeted by the operation. The value in the final element of the address will be '<hidden>' if the caller is not authorized to address the operation's target resource.|
|caller-thread||The name of the thread that is executing the operation.|
|cancelled||Whether the operation has been cancelled.|
|exclusive-running-time||Amount of time in nanoseconds the operation has been executing with the exclusive operation execution lock held, or -1 if the operation does not hold the exclusive execution lock.|
|execution-status||The current activity of the operation. See below for details.|
|operation||The name of the operation, or '<hidden>' if the caller is not authorized to address the operation's target resource.|
|running-time||Amount of time the operation has been executing, in nanoseconds.|
The following are the values for the exclusive-running-time attribute:
|executing||The caller thread is actively executing|
|awaiting-other-operation||The caller thread is blocking waiting for another operation to release the exclusive execution lock|
|awaiting-stability||The caller thread has made changes to the service container and is waiting for the service container to stabilize|
|completing||The operation is committed and is completing execution|
|rolling-back||The operation is rolling back|
All currently executing operations can be viewed in one request using the read-children-resources operation:
The cancel-non-progressing-operation operation is a convenience operation for identifying and canceling an operation. However, an administrator can examine the active-operation resources to identify any operation, and then directly cancel it by invoking the cancel operation on the resource for the desired operation.
As an operation executes, the execution thread may block at various points, particularly while waiting for the service container to stabilize following any changes. Since an operation may be holding the exclusive execution lock while blocking, in WildFly 9 execution behavior was changed to ensure that blocking will eventually time out, resulting in roll back of the operation.
The default blocking timeout is 300 seconds. This is intentionally long, as the idea is to only trigger a timeout when something has definitely gone wrong with the operation, without any false positives.
An administrator can control the blocking timeout for an individual operation by using the blocking-timeout operation header. For example, if a particular deployment is known to take an extremely long time to deploy, the default 300 second timeout could be increased:
Note the blocking timeout is not a guaranteed maximum execution time for an operation. If it only a timeout that will be enforced at various points during operation execution.