Proper deployment and configuration is crucial for production and significant effort is spent in pre-production to ensure that managed resources are correctly configured and deployed with specific, verified versions of software. The failure to detect and understand unplanned changes in the configuration or content of a managed resource increases the risk of operational failure and hinders troubleshooting efforts. Unplanned changes in configuration or in content is referred to as drift. Production, staging, and/or recovery configurations are designed to be identical in certain aspects in order to be resilient in the event of failures in production. When these configurations drift from one another, they leave what is commonly called a configuration gap between them. Configuration drift is a natural occurrence in enterprise and data center environments due to frequent hardware and software changes. Disaster recovery failures and HA system failures are frequently a result of configuration drift.
You will find the following terminology used within the drift user interface. Understanding the concepts behind these terms is important for understanding what drift monitoring can do for you.
Unplanned changes in monitored files such as configuration files, or content.
In RHQ, the general notion of periodically scanning the file system for changes to files that likely indicate drift.
A definition guiding drift monitoring for a resource. It specifies, using a base directory and includes/excludes filters, which directories should be monitored. Additionally, it sets monitoring properties such as scan interval, enablement, and drift handling options. Also called a Drift Definition. A resource can have zero or more drift definitions.
|What is Value Context?|
When setting up a drift definition, you need to tell what base directory to check for drift. A base directory is composed of a "value context" and a "value name". The value context provides the semantics of the value name.
A single execution of drift detection for a resource. In other words a single drift definition applied to a single resource, at a specific time.
In RHQ, a specific occurrence of drift. Meaning a file change detected during a drift detection run. In the GUI, just known as Drift.
Note: In general a drift instance reflects an unexpected change. But RHQ does provide a 'planned changes' mode in the drift definition. Although drift detection is always performed the same way, RHQ will handle the drift instance differently in planned changes mode, specifically, by disabling alerting for the drift instance and omitting it from the drift history view.
The file-set (really, file-version-set, as it's not just file names, it's the actual bits) resulting from a drift detection run. In other words, a 'snapshot' of the actual files on disk at a particular time.
The snapshot resulting from the first drift detection run for a drift detection definition. The initial snapshot is marked as version 0. Variations from the initial snapshot will generate drift.
A snapshot always represents a full file-set, as it exists for the resource at the time of the drift detection run. The GUI provides two views of a snapshot. The 'Snapshot View' shows the complete file-set. The 'Snapshot Delta View' shows only the file differences between tow snapshots (typically the previous snapshot). Other than the initial snapshot, the snapshot delta view is the default. The user can toggle between views as desired.
A diff between two snapshots which can be from the same resource or from different resources. The diff identifies files present in one snapshot and vice-versa. It also identifies the files that exist in both but whose content differs.
By default a drift detection run looks for changes between the previous snapshot and the current file system state. This rolling snapshot approach ensures each change to a file will result in only one drift instance. For more strict environments we offer the ability to always detect against a specific, or 'pinned' snapshot. The user can pin a snapshot via the GUI (or CLI). In this situation a drift detection run always compares the current file system to the pinned snapshot. This can result in the same drift being reported on each detection run, until the situation is remedied.
A drift definition template is basically a preset drift definition, at the resource type level. It can be used to quickly creating and managing resource level drift definitions. Each type supporting drift detection will define at least one template in its plugin descriptor. Additionally, users can create user-defined templates. Drift definitions are always derived from a template, and are by default attached to the template. If attached, changes to the template will be pushed down to the definition.
A drift template can be pinned by pinning a snapshot to it. The snapshot will then be pinned to all attached definitions for the template. In that way many resources can perform drift detection against a single, trusted snapshot.
A Resource Type is in compliance (with respect to drift monitoring) unless one of its imported Resources in not in compliance.
A Resource in inventory is in compliance (with respect to drift monitoring) unless it:
- Has one or more Drift Definitions and for at least one of those definitions:
- The file system backing the resource is missing the definition's base directory
- Is pinned and the file system backing the resource does not match the pinned snapshot (meaning there is active drift)
Once the file system has the base directory and, if pinned, matches exactly the pinned snapshot, the resource is said to be in compliance.
The act of resolving drift. This is analogous to resolving a merge conflict in a version control system like Git. Resolving drift can be done in a number of ways including:
- Revert back to a previous state
- Acknowledge and accept the change
- Change to something other than a previous state
For more, see the following blogs which contain descriptions and demos involving the Drift feature.