Skip to end of metadata
Go to start of metadata

Agentless Management

The following are some ideas to help improve our system of remote management. That is, managing systems that do not have an agent installed on them.

Virtual Resource Types

The basic thought on how to implement "agentless management" is to create a model for "virtual" resource types. These are still implemented using the typical resource type domain model (ResourceType) but they are flagged as "virtual" types. By flagging a resource type as "virtual", you are essentially saying it does not belong to any specific agent. This means you will not see any resources of this virtual type as a child of an agent platform - the virtual resource will be a root resource, just as a normal "platform" resource is today.

The management of these virtual resources can be done by simply running its agent plugin components in server-side embedded agents in a way that does not tie their inventory to those agents (again, this means the resources will, for example, show up in the UI left-hand tree as a top-level root node - it will not be a child of the agent's platform resource).

For example, monitoring of EC2. I could create an EC2 account inventory item (aka a "resource") but I don't want it to show up underneath the "embedded agent" in the server that it happens to be running in. I would see just an "EC2 Account" top level resource (analogous to a top level platform) and its child resources could then drill down into the account in order to manage remote VM resources.

If the server that the agent component is running in goes down, the remote management would automatically fail over to another embedded agent in another server. This decouples the resource from the place where its being managed from. This failover, of course, is not trivial, however, it is required because the embedded agents will all have the same plugins and the inventory is the same across all servers - and we do not want all the embedded agents to be managing the same resource at the same time.

In the inventory model, these virtual resource types could use the new relationship service to setup a sensible resource relationship structure that has less to do with how things are managed and is more of a modeling of the physical relationships.

This setup is also useful for baremetal hypervisor remote management and inventorying of their guests.

Unmanaged Resources

Related to remote, agentless management is resources that aren't being managed at all. For example, when we do a remote discovery (via "morganator") and find a box. There is no agent on it yet, but we put it in inventory as an "unmanaged platform". This lets us track hardware and integrate it with the provisioning subsystem so we can "bare metal provision" it. We can also hook into an "SSH remote deploy" capability to start managing box by pushing out an agent to it.

At a minimum, we need a way to inject into inventory an agentless platform. With that, we could push out via SSH an agent distro onto that platform. So, for example, a "morganator" output file could be handed off to the server and the server could add new agentless platforms. The UI could then provide a "right mouse menu option" to push an agent down to it. Question is how can we automate this if we have tens or hundreds or more boxes that have been discovered and need agents.

We also need to think in terms of setting up a "template platform" in such a way that we can setup a platform with preconfigured/predeployed app. This "template" platform doesn't exist anywhere on the network but can be used to build bundles that can be pushed out to provision boxes that do exist.

Random Thoughts from Joseph

i wholly concede there are some cases that just make sense to deviate from the conventional topology of deploying an agent to every box you want to monitor. quite honestly, i think our model partially (though poorly) handles these scenarios already (at least in the purest sense of the phrase) because an agent can use whatever protocol / mechanism it wants to connect to resources. the problem i see is that calling out to a remote box (web services, http/s, rmi, etc) will invalidate the current hierarchical inventory convention. any /remotely/ discovered resources will be subjugated beneath the /local/ platform managed by that /local/ agent. but, when you think about it, what we really want is to be able to discover the /remote/ platform on the fly, the one that's managing the /remote/ resources we're trying to inventory. if that happens, then we adhere to our hierarchical model quite nicely. but, to achieve that, we either need to:

1) glean enough information from the connection made to the remote server resource, so that we can first create a remote platform resources that represents its parent, then then discover that remote server resource as a child under that remote platform, or...
2) have a more formal paradigm for "agent-less" connections such that we perform some sort of remote platform-level discovery (as opposed to trying to deduce the platform details from the remote server-level connection properties), and then do formal remote server-level discovery under that remote platform

but there are problems with both of these strategies at present. in both cases, we now need to have select agents across the enterprise managing multiple platforms, which requires potentially rewriting portions of the plugin container to properly support that scenario. today, the plugin container logic (for better or worse) presumes that there is one and only one platform that it needs to manage. so, imho, if we want a very elegant solution for an agent-less architecture, we need to put in the effort to redesign the agent to recognize and manage multiple platforms concurrently. if we can do that, then the following model ensues:

a) an agent, by definition, will monitor/manage the platform it is installed on, as well as any servers/services found on that box
b) an agent can monitor/manage an unbounded list of remote platforms via an "agent-less" design, as well as any remote servers/services found on that remote box under that remote platform

there has been some discussion of possibly having "Inventory Plugins", which are trimmed down forms of full agent plugins, but which only focus on discovery of resources. my comments on that are as follows...

if the agentless implementation existed today, i'd suggest we leverage a full agent embedded into the server to replace the concept of inventory plugins. instead of having to complicate the server-side plugin mechanism, we wholly reuse what we get from our agent-side plugins. by having a full plugin container embedded in the server, we can support managing / monitoring an arbitrary number of remote platforms which, after platform-discovery is complete, can perform content-discovery (and subsequent management of that content) for that platform...or discovery / management of content for any remote servers found under that remote platform.

in short, keep inventory management going through the current agent-server policies so as to piggyback on the rich inventory synchronization mechanisms we have, which then enables you to leverage /everything/ that you can do through the plugin api and the full set of agent-side facets. basically, i'd like to see us keep management at the edges we have today...though i'm willing to be convinced if i hear more compelling information about why an embedded model with support for "agent-less" connections would not suffice.

Thoughts from Heiko

I think we should introduce the concept of a virtual platform (listed by Joseph as option 1) that serves as root resource for other detected resources. First, users are now used to have the platform as a root, so having servers as root would just confuse them. Also lots of code internally relies on the platform being root.

Log from an AS7 and RHQ call on May 31st, 2011

16:11:49 <pilhuhn> #topic Pseudo-Platforms
16:12:06 <pilhuhn> http://rhq-project.org/display/RHQ/Needed+-+pseudo+platforms
16:12:50 <pilhuhn> As the domain controller knows all managed as servers, RHQ can detect them even without an agent
16:13:02 <pilhuhn> for the local host with the DC the scenario is easy
16:13:30 <pilhuhn> How do we handle other manged AS (like AS7..9) in the diagram on the above page?
16:13:50 <pilhuhn> Put them the tree of the platform where the DC is running?
16:14:05 <pilhuhn> This is easy, but gives the admin a wrong impression
16:14:30 <pilhuhn> Detection on the other platform is not possible, as no agent is present
16:14:51 <pilhuhn> Letting them fall below the table is no option either , asantos ?
16:17:49 <mazz> this is probably the biggest change being proposed. as I mentioned earlier, it was something we recognized we needed a few years ago (in the context of Virtual Machines) and as jsanda mentioned, its an important concept when discussing how we support cloud.
16:17:50 <mazz> the issue can probably be boiled down to "agentless platforms".. for which we have a wiki page already: http://rhq-project.org/display/RHQ/Design-Agentless+Management
16:19:01 <pilhuhn> Link added to the wiki page
16:20:07 <pilhuhn> #idea Require to have an agent on each platform with a HC for now
16:22:10 <pilhuhn> But then admins are used to run agents on managed nodes
16:22:57 <pilhuhn> #agreed Require to have an agent on each platform with a HC for now
16:23:24 <pilhuhn> #action Investigate more what it takes to support pseudo-platforms and agentless management
16:23:33 <ips> pilhuhn: i for the most part agree w/ joe's thoughts on the bottom of mazz's link
16:24:20 <ips> add ability for an agent to discover one or more agentless platforms (and descendant resources) in addition to its own local platform
16:25:42 <ips> but there would be some challenges such as what if a user later decides to install an agent on an agentless platform
16:25:59 <pilhuhn> yep
16:26:14 <ips> would we seamlessly transform the existing platform resource into a full fledged agent-backed platform?
16:27:13 <pilhuhn> I think that could be doable - probably easier than the other way around
16:27:29 <ips> also would it make sense for the as7 plugin to be in charge of discovering the agentless platforms, or would it be better for the platform plugin (or agent/PC itself) to be in charge of it?
16:28:19 <pilhuhn> I would see it as a base service - not plugin specific. But the plugin would signal the PC that it needs a platform
16:28:19 <ips> the latter so that other plugins besides besides the as7 plugin could also potentially discover servers running on the agentless platform
16:29:12 <pilhuhn> I think we used to have the strong platform concept as once upon a time there was an idea to have licenses with platform count
16:29:15 <ips> eg - postgres plugin could discover a postgres server on the agentless platform which granted might not support all facets
16:29:22 <pilhuhn> yep
16:30:09 <pilhuhn> also as4 via a remote connection. The pseudo-platform would then be a "more natural" place for it, than within the platform of the agent
16:31:10 <ips> it would also be cool to have an operation on a pseudo-platform (or some other means in the gui) to install an agent to that platform and promote it to an agent-backed platform
16:31:38 <lkrejci> hmm... so this basically means duplicating all the resource types with their agentless counterparts (where it makes sense of course) and then have some matching logic in case the agentless becomes "agentful"
16:32:05 <pilhuhn> lkrejci why do you mean duplicating?
16:32:17 <ips> i know we already have the remote agent install page under Administration, but i'm thinking something more integrated with the existing agentless platform Resource
16:32:35 <lkrejci> pilhuhn: because agentless will have fewer features and will use different access "protocol"
16:32:37 <pilhuhn> I think there would be a new Platform "Pseudo" next to Linux, OS X and so on
16:33:08 <ips> right, but the servers and services would be the same restypes used on real platforms
16:33:25 <pilhuhn> ips yes
16:33:43 <lkrejci> but the discovery and resource components would differ
16:34:04 <pilhuhn> and currently you can already manually add an as4 on a platform by giving its JNP url. This can already be remote
16:34:09 <ips> actually it would be pretty cool if we could somehow detect the OS remotely and then use the existing platform restypes
16:34:11 <lkrejci> also their plugin configs could differ because of the different ways we "connect" to them
16:34:11 <pilhuhn> lkrejci only for the Pseudo-P
16:35:07 <lkrejci> pilhuhn: i am able to discover the apache remotely but i will not have info about its server-root and httpd.conf locations, hence i will not be able to use the same discovery component
16:35:08 <ips> lkrejci: yeah, that's true, and it might not even be possible to detect a reskey for certain types, eg - if the key is the install dir
16:35:09 <mazz> side note: we have a lsof plugin
16:35:09 <pilhuhn> #idea detect remote OS and use correct platform resource type (with some isPseudo flag)
16:35:15 <mazz> I think this can be used to detect remote platforms
16:35:37 <lkrejci> ips: yep, that's why i was talking about the "matching" logic - because even the res keys might not be the same
16:35:40 <mazz> I wrote this with greg's proding a year or so ago. I don't know if it has any relevenance. but I bring it up just because I don't think people know about it
16:36:01 <ccrouch> i hate to pour cold water on this topic, but what about saying "if you want to manage an AS7 instance you need to deploy an agent on to its platform"
16:36:27 <lkrejci> ccrouch: that's the only agreement we have so far
16:36:36 <ips> lkrejci: the matching logic scares me though. would be nice to use the same restypes even if it meant some tweaking of plugin configs or even res configs
16:36:39 <pilhuhn> ccrouch we did have that above already
16:36:43 <ips> res configs -> res keys
16:37:10 <ccrouch> pilhuhn: lkrejci: ok we're good then right ? No more pseudo platforms
16:37:36 <pilhuhn> ccrouch I think it is still good to have this stream of thoughts
16:37:47 <ips> lkrejci: another option would be to add an optional remoteKey and remotePluginConfig to the ResourceType entity
16:38:50 <ccrouch> i'm not going to try to dam the stream of consciousness
16:38:56 <pilhuhn> Ok, thanks everyone for your time

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.