Both the RHQ Server and RHQ Agent use the same underlying communications layer for their incoming and outgoing messages (which is based on JBoss/Remoting v2). It is highly configurable, but for the most part the defaults are sensibly defined and therefore you do not need to modify most of the configuration settings. The RHQ Server and RHQ Agent configuration files and settings look similar because many of their preference settings configure this common communications layer. This section describes these settings for you to configure your RHQ Servers and RHQ Agents appropriately.
You can secure the communcations between the server and agent. To read more on how to set that up, see Securing Communications.
When The RHQ Agent starts for the first time, it enters setup mode. You can also manually enter this setup mode by using the --setup command line option or use the setup agent prompt command. Once in setup mode, you will be prompted to provide values for a series of preference settings. Some of these settings involve setting up the communications services of the agent. Other settings are only prompted if you enter "advanced" setup mode or "all" setup mode (these "advanced" and "all" settings are marked with a (a) below). To enter advanced setup mode and thus be able to set advanced settings, use the --advanced command line option with --setup or use the prompt command setup advanced. To enter the "all" setup mode (which allows you to set every preference available), you must use the prompt command setup all.
- Agent Hostname or IP Address - this is the address that the agent will bind to when listening for incoming messages. This usually is the same address that the agent's remote clients (aka RHQ Servers) will use when trying to connect to the agent. However, in some network setups, this may not always be the case (e.g. a remote client going through a router that forwards requests to a different host). If you want the RHQ Server to connect to this RHQ Agent via a different address, you need to set up some special Transport Parameters to indicate this, as described below.
- Agent Port - this is the port that the agent will actually be listening to. This usually is the same port that the agent's remote clients (RHQ Servers) will use when trying to send messages to the agent. However, in some network setups, this may not always be the case (e.g. a remote client going through a router that forwards requests to a different port). If you want the RHQ Server to connect to this RHQ Agent via a different port, you need to set up some special Transport Parameters to indicate this, as described below.
- Agent Transport Protocol(a) - this is the transport the agent expects incoming messages to adhere to. This is usually "socket" or "sslsocket" (for raw binary socket messages, either unsecured or secured). The RHQ Agent does not host a servlet container, so you cannot use a servlet based transport (servlet, sslservlet) for server-to-agent communications. You can only use servlet based transports for agent-to-server communications (see the "RHQ Server Transport Protocol" preference below).
- Agent Transport Parameters(a) - these are additional parameters to be used when the agent creates its communications services and when its remote clients (RHQ Servers) needs to talk to the agent. See the section below, Transport Parameters, for the different types of transport parameters you can set.
- RHQ Server Hostname or IP Address - this is the IP address for the endpoint of the primary RHQ Server this agent will talk to. This RHQ Server IP Address value is dictated by the way the RHQ Server has configured its communications services (using similar settings as the ones being described for the agent). Please refer to your RHQ Server's communications configuration to see what exact value this should be.
- RHQ Server Port - this is the port that the primary RHQ Server is listening to. This RHQ Server Port value is dictated by the way the RHQ Server has configured its communications services (using similar settings such as these being described for the agent). Please refer to your RHQ Server's communications configuration for what exact value this should be.
- RHQ Server Transport Protocol(a) - this is the transport the primary RHQ Server will expect its incoming messages to flow over. This RHQ Server Transport value is dictated by the way the RHQ Server has configured its communications services (using similar settings such as these being described for the agent). Please refer to your RHQ Server's communications configuration for what exact value this should be.
- RHQ Server Transport Parameters(a) - this are additional Transport Parameters that are to be used when the agent connects to the primary RHQ Server. This RHQ Server Transport Parameters value is dictated by the way the RHQ Server has configured its communications services (using similar settings such as these being described for the agent). Please refer to your RHQ Server's communications configuration for what exact value this should be. In particular, you need to know what additional transport parameters the RHQ Server wants its clients to define. This is especially important if the RHQ Agent needs to connect to a different host and/or port than what the RHQ Server actually binds to.
- Command Send Timeout(a) - this is the amount of milliseconds the agent will wait before aborting a command (i.e. the amount of time in milliseconds that the RHQ Server has in order to process commands and return its results). Please ensure that this value is the same as the timeout specified in the transport parameters of your RHQ Server URI (if specified), since both timeouts will be enforced. If this Command Send Timeout value is less than or equal to 0, the agent will not timeout its messages (note that if a timeout is specified in the RHQ Server URI transport parameters, that timeout will be enforced).
- Command Send Retry Interval(a) - This is the minimum amount of time, in milliseconds, the agent will wait before trying to resend a guaranteed command that previously failed.
- Command Send Max Retries(a) - If a guaranteed delivery message is sent, but the agent fails to connect to the server and deliver the message, it will always be retried. However, if the error was something other than a 'cannot connect' error, the command will only be retried this amount of times before the command is dropped (at which time it will be considered lost forever).
- Maximum Commands To Concurrently Send(a) - This is the maximum number of commands the agent can send to the server at any one time. If you defined clientMaxPoolSize in your RHQ Server URI transport parameters, make sure its value is the same as this "Maximum Commands To Concurrently Send" value since you effectively cannot have one higher than the other.
The agent can be configured to auto-detect its RHQ Server. It can do this in two different ways:
- Multicast detection : using multicast detection technology, the agent can usually detect the RHQ Server coming online or going offline within a matter of seconds (a time which is configurable). This requires your network to support multicast traffic; if it does not, then you cannot use this method of server auto-detection. The following configuration preferences affect auto-detection using the multicast detector:
- rhq.agent.server-auto-detection must be set to true in order to enable this feature
- rhq.communications.multicast-detector.enabled must be set to true in order to enable this feature
- rhq.communications.multicast-detector.default-time-delay is the number of milliseconds that must pass without hearing from the RHQ Server before it is to be considered "offline". To quickly detect a RHQ Server shutting down or starting up, set this to a short time. To reduce the amount of network traffic, set this value to a longer time. However, ensure that this value is longer than the server's heartbeat-time-delay, otherwise, unnecessary network traffic will result.
- rhq.communications.multicast-detector.heartbeat-time-delay is the number of milliseconds that must pass between the agent's own heartbeat messages. This value must be shorter than the RHQ Server's default-time-delay otherwise, unnecessary network traffic will result.
- Server polling : this mechanism polls the RHQ Server periodically to determine if it is online or offline. This method of auto-detection does not require multicast traffic but does require the agent to periodically connect to the RHQ Server and send it a simple ping command. The following configuration preference affects this "manual" server detection via polling:
- rhq.agent.client.server-polling-interval-msecs is set to the number of milliseconds that must pass before polling the server. To quickly detect the RHQ Server going down or coming up, set this to a short time; to reduce the amount of network traffic, set this to a longer time. If this value is 0 or less, server polling is disabled.
Typically one or both of these mechanisms are enabled. With the ability to auto-detect the RHQ Server going offline, the agent will be given the opportunity to persist commands that are waiting to be sent and allows the agent to shutdown its attempts to send commands. When the RHQ Server comes back online (and is auto-detected), the agent can resume. If, however, both auto-detection features are disabled, then the agent, upon startup, will immediately assume the RHQ Server is online and will allow commands to be sent. If, at some point, the RHQ Server is down, the agent will continually attempt to send it commands - and receive "connection refused" errors. If the RHQ Server is down for a long period of time, this will cause the agent log file to grow very large. This is one reason why it is best to have at least one auto-detection mechanism enabled.
Settings analogous to those described above exist on the RHQ Server side as well. In order to set these communications configuration preferences on the RHQ Server, you edit their values in the <server-install-dir>/bin/rhq-server.properties file. The RHQ Server must be restarted for any new values in the properties file to take effect.
|It is highly recommended that the RHQ Server only use the servlet and sslservlet transports for incoming messages. When you configure your agents to talk to the RHQ Server, they should use either servlet or sslservlet. There has been no stress testing of the socket-based transports and thus its performance under high-load cannot be guaranteed. There is only one special case where you would ever have to use a socket-based transport - see Securing Communications#Setting Up Server-Side sslsocket Transport for more information. The following example shows the experimental usage of the socket transport just as an illustration of configuring the RHQ Servers communications, but this should not be used in production.|
Note that the rhq.communications.connector.transport-params setting allows you to set custom Transport Parameters just like you could do for the agent. Also note that these RHQ Server endpoint settings:
hold the values that your agent will need to know in order to talk to this server.
For example, if the following settings are used to configure your RHQ Server:
then when you setup your RHQ Agent, its configuration values would be:
You can set concurrency limits on the RHQ Server to help its communications subsystem avoid getting so flooded with agent messages that it can no longer operate properly. The out-of-box settings are usually the values you want to keep - alot of performance testing has gone into determining the proper values for these for the typical RHQ deployment. If you decide you want to change these values, you must understand the relationship between the concurrency limits and the other communication settings. Important factors that you should be familiar with are listed below:
- If you use the default RHQ Server transport of "servlet" (or "sslservlet"), the number of allowed incoming messages are capped at the maximum web connections allowed by the Tomcat connector (rhq.server.startup.web.max-connections). Therefore, this puts an upper limit on the global concurrency limit (rhq.communications.global-concurrency-limit)
- The global concurrency limit (rhq.communications.global-concurrency-limit) is the upper limit for all the other concurrency limits since the global concurrency limit is the maximum number of messages that can be processed, regardless of the message type.
- If you switch the transport that agents use to send the messages to the server from something other than servlet or sslservlet (that is, some transport other than those that send messages over the Tomcat connector) then the low-level transport parameter "maxThreads" caps the maximum number of messages that can be received.
- The larger the number of messages allowed to enter the server concurrently increases the number of database connections that potentially will hit your database (so your database needs to be configured to prepare for a higher number of connections) or increases the liklihood that the database connections in the RHQ Server's connection pools get exhausted (and in either case, problems can occur).
Both the RHQ Server and RHQ Agent use JBoss/Remoting as its underlying remoting framework. JBoss/Remoting defines remote endpoints (which identify how to connect to a RHQ Server or RHQ Agent) via InvokerLocator strings, which look like simple URLs. An example is socket://myhost.corp.com:16163/?transportParam1=value1¶m2=value2. InvokerLocators consist of a transport protocol (socket: ) as well as the host and port of the remote endpoint. Also as part of an InvokerLocator, are transport parameters which further customize the endpoint. They help define the behavior of the underlying communications connector (which is the thing that accepts incoming messages from remote clients). The RHQ Server and RHQ Agent can each define their own transport parameters via their rhq.communications.connector.transport-params preference setting (the agent's advanced setup mode will prompt you with Agent Transport Parameters when asking for this value). Transport parameters are appended to the end of the InvokerLocator - in the same way a query string is appended to a URL. When you define your transport parameters, you must set them using the same syntax, specifically, multiple transport parameters are separated by ampersand characters. If you set transport parameters inside your XML config file, be sure you use the proper & string to represent the ampersand. Below you will find listed all the possible transport parameters that can be defined in your RHQ Server or RHQ Agent rhq.communications.connector.transport-params setting. They are split into two groups: the first group are settings that affect the server-side socket behavior and the second group are settings that affect the client-side behavior. In this context, "server-side" refers to the connector that accepts incoming messages and "client-side" refers to the remote clients that send outgoing messages. RHQ Servers and RHQ Agents have both a server-side and client-side since they both send and receive messages from each other.
|Please refer to the JBoss/Remoting v2 documentation for additional information on these settings and their low-level implementation details.|
- Settings that affect the server-side
- serverBindAddress - The address on which the server socket binds to listen for requests. The default is an empty value which indicates the server socket should be bound to the host provided by the InvokerLocator URL (the host). This would be needed in the case that the client will be going through a router that forwards requests made externally to a different IP address or hostname internally (e.g. NAT setup).
- serverBindPort - The port to listen for requests on. This would be needed in the case that the client will be going through a router that forwards requests made externally to a different port internally.
- timeout - The socket timeout value. The default on the server side is 60000 (one minute). If the timeout parameter is set, its value will also be passed to the client-side (see below).
- backlog - The preferred number of unaccepted incoming connections allowed at a given time. The actual number may be greater than the specified backlog. When the queue is full, further connection requests are rejected. Must be a positive value greater than 0. If the value passed if equal or less than 0, then the default value will be assumed. The default value is 200.
- numAcceptThreads - The number of threads that exist for accepting client connections. The default is 1.
- maxPoolSize - The number of server threads for processing client requests. The default is 300.
- socket.check_connection - indicates if the invoker should try to check the connection before re-using it by sending a single byte ping from the client to the server and then back from the server. This configuration needs to be set on both the client and server to work. The default value is false.
- Settings that affect the client-side
- timeout - The socket timeout value. The default on the client side is 1800000 (or 30 minutes).
- enableTcpNoDelay - can be either true or false and will indicate if the client socket should have TCP_NODELAY turned on or off. TCP_NODELAY is for a specific purpose; to disable the Nagle buffering algorithm. It should only be set for applications that send frequent small bursts of information without getting an immediate response. The default is false.
- clientMaxPoolSize - the client-side maximum number of active socket connections. This basically equates to the maximum number of concurrent client calls that can be made from the socket client invoker. The default is 50.
- numberOfRetries - number of retries to get a socket from the pool. This basically equates to the number of seconds the client will wait to get a client socket connection from the pool before timing out. If the max retries is reached, a CannotConnectException will be thrown. The default is 30.
- numberOfCallRetries - number of retries for making the invocation. This is unrelated to numberOfRetries in that when this comes into play is after it has already received a client socket connection from the pool. However, it is possible that the socket connection timed out while waiting within the pool. Since a connection check is not done by default, the connection is thrown away and an attempt to get a new one will be made. This will happen for however many numberOfCallRetries is (which defaults to 3). However, when (numberOfCallsRetries - 2) is reached, the entire connection pool is flushed under the assumption that all connections in the pool have timed out and are invalid and will start over by creating a new connection. If this still fails, a MarshalException is thrown.
- socket.check_connection - Indicates if the invoker should try to check the connection before re-using it by sending a single byte ping from the client to the server and then back from the server. This configuration needs to be set on both client and server to work. This if false by default.
The example below illustrates how a single InvokerLocator URL with transport parameters can configure both server-side and client-side behavior. Let's assume we are configuring a RHQ Agent with the following settings:
- transport: socket
- bind-address: 192.168.0.5
- bind-port: 56565
- transport-params: backlog=300&enableTcpNoDelay=false
The full RHQ Agent URI would then look like: socket://192.168.0.5:56565/?backlog=300&enableTcpNoDelay=false. The backlog transport parameter is a server-side configuration and the enableTcpNoDelay transport parameter is a client-side configuration. When the RHQ Agent creates its server socket and begins listening for incoming messages from RHQ Servers, it sets its backlog to 300. The RHQ Agent ignores the enableTcpNoDelay parameter because it is only useful for clients that want to talk to its server socket. When a RHQ Server sends a message to this RHQ Agent, the RHQ Server will ignore the backlog parameter because it is strictly a server-side setting but it will disable its TCP_NODELAY setting. As you can see, a single InvokerLocator can provide useful information for both ends of the communications channel (client and server).
One main use of transport parameters is to configure the communications subsystem to work within a NAT environment (i.e. where local bind addresses and ports will be different then their public addresses and ports). The following example will show how you can configure your communication settings to support NAT.
Suppose you have an RHQ Agent that you want to bind to a local address "local", listening to local port 11111. But suppose, due to NAT rules, your RHQ Agent will be accessible over a public address "public" over port 22222. Here are the settings that would enable this (note that the given transport params shown are only those that are related to supporting this NAT use-case; you can have other params but they have been omitted here for simplicity's sake):
|Notice that the name of the property "rhq.communications.connector.bind-address" is confusing in this case (because it isn't really the bind address when you override it with transport param "serverBindAddress" - it really is the "public" address now). The same thing goes for the "rhq.communications.connector.bind-port" setting and the "serverBindPort" transport param.|