xymond - Master network daemon for a Xymon server
xymond [options]
xymond is the core daemon in the Xymon Monitor. It is designed to handle
monitoring of a large number of hosts, with a strong focus on being a
high-speed, low-overhead implementation of a Big Brother compatible server.
To achieve this, xymond stores all information about the state of the monitored
systems in memory, instead of storing it in the host filesystem. A number of
plug-ins can be enabled to enhance the basic operation; e.g. a set of plugins
are provided to implement persistent storage in a way that is compatible with
the Big Brother daemon. However, even with these plugins enabled, xymond still
performs much faster than the standard bbd daemon.
xymond is normally started and controlled by the
xymonlaunch(8) tool, and
the command used to invoke xymond should therefore be in the tasks.cfg file.
- --hosts=FILENAME
- Specifies the path to the Xymon hosts.cfg file. This is
used to check if incoming status messages refer to known hosts; depending
on the "--ghosts" option, messages for unknown hosts may be
dropped. If this option is omitted, the default path used is set by the
HOSTSCFG environment variable.
- --checkpoint-file=FILENAME
- With regular intervals, xymond will dump all of its
internal state to this check-point file. It is also dumped when xymond
terminates, or when it receives a SIGUSR1 signal.
- --checkpoint-interval=N
- Specifies the interval (in seconds) between dumps to the
check-point file. The default is 900 seconds (15 minutes).
- --restart=FILENAME
- Specifies an existing file containing a previously
generated xymond checkpoint. When starting up, xymond will restore its
internal state from the information in this file. You can use the same
filename for "--checkpoint-file" and "--restart".
- --ghosts={allow|drop|log|match}
- How to handle status messages from unknown hosts. The
"allow" setting accepts all status messages, regardless of
whether the host is known in the hosts.cfg file or not. "drop"
silently ignores reports from unknown hosts. "log" works like
drop, but logs the event in the xymond output file. "match" will
try to match the name of the unknown host reporting with the known names
by ignoring any domain-names - if a match is found, then a temporary
client alias is automatically generated. The default is "log".
- --no-purple
- Prevent status messages from going purple when they are no
longer valid. Unlike the standard bbd daemon, purple-handling is done by
xymond.
- --merge-clientlocal
- The client-local.cfg(5) file contains
client-configuration which can be found matching a client against its
hostname, its classname, or the name of the OS the client is running. By
default xymond will return one entry from the file to the client, looking
for a hostname, classname or OS match (in that order). This option causes
xymond to merge all matching entries together into one and return all of
it to the client.
- --listen=IP[:PORT]
- Specifies the IP-address and port where xymond will listen
for incoming connections. By default, xymond listens on IP 0.0.0.0 (i.e.
all IP- addresses available on the host) and port 1984.
- --lqueue=NUMBER
- Specifies the listen-queue for incoming connections. You
don't need to tune this unless you have a very busy xymond daemon.
- --no-bfq
- Tells xymond to NOT use the local messagequeue interface
for receiving status- updates from xymond_client and xymonnet.
- --daemon
- xymond is normally started by xymonlaunch(8). If you
do not want to use xymonlaunch, you can start xymond with this option; it
will then detach from the terminal and continue running as a background
task.
- --timeout=N
- Set the timeout used for incoming connections. If a status
has not been received more than N seconds after the connection was
accepted, then the connection is dropped and any status message is
discarded. Default: 10 seconds.
- --flap-count=N
- Track the N latest status-changes for flap-detection. See
the --flap-seconds option also. To disable flap-checks globally,
set N to zero. To disable for a specific host, you must use the
"noflap" option in hosts.cfg(5). Default: 5
- --flap-seconds=N
- If a status changes more than flap-count times in N
seconds or less, then it is considered to be flapping. In that case, the
status is locked at the most severe level until the flapping stops. The
history information is not updated after the flapping is detected.
NOTE: If this is set higher than the default value, you should also
use the --flap-count option to ensure that enough status-changes
are stored for flap detection to work. The flap-count setting should be at
least (N/300)-1, e.g. if you set flap-seconds to 3600 (1 hour), then
flap-count should be at least (3600/300)-1, i.e. 11. Default: 1800 seconds
(30 minutes).
- --delay-red=N
- --delay-yellow=N
- Sets the delay before a red/yellow status causes a change
in the web page display. Is usually controlled on a per-host basis via the
delayred and delayyellow settings in hosts.cfg(5) but
these options allow you to set a default value for the delays. The value N
is in minutes. Default: 0 minutes (no delay). Note: Since most tests only
execute once every 5 minutes, it will usually not make sense to set N to
anything but a multiple of 5.
- --env=FILENAME
- Loads the content of FILENAME as environment settings
before starting xymond. This is mostly used when running as a stand-alone
daemon; if xymond is started by xymonlaunch, the environment settings are
controlled by the xymonlaunch tasks.cfg file.
- --pidfile=FILENAME
- xymond writes the process-ID it is running with to this
file. This is for use in automated startup scripts. The default file is
$XYMONSERVERLOGS/xymond.pid.
- --log=FILENAME
- Redirect all output from xymond to FILENAME.
- --store-clientlogs[=[!]COLUMN]
- Determines which status columns can cause a client message
to be broadcast to the CLICHG channel. By default, no client messages are
pushed to the CLICHG channel. If this option is specified with no
parameter list, all status columns that go into an alert state will
trigger the client data to be sent to the CLICHG channel. If a parameter
list is added to this option, only those status columns listed in the list
will cause the client data to be sent to the CLICHG channel. Several
column names can be listed, separated by commas. If all columns are given
as "!COLUMNNAME", then all status columns except those listed
will cause the client data to be sent.
- --status-senders=IP[/MASK][,IP/MASK]
- Controls which hosts may send "status",
"combo", "config" and "query" commands to
xymond.
By default, any host can send status-updates. If this option is used, then
status-updates are accepted only if they are sent by one of the
IP-addresses listed here, or if they are sent from the IP-address of the
host that the updates pertains to (this is to allow Xymon clients to send
in their own status updates, without having to list all clients here). So
typically you will need to list your servers running network tests here.
The format of this option is a list of IP-addresses, optionally with a
network mask in the form of the number of bits. E.g. if you want to accept
status-updates from the host 172.16.10.2, you would use
--status-senders=172.16.10.2
whereas if you want to accept status updates from both 172.16.10.2 and from
all of the hosts on the 10.0.2.* network (a 24-bit IP network), you would
use
--status-senders=172.16.10.2,10.0.2.0/24
- --maint-senders=IP[/MASK][,IP/MASK]
- Controls which hosts may send maintenance commands to
xymond. Maintenance commands are the "enable",
"disable", "ack" and "notes" commands.
Format of this option is as for the --status-senders option. It is
strongly recommended that you use this to restrict access to these
commands, so that monitoring of a host cannot be disabled by a rogue user
- e.g. to hide a system compromise from the monitoring system.
Note: If messages are sent through a proxy, the IP-address
restrictions are of little use, since the messages will appear to
originate from the proxy server address. It is therefore strongly
recommended that you do NOT include the address of a server running
xymonproxy in the list of allowed addresses.
- --www-senders=IP[/MASK][,IP/MASK]
- Controls which hosts may send commands to retrieve the
state of xymond. These are the "xymondlog",
"xymondboard" and "xymondxboard" commands, which are
used by xymongen(1) and combostatus(1) to retrieve the state
of the Xymon system so they can generate the Xymon webpages.
Note: If messages are sent through a proxy, the IP-address
restrictions are of little use, since the messages will appear to
originate from the proxy server address. It is therefore strongly
recommended that you do NOT include the address of a server running
xymonproxy in the list of allowed addresses.
- --admin-senders=IP[/MASK][,IP/MASK]
- Controls which hosts may send administrative commands to
xymond. These commands are the "drop" and "rename"
commands. Access to these should be restricted, since they provide an
un-authenticated means of completely disabling monitoring of a host, and
can be used to remove all traces of e.g. a system compromise from the
Xymon monitor.
Note: If messages are sent through a proxy, the IP-address
restrictions are of little use, since the messages will appear to
originate from the proxy server address. It is therefore strongly
recommended that you do NOT include the address of a server running
xymonproxy in the list of allowed addresses.
- --no-download
- Disable the "download" command which can be used
by clients to pull files from the Xymon server. The use of these may be
seen as a security risk since they allow file downloads.
- --ack-each-color
- By default, sending an ACK for a yellow status stops alerts
from being sent while the status remains yellow or red. A status change
from yellow to red will not re-enable alerts - the ACK covers all
non-green statuses. With this option, an ACK is valid only for the color
of the status when the ACK was sent. So an ACK for a yellow status is
ignored if the status later changes to red, but an ACK for a red status
covers both yellow and red.
Note: An ACK for a red status will clear any existing yellow acks. This
means that a long-lived ack for yellow is lost when you send a short-lived
ack for red. Hence alerts will restart when the red ack expires, even if
the status by then has changed to yellow.
- --ack-log=FILENAME
- Log acknowledgements created on the Critical Systems page
to FILENAME. NB, acknowledgements created by the Acknowledge Alert CGI are
automatically written to acknowledge.log in the Xymon server log
directory. Alerts from the Critical Systems page can be directed to the
same log.
- --debug
- Enable debugging output.
- --dbghost=HOSTNAME
- For troubleshooting problems with a specific host, it may
be useful to track the exact communications from a single host. This
option causes xymond to dump all traffic from a single host to the file
"/tmp/xymond.dbg".
When a status arrives, xymond matches the old and new color against the
"alert" colors (from the "ALERTCOLORS" setting) and the
"OK" colors (from the "OKCOLORS" setting). The old and new
color falls into one of three categories:
OK: The color is one of the "OK" colors (e.g.
"green").
ALERT: The color is one of the "alert" colors (e.g.
"red").
UNDECIDED: The color is neither an "alert" color nor an
"OK" color (e.g. "yellow").
If the new status shows an ALERT state, then a message to the
xymond_alert(8) module is triggered. This may be a repeat of a previous
alert, but
xymond_alert(8) will handle that internally, and only send
alert messages with the interval configured in
alerts.cfg(5).
If the status goes from a not-OK state (ALERT or UNDECIDED) to OK, and there is
a record of having been in a ALERT state previously, then a recovery message
is triggered.
The use of the OK, ALERT and UNDECIDED states make it possible to avoid being
flooded with alerts when a status flip-flops between e.g yellow and red, or
green and yellow.
A lot of functionality in the Xymon server is delegated to "worker
modules" that are fed various events from xymond via a
"channel". Programs access a channel using IPC mechanisms -
specifically, shared memory and semaphores - or by using an instance of the
xymond_channel(8) intermediate program. xymond_channel enables access
to a channel via a simple file I/O interface.
A skeleton program for hooking into a xymond channel is provided as part of
Xymon in the
xymond_sample(8) program.
The following channels are provided by xymond:
status This channel is fed the contents of all incoming
"status" and "summary" messages.
stachg This channel is fed information about tests that change status,
i.e. the color of the status-log changes.
page This channel is fed information about tests where the color changes
between an alert color and a non-alert color. It also receives information
about "ack" messages.
data This channel is fed information about all "data" messages.
notes This channel is fed information about all "notes"
messages.
enadis This channel is fed information about hosts or tests that are
being disabled or enabled.
client This channel is fed the contents of the client messages sent by
Xymon clients installed on the monitored servers.
clichg This channel is fed the contents of a host client messages,
whenever a status for that host goes red, yellow or purple.
Information about the data stream passed on these channels is in the Xymon
source-tree, see the "xymond/new-daemon.txt" file.
- SIGHUP
- Re-read the hosts.cfg configuration file.
- SIGUSR1
- Force an immediate dump of the checkpoint file.
Timeout of incoming connections are not strictly enforced. The check for a
timeout only triggers during the normal network handling loop, so a connection
that should timeout after N seconds may persist until some activity happens on
another (unrelated) connection.
If ghost-handling is enabled via the "--ghosts" option, the hosts.cfg
file is read to determine the names of all known hosts.
xymon(7),
xymonserver.cfg(5).