NOCOL Operations Guide

Version 4.3
Last Updated: Mar 22, 2000

 
  Contents
  Running NOCOL
  • File locations
  • Selecting the monitors
  • Configuration files
  • noclogd
  • Routine Maintenance

User Interfaces

  • netconsole
  • webNocol
  • tkNocol

Notifications & Reports

  • SMS Paging
  • Email
  • Reports

You must read the Installation document prior to reading this Operations guide.

 
  Running NOCOL
 

File Locations

The main directory where nocol gets installed is specified at compile time (default is set to /usr/local/nocol). Under this directory, the following sub-directories exist:

bin/ All monitors and utility scripts are in this directory.
data/ The raw data collected by the monitors
etc/ All configuration files, and the snmp MIB file.
msgs/ All files in this directory are displayed in the 'netconsole'  msgs subwindow.
run/ The PID files for all the monitors (used to ensure only one copy of a monitor runs at a time), and the error

Running the Monitors

Nocol has a large number of independent monitors- all desired monitors should be listed in the keepalive_monitors script (the variable PROGRAMS). This script is run periodically from crontab and ensures that all the desired monitors are running (the crontab.nocol file is installed into cron during the installation steps).

Generally the monitors do not need any command line argument- the name and location of the configuration file and the data directory is compiled into the monitors. However, you can always specify an alternate config file or output data file using the '-c' or the '-o' command line options respectively. All monitors also accept the '-d' flag to indicate debug mode, in which case they write verbose error messages to the stderr. If started from keepalive_monitors, these error messages are stored in the run/xxxx.error  file.

Configuration Files

The configuration file for each monitor is located in the etc/ directory. Each of these files should be edited for your site. Note that in most monitors, the 'name' of the device is not used by the monitor, but is basically a operator friendly name for the device.

Currently, sending a HUP signal to the monitors does NOT cause them to re-read the configuration file and preserve the existing state of the variables being monitored.

noclogd - the Logging Daemon

The noclogd daemon listens on port 5354 of the logging host for any events sent by the monitors. The name of the host where noclogd runs is compiled into all the monitors and is not configurable in their config files at this time.

The noclogd process is similar to the Unix 'syslog' daemon and the configuration file allows piping the logged events to any external process. To prevent any random host from sending it any messages, the list of allowed IP addresses (which can log to it) is listed in the noclogd configuration file.

Since this process can run external programs, it is used to run the pager notification scripts, etc. This program can be used to log messages to a database, send emails, etc.

It should be noted that an 'event' in nocol is generated only when a value crosses a threshold in any polling interval. Hence, normally you will not see any logging activity in noclogd, but when a device variable changes its state, an event will be logged. This means that an event will be sent by a monitor to noclogd both when it goes down (e.g. from info level to warning level) and also when it comes back up (e.g. warning level to info level).

Routine Maintenance

Routine admin tasks in nocol consist of ensuring that all the monitors are running (done by running keepalive_monitors from cron),  and rotating all the log files maintained by noclogd (done by running log-maint periodically from crontab). See the sample nocol.crontab for achieving these tasks.

 
  User Interfaces
 

Netconsole

There are three different user interfaces to view the nocol data. The simplest of them all is netconsole,  which is a non-graphical, curses based tool for displaying the raw data being collected by the monitors. Any user on the system where the monitors are running can run this tool.

WebNocol

The Web interface for displaying nocol data is divided into two scripts- genweb.pl which runs periodically from crontab and generates 4 web pages (one for each severity level). The other program is a CGI script webnocol.cgi, which gives added functionality to the user such as troubleshooting, adding notes for an event, hiding a known event, etc. This script has its own built in access control based on the user, but as an alternative the typical .htaccess method can easily be used.

tkNocol

This is a Tcl/tk based monitor using client-server technology. A simple daemon (called 'ndaemon') runs on the nocol machine listening on TCP port 5005 and all it does is periodically send the nocol raw data to all connected clients. The client displays then parse and format/display this nocol raw data. ndaemon has no access control at this time, so it is important to put a firewall to restrict unauthorized access to ndaemon's TCP port.

Note that none of these interfaces displays historical data from 'noclogd'- they all work directly on the data being collected by the monitors which represents the current state of the network.

 
  Notifications & Reports
  A very flexible notification script called  'notifier.pl'  is provided with nocol which has a configuration file describing the type of event and required action. Currently the possible actions are  mail and page. A minimum and maximum age of the event can be defined indicating that the action should be taken (paging or email) only if the age of the event lies between these two values (in seconds). An option exists to allow 'repeat' notification (once every hour) until the age is exceeded.

A more 'event' driven notification system can be written by using noclogd. Any event can be piped to an external script by noclogd, so a page or email can be sent as soon as an event occurs and is logged to noclogd. As an example, look at the 'utility/beep_oncall' script.

Currently the only reporting tool for historical analysis is 'logstats' which parses the historical noclogd event logs and generates a simple summary report. This is run by the 'log-maint' script which in turn is run periodically from crontab.


Vikas Aggarwal