Re: [nocol-users] MySQL monitor?

[Date Prev] [Date Next]

[Thread Prev] [Thread Next]

[Date Index] [Thread Index]

Re: [nocol-users] MySQL monitor?

To: "Nathan Clemons [Staff]" <nathan at ici net>
Subject: Re: [nocol-users] MySQL monitor?
From: Vikas Aggarwal <vikas at navya com>
Date: Tue, 18 Jan 2000 17:06:41 -0500
References: <Pine.GSO.4.10.10001181317380.14405-100000@phoenix>

"Nathan Clemons [Staff]" wrote:
> 
> I'd love to write a good Perl DBI monitor, where you could specify
> username, password, port, and DBD type in the config file and have a SQL
> statement to use to test it.
> 
> If anyone feels up to writing a FAQ on how to write a Perl based monitor,
> I'll be happy to contribute it when complete.

Nathan,

A 'sample' perl monitor is located in perlnocol/SAMPLE-perl-monitor.
Ideally (and also in the case of the Perl DBI monitor), it only needs 2
functions:

   sub readconf() = read config file and build 'item' list
   sub dotest()  = which test's one host, and calls &calc_status()

   &nocol_main() will then automatically call these above routines, etc.

Yes, these need to be documented better.

Regarding the issue with handling a "HUP" signal, the problem lies in
the fact that it is difficult to determine the changes in the config
files. Consider the case of 'portmon'... I might edit an existing file
and change just the IP address in an entry, or just the 'return-string'.
On getting a HUP, the monitors would have to go thru each of these
parameters, decide what has been changed, and then delete that 'item'.
It is specific for each monitor, hence it cannot be made into a library
function.

One simple way to do this, is upon getting a HUP, a monitor can:

	- erase the old file (effectively as good as restarting)
	- but on the first pass, dont reinit everything to 'unknown'
	  instead just directly escalate each event to the 'highest'
	  severity directly.

The only downside to this could be if a site just went down, then all
the monitors would NOT step thru the severities (info -> warning ->
error -> critical), but directly escalate the site to 'critical'. This
is the easiest approach to the problem, and with the least impact.

On another note... Once we have all the events in a database (courtsey
jonz@netrail.net), we should be able to assign 'nodenames' to each
monitor and refer to each event using 'nodename.event'. A meta database
could collect data from all these differnt databases, and co-relation
between nodeA.eventX and nodeB.eventX can be done (this idea from
Velocet folks). Ofcourse, the tool to do any kind of analysis is also
TBD.

	-vikas

Follow-Ups:
- Re: [nocol-users] MySQL monitor?
  - From: Nathan Clemons [Staff]

References:
- Re: [nocol-users] MySQL monitor?
  - From: Nathan Clemons [Staff]

Prev by Date: RE: [nocol-users] Root Cause Analysis
Next by Date: Re: [nocol-users] MySQL monitor?
Prev by thread: Re: [nocol-users] MySQL monitor?
Next by thread: Re: [nocol-users] MySQL monitor?
Index(es):
- Date
- Thread