[Date Prev]   [Date Next] [Thread Prev]   [Thread Next] [Date Index]   [Thread Index]


     Re: [nocol-users] Need to know more

>  > From: Russell Van Tassell
>  > Is this a Solaris machine, by chance?  (ie. I've seen the same thing on
>  > my box(es) and have heard that others have seen the same, but no posted
>  > solutions... yet)  It's a problem somewhere in hostmon, I think (from
>  > memory, so don't shoot me)
> That's probably one of *many* problems with hostmon.
> For me, it's brittle like a potato chip under Sol, AIX
> and Linux.  I can't even do
>   while true; do
>     telnet <hostmon_host> hostmon
>     sleep 5
>   done
> without it crashing!
> In other words, imo, using hostmon as a primary method
> of notifiction about outages is unwise.  
	I'm not saying that. I think that its doing its job
DETECTING them, but not REPORTING them fully.  No where is the return
code from eventlog ever checked!
> I suspect using hostmon with rsh comm and ssh might be
> more stable.  Can anyone comment about this idea?
	NEVER a problem for us (BSDI/Solaris). I do make a mod
which makes it check for the daemon EVERY loop though, since it'd
go away for a while then come back.  But its more the eventlog. 

	This caused a MAJOR problem and even more MAJOR embarassment.