sminfo
asked on
restart a daemon from inetd or rc.tcpip if it gets shutdown.
Hi
I know inittab has respwan, but not sure is the AIX's SRC is ready to do this jobs.. Let's see an example:
Las week nfs subsystem got down so all other servers with filesystems mounted get freeze.
Question:
How can I monitor NFS subsystem on the server side if it gets shutdown.. like monit does for linux daemons? Please, I'd like to monitor all daemons, not only NFS.
Thanks.
I know inittab has respwan, but not sure is the AIX's SRC is ready to do this jobs.. Let's see an example:
Las week nfs subsystem got down so all other servers with filesystems mounted get freeze.
Question:
How can I monitor NFS subsystem on the server side if it gets shutdown.. like monit does for linux daemons? Please, I'd like to monitor all daemons, not only NFS.
Thanks.
ASKER
wmp,
I see nfs is a group that can be start/stop using startsrc/stopsrc -g nfs, but it includes some other daemons:
u: /home/u # lssrc -g nfs
Subsystem Group PID Status
biod nfs inoperative
nfsd nfs inoperative
rpc.mountd nfs inoperative
nfsrgyd nfs inoperative
gssd nfs inoperative
rpc.lockd nfs inoperative
rpc.statd nfs inoperative
Now, nfs starts on inittab:
rcnfs:23456789:wait:/etc/r c.nfs > /dev/console 2>&1 # Start NFS Daemons
Is possible to make a self-fix to get NFS always running?
All the above daemons are necessary to get NFS working?
Can I start nfsd daemon alone on inittab with respawn?
We've a big problem in production because of the remote NFS daemon server gets down, because it runs en ls in some scripts somewhere and get freezed. That's what I want to avoid to happen again.
I've work with monit before and it works perfectly, but I don't want to involved this open source codes and compilationm in our AIX servers.
Don''t know if you understand my point.
Thanks
I see nfs is a group that can be start/stop using startsrc/stopsrc -g nfs, but it includes some other daemons:
u: /home/u # lssrc -g nfs
Subsystem Group PID Status
biod nfs inoperative
nfsd nfs inoperative
rpc.mountd nfs inoperative
nfsrgyd nfs inoperative
gssd nfs inoperative
rpc.lockd nfs inoperative
rpc.statd nfs inoperative
Now, nfs starts on inittab:
rcnfs:23456789:wait:/etc/r
Is possible to make a self-fix to get NFS always running?
All the above daemons are necessary to get NFS working?
Can I start nfsd daemon alone on inittab with respawn?
We've a big problem in production because of the remote NFS daemon server gets down, because it runs en ls in some scripts somewhere and get freezed. That's what I want to avoid to happen again.
I've work with monit before and it works perfectly, but I don't want to involved this open source codes and compilationm in our AIX servers.
Don''t know if you understand my point.
Thanks
Yes, I understand (I believe).
For an NFS server to work you'll need everything in the "nfs" group except for "nfsrgyd" (unless you use NFS4) and "gssd" (unless you use GSS security).
You can't use "respawn" for /etc/rc.nfs, because this script will start everything in the background, then disconnect and terminate, so it will be respawned immediately on and on, which is nonsense.
I never tried it, but I suspect that you can't use "startsrc" with respawn either.
Remain the individual processes. I think it should be possible to have them all respawned by inittab.
This would (e.g. for nfsd) look like this:
nfsd:2:respawn:/usr/sbin/n fsd 3891
Check with "ps" which options/parameters your processes are using now, and use the same ones in inittab.
But why not wsm/RSCT? If you don't like graphical interfaces (I mostly hate them) you could try "mkcondition", "mkresponse" and "mkcondresp" to define things and "startcondresp"/"stopcondr esp" to start/stop monitoring. There are manpages for all of the above, and RSCT is present in AIX by default. Why not give it a try?
wmp
For an NFS server to work you'll need everything in the "nfs" group except for "nfsrgyd" (unless you use NFS4) and "gssd" (unless you use GSS security).
You can't use "respawn" for /etc/rc.nfs, because this script will start everything in the background, then disconnect and terminate, so it will be respawned immediately on and on, which is nonsense.
I never tried it, but I suspect that you can't use "startsrc" with respawn either.
Remain the individual processes. I think it should be possible to have them all respawned by inittab.
This would (e.g. for nfsd) look like this:
nfsd:2:respawn:/usr/sbin/n
Check with "ps" which options/parameters your processes are using now, and use the same ones in inittab.
But why not wsm/RSCT? If you don't like graphical interfaces (I mostly hate them) you could try "mkcondition", "mkresponse" and "mkcondresp" to define things and "startcondresp"/"stopcondr
wmp
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I'm back..
I was looking for some checklist for AIX's servers because we have some auditors coming on December...
Let me some minutes to read again and make some tests..
thanks wmp..
I was looking for some checklist for AIX's servers because we have some auditors coming on December...
Let me some minutes to read again and make some tests..
thanks wmp..
as long as it's not inetd itself which would die there is no need to restart its subservers. inetd will start them as soon as an request to the respective port arrives.
Quite other things are the servers started by /etc/rc.tcpip. You're right, the SRC can't monitor and/or restart any services.
Unfortunately, there is (almost, see below) no such feature built into AIX, except for respawn in inittab - but that's what you know already.
You will have to go with something like nagios, or you could use homemade scripts to monitor your daemons --- or you could try the good old wsm.
wsm is in sysmgt.websm.rte, a fileset which you have most probably already installed. You need X11.rte on AIX and an Xserver at your workstation (Xming, Exceed or the like) to run it.
Log in to the concerned AIX server, export your DISPLAY, and issue "wsm".
Choose "Monitoring" then "Conditions" and "Responses". There are predefined ones, but you can create your own.
It's all not that straightforward and a bit old-fashioned, but have a look!
wmp