Solved

restart a daemon from inetd or rc.tcpip if it gets shutdown.

Posted on 2010-11-22
6
818 Views
Last Modified: 2013-11-17
Hi
I know inittab has respwan, but not sure is the AIX's SRC is ready to do this jobs.. Let's see an example:

Las week nfs subsystem got down so all other servers with filesystems mounted get freeze.
Question:

How can I monitor NFS subsystem on the server side if it gets shutdown.. like monit does for linux daemons? Please, I'd like to monitor all daemons, not only NFS.

Thanks.
0
Comment
Question by:sminfo
  • 4
  • 2
6 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34187096
Hi again,

as long as it's not inetd itself which would die there is no need to restart its subservers. inetd will start them as soon as an request to the respective port arrives.

Quite other things are the servers started by /etc/rc.tcpip. You're right, the SRC can't monitor and/or restart any services.

Unfortunately, there is (almost, see below) no such feature built into AIX, except for respawn in inittab - but that's what you know already.

You will have to go with something like nagios, or you could use homemade scripts to monitor your daemons --- or you could try the good old wsm.

wsm is in sysmgt.websm.rte, a fileset which you have most probably already installed. You need X11.rte on AIX and an Xserver at your workstation (Xming, Exceed or the like) to run it.

Log in to the concerned AIX server, export your DISPLAY, and issue "wsm".

Choose "Monitoring" then "Conditions" and "Responses".  There are predefined ones, but you can create your own.

It's all not that straightforward and a bit old-fashioned, but have a look!

wmp









0
 

Author Comment

by:sminfo
ID: 34187341
wmp,

I see nfs is a group that can be start/stop using startsrc/stopsrc -g nfs, but it includes some other daemons:

u: /home/u # lssrc -g nfs
Subsystem         Group            PID          Status
 biod             nfs                           inoperative
 nfsd             nfs                           inoperative
 rpc.mountd       nfs                           inoperative
 nfsrgyd          nfs                           inoperative
 gssd             nfs                           inoperative
 rpc.lockd        nfs                           inoperative
 rpc.statd        nfs                           inoperative

Now, nfs starts on inittab:
rcnfs:23456789:wait:/etc/rc.nfs > /dev/console 2>&1 # Start NFS Daemons

Is possible to make a self-fix to get NFS always running?
All the above daemons are necessary to get NFS working?
Can I start nfsd daemon alone on inittab with respawn?

We've a big problem in production because of the remote NFS daemon server gets down, because it runs en ls in some scripts somewhere and get freezed. That's what I want to avoid to happen again.

I've work with monit before and it works perfectly, but I don't want to involved this open source codes and compilationm in our AIX servers.

Don''t know if you understand my point.

Thanks

0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34187503
Yes, I understand (I believe).

For an NFS server to work you'll need everything in the "nfs" group except for "nfsrgyd" (unless you use NFS4) and "gssd" (unless you use GSS security).

You can't use "respawn" for /etc/rc.nfs, because this script will start everything in the background, then disconnect and terminate, so it will be respawned immediately on and on, which is nonsense.
 
I never tried it, but I suspect that you can't use "startsrc" with respawn either.

Remain the individual processes. I think it should be possible to have them all respawned by inittab.
This would (e.g. for nfsd) look like this:
nfsd:2:respawn:/usr/sbin/nfsd 3891

Check with "ps" which options/parameters your processes are using now, and use the same ones in inittab.

But why not wsm/RSCT? If you don't like graphical interfaces (I mostly hate them) you could try "mkcondition",  "mkresponse" and "mkcondresp" to define things and "startcondresp"/"stopcondresp" to start/stop monitoring. There are manpages for all of the above, and RSCT is present in AIX by default. Why not give it a try?

wmp

0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 34188014
OK,

as you might have imagined, I tested startsrc wth respawn ... and it works!

The only drawback is that you should not use "startsrc -g nfs" because inittab will always try to start the two daemons for NFS4 resp. GSS which will terminate immediately if there is no NFS4 or GSS, which in turn will lead to errpt filling up with their error messages.

So either user "startsrc -s ..." with respawn for every single startable subsystem or take the two daemons out of the nfs group if you don't need them.

Nonetheless: RSCT is useful too, I keep telling you.

wmp

0
 
LVL 68

Assisted Solution

by:woolmilkporc
woolmilkporc earned 500 total points
ID: 34188161
It's me again.

The above inittab thing will help against someone terminating nfs with "stopsrc".
To protect nfsd against someone who would "kill" nfsd you could tell SRC to restart it. I fear this has not been quite obvious in my first post - I wasn't talking about "kill" there, only about "stopsrc".

To make SRC restart a killed nfsd simply issue

chssys -s nfsd -R

wmp

0
 

Author Comment

by:sminfo
ID: 34189906
I'm back..
I was looking for some checklist for AIX's servers because we have some auditors coming  on December...

Let me some minutes to read again and make some tests..

thanks wmp..
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

A metadevice consists of one or more devices (slices). It can be expanded by adding slices. Then, it can be grown to fill a larger space while the file system is in use. However, not all UNIX file systems (UFS) can be expanded this way. The conca…
FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now