How to do Health check for  AIX servers ? what information need to be collected ?

Posted on 2009-06-28
Last Modified: 2013-11-17
We have 400 hundred servers in our Implementation Project. We need to do health check For this servers ? What information I need to check and collect for AIX servers ?
Do we have any standard tools available to do that ? How to prepare report for it ?
Question by:rammaghenthar
  • 2
LVL 68

Accepted Solution

woolmilkporc earned 500 total points
ID: 24732467
Hi again,

there is no standard healthcheck script in AIX.

In one of our other cases we're talking about 'cfg2html' which is a fine tool to get
an overview of the vital data of your machines.

Furthermore, health checking cannot be done by sort of a 'snapshot', but should be a continuous process,
using a monitoring tool like e.g. nagios:

Anyway, to check the most important things you could run a little script
regularly against all machines contained in a server list for:

prtconf -> overview
errpt -> hardware error log
df -> filesystems
diag -cs -> hardware diagnostics
lppchk -v -> software packages' consistency

It could look like this (see attachment):

Note that you should have ssh access using publickey, in order to not get prompted for passwords.

And since you're talking about 400 servers, it seems nearly impossible to read all the output from
any check script, so I'd really suggest using a monitoring tool (see nagios above)!




for host in $(cat $serverlist)


   /usr/bin/ssh $host '





   echo RUNNING errpt




   echo RUNNING df -g


   /usr/bin/df -g


   echo RUNNING diag -cs


   /usr/sbin/diag -cs


   echo RUNNING lppchk -v


   /usr/bin/lppchk -v ' > [/path/to/]$host.$(date +"%Y.%m.%d").custom.check



Open in new window

LVL 61

Expert Comment

ID: 24735041
400000 AIX servers? Are you IBM?
diag has automated diagnostics facility whose config is stored in ODM
LVL 30

Expert Comment

by:Kerem ERSOY
ID: 24749166
Depends on what you understand from health chek. If you're after monitoring CPU load, Disk capacity etc. You need a systematic approach.  In this case you need central periodical monitoring and alerting. This could be done with with monitoring tools such as IBM's Tivoli or Nagios.
LVL 61

Expert Comment

ID: 24763543
diag contains part about chcheduled RAM/CPU/DISK/RAID/sysplanar0 diagnostics.
it serves practical and formal policy porposes quite well.

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Hello fellow BSD lovers, I've created a patch process for patching openjdk6 for BSD (FreeBSD specifically), although I tried to keep all BSD versions in mind when creating my patch. Welcome to OpenJDK6 on BSD First let me start with a little …
In tuning file systems on the Solaris Operating System, changing some parameters of a file system usually destroys the data on it. For instance, changing the cache segment block size in the volume of a T3 requires that you delete the existing volu…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

895 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now