HP Ux Superdome 100% CPU utilization and memory Utilization

Posted on 2004-11-17
Last Modified: 2013-12-06

I am running a HP UX 11.x Superdome 4 CPU 16 GB Memory in a Cluster Mode (2 Superdomes in Cluster - One running the App Server and 1 running the DB server). 4 GB Ram is for Oracle. I am running a banking app and the Database is Oracle 9i. I also use Apache and Resin.

The number of logged in users are about 150 (peak). The Memory and CPU utilizations hit the roof and hovers between 85% to 100%. How to go about diagnosing and how to sort out this. For the type of load this looks absurd.

Further the resin part is also a bit confusing with resin also appearing to be slow in startup and is there a standard config guidelines available for Configuring and installing Resin on HP Ux 11.x


Question by:suryapadma
    LVL 20

    Expert Comment

    To start from basics:
    For CPU utilisation, run `top` to see the heaviest CPU users; Trace the "CPU hog" processes back to their parents, and decide if the CPU usage is "reasonable";

    Also look at the load averages `top` shows you; If possible, post the `top` output.

    To check process memory usage, do:
    UNIX95= ps -eo uid,pid,ppid,pcpu,state,sz,vsz,time,comm |sort -rnk7 |more

    Of course, the App and DB systems will have very different profiles and will need tuning differently.

    Start `sar` logging running so you can build up a picture of resource usage over a few days.
    LVL 38

    Expert Comment


     Follow tfewster to use top or ps to find out what processes are eating up all your
    system resource.

    It is posible to see a box use up all it CPU and RAM, it depands on what it is doing.
    some of my boxes are running Neural Network simulations at the time, and in most
    of the case 99.99% CPU time + 100% RAM are used.

    find out the cause of the problem first and then decide what to do, hareward upgrade, or
    split work load to more boxes, or software bug, need patches.

    LVL 61

    Expert Comment

    try $ vmstat 1 10
    to identify if bottleneck is i/o wait or cpu brainpower.
    depending on result there are many ways to improve before buying expensive hardware

    then try sync ; sync ; vmstat 1 10
    if this improves wait time - your system is simply swapping, teach your apps to use less memory .

    etc etc
    LVL 20

    Expert Comment

    Oops, just noticed this response from Surya that was posted as feedback instead of a comment:
    (is anyone familiar with configuring/tuning Resin?)

          Author: suryapadma
          Date: 11/17/2004 02:48PM GMT
          View Source Question
          Post a Reply to this feedback
          I have got the stats collected using Glance and also OV. I have the stats as a file also. I need some help to drill down on to the problem further and get it sorted out. I have Resin 2.0.5 also running and this is also hogging memory and may be CPU also. Resin takes a lot of time to come up and in a Cluster Switch Over fails. Do you have any inputs on Resin and its Installation and Tuning on HP UX. I can send you the log files, if you can give me an Id where I can send them.


    LVL 1

    Accepted Solution

    Hi  suryapadma,

    From reading your posts I see 3 questions
    1. HIGH CPU utilization in general
    2. HIGH MEMORY utilization in general
    3. Application Resin is using high CPU utilization.
    4. Cluser Switch over fails with resin

    Performance issue is not easy and will not be be solved simply using one or two tools.
    First rule. If no one is complaining about the performance, don't try to do / change too much.
    (you complained about the resin which i assume is running on the second superdome but you didn't complain about the performance of the oracle database server)

    High CPU utilization is not neccessary a performance issue. To say it simply, if a process get all the resouces it demands ( allocate enough memory, IO from disk etc ) then it will use up to 100% CPU to try to run ( calculate ) the process and try to finish it as soon as possible. If a process starts up and runs 100% cpu for a few seconds or minutes then disappear, then it sounds ok to me. But if 1 single process keeps running 100% CPU for a long time, then it can be cause by software failure or other reason.
    High Memory utilization can be ok too. Check if you have enough ( or too much swap ), check SWAP utilization, page out frequency, IO on the swap devices etc.
    As long as you have enough swap, NO PAGE OUT!!!, and specially when no users are complaining, you don't have to worry ! :)

    Third: collect all relevate informatie !
    Some idea's
    use glance to check global CPU, Memory, but also disk IO's swap utilization.  Check the size of global shared mem,  numbers of nfiles, Page put, system calls, global WAIT stats / threads. Check single process CPU/memory/IO, activities per CPU (maybe one cpu is broken?! ), check activity in file system (LVM), PV, NFS? network utilization. All can be done and view within Glance!
    using glance you can determinate if the CPU usage is used by system mode (kernel) or usermode (outside the kernel).

    If this performance issue is temperaray and only happens within a certain time frame. Use perfview ( measureware) to collect data's  and put it ina graph to find out a  trance!

    Check syslog for any unusal warnings
    Check EMS for any hardware warning
    use netstat to check any hanging open connections
    check the kernel setup (kmtune). check the value of maxuers, maxfiles, maxdsiz,maxssiz maxtsiz, max_thread_proc, nfiles,nproc, nkthread etc etc etc........
    use lsof (list open files) to track down what a process ( the one claiming 100% cpu / mem constantly ) is actually doing.
    It's not a standard HP software but you can found one from
    Check the latest quality package of HP. I will advice to install the latest half year of HW and quality packages. ( don't install the newest one unless you have no chooise for example to solve some typical problem, a quality packages of at least 6 months old will provide enough stability and it's almost 100% safe to install it with no headache)

    So far, these are checks are only usefull if the performance issue is related to HP-UX OS.
    if you have issue with oracle, grab a DBA and ask hem to do trouble shooting together with you, then try the following
    check the set up of oracle
    does the HP kernel confirm the requirement of a typical version of oracle? check metalink for it.
    check the global logs for oracle
    check the listenaar logs
    check others like data framentations in a datafiles, check buffer cache hit rates, check numbers of concurrent users etc etc

    do the same for other applicaion like resin. Read the install doc for resin, for example  (good one)
    To begin with check the base configuration.

    Anyway too much to mentions.

    About you specific problems.
    1 and 2. first collect all the needed information as describle above to determinate if it's a global issue or single process / applicaiton. Hardward or only software relative. Is it a concurrent problem or only happens between 9-17 hours??? gather more information and post it here, we will help you.

    3. resin is slow / using high CPU utilization.
    give us the result of checking configuration, show us more the configurations of you HP-OS like kernel settings etc.
    Do a lsof on the busy process. Maybe it's relative to apache. Start the application but with no user loads and see if the problems still exits. ( try to start it without MC serviceGuard)

    4. Cluster switch failed. You have to show us the entire configuration of MC/ServiceGuard.
    MCSG is very powerfull but need to be configure carefull. 1 single small mistake and it will fail to start.
    Collect the error log from serviceguard with the configuration of the node / packages and post it here please!

    Just go and try out a bit, post your questions in details in and see if we can help more.
    If you like to learn more about HPUX / oracle setups, i have a few cookbook ready for you.
    Post me and give me your mail address and i can mail them to you. but they are but , like a couple of Mg each :)) FAT cookbook.

    Hope this can guide you to the right direction !!

    Good luck

    LVL 20

    Expert Comment

    Points to sirjb for good generic info on investigating performance problems; Without detailed feedback it would be impossible to diagnose the real problem, but sirjb's suggestions will go long way to finding the problem.

    Incidentally, I'd be interested in seeing JBs "cookbooks"; I have a few links to HP-UX performance troubleshooting docs (Some of which require an account on HPs ITRC and, naturally, there is a great deal of overlap...)
    HP-UX Performance Cookbook:
    Basics of Poor System Performance:
    Determining the cause of system performance problems:
    System Performance Tuning Guide:
    LVL 1

    Expert Comment


    The link to the HP_UX Performance Cookbook posted by tfewster is good BUT that document is outdated.
    The original and the latest version is:  by Stephen Ciullo and Doug Grumann 27th May 2003
    This is the latest one !

    To tfewster:
    I have some e-books for specific subjects like MC/SG, LVM in combination of SAN storage for like XP's, VA's etc etc
    What are you looking for? or in what area?
    PS: thanks for the comment


    LVL 61

    Expert Comment

    most likely problem is with JVMs, and one has to renice and tune it if ir interferes with normal processes ( thus accounting etc)
    LVL 20

    Expert Comment

    JB - Thanks for the link to the updated Performance Tuning pdf - I searched the ITRC for "cookbook" and found an MC/Serviceguard one as well, but if you have anything specific on tuning HP-UX for Oracle I'd be interested. My email address is in my profile (but posting a link would be better, so everyone can see it ;-)


    Featured Post

    What Security Threats Are You Missing?

    Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

    Join & Write a Comment

    Attention: This article will no longer be maintained. If you have any questions, please feel free to mail me. Please see for the updated article. It is avail…
    Installing FreeBSD… FreeBSD is a darling of an operating system. The stability and usability make it a clear choice for servers and desktops (for the cunning). Savvy?  The Ports collection makes available every popular FOSS application and packag…
    Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
    Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…

    745 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    15 Experts available now in Live!

    Get 1:1 Help Now