Automatically starting shell processes on HP-UX

Hi Team,

I have a weird problem, what happens is on my HP-UX machines the shell processes starts automatically and eats up all the CPU. I mean it takes up all the 100% usage of CPU and everything becomes slow bcos of that .
I have no clue why is that happening, even i dont know what info. should i give here so that it will help u experts to help me out.
What happens is that when we do a ps -eaf  | grep sh, i get quite a lot of processes owned by root user which says that something like
             root 10381  9923  0 15:14:09 pts/8     0:00 -sh
             root 10155 10152  0 14:09:23 pts/7     0:00 -sh
i mean i get at least 10-20 of such processes.
can anyone tell me what is this?? any kinda clues will be really helpful.Please let me know if you need any other info.

Ok I suspect is that we have written a cron job which executes a shell script which is owned by root user.

Thanks.

Rgds,
Arpit
arpit080399Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

liddlerCommented:
Check the third column PPID (Parent process ID) in youe example 9923 (& 10152)
i.e.
ps -ef|grep 9923
what process is this?
0
yuzhCommented:
Could you please type in:

top -s1 -d1

to find out which process is eating up the CPU and memory first, and the
use "ps" to find out the PPID.

see what we can do about it
0
yuzhCommented:
You might need to use "rtsched" or  "rtprio" to control
the process.

man rtsched
man rtprio

To learn more
     
0
Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

arpit080399Author Commented:
This is sample extract for one of the sh process, the same thing is for all .

    root 11617 11614  0 19:39:34 pts/5     0:00 -sh
    root 11614   617  0 19:39:25 ?         0:00 sshd: root@5
    root   617     1  0  Oct 28  ?         0:00 /opt/ssh/sbin/sshd
    root     1     0  0  Oct 28  ?         0:02 init

    root  9614   617  0 12:13:18 ?         0:02 sshd: root@notty
    root 11614   617  0 19:39:25 ?         0:00 sshd: root@5
    root   617     1  0  Oct 28  ?         0:00 /opt/ssh/sbin/sshd
    root     1     0  0  Oct 28  ?         0:02 init

    root 10152   617  0 14:09:15 ?         0:00 sshd: root@7
    root 11614   617  0 19:39:25 ?         0:00 sshd: root@5
    root   617     1  0  Oct 28  ?         0:00 /opt/ssh/sbin/sshd
    root     1     0  0  Oct 28  ?         0:02 init

Hope it helps
0
arpit080399Author Commented:
Hi Yuzh,

Yes i used top to find out that sh is eating out the process and after that i send the above detais. can i please get some more pointers on the same?
0
yuzhCommented:
use:

netstat
who
ps -ef | grep ssh

to check out the connetions

crontab -l

to check your contab

it looks the the secure shell server sshd is handling a lot of ssh connetions.

Do you have a lot of remote users?

What happen if you can have a chance to reboot the box.

This is a UNIX sys adm question, you can create a link to UNIX TA.  And you
could get more help from the others.
0
arpit080399Author Commented:
ya when i reboot the box, all works fine...
0
liddlerCommented:
Does sshd's size in memory keep increasing (i.e. memory leak)
0
arpit080399Author Commented:
liddler how do i check that? whether theres a memory leak??
0
liddlerCommented:
ps -eo pid,rss,comm |grep sshd

Run it after a reboot, then maybe every couple of hours or so to see if the rss (resident set size) increases.  Have a look at ps man page for all the other options.
If HP-UX has truss or something similar, you can use it to monitor the sshd process to see if anything obvious looks wrong, but if you've not used truss before, if may be a little too much information for you to make sense of.
0
ahoffmannCommented:
> .. the shell processes starts automatically and eats up all the CPU ..
does this happen when you start a new shell/script, or when you login?
0
yuzhCommented:
For HP-UX you can use " tusc" (not part of the main distribution).

You can download it from: (binary)

http://hpux.cict.fr/hppd/cgi-bin/search

after install it.

man tusc
0
arpit080399Author Commented:
This happens when i start a new shell process .
0
ahoffmannCommented:
stupid question:
  have you check your .profile, .cshrc, or whatever is appropriate for your shell for cyclic includes (source or . command)
0
arpit080399Author Commented:
ahoffmann!! i m pasting the relevant .profile statements here can u tell me whether this is the problem?


stty erase ^?
export CLOG=`date +%d`
alias jboss='cd /var/xsam/jboss-3.2.2RC4/bin'
alias cpsim='cd /tmp/sanjay/simulator/com/mobilgw/sam/test/simulator'
alias run-sam='./var/xsam/jboss-3.2.2RC4/bin/run_sam.sh'
alias checksam='ps -eaf | grep java'
alias jbossmq='cd /var/xsam/jboss-3.2.2RC4/server/resin-all/tmp'
alias samhome='cd /var/xsam/samhome'
alias titanium='./opt/titanium/bin/tiping 0.0'
alias clean='./clean.sh'
alias deploy='cd /var/xsam/jboss-3.2.2RC4/server/resin-all/deploy'
export ORBIX_HOME=/opt/iona
export JAVA_HOME=/opt/java1.4
#export JAVA_OPTS=-Dorg.omg.CORBA.ORBClass=IE.Iona.OrbixWeb.CORBA.ORB
#export JAVA_OPTS=$JAVA_OPTS -Dorg.omg.CORBA.ORBSingletonClass=IE.Iona.OrbixWeb.CORBA.singletonORB
#export JAVA_OPTS=$JAVA_OPTS -Dorg.omg.CORBA.ORBInitialHost=smidva
#export JAVA_OPTS=$JAVA_OPTS -Dorg.omg.CORBA.ORBInitialPort=1570
export JBOSS_HOME=/var/xsam/jboss-3.2.2RC4
export JBOSS_CLASSPATH=$ORBIX_HOME/config:$ORBIX_HOME/bin:$ORBIX_HOME/lib/OrbixNames.jar:$ORBIX_HOME/lib/OrbixWeb.jar
export PATH=$PATH:/opt/java1.4/bin
export ANT_HOME=/opt/ant-1.5.3
export JACORB_HOME=/home/terabyte/JacORB_1_4_1
export PATH=$ANT_HOME/bin:/usr/sbin:/usr/local/bin:/usr/contrib/bin:/opt/nano/bin:$PATH
export IT_CONFIG_PATH=/opt/iona/config
alias stopxsam='./stopxsam.sh'
export TERM=vt100
. $ORBIX_HOME/setenvs.sh  #THIS IS SUSPECT I GUESS
#. ./orbix.profile
umask 022
#export ORACLE_BASE=/u1/app/product
#jexport ORACLE_HOME=/u1/app/product/9021
#export ORACLE_SID=xsamdb
#export PATH=$PATH:$ORACLE_HOME/bin:.

0
ahoffmannCommented:
could you please post  . $ORBIX_HOME/setenvs.sh  too
Which shel lare you using? /bin/sh ?

BTW, I'd remove the stty command in your .profile, or at least call it only if you're shure that it is in a tty context
also the syntax
  export VAR=value
is not supported by all shells reading .profile, better written as
  VAR=value; export VAR
0
arpit080399Author Commented:
here goes the $ORBIX_HOME/setenvs.sh

#!/bin/echo This must be sourced by sh
# automatically generated by Orbix installation.
#
IONA_ROOT=/opt/iona ; export IONA_ROOT
ORBIX_ROOT=$IONA_ROOT ; export ORBIX_ROOT
ORBIXWEB_HOME=$ORBIX_ROOT ; export ORBIXWEB_HOME
JAVAHOME=/opt/java1.4  ; export JAVAHOME
IT_CONFIG_PATH=$IONA_ROOT/config ; export IT_CONFIG_PATH
IT_IDLGEN_CONFIG_FILE=$IT_CONFIG_PATH/idlgen.cfg ; export IT_IDLGEN_CONFIG_FILE
ORBIXEVENTS_ROOT=$IONA_ROOT ; export ORBIXEVENTS_ROOT

PATH=$IONA_ROOT/bin:$JAVAHOME/bin:$PATH ; export PATH

# Add other .jar files from the lib directory to the CLASSPATH as needed.
# Respect the same order of JAR files as for IT_DEFAULT_CLASSPATH, in common.cfg in the config directory.
if [ ${CLASSPATH:=x} != x ]
then
        CLASSPATH=$IT_CONFIG_PATH:$IONA_ROOT/demos/classes:$CLASSPATH ; export CLASSPATH
else
        CLASSPATH=$IT_CONFIG_PATH:$IONA_ROOT/demos/classes; export CLASSPATH
fi

if [ ${SHLIB_PATH:=x} != x ]
then
        SHLIB_PATH=$IONA_ROOT/lib:$SHLIB_PATH; export SHLIB_PATH

else
        SHLIB_PATH=$IONA_ROOT/lib; export SHLIB_PATH
fi
0
ahoffmannCommented:
both scripts do not call any other program/script except date, so it should not be the culprit for your problem
0
arpit080399Author Commented:
ok ahoffmann!! then what else can u suspect? also one more thing many times to kill something ctrl-C doesnt work so developers just close the window. can that be one of the possible problems? or any other clues will be helpful....
0
ahoffmannCommented:
a bit confused ..
you said that it is for a/each "shell session"
>  This happens when i start a new shell process .

I asked if it happens when a user logs in, or when a (shell) script is started.
According your last comment I assume that it happens for scripts, randomly ..
Could you please confirm.
0
arpit080399Author Commented:
it happens randomly!! i mean when user logs in... shell processes are started ,
sometimes shell processes dies and sometimes it doesnt....so problem is random i guess.
0
yuzhCommented:
Why don't you install tusc (HP-UX 11.x) or trace (HP-UX 10.x) to check it out.
0
ahoffmannCommented:
ok, login shell
and randomly

Does randomly mean random per user (always the same user), or does it also mean random for all users.

I'd first disable all user-private ~/.profile ~/.login ~/.cshrc ~/.bashrc etc.

Are the home directories NFS mounted?
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
tfewsterCommented:
>>many times to kill something ctrl-C doesnt work so developers just close the window.

That's a likely cause of these "orphan" shell processes - Although the terminal window is closed, the HP-UX system thinks the connection is still open and can chew up CPU trying to service the connections.  This often happens if the users hop from one from one system to another instead of connecting directly.

The sh processes you showed aren't using much cpu, but it will add up. Can you trace the process tree of an sh process that IS using a lot CPU (i.e. its parents & children) and show the `ps` output?

You may be able to used "idled" to kill these orphan sessions automatically - http://www.darkwing.com/idled/README.html
or write your own "cleanup" script to kill idle connections.
0
ahoffmannCommented:
tfewster, nice tip
I also thoght about the limit of processes, or limit of users, which can be set in the kernel.
0
tfewsterCommented:
Good point, ahoffmann - it may be that the kernel is tuned to only allow enough processes for the "normal" workload and all the remote users are blowing that limit;  sar -v and /var/adm/syslog/syslog.log should give a clue if those limits are being reached.  If sar is running all the time, you can see what is "normal" (e.g. shortly after a reboot) and

Of course, as all the remote users appear to log in as root, setting maxuprc will not restrict them ;-)

I'm also curious why "Ctrl-C doesn't work" - It may be a badly behaved program that doesn't respond to interrupts, but it needs to be killed "properly";  Ideally the users should start another session and clean up the first.
0
arpit080399Author Commented:
thakyou ahoffman and tfewster. the problem was resolves with help of controlling users to properly clean up and using "idled" utility in case of any idle instances....
0
suryapadmaCommented:
I have a similar issue and my server CPU and memory are getting hogged and in a Cluster environment, the switch over is delayed and even Resin start up is very slow and time consuming.

Regards

Surya
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
System Programming

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.