nsome
asked on
SRC_RSTRT; IDENTIFIER: BA431EB7; 'srchevn.c'@line:'217';rpc.statd
Every ten minutes or so i'm getting this error below. rpc.statd service is active...any suggestions?
LABEL: SRC_RSTRT
IDENTIFIER: BA431EB7
Date/Time: Wed Jun 22 15:08:32 EDT
Sequence Number: 16039
Machine Id: xxxxxxxxxxxxxx
Node Id: hostname
Class: S
Type: PERM
Resource Name: SRC
Description
SOFTWARE PROGRAM ERROR
Probable Causes
APPLICATION PROGRAM
Failure Causes
SOFTWARE PROGRAM
Recommended Actions
VERIFY SUBSYSTEM RESTARTED AUTOMATICALLY
Detail Data
SYMPTOM CODE
256
SOFTWARE ERROR CODE
-9035
ERROR CODE
0
DETECTING MODULE
'srchevn.c'@line:'217'
FAILING MODULE
rpc.statd
---------------
oslevel -r ???
ASKER
5200-01
I find a hit similar to your case at AIX 5.2
http://www-1.ibm.com/support/docview.wss?uid=isg1IY47085
http://www-1.ibm.com/support/docview.wss?uid=isg1IY47085
rpc.statd is related nfs and it is a software error at AIX. When I enter search keyword rpc.statd and no hit at .Only IBM employee can access the entire IBM problem database. If you want the solution, you must report the problem to IBM.
smells like old unpatched XDR decoder ???
you have to install at least 5200-02 to get rid of known rpc.* problem, or better 5200-06
you have to install at least 5200-02 to get rid of known rpc.* problem, or better 5200-06
Patches are here:
<http://www-912.ibm.com/eserver/support/fixes/maintpkg.jsp?system=2&release=52>
<http://www-912.ibm.com/eserver/support/fixes/maintpkg.jsp?system=2&release=52>
ASKER
Downloading 5200-06 level right now...
Be sure to read all instructions and religiously back up *all* before and after ...
(including raw partitions, not included with vg backup)
(including raw partitions, not included with vg backup)
ASKER
By using the following directions https://techsupport.services.ibm.com/server/mlfixes/52/06/01to06.html I installed volume 1 after which running oslevel -r still displayed 5200-01. I dumped volume 2 contents into the /usr/sys/inst.images and ran smit update_all - displayed the following
Command: failed stdout: yes stderr: no
Before command completion, additional instructions may appear below.
#------------------------- ---------- ---------- ---------- ---------- ----
# No filesets on the media could be used to update the currently
# installed software.
#
# Either the software is already at the same level as on the media, or
# the media contains only filesets which are not currently installed.
#------------------------- ---------- ---------- ---------- ---------- ----
...? Should I reboot?...Should I have dumped the packages elsewhere?
Command: failed stdout: yes stderr: no
Before command completion, additional instructions may appear below.
#-------------------------
# No filesets on the media could be used to update the currently
# installed software.
#
# Either the software is already at the same level as on the media, or
# the media contains only filesets which are not currently installed.
#-------------------------
...? Should I reboot?...Should I have dumped the packages elsewhere?
ASKER
What doesn't it like?
ASKER
OK - I dumped the 50006.v2.tar.gz into a tmp directory under /usr/sys/inst.image/tmp and most of the install succeded except for the following SDK java errors ( which may not be critical filesets)
Pre-installation Failure/Warning Summary
-------------------------- ---------- ----
Name Level Pre-installation Failure/Warning
-------------------------- ---------- ---------- ---------- ---------- ---------- ---
sysmgt.websm.webaccess 5.2.0.60 Requisite failure
sysmgt.websm.rte 5.2.0.50 Requisite failure
sysmgt.websm.icons 5.2.0.60 Requisite failure
sysmgt.websm.framework 5.2.0.60 Requisite failure
sysmgt.websm.diag 5.2.0.50 Requisite failure
sysmgt.websm.apps 5.2.0.50 Requisite failure
sysmgt.sguide.rte 5.2.0.40 Requisite failure
Pre-installation Verification...
+------------------------- ---------- ---------- ---------- ---------- ---------- --+
Verifying selections...done
Verifying requisites...done
Results...
FAILURES
--------
Filesets listed in this section failed pre-installation verification
and will not be installed.
Requisite Failures
------------------
SELECTED FILESETS: The following is a list of filesets that you asked to
install. They cannot be installed until all of their requisite filesets
are also installed. See subsequent lists for details of requisites.
#1 sysmgt.sguide.rte 5.2.0.40 # TaskGuide Runtime Environment
#2 sysmgt.websm.apps 5.2.0.50 # Web-based System Manager App...
#3 sysmgt.websm.diag 5.2.0.50 # Web-based System Manager Dia...
#4 sysmgt.websm.framework 5.2.0.60 # Web-based System Manager Cli...
#5 sysmgt.websm.icons 5.2.0.60 # Web-based System Manager Icons
#6 sysmgt.websm.rte 5.2.0.50 # Web-based System Manager Run...
#7 sysmgt.websm.webaccess 5.2.0.60 # WebSM Web Access Enablement
MISSING REQUISITES: The following filesets are required by one or more
of the selected filesets listed above. They are not currently installed
and could not be found on the installation media.
(Selected filesets which depend upon these requisites are referenced in
parentheses.)
Java14.sdk 1.4.0.1 # Fileset Update
(dep #s: 1-7)
<< End of Failure Section >>
Pre-installation Failure/Warning Summary
--------------------------
Name Level Pre-installation Failure/Warning
--------------------------
sysmgt.websm.webaccess 5.2.0.60 Requisite failure
sysmgt.websm.rte 5.2.0.50 Requisite failure
sysmgt.websm.icons 5.2.0.60 Requisite failure
sysmgt.websm.framework 5.2.0.60 Requisite failure
sysmgt.websm.diag 5.2.0.50 Requisite failure
sysmgt.websm.apps 5.2.0.50 Requisite failure
sysmgt.sguide.rte 5.2.0.40 Requisite failure
Pre-installation Verification...
+-------------------------
Verifying selections...done
Verifying requisites...done
Results...
FAILURES
--------
Filesets listed in this section failed pre-installation verification
and will not be installed.
Requisite Failures
------------------
SELECTED FILESETS: The following is a list of filesets that you asked to
install. They cannot be installed until all of their requisite filesets
are also installed. See subsequent lists for details of requisites.
#1 sysmgt.sguide.rte 5.2.0.40 # TaskGuide Runtime Environment
#2 sysmgt.websm.apps 5.2.0.50 # Web-based System Manager App...
#3 sysmgt.websm.diag 5.2.0.50 # Web-based System Manager Dia...
#4 sysmgt.websm.framework 5.2.0.60 # Web-based System Manager Cli...
#5 sysmgt.websm.icons 5.2.0.60 # Web-based System Manager Icons
#6 sysmgt.websm.rte 5.2.0.50 # Web-based System Manager Run...
#7 sysmgt.websm.webaccess 5.2.0.60 # WebSM Web Access Enablement
MISSING REQUISITES: The following filesets are required by one or more
of the selected filesets listed above. They are not currently installed
and could not be found on the installation media.
(Selected filesets which depend upon these requisites are referenced in
parentheses.)
Java14.sdk 1.4.0.1 # Fileset Update
(dep #s: 1-7)
<< End of Failure Section >>
Install Java14.sdk 1.4.0.1 and find out why this fileset cannot be installed. Some filesets require to enable accept new license.
On the other hand you can deinstall these web-based administration modules, since you are able to work with command lines just fine
ASKER
Would failure of these filesets impact the overall maintenance level upgrade? I rebooted the box and the oslevel -r still lists 5200-01....the syslog error from above keeps on coming in...What do you recommend next?
I recommend applying full patch (e.g by deinstalling nonupgradeable+unnecessary websm modules and upgrading java14 filesets
ASKER
I can't seem to find the proper java14 filesets to install ?
http://www.ibm.com/developerworks/java/jdk/aix/index.html
though your upgrade is mostly done - kernel and c libraries are updated along with net services, only web management and JDK are holding you back
IMO installing java and websm will not ask for reboot.
though your upgrade is mostly done - kernel and c libraries are updated along with net services, only web management and JDK are holding you back
IMO installing java and websm will not ask for reboot.
simply oslevel -r shows version of oldest component ....
ASKER
I removed web management packages that failed to install. The only thing that is complaining is sysmgt.sguide.rte 5.2.0.40 # TaskGuide Runtime Environment - what is this for?
So you're saying that untill the rev is offically 5200-06 (even though all 5200-06 filesets loaded except for java and the sysmgt.sguide.rte) the rpc.statd could still be having problems?
So you're saying that untill the rev is offically 5200-06 (even though all 5200-06 filesets loaded except for java and the sysmgt.sguide.rte) the rpc.statd could still be having problems?
ASKER
for the http://www.ibm.com/developerworks/java/jdk/aix/index.html link , it requires an IBM ID...which I don't know if I have one...crap.
You have to register, and it will take few days to process ( because of java being encryption software).
Or alternatively you can deinstall it ...
Or alternatively you can deinstall it ...
ASKER
You mean to deinstall sysmgt.sguide.rte 5.2.0.40 # TaskGuide Runtime Environment? Is this necessary? I want to eliminate the rpc.statd problem asap and if this thing is unecessary I can deal with it later on...
rpc.statd is already patched.
You can keep unpatched TaskGuide, it does not hurt portmapper or rpc.statd, it is config wizards support environment, and will still work, and is fairly safe, since it is not always-on network service
You can keep unpatched TaskGuide, it does not hurt portmapper or rpc.statd, it is config wizards support environment, and will still work, and is fairly safe, since it is not always-on network service
ASKER
So you're saying because TaskGuide fileset is not been updated, level reported by oslevel -r will not display 5200-06 until each fileset is updated...(all or nothing)?
I've had the system rebooted and the error reported initially is still happening.
I've had the system rebooted and the error reported initially is still happening.
first job is done - I cannot tell you anymore that XDR decoder has error which make each and every rpc service easy to crash :-)
rpc.statd is twin brother of rpc.lockd
try stopsrc -s rpc.lockd ; stopsrc -s rpc.statd ; startsrc -s rpc.statd ; startsrc -s rpc.lockd
then use rpcinfo -p to check how they are running
if they crash right away - they use files in /etc/sm/ and /etc/state to keep status over restart - maybe these are damaged via forced poweroff or kill -9 or so and need to be cleaned up
are you using nfs server on this machine so multiple clients can write to same files ??? (i.e do you need lockd and statd at all)
the error is that SRC ( system resource controller ) detected that resource crashed and started it again, it does not tell much more.
rpc.statd is twin brother of rpc.lockd
try stopsrc -s rpc.lockd ; stopsrc -s rpc.statd ; startsrc -s rpc.statd ; startsrc -s rpc.lockd
then use rpcinfo -p to check how they are running
if they crash right away - they use files in /etc/sm/ and /etc/state to keep status over restart - maybe these are damaged via forced poweroff or kill -9 or so and need to be cleaned up
are you using nfs server on this machine so multiple clients can write to same files ??? (i.e do you need lockd and statd at all)
the error is that SRC ( system resource controller ) detected that resource crashed and started it again, it does not tell much more.
Do not worry about some unused filesets not updated - system kernel , base libraries and net services are well updated and that is all what is needed
oslevel -r display indicates the ML of oldest fileset, there is nothing to worry about - rest of filesets not mentioned as errors are completely updated.
oslevel -r display indicates the ML of oldest fileset, there is nothing to worry about - rest of filesets not mentioned as errors are completely updated.
> Date: 06/23/2005 06:10AM PDT
And check if thisis installed too ( by date it should be - but who knows )
And check if thisis installed too ( by date it should be - but who knows )
ASKER
If what is installed?
fix recommended by cpc2004 ...
ASKER
gheist - I just noticed that I missed your comment dated Date: 06/26/2005 08:22AM PDT with the steps to investigate rpc.statd stop / startup....I'll go ahead and give that a shot...sorry.
Your question "are you using nfs server on this machine so multiple clients can write to same files ???" answer is no. The other question (i.e do you need lockd and statd at all? - what else are these daemons for?
...
Your question "are you using nfs server on this machine so multiple clients can write to same files ???" answer is no. The other question (i.e do you need lockd and statd at all? - what else are these daemons for?
...
Do you have anything in the /etc/netsvc.conf? If not, try adding the line hosts=local,bind to the file. That took care of the problem for me.
ASKER
Thanks rhorton2. I just edited the file and restarted both rpc.statd and rpc.lockd. I'm waiting for about 20 minutes or so to confirm if it logs the error again. In your situation, did the error log every 10-15 minutes?
It would happen every 6 minutes. The exact same error using the same os level. After making the change the error has not returned.
ASKER
Did you reboot the box?
You added the following:
hosts=local,bind
right?
You added the following:
hosts=local,bind
right?
I did not reboot. I simply added a line at the bottom of the /etc/netsvc.conf which only had comments in it. The file tells how to do name resolution. The two entries I used were local and bind which are explained below. If you use a different method you may want to change the config a little. http://publib16.boulder.ibm.com/pseries/en_US/files/aixfiles/netsvc.conf.htm explains the file in detail.
# bind Uses BIND/DNS services for resolving names
# local Searches the local /etc/hosts file for resolving names
Also, if I remember right, the error I think showed up one more time after I made the change and then dissapeared. I would suggest making the change and monitoring the system for a while to see if the problem goes away after the change. If you're looking at 10-15 minutes for each error you should know after a couple of hours if things look good.
# bind Uses BIND/DNS services for resolving names
# local Searches the local /etc/hosts file for resolving names
Also, if I remember right, the error I think showed up one more time after I made the change and then dissapeared. I would suggest making the change and monitoring the system for a while to see if the problem goes away after the change. If you're looking at 10-15 minutes for each error you should know after a couple of hours if things look good.
ASKER
The error came back...
gheist - neither /etc/sm nor /etc/state files exist. Is that of a concern?
gheist - neither /etc/sm nor /etc/state files exist. Is that of a concern?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.