Do not use on any
shared computer
August 29, 2008 06:02pm pdt
 
[x]
Attachment Details

RS6000 AIX 4.3.3 Server became inaccessable

Tags: IBM, RS 6000, 32P0731, AIX 4.3.3, 46072733 U1.1-P1.1 On panel display, Oracle 8.1.7 Server inaccessable
AIX 4.3.3
RS 6000 32P0731 6M1
FastT700 with 3 EXP700 Expansion Modules.
Runs Oracle 8.1.7

On Friday 3/21 lost connectivity to the server. Could not Ping, TNS Ping or access via telnet or ssh.  No apparent power outages, no switching/routing outages. Came in Saturday AM and found the server frozen.  
Panel on the server had this error:  46072733  (and) U1.1-P1.1
Drive lights were green but solid, not flickering.
Powered the unit off & back on.  What log file do I look for that might tell me what happened?

I did this:
The error logging process begins when an operating system module detects an
error. The error detecting segment of code then sends error information to either
the errsave and errlast kernel service or the errlog application subroutine where
the information is, in turn, written to the /dev/error special file. This process then
adds a time stamp to the collected data. You can use the errpt command to
retrieve an error record from the error log

oracle@ifs-server:/var/adm/ras>errpt -a -s 0319093008 | more

*But it only showed me the following, which is reporting on me shutting down and rebooting the server:


LABEL:          SCANOUT
IDENTIFIER:     CF8CADB6

Date/Time:       Sat Mar 22 06:59:41
Sequence Number: 382
Machine Id:      000A8ABA4C00
Node Id:         ifs-server
Class:           H
Type:            PERM
Resource Name:   sysplanar0
Resource Class:  planar
Resource Type:   sysplanar_rspc
Location:        00-00

Description
SYSTEM FAILURE WITH SCAN DATA

Probable Causes
SYSTEM HARDWARE
SOFTWARE ERROR

Failure Causes
SYSTEM HARDWARE
SOFTWARE SUBSYSTEM

        Recommended Actions
        PERFORM PROBLEM DETERMINATION PROCEDURES
        IF PROBLEM CONTINUES TO OCCUR REPEATEDLY THEN DO THE FOLLOWING
        CONTACT APPROPRIATE SERVICE REPRESENTATIVE

Detail Data
ERROR COUNT
           1
SCAN DATA PATHNAME
/usr/lib/ras/scanoutlog.000A8ABA4C00.A
---------------------------------------------------------------------------
LABEL:          EPOW_SUS_CHRP
IDENTIFIER:     071F4755

Date/Time:       Sat Mar 22 06:59:01
Sequence Number: 381
Machine Id:      000A8ABA4C00
Node Id:         ifs-server
Class:           H
Type:            PERM
Resource Name:   sysplanar0
Resource Class:  planar
Resource Type:   sysplanar_rspc
Location:        00-00

Description
ENVIRONMENTAL PROBLEM

Probable Causes
POWER OR FAN COMPONENT

        Recommended Actions
        RUN SYSTEM DIAGNOSTICS.
        PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
POWER STATUS REGISTER
0000 0002
PROBLEM DATA
Standard input

Error Code deleted unless you really want it&&


LABEL:          SCAN_ERROR_CHRP
IDENTIFIER:     BFE4C025

Date/Time:       Sat Mar 22 06:58:01
Sequence Number: 380
Machine Id:      000A8ABA4C00
Node Id:         ifs-server
Class:           H
Type:            PERM
Resource Name:   sysplanar0
Resource Class:  planar
Resource Type:   sysplanar_rspc
Location:        00-00

Description
UNDETERMINED ERROR

Failure Causes
UNDETERMINED

        Recommended Actions
        RUN SYSTEM DIAGNOSTICS.

Detail Data
PROBLEM DATA
Error code deleted&..



LABEL:          ERRLOG_ON
IDENTIFIER:     9DBCFDEE

Date/Time:       Sat Mar 22 06:59:40
Sequence Number: 379
Machine Id:      000A8ABA4C00
Node Id:         ifs-server
Class:           O
Type:            TEMP
Resource Name:   errdemon

Description
ERROR LOGGING TURNED ON

Probable Causes
ERRDEMON STARTED AUTOMATICALLY
User Causes
/USR/LIB/ERRDEMON COMMAND
        Recommended Actions
        NONE


***I also did this:  To determine the path to your system's error log file, run the following command:
# /usr/lib/errdemon -l
Error Log Attributes
--------------------------------------------
Log File /var/adm/ras/errlog
Log Size 1048576 bytes
Memory Buffer Size 32768 bytes

I got:

oracle@ifs-server:/usr/lib>errdemon -l
ksh: errdemon:  not found.
oracle@ifs-server:/usr/lib>



***And I did this and got&

oracle@ifs-server:/usr/lib>errpt
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
CF8CADB6   0322065908 P H sysplanar0     SYSTEM FAILURE WITH SCAN DATA
071F4755   0322065908 P H sysplanar0     ENVIRONMENTAL PROBLEM
BFE4C025   0322065808 P H sysplanar0     UNDETERMINED ERROR
9DBCFDEE   0322065908 T O errdemon       ERROR LOGGING TURNED ON
5DFED6F1   0205164508 I O SYSPFS         UNABLE TO ALLOCATE SPACE IN FILE SYSTEM

But there is no entry for 3/21  ( I rebooted it the next morning when the issue was discovered, which is indicated by the entries on 3/22 @ 7am&& but again nothing for 3/21) I know it happened 3/21 around noon because that was the last time an archive log file made it to our standby database.

Any other way to determine what happened? I have googled the error which was on the panel, but have come up empty handed.
Start your free trial to view this solution
[x]
The Solution Rating System

With so many solutions, how can you tell which solutions are most likely to help you and which ones are not? To provide you with a tool to use, we rate our solutions based on various elements that most accurately determine if a solution is a quality solution. To explain what factors affect the solution rating, here are the elements we take into consideration when formulating our solution rating.

  • The Grade of the Solution
  • The Zone Rank of the Expert Providing the Solution
  • The Number of Author and Expert Comments
  • The Number of Experts Contributing
  • The Feedback of the Community

Your Input Matters
Because of the way the system is set up, the most important variable in this equation is you. As a member of Experts Exchange, you are able to cast your vote on the quality of the solutions in regard to how complete, accurate, helpful and easy to understand each solution is. When you provide your feedback, each rating is adjusted accordingly. So, if you see a solution that has a poor rating that you think is a good solution, let us know by rating it. As you do, the rating will be adjusted and will become more accurate for other members of our site.

If you have any suggestions that you would like to make for our rating system, please ask a question in the Suggestions Zone of Community Support.

Thank you!

Question Stats
Zone: OS
Question Asked By: khowe34
Solution Provided By: polazarus
Participating Experts: 1
Solution Grade: B
Views: 20
Translate:
Loading Advertisement...
 
[+][-]Author Comment by khowe34
Author Comment by khowe34:

All comments and solutions are available to Premium Service Members only.

Start your 7-day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
[+][-]Accepted Solution by polazarus
Accepted Solution by polazarus:

All comments and solutions are available to Premium Service Members only.

Start your 7-day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
[+][-]Author Comment by khowe34
Author Comment by khowe34:

All comments and solutions are available to Premium Service Members only.

Start your 7-day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
Loading Advertisement...
20080723-EE-VQP-34 / EE_QW_2_20070628