Solved

SCSI Bus Reset - Disk or SCSI adapter issue

Posted on 2013-12-20
7
611 Views
Last Modified: 2014-02-11
One of our Solaris servers stopped responding this morning.  The application that runs on the server is vendor supported.  The vendor found no errors reported by iostat or prtdiag.  No amber light was showing on the server either.  However, two disks were disconnected.  They shutdown and turned the server backup on and were able to boot up.  

The only errors reported on the server are the one found in the /var/adm/messages which I have attached a copy of.

Although I understand that multiple hard disks may fail, I am not sure the issue are the HDs, SCSI controller or motherboard.  I was hoping someone could tell by perhaps looking at the attached log.  Please let me know if there are other commands that can be run that might give you a better idea of the problem.
messages.0.txt
0
Comment
Question by:cartereverett
7 Comments
 
LVL 47

Expert Comment

by:dlethe
ID: 39732518
Check cabling & termination. There are no other entries in the log that reveal other issues.
Now you can spend some money and buy some diagnostic software that will get to the bottom of things, but it probably isn't worth the money.
0
 
LVL 16

Expert Comment

by:Gerald Connolly
ID: 39733958
As David said check the cabling and termination.
NB. SCSI is a bus and requires termination at both ends of the bus.
No termination or multiple terminations per end will cause problems
0
 
LVL 62

Expert Comment

by:gheist
ID: 39734342
Would be nice if you provide reasonable system information e.g. at least if disks are builtin and if your server is a pc or sparc....

ASC 02 -> no seek complete... i.e scsi device did not do anything on command...
Given failing command is "write" you most likely lose 4KB every couple of minutes...

Fos system info send in prtconf (-v)

What do you mean by "vendor" - was it oracle saying continuous disk errors involving data loss is ok for them to leave?
0
Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

 
LVL 47

Expert Comment

by:dlethe
ID: 39734372
Gheist - You are misreading this.
it is ASC=29h, ASCQ=02h, not ASC=02h.  This is defined as a SCSI bus reset per the ANSI spec.

A no seek complete would be ASC=02h, ASCQ=06h  (Which can't happen on a WRITE10 CDB anyway).
P.S. I write SCSI diagnostic code professionally.
0
 
LVL 16

Accepted Solution

by:
Joseph Gan earned 500 total points
ID: 39735192
The system had lots of errors:

Dec 18 02:38:42 VRCdata.braishfield.local scsi: [ID 107833 kern.notice]       Requested Block: 114103696                 Error Block: 114103696
Dec 18 02:38:42 VRCdata.braishfield.local scsi: [ID 107833 kern.notice]       Vendor: FUJITSU                            Serial Number: 0745B0PAJU  

I asume this was a Fujitsu internal disk or disks, which has OS installed on it.

If you could show output of "iostat -En" here?
0
 

Author Comment

by:cartereverett
ID: 39735986
The issue was one of the hard disks in the data mirror.  Replaced the drive, resynced and everything is back to normal.
0
 
LVL 16

Expert Comment

by:Joseph Gan
ID: 39736986
Yes, that's it!
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
Moving your enterprise fax infrastructure from in-house fax machines and servers to the cloud makes sense — from both an efficiency and productivity standpoint. But does migrating to a cloud fax solution mean you will no longer be able to send or re…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

778 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question