Link to home
Start Free TrialLog in
Avatar of elliot2002
elliot2002

asked on

SCSI hard drive prob - please help!

Hello there,

This one could have gone into hardware, XP or win 98 – but I am desperate for a resolution so I went with my favourite. Sorry this is a bit long, but hopefully the more info I give the easier it will be for you guys to contribute.

I am experiencing some pretty fundamental (but intermittent) problems with a PC that I have just installed for someone. It is a P4 2 gig, with 1 gig RAM, Geforce 3 and a SCSI Quantum Atlas 10KIII.

The PC was initially pre-installed with windows 98. I was only able to test it for a limited period, but everything seemed to be running smoothly. Then the decision was taken to upgrade to XP – but due to time constraints I had to upgrade over the top of win98 rather than do a fresh install as I would have preferred. I followed the procedure for installing XP (inserting the SCSI driver disk at the appropriate time) and everything seemed to be running perfectly. An intensive period of use followed during which many programs were installed and settings transferred – no problems manifested themselves. The two weeks that followed were relatively trouble free.

However, in addition to sporadic lockups more serious boot up problems have started to occur. On several occasions the message “Disk boot failure insert system disk and press enter” has appeared. He got round this by inserting his XP disks and trying various combinations of onscreen options – I sense that the problem spontaneously rectified itself.

Also with increasing frequency, the system has been stalling very early on just when the system is detecting the SCSI drive – it displays the message “SCSI ID 06 Quantum Atlas 10K3_72_wls_” and the system hangs with the cursor blinking. After multiple restarts and patient waiting the boot process typically resumes (this has taken anything from 5-10 mins to a current maximum of 1/2 hour). There have been multiple variations on this same theme. Once into windows the system seems to function normally.

I would be very grateful for some feedback as I have so far been unable to isolate the source of the problem. The PC is in another country so I have had to diagnose it as best I can remotely. If necessary I will ship it back, but I just want to get him up and running properly as soon as possible.

It’s way past bedtime here in the UK so I will check back in the morning (about 6 hours time).

Thanks in advance,


Elliot
Avatar of swwelsh
swwelsh

What scsi card is the drive attached to? I would check the options for the card (usually a key combo at startup will get you into the card setup) or look for an updated driver for it. Have any other scsi devices been added or removed from the scsi bus?
Avatar of elliot2002

ASKER

Cheers mate,

As I didn't assemble the computer (just installed and configured it), I unfortunately don't know what the SCSI card is. How can I find this out - will it be displayed at some point in the startup sequence or could it be accessed from within windows? Bear in mind that until I get the computer shipped back here I will have talk someone through this over the phone.

Once I have accessed the SCSI card, what options should I look for/adjust - I don't want to leave it more unbootable than it is already!

Any more ideas?

Thanks,

Elliot.
ASKER CERTIFIED SOLUTION
Avatar of rid
rid
Flag of Sierra Leone image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks rid,

I could use a tiny bit of elaboration as I am not that hot with the physical hardware side of things. When you say check the termination I assume I have to remove it then reinstall it, check the contacts and make sure it's seated securely. If there is anything else to check for, please let me know.

To be completely honest I'm not sure what you mean by POST, I'm assuming its an acronym.

I am downloading the hardrive diagnostic utility as I will be able to do that remotely.

Keep it coming,

Elliot.
POST is power on, self test, where the computer checks hardware. A single beep from the bios tells you the system has passed POST. The bios on the scsi card will install itself in order to boot the hard drive, normally it will display a message identifying the card and chipset, and giving a keyboard combo (usually control+A for Adaptec cards) to enter setup for the adapter. The scsi bus (this is the cable and all the devices on it) has to be terminated at either end, this is done with a switch or jumper on the device, or possibly in software for the card itself. I agree with Rid that this is likely a problem with the drive, since it worked fine for a while. If no other scsi devices were added or removed, and all cabling is secure, your best bet would be to get a report from the diagnostic. You might also want to check that the boot order in the bios is set to scsi first, I would also change the ID of the hard drive to 0 to see if that helps, that is the standard scsi ID assignment for a bootable scsi drive
Sorry about being non-verbose. You've got most of it in the above comment from swwelsh.

The termination is a way of determining the end points of the SCSI signal chain (the cable(s)). Termination dampens out signal reflection and is needed in both ends of the chain, and in no other positions. If you have only internal units, it should be applied on the SCSI host adapter (the card) and on the unit farthest away from the card. Not setting it up like this can cause total non-functioning, or, depending on units attached, cable length etc, erratic behaviour. If you are connecting external units, the host adapter should not be terminated in most cases, but instead the end point unit on the external part of the chain should.

There are variations here: some adapters actually have two chains (internal/external), some have automatic termination of the host adapter end etc. The thing is, you need to know...

Regards
/RID
Did the problem start after the system was transported? Another candidate is a loose cable.

Ask the person on-site to open the machine up and check that all cables are secure, particularly the drive data cable and power cables to any fans.
Thanks guys that was enlightening,

As I suspected, I will need to get it shipped back over here to properly diagnose it. In the meantime I will send over the quantum diagnostic utitliy and get him to email me the results - I'll post these back as soon as I get them.

If anyone can think of anything else I can do remotely while the computer is being prep'd for shipping please post back.

rid and swwelsh: It could be as long as a couple of weeks before I have the computer over here so if we can pick this up when I have the machine that would be great.

In the meantime I will post back with any further developments.

Cheers,

Elliot.
bartsmit - it was fine immediately after transport. I had to install everything and do a load of transferring and everything was running fine. This problem started to appear a couple of weeks later.

Hope that helps.

I wouldn't really have him opening the computer up to be honest - I would have to check that myself when I get it back.

Elliot.
Just so you all know this question is still active - I am waiting for the guy to get back to me with the results of the drive diagnostic.

Cheers,

Elliot.
Also make sure that the BIOS of the SCSI adapter is up to date, especially as Windows update may have grabbed a later driver when run after a few weeks. Some drivers only work properly with later revisions of the firmware.

If the SCSI adapter is on the motherboard then you need to upgrade the motherboard BIOS. Only use the SCSI chipset BIOS upgrade if the SCSI adapter is on a PCI expansion card.
Cheers bartsmit, that's very interesting,

I know he downloaded a raft of windows updates, and that could have been before the problem emerged. Do you think that this could have caused the kind of erratic behaviour we have been observing?

When I get the machine over here (he wants to hold on to it until next week) I will do across the board updates, in the meantime I will try and trackdown an update for the SCSI BIOS.

Thanks for the continued interest.
I have had a similar problem with an IDE RAID controller; the updated driver completely threw the system disk out. Boot disk not found, registry hives not found. Flashing the firmware fixed the problem.

You're likely to face a complete rebuild if that is the case, since the problems stem from failed/incomplete writes. The disk may be in a bit of a pickle by now.
As soon as this started happening I implemented a total and incremental backup system, so there will be no problem with wiping and rebuilding. The system has generally not been running as smoothly as I would have expected given the specification and I suspect that it is due to residual stuff from the win 98 installation - with this in mind a fresh install will be good all round.

I am completely comfortable with all aspects of the software reinstallation, but I have less experience of firmware flashing and updating (this makes me a bit edgy) - I would be grateful for some pointers in this area.

Cheers.
Just a thought here:
As you say in your first post that the problems have come on gradually, I have difficulties with the idea of a BIOS version problem. I always break out in a sweat when it comes to flashing; I have never touched a SCSI host adapter BIOS. I'm not saying the idea is wrong, but I would try it as a very last resort.
1) Check the physical setup thoroughly (cabling, jumpering and termination, as well as cooling)

2) Do a low-level format of the HD, if possible. Many SCSI host adapters incorporate a program for this in their BIOS.

3) Fdisk and format and do a clean reinstall of your O/S, taking care to get all hardware drivers correctly installed.

Regards
/RID
Cheers mate,

I will keep the BIOS stuff on the backburner until I have elimanted the less risky stuff. I have plenty to be going on with and am all ready to kick off when I get the computer back. The bloke goes to Greece on Friday, so the computer should be dispatched directly. Biazarrely it hasn't thrown up a single problem in the last ten days - despite the fact that nothing has been physically changed.

I will post back as soon as I can try all this stuff out myself.

Thanks,

Elliot.
Hello all,
I am Computer101, a moderator from Experts-Exchange and also an expert within this topic area. This uestion has been open a long time.  What I am going to do is allow feedback from the questioner and xperts.  If it is not resolved, I will delete or accept an answer based on the info I have been given, Experts, feel free to offer input.  I will monitor these questions for a period of 5-7 days and come back and evaluate.  I will have another moderator (who is also an expert in this topic area) look at the question also to ensure we do the right thing for this question.

Thank you
Computer101
Community Support Moderator
I think the author has got a lot of useful info here. If we get them online again, I think we could solve this. Otherwise a split might perhaps be called for.

/RID
User logged in 10/15.  Please return to your question.  If no action, I will perform a split

Computer101
E-E Admin
Split points between rid, bartsmit, and swwelsh.

bartsmit, points for you here.

https://www.experts-exchange.com/questions/20640910/Points-for-bartsmit.html

swwelsh, points for you here.

https://www.experts-exchange.com/questions/20640912/Points-for-swwelsh.html

YensidMod
EE Moderator
Hello there,

Many apologies for not concluding that. The customer was getting increasingly disgruntled, so in the end I had to just replace the hard drive to smooth things over - as such the issue was never fully resolved. Evidently I completely forget to close the thread. Thanks for the all the feedback, it was greatly appreciated at the time - and again sorry for letting it slide for so long.

Elliot.