Link to home
Start Free TrialLog in
Avatar of oskelton
oskelton

asked on

Disk errors - 8 bytes lost at each point and 2 bytes repeated

My 80GB Maxtor seemed to have started to error (including CRC errors on moving some files)_ so I bought a new Samsung 200GB to replace it.
With both on a Promise UDMA 100 controller I used MAXBLAST to copy the disk.

Then I found that there were a great many more errors on the copy than the original.

Comparing the files which were different showed the same pattern at each difference - where the original disk had 8 bytes, the copy has only 2 bytes and they are a duplicate of the preceding 2 bytes and not any of the 8 missing bytes.

I expected to see 512 byte zones of garbage... so I was a bit surprised by this result.

Does this pattern of differences give any clues as to the source of the problem? - is it possible the controller or motherboard rather than the disks which are causing problems?



ASKER CERTIFIED SOLUTION
Avatar of rindi
rindi
Flag of Switzerland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of oskelton
oskelton

ASKER

Thanks, I don't have that bad an impression of Maxtor but I have come to recognize their reputation has fallen with respect to Seagate (I had been sticking with Quantum/Maxtor drives for some years after remembering some problems with Seagates - I have been finding drives last generally 2-3 years of continuous use before they start to error). Looking at www.storagereview.com gives me the impression reliability is also very model specific. the very few responders for the Samsung 200GB drive are not very encouraging for that model either.

The full problem may be much more complex. Firstly I took the bad copy onto Samsung and re-installed Windows onto it and started to run that (not wanting to risk losing the original copy of data in the Maxtor until I was confident that Samsung was reliable). I put the Samsung on the motherboard ATA66 controller for security and put the Maxtor on the Promise controller (in a removable caddy) with a new cable.

Copying Maxtor to Samsung is fine.
Samsung seems to perform ok but some bouncing of SMART statistics  which leave a questionmark over it (Unknown attribute BE, Write error count C8, and to a lesser extent , Ultra ATA CRC Error Rate C7, Detected TA count C9 ).
Re-installed various software components that were broken... including ATI Catalyst drivers.
Then tried to copy newly downloaded files back to Maxtor and got a lot of Delayed Write errors. Also event log shows warnings about Promise firmware not being the latest version. Some googling on those errors  throws a finger of suspicion at a lot of things: Promise controller - the latest firmware available from Promise (which I had installed) is very old but I think since then Windows update tried to install something newer on it (perhaps from the similar but later TX2 controller). Something to investigate further.
ATI Catalyst drivers may be implicated - especially as they seem to have re-enabled Fast writes via Smartgart when I recall it previously used to conclude that wasn't safe on my machine.

Just to add to the mess - last night there was a loud bang from the vicinity of the PC and loss of power to the apartment - 2 levels of circuitbreakers in in the apartment didn't trip nor the earth leakage detector the computer is plugged in via. After facility management got the power back I unplugged a USB joystick and powered off scsi card - and the PC boots and runs normally. Hmm. (I replaced the power supply 6 months ago after a similar incident where I think the power supply fan failed and then it blew). This homebuilt PC has given very good service over the last 6 years or so but maybe it is coming to the end of it's lifetime. I guess it is a bit like a cherished old car - the more trouble it gives the more human it seems!

As for the tip about Acronis - you perhaps read my mind - I have been evaluating it at work last week and it is impressive software. What I plan to try next is to cut the PC out of the equation and try and backup my original Maxtor via USB-IDE adaptor to another USB HDD with Acronis using my work laptop. Once I have a safe copy I will be a lot more comfortable about further experimentation on the home PC and I might learn something about the reliability of the Maxtor when I try to do this. Hope I have the disk space to accomplish this.

If you don't mind I will leave this question open a while longer to update with anything else I learn and in case there is any more useful comments coming in.
Actually I've had really bad experiance with maxtor. If a drive lasts more than half an year, you are very very lucky. As for samsung drives, we've been using them mainly, because for one they are very very quiet and don't run very hot. I've used plenty of them and just now we had the first samsung drive return which is bad. There was also a maxtor drive in that PC which was also broken, so the cause of this was probably a large jolt or something which would kill any HD. This drive is at least 5 years old, so I'd say that's nothing to worry about. I haven't had much experiance with the 200GB drives yet, but they are still relativly new, I think they came out about 1/2 an hear ago (at least the SATA-II model), and I've been using 3 of them in a raid 5 array without any problems yet for some months. To me it looks as if you are having a problem with the promise controller which is pretty old and might have problems with large disks, If there is no newer firmware you might have to try a more modern controller.
Sorry for the delayed response. I just haven't had the time to sort the PC out yet.
What I did do was the Acronis Trueimage backup as recommended by rindi.
I backed up all 4 partitions from the Maxtor to internal HDD of the laptop using a USB2 to IDE adaptor - so this testing just the Maxtor (Home PC, cables, Promise controller etc. all out of the equation).
What happened is 3/4 partitions backed up without problems and one had repeated read errors.
This confirms 2 things:
a) The Maxtor definitely is failing
b) The original copy on the home PC to the new Samsung drive was definitely afflicted by another source of errors (as all partitions had files with the copy problems mentioned in my initial question).

I have also continued to run the Home PC using the Samsung HDD (on the motherboard controller). The S.M.A.R.T. stats for the Samsung continue to bounce but there have been no uncorrected read errors that I am aware of and the machine has been working. Basically I do not trust this new drive to be reliable but I doubt it was the cause of the numerous errors found when copying the Maxtor to it with Maxblast.

Next plan is to do
(a) Run SpinRite on the Maxtor to see if I can reduce the read errors on the damaged partition - want to make sure I have minimal data loss.
(b) Run a 'low-level' format of the Samsung and then re-image it with Acronis TrueImage from the backups of the Maxtor that I had created.
(c) Test the Samsung further to gain confidence before starting to create important new data files on it.

So back to the original question. I got good advice from Rindi (if not confirmation that this pattern of errors is caused by a particular type of failure).
I will accept rindi's answer shortly. Future reader's of this thread may find it helpful to know that this pattern of errors does not seem to come from read errors on the source disk but from the transfer of the data (bad cabling, controller issues, Fast writes, or something I haven't thought of)
thanks