Link to home
Start Free TrialLog in
Avatar of bpfsr
bpfsrFlag for United States of America

asked on

Trying to determine if a particular program is causing disc damage

For the past several years I have been using a third party program that aids my bookselling business by remotely repricing inventory and mass listing books. However in the last five years I have had no fewer than three - and now on the verge of a fourth - computers die because of disc corruption.

The only common denominator I can think of over that time is this program. Unfortunately it is a very small market and the comepetitors offer a far inferior product so I am really hoping this program is not the cause of the problem.

My suspicions are aroused most by the fact while the proprieter has otherwise stellar customer service, anytime I even suggest the disc problem may be caused by his program he gets very defensive and begins pointing out all the other things I am "doing wrong" without apparently being willing to even consider the fact his program may be at least part of the problem.

So, is there a way to check logs, etc. to see if a particular program is causing damage to a disc? Thank you.
Avatar of noxcho
Flag of Germany image

Do you mean damage to HDD?
Avatar of fleamourian

I would think it impossible for software to cause damage to hardware.  I stand to be corrected?
Yikes. Where to start?

Firstly, and you may not like this, but, there's only been a select number of of software (malware) that damage a hard drive. Usually attacking the boot sector, or some older school ones that would do low level formats before there were checks in place to prevent it. But that kind of damage is strictly to data. There is no intentionally crafted piece of software I know of that causes physical damage to the drive.
Now it's conceivable, that a poorly coded piece of software could be performing too many I/O (input/output) read/writes  which, over time, could decrease the 'life expectancy' of the hardware. could. But it'd have to hammer the drive pretty hard and constantly...and truthfully when you look at a typical database server or high volume web server, the I/O is off the charts compared to a 'single service' machine. Granted there is drive redundancy and load sharing, but still...I mean if we're talking about a program or service that is pinning the drive that hard, you'd notice it performance wise on the machine, in that program access time and overall computer performance would be ground to a halt.
That being said, let's assume that you've gotten ahold of the poorest coded piece of software in the world, record breaking, ground shattering, no holds barred the worst piece of code evar. If you wanna test it's I/O you can use Task Manager. Start>>Run>>taskmgr>>View>>Select Columns  choose the I/O options and you'll get that column in the main window, check it out you may be right. But again, you need a base line cause those numbers will look big no matter what. Likewise you can use PIO here a bit of VB script and a pl wrapper.

So now let's look at it from the developers side. I've been a developer and hardware guy for going on 20 years, professionally for ~13. There is an end-user mentality that mystically all components in a computer are one. i.e. you cam over and fixed my printer now my monitor doesn't work. laugh, but it happens all the time. So to him you are connecting a highly unlikely bug in his programming with your propensity to blow through hard disk drives. Again, i'm not saying it isn't the cause, i would just need more than my end-user's opinion of the situation. You gotta figure in the hours/days/weeks/years he spent developing, way too high I/O reads would be flagged in the debugging process. Again, he may not care.

You said: "The only common denominator I can think of over that time is this program"  

as my father would say, "the only common denominator is you"   EVERYTHING is a variable. I have no doubt you are blowing through hard drives. but what you're not taking into account are basic environmental variables. Temperature. Wanna blow a drive?... get it hot. Electrical, you didn't say that you've made any other changes to the machine except replacing drives, Low power or bad power is a sure fire way to wreck a drive. We're getting into A+ stuff here, but if I had to give advice to actually fix the problem here, I'd say pick up a new workstation..or better yet since this is a business critical application we're talking about, invest in a decent server, get some redundancy (RAID) on those drives. Close that circuit up and make sure there's clean power. get it up off the floor, Make sure your end is right, then you can go back and tell the guy "Hey, your software sucks"

Good luck man, I hate hardware problems.
>without apparently being willing to even consider the fact his program may be at least part of the problem.

I'd take his side in this as well.  I'm not a developer.  I'm the guy who has to fix anything remotely resembling a computer, including light bulbs, plumbing, and the occasional garbage disposal.

I've not had a case where software (not malware) does physical damage to a hard drive.  Nor corruption beyond its own data set.

It's unclear how you define "die" and "corruption" and "damage".

If you're losing drives in the same computer, then there are several things that may cause problems in multiple drives:

- OS corruption (has nothing to do with physcial drive)
- cable problem (replace cable)
- controller problem (update firmware)
- power problem (use voltage-regulated UPS, replace PSU)

The first item, OS corruption, is a common reason for people saying that their computer is "dead".  Physically, nothing is wrong.  Wipe the disk and re-install the OS, and it's fine.  Operating systems and application are fragile.  Move or re-name one file and the whole thing crashes.

If the drives are physically damaged, what did you get back from the drive manufacturer?  Was it a re-build?  New drive?  Did they disallow the return?  If you went to a data recovery vendor, what diagnosis did they have for the damage?

A drive that does nothing can't show any errors.  Heavy I/O may uncover problems, but shouldn't be considered the cause of physical damage.  Data corruption is another story.  Severe fragmentation or power problem during write could cause non-physical damage that would make a system unusable.

As an anecdotal, non-statistical point of reference, my laptop hard drives are used daily from  10-24 hours, plus bouncing around during commute and travel.  I get 18 months before they fail.  That's between 12 and 24 months....the spread is unpredictable.  So, I shop for a replacement at about 12-15 months before the inevitable.

Reality:  ALL drives die.  Just like light bulbs.  Some last 1 week, some last 10 years.  I have a drive from 2000 that still works.  Consumer-grade IDE drive.  I had a 2TB SATA that crapped out after 3 months in a test rig...almost no use at all.

Temperature and usage don't necessarily correlate with failure rates.  Just ask the engineers at Google.  High temps - small but not significant negative effect.  High-usage - no effect.

They do point out that a failed drive will often work fine in another installation. The problem could have been the controller or cable or connector.

Biggest question is:  What is the actual problem?   A crashed OS can be caused by so many other things, from a bad driver, to a blown capacitor, to a loose cable.  Or malware, underpowered CPU and not enough RAM.
I will assume is causing HDD media to die due to some program. The "features" of this program will include (but not limited):

a) Legit software that is not wear resistant aware such that there is always write through on the media in the same few sector region. Especially evident when used in reused media (recovery media deployed multiple times, reaching it max write states). Sometimes, they would caused the system blue screens (or known commonly as BSOD)

b) Malware that does file deletion (deleted critical dll's or system files), drive formatting (or destroy file systems) or perform virus attacks (recalled there is one worm called STORM), it does not die immediately but the persistent reinfection may just render it non-usable. It can also be rootkit which "force" transparently write on specific media sector persistently or simply keeps re-booting the system over and over causing media dying

c) Software that have intense physical fault symptoms upon running like media starts "clicks" or it makes other kinds of strange noises. Typically the memory working set is heavy due to intense paging between media and RAM. System memory can gets full or overloaded and causing it to freeze but the media is intensely spinning - contention of resources caused by a corrupt file or shared program files. At times, when you see "Drive or Device Not Found" the partition may be already dying off, the last read program may be culprit from the application logs

d) Firmware type that inadvertently touches the BIOS or Windows' Disk Management or disk utilities that would mess up the file systems (with Operating System Not Found message)  like FAT, NTFS, etc. E.g. bad sector may be caused by cross-linked files which are a result of the operating system overwriting a portion of one file with another file. Potential causes of cross-linked files are turning off or restarting your computer without exiting Windows or by not regularly defragmenting your hard disk. Having said that, sometimes bad sector need not necessarily be permanently marked and can be "reused", unless formatted with the utility that comes from the disk drive maker

This link depict the causes as well and some detection method. I also see that if you utilised those registry cleaner

Can also check out this discussion on detecting anomalies that does render some attention (even it can be false alarm)
Avatar of btan

Link to home
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial