?
Solved

Hardware diagnostics apps for memory / processor

Posted on 2005-02-27
23
Medium Priority
?
1,505 Views
Last Modified: 2013-11-10
I am having intermittent bluescreens on my computer and I'm trying to get some apps that I can use to diagnose exactly where the fault is.  I've used the MS memory tester, which showed no errors and I've also used "Hot CPU Tester" which showed an error in the "Complex Matrix" test (Error:CPU 0: Checksums do not match).  Ideally, I want some software which will actually tell me what caused the error (Processor, memory etc) so that I can give this information in as proof for a warranty claim.

I don't mind paying for the software, and I'd prefer it if the software ran outside of windows so that the hardware company can't come back to me and say it's a driver / windows error rather than a hardware problem.

Thanks in advance.
0
Comment
Question by:Psychotext
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 11
  • 4
  • 3
  • +4
23 Comments
 
LVL 1

Assisted Solution

by:brokegenius
brokegenius earned 300 total points
ID: 13414572
This is just a partial answer:
http://www.memtest86.com/

memtest86 will OFTEN find information/errors that other programs won't, just like you said, it is NOT windows based

for general information, because retailers often sell ram that is not up to advertised specs:
http://www.cpuid.com/

Cpu stuff:
http://www.benchmarkhq.ru/english.html?/be_cpu.html

Good luck!
0
 
LVL 25

Assisted Solution

by:kode99
kode99 earned 200 total points
ID: 13414818
Here's another memory tester,
  http://www.simmtester.com/

Like memtest86 this boots and runs on its own.  I would try both and see if the results are the same.  If the memory fails these tests it is most likely going to be isolated to a memory problem.

Also if you have more than one stick of memory try using the system with one at a time - or even better if you could test them in another machine to isolate your problem.

If the memory does not show any problem something else you could try would be to backup your system and do a fresh install.  If it is any type of software problem that would clear it - if it does not you then have pretty compelling claim as if a clean install does not work thier is obviously a problem with some part of the hardware.  Assuming this is a basic system with no funky stuff going on, so no overclocking or non-typical hardware etc.
0
 
LVL 2

Author Comment

by:Psychotext
ID: 13415056
Just got through two full passes of memtest with zero errors whatsoever.  The reason I think it's hardware related is that the system was running stable for about 18 months and then started bluescreening on all sorts of different things with no apparent reasons.  Windows error reporting has told me that the cpu has reported a hardware error, the memory address for the app was corrupted and that it suspects faulty drivers (All over a number of different bluescreens).  Have clean installed and still getting bluescreens (Even got them just after installing on one of the attempts, before I had even added the driver packs).  System is not overclocked, very well cooled and uses top of the line hardware.

I'm still suspecting a motherboard / cpu problem as I just can't get the memory to fail.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 11

Assisted Solution

by:Paul S
Paul S earned 300 total points
ID: 13415136
i use this cd for hardware tests.

http://www.ultimatebootcd.com/

everything you need on 1 disc.

you should list all the stop errors you've got so we can tell you whats wrong.

if the cpu is bad it is hard to test. i would just put it into another system to see what happens. sounds like your mobo or the cpu is bad.
0
 
LVL 20

Expert Comment

by:nedvis
ID: 13415148
Here is the list of most frequently used and most popular hardware diagnistic utilities .
It's a Google list of "most wanted" apps :
http://directory.google.com/Top/Computers/Software/Shareware/Diagnostics
   
 good luck
nedvis
0
 
LVL 2

Author Comment

by:Psychotext
ID: 13415158
The_Computer_Guru_777: Are these stop errors stored anywhere?  I have a ghosted backup of the system prior to rebuild that I can get them out of.  Nothing in my event log after the rebuild as I cleared it this morning.
0
 
LVL 2

Author Comment

by:Psychotext
ID: 13415680
I'm now 99% sure it's not a memory issue.  Haven't been able to get a single failure on memtest, MS memory test or Docmemory.

Ultimate Boot CD is pretty useful, thanks, although there's not much in the way of cpu related testing on it.  Funny really, you'd think that Intel / AMD would have diagnostic utilities for their chips.  I'm going to have to go through the google / benchmarkhq lists to see if any of those apps are more helpful.
0
 
LVL 1

Expert Comment

by:brokegenius
ID: 13416225
#1, in the future, please always post your Operating System/any upgrades with any hardware questions, just helps us help you.

memtest is THEE #1 pplication for memory testing, while it never hurts to try multiple apps, rest assured that memtest is one tool to keep and bookmark for ever.

honestly, one problem could be a BAD CD....a cd with scratches or defective will sometimes install, but you won't know that it was the installation itself that cause the problem

#2 one thing you could check, but not always helpful is (hoping you're on xp)
1)open up MY COMPUTER
2)right click in the UPPER LEFT corner, directly on the Computer icon
3)choose MANAGE
4)on the left hand side click the PLUS sign next to EVENT VIEWER....there are 3 different categories, you might be able to track it down via those error messages
0
 
LVL 2

Author Comment

by:Psychotext
ID: 13416324
Ok, well I'm running XP SP2 (With all patches as of 27/02/2005).  Hardware is as follows:

Asus A7N8X Rev 1.04 (Tried latest standard and beta bios),
AMD XP 2800+ (Not overclocked),
1GB Corsair Twinx XMS3200 Dual Channel DDR-SDRAM (Cas 2, Ras to Cas Delay 2, Ras Precharge 2, Cycle Time 6),
ATI Radeon 9800+,
Western Digital Raptor 36gb HDD.

Actually considered the bad cd and tried another, but with the same results after the install (I've spent a lot of time on this!).  Get no errors in the event logs, other than when I get the bluescreen (But I don't have any right now as I cleared the logs this morning to make them easier to look through).
0
 
LVL 2

Author Comment

by:Psychotext
ID: 13416329
(That's 27th Feb 2005 for those confused by UK date style!)
0
 
LVL 23

Expert Comment

by:sciwriter
ID: 13417594
<< The reason I think it's hardware related is that the system was running stable for about 18 months and then started bluescreening on all sorts of different things with no apparent reasons. >>

Funny, that would make me suspect that it is a windows problem, typical symptoms of windows going bad.

SP2 is a potential problem, an uninstall is often needed.  Another thing that can totally hang a system is a bad CD/R or disks in the CD/DVD drive that are bad and cannot be "seeked" by windows explorer.  Short of that, you might be looking at some significant windows debugging, not hardware.

Have you checked the CPU temp. with the ASUS monitor?  If it goes over 55 degrees, especially over 60, start getting worried.  Have you checked the PS -- temp swap in a different one?
0
 
LVL 2

Author Comment

by:Psychotext
ID: 13419086
I'd have thought the same thing, but even after the clean rebuilds it would bluescreen, both before / after sp2 and before / after installing the driver packs.

I used the "Mersenne prime" application on the Ultimate Boot CD last night.    After five hours and one minute it failed with "FATAL ERROR: Rounding was 0.499..... expected less than 0.4.  Hardware failure detected, consult stress.txt".  Haven't looked up the exact meaning of that yet.

CPU never gets above 55 Celsius at full load (Case temp never higher than 25 Celsius).  Voltages are as follows (Low / High / Average / Max Percent Outside Target Voltage):

Core 0: 1.60v / 1.68v / 1.65v / 3.1%
Core 1: 1.60v / 1.68v / 1.65v / 3.1%
+3.3: 3.25v / 3.30v / 3.27v / 1.5%
+5.0: 4.78v / 4.81v / 4.79v / 4.6%
+12.00: 11.55v / 11.61v / 11.58v / 3.8%
-12.00: -12.13v / -12.07v / -12.08v / 1.0%
-5.00: -5.09v / -5.06v / -5.07v / 1.8%

All voltages are within 4.6% (or better) percent of target, but then I'm not sure how tight they should be.

XP 2800+ (Barton) Specs:
Nominal Voltage: 1.65v
Max Die Temp: 85 Celsius
0
 
LVL 20

Expert Comment

by:cpc2004
ID: 13419137
Attach the 3 to 4 minidumps at any webspace. I will process the dump and find which hardware component is faulty. You can find the minidump at the folder \windows\minidump
0
 
LVL 2

Author Comment

by:Psychotext
ID: 13419285
cpc2004: www.tacticaladvantage.co.uk/minidump.zip

The dump files in that zip are all the ones from the most recent clean build of XP (The one I'm working on now).  Thanks.
0
 
LVL 20

Accepted Solution

by:
cpc2004 earned 1200 total points
ID: 13419718
Three minidumps reports there have memory corruption. 2 minidumps with 4 errors and 1 minidump with 16 errors. The memory corruption is caused by the faulty motherboard.

Mini022605-01.dmp D1 (9ed6162c, 00000002, 00000000, f7b10831)

CHKIMG_EXTENSION: !chkimg -lo 50 -db !usbohci
4 errors : !usbohci (f7b10822-f7b1083a)
f7b10820  c1  f6 *16  02  74  13  f6  40  0c  01 *8b  0d  8b  45  14  80 ....t..@.....E..
f7b10830  4f  02 *0c  83  48  10  01  eb  03  8b *82  14  80  67  02  1f O...H........g..

MODULE_NAME:  memory_corruption
IMAGE_NAME:  memory_corruption
FOLLOWUP_NAME:  memory_corruption
DEBUG_FLR_IMAGE_TIMESTAMP:  0
MEMORY_CORRUPTOR:  STRIDE
STACK_COMMAND:  kb
FAILURE_BUCKET_ID:  MEMORY_CORRUPTION_STRIDE
BUCKET_ID:  MEMORY_CORRUPTION_STRIDE

Mini022405-02.dmp D1 (9b84e474, 00000002, 00000000, f7ae8831)
CHKIMG_EXTENSION: !chkimg -lo 50 -db !usbohci
4 errors : !usbohci (f7ae8822-f7ae883a)
f7ae8820  c1  f6 *16  02  74  13  f6  40  0c  01 *8b  0d  8b  45  14  80 ....t..@.....E..
f7ae8830  4f  02 *0c  83  48  10  01  eb  03  8b *82  14  80  67  02  1f O...H........g..

MODULE_NAME:  memory_corruption
IMAGE_NAME:  memory_corruption
FOLLOWUP_NAME:  memory_corruption
DEBUG_FLR_IMAGE_TIMESTAMP:  0
MEMORY_CORRUPTOR:  STRIDE
STACK_COMMAND:  kb
FAILURE_BUCKET_ID:  MEMORY_CORRUPTION_STRIDE
BUCKET_ID:  MEMORY_CORRUPTION_STRIDE

Mini022505-03.dmp 0A (f104e9b8, 00000002, 00000001, 804e2b65)

CHKIMG_EXTENSION: !chkimg -lo 50 -d !nt
    804e2d64-804e2d67  4 bytes - nt!KiServiceTable+44
      [ 77 87 56 80:30 4b 3e f1 ]
    804e2df4-804e2df7  4 bytes - nt!KiServiceTable+d4 (+0x90)
      [ 62 f2 57 80:f0 46 3e f1 ]
    804e2ed0-804e2ed3  4 bytes - nt!KiServiceTable+1b0 (+0xdc)
      [ 04 3c 57 80:70 44 3e f1 ]
    804e2f44-804e2f47  4 bytes - nt!KiServiceTable+224 (+0x74)
      [ 4d 49 57 80:50 4c 3e f1 ]
16 errors : !nt (804e2d64-804e2f47)

MODULE_NAME:  memory_corruption
IMAGE_NAME:  memory_corruption
FOLLOWUP_NAME:  memory_corruption
DEBUG_FLR_IMAGE_TIMESTAMP:  0
MEMORY_CORRUPTOR:  LARGE
STACK_COMMAND:  kb
FAILURE_BUCKET_ID:  MEMORY_CORRUPTION_LARGE
BUCKET_ID:  MEMORY_CORRUPTION_LARGE

0
 
LVL 2

Author Comment

by:Psychotext
ID: 13419887
Interesting, thanks for that.
0
 
LVL 23

Expert Comment

by:sciwriter
ID: 13421321
Still, I would at least install something basic like win98se -- maybe even DOS?? -- and run it for a day.  Wouldn't you be surprised if 98 never froze, but XP did, right?
0
 
LVL 2

Author Comment

by:Psychotext
ID: 13421491
I've tried running the DOS mode (and linux mode) diagnostic apps and so far I haven't made it past 8 hours of testing.
0
 
LVL 20

Expert Comment

by:cpc2004
ID: 13429972
It is a hardware error. There has no sharewre can diagnostic which hardware is faulty. Windows use special coding to make use cache memory and it is different to DOS mode program.  Hence most faulty CPU and m/b can pass memest utility. Only the PC computer manufacturer such as IBM notwebook has built-in utility to diagnostic hardware. Your PCs crashes due to hardware problem and the minidumps have the snap shot when the hardware occurs. It is the cache memory problem either in the m/b or cpu.  According to my experience 70% is faulty m/b and 30% is the faulty CPU.
0
 
LVL 2

Author Comment

by:Psychotext
ID: 13430143
It would have to happen in the only part that's not in warranty!  Ok, thanks all. I've got to work out a fair way of awarding the points on this one which I'll try to do later today.
0
 
LVL 20

Assisted Solution

by:cpc2004
cpc2004 earned 1200 total points
ID: 13471247
The analyze report of microsoft windbg reports the hardware error such as memory corruption. It is best tool and you don't have to spend money to buy the software.

Sample hardware message from Windbg
MEMORY_CORRUPTION_STRIDE
MEMORY_CORRUPTION_LARGE
MEMORY_CORRUPTION_ONE_BIT
MEMORY_CORRUPTION_ONE_BYTE
TWO_BIT_CPU_CALL_ERROR
INTEL_CPU_MICROCODE_ZERO
0
 
LVL 2

Author Comment

by:Psychotext
ID: 13471409
Ok... I think that's about as fair as I could make it.  Ideally I would have opened up a new 500 point question for CPC for going above and beyond the call of duty; but apparently that's not allowed. Sorry.

Good answers, thank you very much.
0
 
LVL 1

Expert Comment

by:brokegenius
ID: 13472027
no, you can and should open a NEW question when wandering from the original post this happens all of the time:)

I have low points, but do this too:)
Cheers
0

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article outlines why you need to choose a backup solution that protects your entire environment – including your VMware ESXi and Microsoft Hyper-V virtualization hosts – not just your virtual machines.
Backups and Disaster RecoveryIn this post, we’ll look at strategies for backups and disaster recovery.
In this video we outline the Physical Segments view of NetCrunch network monitor. By following this brief how-to video, you will be able to learn how NetCrunch visualizes your network, how granular is the information collected, as well as where to f…
Monitoring a network: how to monitor network services and why? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the philosophy behind service monitoring and why a handshake validation is critical in network monitoring. Software utilized …

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question