?
Solved

Computer restarts / reboots / crashes when playing games

Posted on 2004-11-09
23
Medium Priority
?
43,335 Views
Last Modified: 2013-11-10
Greetings

I bought a new graphics card about a month and a half ago - a Gigabyte GeForce 6800 128mb DDR card (link: http://www.giga-byte.com/VGA/Products/Products_GV-N68128DH.htm).  A rather nice card, since it comes shipped with passive cooling.  I was a bit sceptic at first (a friend of mine has difficulties playing computer games in the summer due to a passive-cooling system he mounted on his graphics card), but since this passive cooling came with the card, I believed it to be "guaranteed" to run smoothly.

I didn't play any games on it for over a month, was busy with my school project.  I tried Thief 3 first, and I was very pleased with the graphics (my old graphics card was a Creative GeForce 3 68mb...).  I then stopped playing it after a while and downloaded the Doom3 demo, to find out what the card was really capable of.  The graphics looked nice, and the game ran very smoothly, but in some places (one place where there was smoke seeping from a pipe, and in other places where there were specific objects / lights, i believe.  hard to tell) it really Lagged (and I found that odd, since it is a new, fast card on a newly upgraded computer (computer specs below)).  I decided to try it anyways, being curious what the game was all about...

...then the computer rebooted after about 30 - 60 minutes of playing.  At first I believed it to be one-of-a-time event, but then it did it again, even sooner (and again, and again).  I checked the temperature of the graphic card's GPU, and it ran close to 70°C (my processor was running a safe 43°C).  Since that isn't exactly low, I tore the side off the cabinet and moved the front cooling fan right under the graphics card, and changed the air circulation of my processor cooler (it was blowing air onto the graphics card).  I updated the drivers of my graphic card to the latest NVIDIA Drivers and updated DirectX to the latest version etc., then ran Doom 3 in window mode while having the shipped GPU temperature monitoring software and my ASUS Probe running in the background, to check the temperature during the crash...

GPU Temperature: no more than 55°C
CPU Temperature: no more than 48°C

I am not a guru, but I believe these are very safe temperatures.  By testing my old graphics card to try and isolate the problem (I let Doom 3 run over night with my GeForce 3 in - ran a tad bit slower, but Never lagged at those places where my new card did...), I discovered that it must be my new graphics card, despite the temperature, since doom 3 didn't crash with the old graphics card.

The machine works fine in windows and when on the internet, it's just when I play games which require any graphics that it reboots.  To, to summarize, these are my essential questions:

1. Why is my computer rebooting, and how do I fix it? Any suggestions?
2. I have the impression that the graphics card temperature limit follows the same principle as CPU temperature limits, since it is a GPU which gets hot.  Am I wrong? what is an unsafe temperature? (70°C+ ?)

Some other things worth mentioning:

- The graphics card requires extra power from the same type of cable as the hard drive (and yes, i did not forget to connect it :P ).  Does the card consume such vast amount of power that it could be a power issue? (check power supply below)
- There exists a BIOS update for the graphics card (didn't know until now that there actually is a BIOS on graphics cards, but I guess it makes sense).  Is there a chance that the updated BIOS would help?
- After disabling "automatic reboot on error", the following errors came forth (not both simultaneously):

  IRQL_NOT_LESS_OR_EQUAL
  PFN_LIST_CORRUPT

  Also, in two cases when the computer crashed, my computer didn't register that my USB keyboard was connected to the computer (Keyboard error or no keyboard present).  I plugged in a PS2 keyboard, started the computer up and then disconnected it and the computer worked fine.  Also, on one occation, after rebooting on error, instead of showing the memory count, drives & processor speeds, it showed a screen parted vertically: half black and half in a blinking rainbow pattern.  It has only appeared once, but mayhaps it has a significance?
- I doubt it is a motherboard overheating problem, since the ASUS probe didn't report a heat problem from the motherboard.

¤¤Computer Specs:¤¤

What is worth mentioning:
- OS: Wiindows XP Professional w. service pack 2 (+ directx 9.0c and the latest NVIDIA drivers)
- PROCESSOR: AMD Athlon XP 3200+ (Barton)
- RAM: Corsair TWINX PC3200 2x512mb DDR
- MOTHERBOARD: ASUS A7N8X-E Deluxe (Gold Edition)
- VGA: Gigabyte GeForce 6800 128MB DDR (model nr: gv-n68128dh)
- POWER SUPPLY: 300W (think it is from CodeGen)

I hope I have mentioned all worth mentioning.  If you need to know anything else, then do say so and I shall see what I can do.

Thanks for reading
0
Comment
Question by:mSchmidt
  • 6
  • 5
  • 3
  • +5
22 Comments
 
LVL 69

Expert Comment

by:Callandor
ID: 12536090
You power supply is definitely on the low side for this high-performance setup - go for 400W, and choose a quality manufacturer like Enermax, Zalman, PC Power and Cooling, Antec Truepower, or Thermaltake Purepower.  The 6800 is power hungry, and that may be contributing to the problem.

Crashing on video games points to a bad video driver or overheating, but your power problem may be the real reason, since you have a 6800.  I am surprised it didn't come with a fan, but you can mount a case fan near it to blow on the heatsink.
0
 
LVL 32

Expert Comment

by:willcomp
ID: 12536121
Could be a GPU overheating problem, but first thing I would do is replace power supply.  You are underpowered with a 300 watt "generic" power supply.  Recommend 400 watt minimum quality power supply (e.g. Antec, Enermax, ThermalTake, PC Power & Cooling).  After PSU changeout, see what happens with temps.

May want to pick up a Slot Cooler (exhaust fan that mounts in card slot) to mount next to graphics card or install a fan on case side panel either blowing in or out determined by effect on GPU temp.

Dalton
0
 
LVL 2

Expert Comment

by:garak1357
ID: 12537145
This is not a hardware issue in my opinion.  This is purely a issue with Windows.  You probably won't like my advice on how to fix this issue, but I'm going to put in my two cents anyway.

I would gather together all of the drivers you need for your system, and completely reinstall Windows from scratch.  It has been my experience that anytime you change major hardware (CPU, Video Card, et cetera), a fresh install saves you hours of painful troubleshooting.

Having said this, I am doubting that you will actually take this advise since most people consider it to be a radical solution; not to mention time consuming.  Therefore, I will offer alternative suggestions.

If you have not updated your system's BIOS, go ahead and do so.  Set conservative settings for RAM timings et cetera.  Anything that is auto detected is fine.  Manually set the AGP to 66MHz instead of an auto setting.  Do not update the BIOS for your video card.

Download the chipset drivers for your motherboard, the video card reference drivers (61.77 for Nvidia), and the latest version of DirectX (9.0c).  Uninstall all of these drivers from Windows. Reboot.  Install DirectX 9.0c.  Reboot into safe mode.  Install the chipset drivers and reboot.  Allow the system to boot all the way up.  Do not let it try to install any of the other drivers at this point.  Reboot into safe mode and install the video drivers.  Reboot and let it come all the way up.

In other words, uninstall all three of these, reinstall them in safe mode, and allow it to fully boot up in between installs.  If you have an nforce2 chipset on your motherboard, do not get the latest chipset drivers.  Get the 4.27 version.  The newest version does have some issues.  Likewise, you may need to back off a version from other chipset drivers if none of this works (try the latest first for the others however).

I am convinced that this is not a hardware issue having seen very issues in the past.  Your GPU temp is well within tolerance and updating the BIOS of the video card is a bad idea unless you're seriously into tweaking and don't mind loosing a card now and then.  Good luck!



0
A Cyber Security RX to Protect Your Organization

Join us on December 13th for a webinar to learn how medical providers can defend against malware with a cyber security "Rx" that supports a healthy technology adoption plan for every healthcare organization.

 
LVL 8

Expert Comment

by:stockhes
ID: 12539544
I am surprised that your setup will even boot !

12 volt is used for CPU , GFX if molexpowered, fans & drives

Roughly estimate

CPU =6-8 amps(2500+-3800+), GFX 5-7(>9800/5900) amps, fan=0,5 amps, drive =0,75 amp

you do the calculation

You will need at least 16-18 amps on the 12 volt rail

Does your 300 watt codegen provide that ?


Bare in mind at gaming CPU and GFX are running full throttle

I recommend this PSU
Enermax 365 watts with 26 amps on 12 volt rail

http://www.newegg.com/app/ViewProductDesc.asp?description=17-103-455&depa=0

or if you go for silence

http://www.newegg.com/app/ViewProductDesc.asp?description=17-103-431&depa=0



 
0
 

Author Comment

by:mSchmidt
ID: 12539739
*laughs* well, guess it boots because the graphics card is not sucking all the power when I am not playing games.  I am amazed sometimes myself when it comes to me and technical devices (I am an anti-mobile phone man - they just tend to die permanently after I have had them for a while).

but, anyways, I'll be browsing for power supplies first (since if it works, I save more time (and since I have tried practically anything possible software-wise), and if I was "paying myself salary", then saving money as well).  

Thanks alot for the advice so far.  It will take 2-3 work days before the power supply gets here, so if you get other ideas in the mean time, feel free to type them in here

One more thing: since we are talking possible power problem, could the lack of power be the reason why the game lags when I view some specific graphical detail, whereas it doesn't lag any more than normal when I use my GeForce 3?
0
 
LVL 8

Expert Comment

by:stockhes
ID: 12539838

Don't rely too much in a demo, because the code might not be optimized yet, nor the driver for GFX, especially fog or smoke are created with different technigues, so maybe you compare apples with pears.
0
 
LVL 4

Expert Comment

by:BjornEricsson
ID: 12540432
I suggest that you head over and read this post before you buy a new PSU. Personally I don't think it is undervoltage causing your problem, I'd go more for memory lapse due to over-heating.
This is mainly tips from LucF:
http://www.experts-exchange.com/Operating_Systems/WinXP/Q_20999245.html
0
 
LVL 69

Expert Comment

by:Callandor
ID: 12540993
Actually, I covered the typical "IRQL_NOT_LESS_OR_EQUAL" error as due to bad RAM or corrupt Windows drivers.  The power supply seems a more likely culprit in this case.
0
 

Author Comment

by:mSchmidt
ID: 12542960
stockhes: You have a point.  However, the computer has crashed in other games as well, albeit not as frequently.  But it still does, and that fact makes my machine unreliable.  That was a detail I failed to mention, so my apologies.

Also, just to clarify, I ran a MemTest on my RAMs and let it run overnight without receiving a single memory error.  Of course, I can't say if memory lapse due to overheating is occuring (as hinted by BjornEricsson), or that my memory/CPU is giving false readings, but the test, as well as the fact that my computer didn't crash even once when running a game with my GeForce 3 card for a lengthy period makes it more likely that the problem is related to my new graphics card, be it driver, power or hardware.

What puzzles me is why the graphics card producer doesn't specify how much power the card consumes, since power is such a crucial factor.

I ordered the "Aspire ATX-AS520W 12V" - a 520W powersupply w. plenty of cooling, manual fan adjustment etc. etc.  I may have overdone it a little (being that I only needed perhaps 400W), but powersupplies from this producer have gotten excellent grades, both from hardware enthusiasts and from average users (some tests reported a minor over-voltage, but it was well within the +/-5% standard set for computer hardware).  This powersupply also only costed 4% more than the 400W powersupply from the same producer, so, better to have too much than too little watts I say :P

The power supply can be seen here:
http://www.newegg.com/app/ViewProductDesc.asp?description=17-148-009&depa=0

 
0
 
LVL 20

Expert Comment

by:cpc2004
ID: 12545090
I am quite sure that it is caused by faulty nVidia GE Force Display Card Driver. I've resolved 4 blue screen and one system hang up problem at XP platform and the culprit is nVidia GE Force Display Card. Upgrade the latest nVidia Display Card Driver will resolve your problem.
0
 

Author Comment

by:mSchmidt
ID: 12574062
Okay, a little status update

I got my new power supply and have plugged it into my computer.  There are 3 (!) fans in my power supply: two internal fans which suck air from inside the tower cabinet, and one exteral, which blows the air out through the back.  I also have a PCI slot cooler seated right under my GeForce 6800 card.  All went fine during startup.  I adjusted the cooling fans to a bearable noise level (the power supply at roughly 2/3 of its maximum adjustable speed, and the PCI slot cooler at 1/3) and tried playing a game.

after about an hour and a half of playing Thief 3, the computer rebooted with the IRQL_NOT_LESS_OR_EQUAL error.  When it crashed, I put a hand behind the power supply, and noted that the air was pretty warm.  I tried looking through the memory dump by use of the windows support tools (thanks for the link, BjornEricsson), but the dump didn't point to any exeption address.  Also, the memory dump file was set to be a mini-dump.  I went to system-advanced-startup/recovery and set the dump file to store a full memory dump.

I then tried adjusting all fans to maximum, put the computer to "the ultimate stress test" (Doom 3 :P), placed my avatat in a place with constant activity (some machinery moving in and out of a shadow, sparks flying, lights spinning, hatches opening / closing etc.) and went to sleep.  When I woke up the next morning, the game was still running!!!

Still curious about trying to find the cultprit of my problems, I set the PCI slot cooler fan to almost the minimum and kept the power supply cooling fans at maximum, and put the computer through the same Doom 3 test, and let it run about 7-8 hours.  Again, the machine was still running.  I exited the game and immediately opened ASUS Probe and VTuner (VTuner monitors the temperature of my GPU) (no more than 5 seconds passed from being in game and running the applications).

GPU Temperature: 51°C
CPU Temperature: 47°C

Also, the air coming from the computer through the power supply wasn't nearly as warm as before: like warm summer this time (Can't really measure the temperature of the air exhaust, sorry :P hope my descriptions help somewhat).  As one can see, the power supply helped alot with cooling the cabinet, and I am beginning to think that my issue is related to overheating.  Exactly what is overheating, however, is hard to say.  I doubt it is the graphics card (since the computer has crashed at a GPU temperature of 55°C), and definitely not the processor (since I have never seen its temperature rise above 52°C with my "ThermalTake Tower112 Heatpipe Cooling" CPU Cooler.  I don't believe it is my motherboard, since ASUS Probe warns with an alert sound when either motherboard or CPU is overheating, although I do not deny the possiblity.  To me, it leaves two likely candidates:

 - The Power supply
 - The Memory

Now, is it one of these which overheats, or something else? or is the problem completely different?

Now, the memory (Corsair TWINX1024-3200C2) (which can be seen here: http://www.corsairmemory.com/corsair/products/specs/twinx1024-3200c2.pdf) comes with passive cooling plates of metal covering the ICs (Integrated Circuits) on both sides of the RAM block.  Now here's my question: are those metal blocks there for added safety, or are they mandatory, due to the awesome heat generation of the RAMs?

If they are mandatory (meaning they most likely will overheat without them), I believe it is likely that they can still overheat without including cooling dedicated to my memory.

It could basically be anything in my computer overheating (since the air venting through the back was pretty hot when the machine crashed), and I guess I won't know until it crashes again and I can take a look at the memory dump (and maybe not even then).  If you have a definite answer, or know of something I can do to find out what exactly is responsible for my computer reboots, then do mention it, and I'll see if it helps.  I'll force my computer to crash and look at the memory dump soon, and write what I can conclude from there.

Thanks for reading
0
 
LVL 20

Expert Comment

by:cpc2004
ID: 12574412
Bugcheck 0A is very hard to diagnostic. The failing address instruction is within ntoskrnl.exe (ie Windows kernel), hence winddg does not display the failing module name. You are doing excellent job. You need a full dump to diagnostic the bugcheck 0A problem. Usually it is memory overlaid. You have to decode the failing instruction and work for which address is corrupt. Find out the culprit. I believe that it is display card driver problem

20% of bugcheck 0A is related to hardware (faulty ram. faulty cache ram at mb, faulty CPU) and 80% is related to software (ie device driver)
 
Hints to process full dump

1) !analyze -v  (analyze the dump)
2) process       (display the current process and usually the current process is the culprint)
3) lm              (display load module map)
4) k                (stack trace)
5) !dc xxx Lyy  (display dump hexidecimal and ASCII at big endian format where xxx is memory address and yy is the number of words )

If you can post the !analyze -v output of the minidump, I can have  some comment.

Hope it can help you.
Albert
0
 
LVL 32

Expert Comment

by:willcomp
ID: 12575156
In order for a slot cooler to effectively cool a GPU with passive cooling, you need a fairly cool ambient temp inside case.  This means adequate intake and exhaust.  When gaming, run all fans full speed.  There is no need to stress components more than necessary.

Optimum solution would be fan blowing outside air onto GPU.

Dalton

0
 
LVL 69

Expert Comment

by:Callandor
ID: 12575483
If the RAM were bad, they would fail at lower temperatures also.  Those are heat spreaders on them, and I recommend you keep them on.  The improved cooling that the new power supply accomplished was probably a deciding factor.
0
 

Author Comment

by:mSchmidt
ID: 12618182
You want to hear a joke?

I took a round not long ago to check for updated drivers.  Turns out that NVIDIA had produced a new driver November the 9th (the day I asked this question).  I downloaded it and tried playing games, and Doom 3 had stopped lagging at the spots it lagged before! Furthermore, it all ran alot smoother.  I let it run overnight with my coolers set to an acceptable noise level, and it didn't crash!

Before giving a grade, however, I would like to know for sure what the problem was.  It is almost Certain that the problem was driver-related, but perhaps the added power from the power supply, or the cooling from it, helped?

anyways, here's the kernel memory dump (since the computer hasn't crashed so I can get a full memory dump yet):
http://fordaeda.dk/wTroubleShoot/wMemDump1.JPG

If anyone can see from this memory dump what was responsible for my computer rebooting, please mention it :-)

willcomp: unfortunately, I do not have a hole in the side of my cabinet to blow fresh air onto the GPU.  I could place a fan so it sucks in air from the space where there are no PCI cards in the back of my computer... might take a look into that.

Callandor: definitely, I'm keeping those heatspreaders on :-) All I was wondering was if they were meant to work without them, because if they were, then the fact that the computer was crashing due to a memory overheat despite the fact that it has heatspreaders seems unlikely.

cpc2004: I am not familiar with all those tools, so all I did was to use dumpchk.exe (one of Windows XP's support tools).  If that is enough, then you can look into the link above.  If not, then I will need some directions as of how to fulfil the diagnostics you speak of.

Again, thanks everyone for helping
0
 
LVL 20

Expert Comment

by:cpc2004
ID: 12618694
Your dump was triggered by bugcheck code 0A and the failing address is ntkokrnl.exe which is windows kernel. I don't think windows kernel is the culprint. A mindump is not sufficient to determine the culprit of bugcheck 0A. Usually we need to analysis the stack trace and the minidump only shows the three or four stack trace.

If it is nVidia Display card driver problem, it triggers minidump bugcheck code 50 and 8E. Analyse the minidump and it shows that the failing module is nVidia Display Card Driver. From the output I know that your PC is running at XP SP2 and I've handled a lot of bugcheck code 0A crash at nVidia Display Card Driver at XP SP2 and W2K SP4.

 
0
 
LVL 8

Expert Comment

by:stockhes
ID: 12620116
Hi mSchmidt

There has been some issues with Xp servicepack2 and drivers, it looks as if you were caught there.

However It also seems you were on the edge in the sense of cooling & Power.

Power actually depends on the 300watts codegens 12 volt ability.

Various monstrous fans inside a case does'n t help much if you do not remove the hot air from the case
It is even more essential with passive cooled devices.

>willcomp: unfortunately, I do not have a hole in the side of my cabinet to blow fresh air onto the GPU

solution :-)

http://www.toolworld.dk/store.php?ctrl=120216&pno=60121&pointer=0

The proper way is to have a flow through the cabinet normally cold air in bottom front and hot air out top rear

This is normally achieved with decent gaps in the front of cabinet, sometimes aided with a fan sucking inwards

In the back the hot air is sucked from above the cpu (tower) by the PSU and emitted in the back of the case.
Normally a case fan is added in the back sucking the hot air from GFX.
 
0
 

Author Comment

by:mSchmidt
ID: 12805260
Alright, I've been using my computer for a while now, and it does run alot more stable.

However, it is not completely stable.  I played a game over a weekend with a friend of mine, and I could say that it crashed on average 1-2 times per day.  The game has some 3D graphics, and on some of the BSOD screens, it directly points out the nv4_disp.dll with a bugcheck stop code 8E.  However, it has also crashed twice on the windows desktop (not much, but it hasn't done so before).  I photographed some of the BSODs during some of the occurrences, but many of the memory dumps are lost because I didn't have enough disk space for the computer to dump the memory :P I'll make sure it finishes writing the entire memory dump to the disk next time, and post the dumpchk here.

Anyways, here's the screenshots:

http://fordaeda.dk/wTroubleShoot/wScreenShot1.jpg
http://fordaeda.dk/wTroubleShoot/wScreenShot2.jpg
http://fordaeda.dk/wTroubleShoot/wScreenShot3.jpg

Excuse the poor picture quality, these were photographed with my telephone.  All of these BSOD occurred after installing the november drivers.  As "cpc2004" mentioned, these are all "50" and "8E" errors, and they all mention the nv4_disp.dll.  However, it has also crashed without pointing out this driver (will specify them here when they reoccur).

Since it is probably the display driver, then what does that tell me? That I have the wrong driver? That I have some invalid graphical software? That the driver simply isn't compatible with my windows? Or that the hardware is faulty?

cpc2004: I shall post a dumpchk of a memory dump the next time the computer crashes.  Since you have handled numerous problems with NVidia display driver vs. SP2, perhaps you know what the common cause is (if there is one)?

stockhes: Hehe, good one :) I'd prefer not to litterally grind up my computer case, but I'll keep the solution in mind, as a last option ;) I do not believe I have a thermal problem, not any more, that is.  Below you can see a mock-up of the heat generation and air-flow in my cabinet, illustrated as if looking into the side of the tower cabinet:
http://fordaeda.dk/wTroubleShoot/wCompMockup.GIF
The air-flow follows the ideal "in through the lower front, out through the upper back" air flow.  It is possible that the size of the VGA obstructs the supply of fresh, cool air to the CPU and RAM, since the VGA and one of my hard drives almost separate the cabinet horisontally, but the air can still get around the side of both hard drive and VGA.
As cpc2004, you speak of driver issues with SP2.  Is there a solution to this, or would I have to down-grade my windows to SP1 for the computer to work properly?

Thanks for reading, everyone, and for your patience :)
0
 
LVL 20

Expert Comment

by:cpc2004
ID: 12805373
Refer to the following 3 cases that I resolved at last month and they are related to nVidia Display Card Driver
http://www.experts-exchange.com/Operating_Systems/WinXP/Q_21221550.html#12684667
http://www.experts-exchange.com/Operating_Systems/WinXP/Q_21205016.html
http://www.experts-exchange.com/Operating_Systems/WinXP/Q_21230813.html

Install nVidia Display Card Driver Version: 61.77 Release Date: July 27, 2004 and this is a stable vesion. http://www.nvidia.com/object/winxp_2k_61.77.html

cpc2004
0
 
LVL 20

Expert Comment

by:cpc2004
ID: 12826317
Can attach the minidump at any webspace and I want to proeces your dump to find out the culprit.
0
 
LVL 20

Expert Comment

by:cpc2004
ID: 12972962
Have you resolved your problem? If not, install nVidia Display Card Driver version 61.77.
0
 

Accepted Solution

by:
modulo earned 0 total points
ID: 13264786
PAQed with no points refunded (of 500)

modulo
Community Support Moderator
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

On Beyond Tools A conversation I recently had with the DevOps manager of a major online retailer really made me think about DevOps monitoring tools (https://www.onpage.com/devops-incident-management-tool/). The manager and I discussed how sever…
Arrow Electronics was searching for a KVM  (Keyboard/Video/Mouse) switch that could display on one single monitor the current status of all units being tested on the rack.
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an anti-spam), the admin…
Screencast - Getting to Know the Pipeline
Suggested Courses
Course of the Month14 days, 16 hours left to enroll

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question