hardware problem, sapphire hd4870 vporx showing gpu overtemp protection led r(not) randomly

hello everybody,
i'm in the long proces of building my gamin/working rig and all is flawless but for one small problem.
here is my configuration

-case: corsair obsidian 800d
-MB: asus m4a79t deluxe
-cpu: AMD phenomII x4 955 black
-ram: 4x2gb kingstone "standard" low profile
-gpu: tried 2x crossfire sapphire hd4870 vaporx 1gb ddr5, but now running with one
-psu: corsair AX850
-cpu cooling: corsair h80 liquid aio
-raid: IBM Megaraid M5015 with BBU
-sys disk: IBM SAS 10k 300gb
-data array: 3x 250gb caviar blue raid 5 (temporary ofcourse wile prices comes back to hart)
- various fan: two 140mm noisblocker pk3 for intake, two silverstone airpenetrator for radiator in p&p, a shyte kama flow 120 for extraction on top, original 120mm corsair for hd rack, a noiseblocker 3600rpm 40mm for raid originally passive dissipator.

the first problem i had was indeed raid card overheating, wich i solved mounting the above fan pushing air on the passive dissipator, and changing thermal paste on the small cpu.

now the problem is that most of the times when the pc is off for long time (not month maybe 24 hours) when i first tourn it on fans ar going 100% and a red led is on on the gpu:
the led revealed to mean overtemp protection wich i9 knew was impossible, because the gpu is cold even touching it!

-i tried changing the gpu with the identical one i had, same resoult.
-tried one of the gpu in another system, and is still working fine after 3 weeks.
-found out that when the problem appear is enough to switch of the psu, remove the motherboard small battery, wait for power to fade away from mb then put it back and it works magically.
then if i shut it down brutally for a crash, or i don't use it for few days the problem appeares, but goes away if i do as above.

my main suspect is on the raid card..
since is ceated for servers, which usually stays up forever, virtually at least, when off it runs out of battery on bbu..
when i tourn it on again it says that data on  chache memory was lost.

Any idea on how to solve the problem?
is adviseable to use a raid with bbu on desktops wich is not used server-like 24/7?

i would like to keep courrent HW ( not channge MB for ex) since i'm waiting for ne intel ivy bridge family,,, if possible!

thanks!
Paolo
LVL 3
oloap88Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

DavidPresidentCommented:
RAID card overheating?  That is rare unless the problem is a cold solder joint, or a chip failure.  What are the symptoms?

One thing for sure, the caviar blue's are not suitable for RAID5.  In fact, if you look at the datasheet, they reveal that the warranty is voided for RAID5 use. Those drives simply aren't architected for the error recovery parameters.   You can get away with RAID1 but not RAID5..  So if problem is data loss, disk I/O related, then you need to go to RAID1.
Aaron TomoskyDirector of Solutions ConsultingCommented:
That raid card even without the overheating sounds like a lot of work just for 500gb of storage. Add in the overheating weirdness and I'd say pull that whole thing out. Use the onnoard raid to raid 1 and only have 250gb mirrored for now.
oloap88Author Commented:
the symptoms where system unresponsiveness, freezing and so on.. now it works flawless..
it is obvious that the card is overheating, since it was projected for servers, where hig pressure fan are pushing air in those air baffle trougt the passive dissipator.
the prolem is that down there where is placed, under the gpu there is no airflow inthe obsidian which is a liquid cooling designed case!

i know the point on caviar blue is correct, but rignt now i neither created the volume on the disk array.
and when i did it simply worked correctly.

i allso have a ibm server raid m1015 which laks chache and bbu, you think is better?
i'd avoid using onboard since laks SAS support and i spent 100€ for the cables and 100€ for the card!

anyway the major issue is that overtemp protection light on the gpu, and since gpu is ok what can be the problem?

thanks for answer!
The 7 Worst Nightmares of a Sysadmin

Fear not! To defend your business’ IT systems we’re going to shine a light on the seven most sinister terrors that haunt sysadmins. That way you can be sure there’s nothing in your stack waiting to go bump in the night.

DavidPresidentCommented:
Get rid of the RAID card if you are running Win7.  You'll probably get much better performance anyway.  Windows7 does read load balancing.  With a RAID controller and BBU then you will get better write performance on a HW RAID controller, but since this is a gaming system, it is all about reads.  No RAID card = no RAID card problems.   Just use the native RAID capability of the O/S.
oloap88Author Commented:
man, i'd like to. but this is not only a gaming machine.. and sas disk are not supported onboard, then i cannot belive that solution is throw away li 400 bucks of hardware.. (controller, disk, and cables!)

the machine is used for many stuff, i use it for virtualization with vmware and gns3, my girlfried as workstation for graphic jobs like Autocad 3d, 3d studio, cinema 4d, maya... and she used to work with files larger than 3 gb, so SAS is another world!

Hardware will be changed in 3-4months, with a xeon based mobo, don't know yet if it will be a dual or single gpu (waiting for new evga dual lga2011).
gpu will be changed with 7970 i guess.
allso sas hd will be replaced by a couple of ssd.
but since i'm payng myself for the rig, i'd like  to get new-gen hw.
 In my opinion the problem is caused by the bbu, since when the system has been up for hours, so is charge, a reboot is no problem, while when is probably out of power it gives that led error on the gpu..
this said i have no idea on how it is possible!

i'll change the raid card with the simplier m1015, since the actual m5015 was meant to work, togheter with the sas hd, in the server i have in the basement a 1u IBM runnin esxi.

promise i'll let know!
Aaron TomoskyDirector of Solutions ConsultingCommented:
Or just use the raid card as a sas adapter. Don't raid 5 your 250gb drives.
CallandorCommented:
It sounds like you have a bad setting in your motherboard BIOS, if removing the battery fixes your problem.  Check if it is 3v; it may need replacing.

It is a bad idea to use a RAID card in a silent cooling case, because as you have noticed, those cards can get very hot.  If you still want quiet operation, add a 120mm case fan to blow air over it.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
oloap88Author Commented:
hi, what you mean with 3v? the power supply? hope is not that is a top corsair AX850, brand new 160€ psu!

I removed the battery backup unit from the raid card and reinstalled the system, now seems that the problem happens less often, it seems like the raid card was running out of battery and waas drainig the bord battery.. impossible by my side but no explanation can be found.

The caard anyway has now a 40mm fan running at 3800rpm pushing air on the dissipator, at full speed i detect a temp of 36-45° on the probe attached to the dissipator.. sounds good...
CallandorCommented:
I meant the CMOS battery should be 3v; they are usually stored for long periods of time and can go bad when first used.

Cooling the RAID card is a good idea, as they can get very hot during normal use.
oloap88Author Commented:
well, bad mobo settings though.  it is broken indeed! just ordered brand new asrock 990fx extreme 4, while waiting for ivy bridge.

thanks,
Paolo
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PC

From novice to tech pro — start learning today.