Link to home
Start Free TrialLog in
Avatar of Wayne Barron
Wayne BarronFlag for United States of America

asked on

R710 MEMBIST Failures DIMM B1 and B4

Hello, All.

I've had this server sitting around for over a year. Bought it on eBay for $95.00 total, including shipping.
(Waited for the last 15 seconds and placed bid at the bottom dollar, all together, including shipping, was $95.00)

During the setup, this is what I've done.

Updated the BIOS to 6.6.0
Updated the iDrac Express to 1.8

Have 4 sticks on A and 4 Sticks on B (64GB using 8GB DIMMs)
A1 A4, A2 A5
B1 B4, B2 B5

I've tried three different brands of DIMMS, and I receive the same error on the following DIMMs.
MEMBIST Failure - B1 and B4

All DIMMs are:
8GB 2Rx4 PC3-10600R
HYNIX
HP Micron
Samsung

All memory is known good and tested on my other servers, so the sticks being bad, in this case, would be ALL sticks being bad, is not possible.

Could this possibly be a Motherboard issue?
Here is a picture of the error screen.
Showing  48GB out of the 64GB.

Any info on this would be great.
I can pick up another motherboard for it rather cheap, so I have no issues doing that if that is the case.
I have 3 other R710s, so I will replace this one in the rack with one of them and work on it later on down the road.
I love these R710s, so I am not bitter with the motherboard being bad if that is the case.

Thank you
Wayne

User generated image 
Avatar of kevinhsieh
kevinhsieh
Flag of United States of America image

Possible dirty pins on those DIMM slots.

Not sure how to clean them. Otherwise, replace motherboard.
Could this possibly be a Motherboard issue?

that's what it looks like
the fact that you've tried different memory in the same modules with the same error tells me it is the system board
your slot config seems fine

another test would be to put the B modules in the A slots though would require removing the second processor but if it will POST fine with one processor and those 4 for 32gb, you have a more definitive answer
Avatar of Wayne Barron

ASKER

Seth
another test would be to put the B modules in the A slots though it would require removing the second processor if it posts fine with one processor and those 4 for 32gb, you have a more definitive answer

Not sure I am following you.
Are you saying take the modules from the B-side and place them all in the A-Side?
And test with one side, which is the A-side, and see if it post?
If that is what you are saying, yes, it will.

------------
I also have another server that does it to the entire B-side.
The BIOS sees both CPUs, so that is good; just the DIMM slots seem to be shot.
However, I will TRY to do as Kevin stated and blow them out.
I have a can of that computer AIR spray cleaner.
It is worth a shot.
I mean, these systems were downright dusty as heck when I removed the covers from them.
So, hopefully, that is the issue.
If not, then I will pick up a few boards next week when I get paid.
-------------

I grabbed one of the other r710's and placed the CPUs and memory in it that I tried in the two mentioned above.
And the only issue it had was the h700 raid A cable was not seen.
So, I took the known good one from the server (From this main post screenshot) and replaced it, and that server is up and going.
So, that will be put to work later this week while I work on getting the other systems going.

Gotta love buying stuff cheap, but you do take a gamble.
However, I would rather purchase 3 x r710's for $350.00 (All 3) and purchase a couple of motherboards for a total of $75.00 (for 2).
Then, spend $500.00 + on a semi-new or upwards of $1000,00 for a new r710 from some of the companies I've found.

Love technology.
It is getting me where I want to be.
ASKER CERTIFIED SOLUTION
Avatar of Member_2_231077
Member_2_231077

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hey, Andy.
This is the reference I went by, for populating the ram with 8gb sticks.
User generated image
I agree that it could be an issue with e CPU. pins. Remove the CPU and clean the contacts. Shold be a ball and socket type, so alcohol on a clean cotton cloth can be used.
Avatar of Member_2_231077
Member_2_231077

Where did you get that from Wayne? It is used but mainly for memory mirroring which puts half of the RAM in reserve. It can also provide advanced ECC but generally normal ECC is fine. For normal independent channel mode equal amount of RAM per channel is preferable.

Installing and Upgrading DDR3 Memory:
Quick Reference Guide for New Dell PowerEdge Servers

says
"Unbalanced configurations can potentially generate BIOS warnings or error messages. The only exceptions to this statement are a few low DIMM count offerings and specific configurations for RAS
features (such as Advanced ECC and Mirroring) which only populate 2 channels and do not use the third channel.
A Balanced Memory Configuration is a memory configuration where both CPU 1 and CPU2 (if installed) both have identical memory populations on all memory channels.
Where did you get that from Wayne?

looks like the same place where I found it...

https://www.netdevgroup.com/support/dell_r710_configs.html
Well it will work but it's better to use all 3 channels, it is what page 132 of the Hardware Owner's Manual says, not that it helps with your problem.

I'm not sure Dell firmware will allow it but assuming channel 1 has a bent CPU contact or damaged trace it is possible to just use channels 2 and 3, may need a small DIMM in channel 1 just to fool it into starting off - it'll disable that DIMM anyway so a 2GB one will do.
I had similar problem when I replaced CPU in my server. There was some thermal grease on the CPU contacts. I was able to clean the CPU and it was fine.
The memory guide is here.
https://www.netdevgroup.com/support/dell_r710_configs.html

Issues are.
Server 1
Bent pins on CPU2

Server 2
Pins all down in a row. CPU2

So, new boards for both of them.
Checking the 4th server now.

2 good,
2 bad

As I stated before, just about $75.00 for two boards, no big deal.
So, as it sits.
1 board from one eBay seller is bad CPU 2
1 board is bad from Facebook Marketplace, out of 3 Servers I bought from him for $350.00.

Still not too bad.
Thanks, guys, for the assistance on this one.
Wayne
I must confess to a few bad CPU2 sockets, I'm not that clumsy but I've probably built more than 1000 servers and with a mixed order if there's socket 2 damage and it's in a hurry that server gets sent out as a 1P box and another gets the second processor fitted instead.

>The memory guide is here...

I don't want to appear rude but that is not the memory guide, it's not even from Dell but some company I've not heard of. The Dell manual suggests populating in numerical order for best performance, that's why they numbered the slots that way. Both orders work and so do many others. Suboptimal configurations work pretty well so I could get these working with the right amount of RAM on each CPU for about $10 each.
Andy, I did as you suggested in the memory, and it worked as the other, for some reason, would not work on this server.
I cannot remember how I have the memory set in the live server. I will take it down this weekend, and once I have it down, I will look at the memory config on that one.

Some work in the way the image shows, and others work in the way you explained.
It is strange, actually.
Either way, I know what the issue was, and two new MB's will be ordered in the coming days.
https://www.dell.com/support/kbdoc/en-us/000138771/poweredge-how-to-configure-the-memory-in-the-bios will explain why some work one way and others the other way, if optimizer mode is set then populate 1,2,3,4... if Advanced ECC Mode is set then populate 1,2,4,5 although that obviously doesn't matter with a row of damaged contacts.