Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


"Single-Bit ECC Errors in Memory Bank"

Posted on 2006-07-02
Medium Priority
Last Modified: 2007-12-19
At bootup I see the message "Single-Bit ECC Errors in Memory Bank", but I am able to proceed with booting and nothing seems wrong.

I assume that at least one of the banks of ram is bad (I think I have either 4 or 8 banks in there).

Should I pull one out at a time until I find the broken one (and then replace it) or is it safe to just ignore it? If I can ignore it, is the speed of the ram impacted by this at all?

I don't know if my type of memory corrects single bit errors or simply detects them.
Question by:HappyEngineer
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
  • 2
  • +4
LVL 32

Assisted Solution

jhance earned 400 total points
ID: 17026152
If the BIOS screen doesn't identify which RAM module is producing the error, then yes, the one-at-a-time removal method should help you identify the failing one.

ECC = Error CHECK and CORRECT.  This try DOES detect and correct single bit errors.  The other type, PARITY can only CHECK but since there is no correct capability, the system is halted.
LVL 14

Accepted Solution

FriarTuk earned 400 total points
ID: 17026154
are these simms or dimms? does the mobo require matched ram in certain banks?

yes, you should try finding which one it is & remove it (one at a time)

but there are memory testers:

Expert Comment

ID: 17027091
ECC has the ability to correct a detected single-bit error in a 64-bit block of memory. When this happens, the computer will continue without a hiccup; it will have no idea that anything even happened. However, if you have a corrected error, it is useful to know this; a pattern of errors can indicate a hardware problem that needs to be addressed. Chipsets allowing ECC normally include a way to report corrected errors to the operating system, but it is up to the operating system to support this.

If a minor (one-bit) memory error occurs, the ECC logic will handle it. If a two-bit or larger error occurs in ECC memory, your system will be halted--similar to what happens with parity memory when any error is encountered.

refer this.

Or you can download a third party memory testing software to find the fault

Hope you find this helpful,
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

LVL 44

Assisted Solution

scrathcyboy earned 400 total points
ID: 17028938
In general, it is NOT safe to ignore ANY RAM errror by windows.  Remove the chips and see if they have a CL posted on the RAM card.  If so, and it is CL3, it means your RAM is too slow for the chipset on the motherboard.  Most fast chipsets like NVidia need CL2.5 or faster.  Only the slowest chipsets can handle CL3, so if you have CL3 ram (check the net for your product number, if the CL is not printed on them), then you will get intermittent errors like you are getting.

This is NOT safe to proceed this way. You will get random seizures, failures, data loss, even corruption of the chipsets and eventual loss of the motherboard if you continue.  It is possible you have just one bad RAM chip, but most likely the RAM cannot perform to the latency requirements of the board.  It is not advisable to run with RAM too slow for the MB (slow has nothing to do with the PC xxx rating, BTW).

Assisted Solution

SaxicolousOne earned 400 total points
ID: 17031798
Honestly, I wouldn't make blanket statements about the necessary CAS latency of memory modules without knowing what the system in question calls for. HappyEngineer, just check your system's (or specific motherboard's) recommendations as to the memory module specifications if this is something you're worried about. If you are concerned that your modules may be inappropriate for your system, you can have a look at exactly what memory you've got using CPU-Z...

... and compare that with the motherboard's stated requirements. If you can't find any stated requirements anywhere for your system, just go to or some other memory manufacturer's site, look for THEIR recommended modules for your system, and have a look at those modules' specs.

If you want, you could also have a look in your BIOS and see what SPD timings (CAS, etc.) are supported by your system.
LVL 44

Expert Comment

ID: 17033020
The inportance of a good CAS match to the MB chipset is one of THE most important issues for stability.  Single bit ECC errors in this case are more likely related to the MB chipset not liking that ECC Ram, but the CAS issue is still supremely relevant.  With the wrong CAS, this is what you can get with ECC.
LVL 12

Assisted Solution

GinEric earned 400 total points
ID: 17039892
Error Correction for single bit errors is between the the two ends, the memory and the input/output registers.  It's generated at each end and checked.  Therefore, it is not simply a memory problem.  One Bit Error Correction is more designed to fix a bit that got lost somewhere in the middle, not in memory.

It could just as well be a bad gate on the motherboard as a memory card, in fact, it would be more likely, since the Error Correction Code is stored in the parity bits in memory.

If you don't know what RAS and CAS timing are, you should not play with them.  The Refresh timers will cause all kinds of errors if you tweak them out of specification.

Your memory is not on all the time, that's how modern memory works, to cut down on power consumption.  The RAS and CAS timers determine how long memory is allowed to be powered off and still retain memory of bits until the next refresh cycle.  During the refresh cycle, you cannot access that memory, the request goes into a wait queue for a few milliseconds.  This is masked by how memory is transferred in large blocks, so you hardly ever see any latency.  To see how this works, look up dynamic refresh memory.

A One Bit Error is generally an indication that something outside of memory lost a bit, not the memory itself.  Anything from Flourescent lights to power surges in other devices can cause the loss of a bit here or there.  An air conditioner switching on could do it.  But if it's always the same bit, either some circuit is weak or some cable is loose or dirty, or perhaps some chip has a bad solder joint.  It could even be a badly laser welded pin leg on a chip and merely walking by it could cause enough vibration to to temporarily disconnect the pin.  I've seen all this in production and in the laboratory, under a scanning electron microscope.  Have even seen bad runs of all sorts of IC chips that had bad laser welds.

If the power supply is getting weak, or the motherboard is getting too hot, it can start to show up as one bit errors.

Often, replacing the memory only delays the eventual catastrophe by masking the real problem.  Then, when the real component fails, it takes the data with it.  So, even if it works after replacing the memory, there is no guarantee that you have actually found the problem, and, it may come back to haunt you.

You should check all of your voltages, connections, clear the dust from the motherboard and fans, hard drives, etc., and visually inspect while doing so, even use your sense of smell to sniff out possible burned components.  Check all cabling, and press on the connectors to make sure they're firmly seated.  Aside from that, you can take out the memory cards and clean the lands, if you have the proper cleaner.

That's how it's done in the field by engineers.

LVL 14

Expert Comment

ID: 17054515

Author Comment

ID: 17055483
This is all useful information, but for some reason the message stopped coming up during booting. I've been waiting to see if it reoccurs, but so far it hasn't. I obviously can't go in and change things in order to fix it if it isn't giving me any messages to indicate that it's still happening, so I guess I'll close the question.

I tried memtest86, but it didn't find any problems and I haven't seen the error during bootup since I ran it.
LVL 12

Expert Comment

ID: 17055581
It may have been heat or any condition.  One bits are flakey, but it will return one day, count on it.

Featured Post

Are You Ready for GDPR?

With the GDPR deadline set for May 25, 2018, many organizations are ill-prepared due to uncertainty about the criteria for compliance. According to a recent WatchGuard survey, a staggering 37% of respondents don't even know if their organization needs to comply with GDPR. Do you?

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Monitor input from a computer is usually nothing special.  In this instance it prevented anyone from using the computer.  This was a preconfiguration that didn't work.
This article shows how to use a free utility called 'Parkdale' to easily test the performance and benchmark any Hard Drive(s) installed in your computer. We also look at RAM Disks and their speed comparisons.
Have you created a query with information for a calendar? ... and then, abra-cadabra, the calendar is done?! I am going to show you how to make that happen. Visualize your data!  ... really see it To use the code to create a calendar from a q…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA:…

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question