Solved

Write cache and performance hit.

Posted on 2011-09-09
11
530 Views
Last Modified: 2012-05-12
ML 110G4 Server has built in RAID controller and RAID 1 is configured using 2x Seagate ST3160812AS SATA drives.
Recently I had issues with a bad spot on the RAID which I assume was the cause of (so far) two BSOD incidents.
I have run a RAID verify and it picked up and fixed 107 errors.
In the process I discovered I had been running the server with write cache enabled on the RAID controller but BIOS warning on POST recommends turning off write cache because the controller apparently can't handle this without the risk of data integrity issues.
The server has been running with cache write enabled for a long time (two years or more).
After disabling the write cache, server performance has dropped significantly to the point that it is almost unusable/unproductive (takes ages to boot up and CPU goes crazy).
The other thing to note is that for a long time (two years or more), one of the Seagate drives shows up on POST as 3.0GB/s and the other as 1.5GB/s.
I have not been able to work out why and if this is having a major impact on performance?
Please advice on possible fix.
How can I improve performance?
Should I just leave the write cache enabled and ignore the warning message on POST?
Should I try to get both drives to function at 3GB/s and if so how?
0
Comment
Question by:stevenvel
  • 5
  • 3
  • 3
11 Comments
 
LVL 14

Accepted Solution

by:
charlestasse earned 300 total points
ID: 36514359
So, not knowning much about this controller, i can tell you how LSI controllers work, there should be similarities here.

Generally you would enable write back cache on a controller that has a connected battery backup. The controller will write to its onboard memory and in the event of an unexpected shutdown, the battery will maintain the contents of the cache until the server comes back up and then write the data to the drives without data loss or corruption.

From what you have written, it appears that your controller does not have (or supports) a battery and this would cause the post message.

Running in write back mode will give you the performance you are looking for, but not the protection that a server should have. It also appears that at some time one of the hard drives may have been replaced with a 3GB/s drive. Generally most controllers are designed to be forward compatable so that faster and bigger drives can be used to replace smaller/slower ones.
0
 
LVL 14

Expert Comment

by:charlestasse
ID: 36514364
I would highly recommend that you invest in the best controller you can pay for, that has battery backup for this server
0
 

Author Comment

by:stevenvel
ID: 36514492
Yes you are right, the controller does not have a battery backup.
The server is due for upgrade but  I need to buy time.
In the short term, should I ignore the error and just enable the write cache?
I have a UPS always protecting the server.
Both drives are the same model (but not sure about firmware), how can I make them both run at 3GB/s?
0
Space-Age Communications Transitions to DevOps

ViaSat, a global provider of satellite and wireless communications, securely connects businesses, governments, and organizations to the Internet. Learn how ViaSat’s Network Solutions Engineer, drove the transition from a traditional network support to a DevOps-centric model.

 
LVL 8

Assisted Solution

by:eager
eager earned 200 total points
ID: 36516485
Battery backup caching controllers are better than one's without a battery, although I have seen controllers which failed because the battery failed.  

Since you have a UPS, there is little likelihood that you will have a power failure which causes data in the cache to be lost.  I would re-enable the write cache.

Check the drives to see if there is a 1.5/3.0 Gbs jumper and if it is set correctly.  
0
 
LVL 14

Expert Comment

by:charlestasse
ID: 36516568
Even with UPS there is always the possibility for the server to shutdown unexpectedly, many things can cause this including power loss. Its up to you to determine if this is acceptable.
0
 
LVL 8

Expert Comment

by:eager
ID: 36516896
Server shutdown, such as caused by a system crash, will not result in loss of data when using a disk cache unless there is a loss of system power to the controller.  This could happen if the system power supply fails, for example, or if someone powers off the system manually.  

You should evaluate to what extent you can risk a data error.  Using a UPS and RAID significantly reduces the likelihood of unrecoverable data loss, as do regular backups.  
0
 

Author Comment

by:stevenvel
ID: 36518167
eager,
I have re-enabled write cache and I think performance is back to normal.
I checked the drives and confirmed that no jumpers are installed.
No jumpers should set the drives to work at 3GB/s but for some reason one drive displays as 1.5GB/s on POST.
It would be nice if they both show up and operate at 3GB/s for best performance but I don't know how to achieve this. Even if I could I am not sure if this would impact the RAID 1 setup (stuff it up)?
Not sure if it's possible to upgrade firware in the drives?
I may just have to leave this in the too hard basket and work on replacing this server.
0
 
LVL 14

Expert Comment

by:charlestasse
ID: 36518986
This sounds like the best plan for you at this point Stevenvel.

Cheers
0
 
LVL 14

Expert Comment

by:charlestasse
ID: 36519000
0
 
LVL 8

Expert Comment

by:eager
ID: 36519289
You might run smartctl on the drives. I don't know whether the transfer speed setting is one of the available settings for your drives, but the utility should tell you.  

Seagate has firmware updates for some of their drives.  Search on their website for your model.  

Different drive speeds will not affect whether the RAID works, but might affect the performance.  

0
 

Author Closing Comment

by:stevenvel
ID: 36520423
Thank you both for your comments.
0

Featured Post

Resolve Critical IT Incidents Fast

If your data, services or processes become compromised, your organization can suffer damage in just minutes and how fast you communicate during a major IT incident is everything. Learn how to immediately identify incidents & best practices to resolve them quickly and effectively.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Hyper-convergence systems have taken the IT world by storm and have quickly started to change our point of view of how the data center should and could be architected. In this article, I’ll explain the benefits of employing a hyper-converged system …
Learn about cloud computing and its benefits for small business owners.
Are you ready to implement Active Directory best practices without reading 300+ pages? You're in luck. In this webinar hosted by Skyport Systems, you gain insight into Microsoft's latest comprehensive guide, with tips on the best and easiest way…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question