Data and log devices on the separate disks ?

I know, that it's recommended to put data and log devices on separate disks. Some say it's for security reasons, others say it's for performance.

But, is this "old truth" still valid for external redundant disk arrays ?

With proper point-to-point architecture, there should be no reason for decreased performance with all the devices placed on the same RAID group. And for security - I don't see the advantage of separate disks - if one of them fails, you have to use your backups, so there's no difference, whether you still have half of your devices available.

If I have 8 disks, is it better to create 2 separate RAID10 groups, each with 4 disks and put data on one and log on second ? Or is it better to create one big RAID10 with 8 disks and put everything on it. AFAIK RAID10 on 8 disks should be approximately 2 times quicker on both read and write operations than RAID10 on 4 disks.

So, would you still recommend to place data and log devices on separate RAID groups on external redundant disk array (like Dell/EMC CX3-20 or IBM DS4700) ?
LVL 14
Jan FranekAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Another reason for putting the data and log on seperate devices is up to the minute recover.

If your data and log are on separate devices and the device that has you data goes bad, you can still backup the existing transaction log and get the data that is in the log so you can gt up to the minute recovery.
Jan FranekAuthor Commented:
Hi rboyd56,

thanks for input. I haven't experienced disk subsystem crash yet, that would left log devices functional while data devices destroyed. Is it really possible to dump transaction log in such a situation ? I would expect database server to stop working immediately.

However, in my scenario this case is covered by synchronous mirror of all the data on disk array to remote site. So in case of any disk array failure we would switch to backup remote site disk array and we should be able to continue without need for restoring data from backup. So I believe, that "security reasons" are covered.
Joe WoodhousePrincipal ConsultantCommented:
It is totally possible to dump the log even when the data devices are dead. "Dump tran [database] with no_truncate".

The logs are actually more important than the data.
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

I'm a performance kind of guy.  I strongly believe in seperating database and log files for performance reasons.  There's recoverability reasons as well, but that's dependent on your company's backup strategy and disaster recoverability needs.  In corporate environments, recoverability and data integrity is far more important than speed (i.e. I'd rather be slow and recoverable than fast and destroyed).

Anyway, lets try a simple example configuration:

HDS AMS 1000 (or HP EVA, or EMC [expensive metal cabinet], or STK, or 3Ware, or ...)
  Raid Group 0 (RG0) is a RAID5 (7D+1P)
  Raid Group 1 (RG1) is a RAID10 (4D+4P)

Carve out a lun on RG0, say 500GB.
Carve out a lun on RG1, say 100GB.

Database on RG0, logs on RG1.

Logs are 100% sequential writes during normal operations.  Placing them on RAID10 parity disks allows for very fast writes.  There's no parity calculations, just duplicate writes.

Database is a mix a reads and writes, and while a RAID10 parity disk for your database would be great, it's not cost effective for the performance gain due to the read/write I/O mix and random disk I/O.  Yes there's a parity calculation penality, but this is offset by dedicated parity generation hardware in most redundant arrays.  It will never be as fast for writes as RAID10, but read performance will be better, not to mention being far cheaper than the equivalent RAID10 array.

Keep in mind that each database update transaction is a sequence of three dependent writes.  They can't be queued, they must be performed in order, and they must all succeed.  Each update writes to the logs file (identifying the records to update), updates the main database, and updates the logs file again to show a successful update.  So for each database update, you're going to hit the log files twice.  That's why having logs on RAID10 for fast writes is important.  A slow log file/lun/raid group can bring the database to a crawl.

And it gets better...

External redundant arrays are smart controllers.  They're typically cache based controllers, so your disk writes are acknowledged to the host as soon as they enter the cache on the array.  This gives <2ms response times for I/O as long as there's cache space available.  As long as the pending disk writes can be written from cache to disk faster than the write I/Os from the host to the controller's cache, you'll get absolutely fantastic disk write performance.

So you can badly misconfigure an external raid array and not know it until it gets under heavy enough log to slow down incoming host I/O to backend disk I/O speeds due to cache being flooded with disk writes.  In other words, you can stick your database and logs on the same filesystems, LUN, raid group and the raid array's cache will "hide" performance problems until it comes under a real workload.

That said, I can tell you that some people argue for placing logs on RAID5 volumes instead of RAID10.  Until you place the database under a heavy write load, the controller's cache hides the performance advantage of putting logs on RAID10 volumes.  So it may be quite acceptable to use RAID5 for logs if the database isn't heavily utilized on a regular basis.

And finally, we use the ShadowImage (in-array mirroring capability) of the HDS AMS1000 to copy the log luns.  The transaction logs are then rolled forward against the development copy of the database.  It's easier to snapshot a 100GB lun than a 500GB lun because its smaller, it's a log file so we can take a snapshot at any time, it copies faster (underlying source/target LUN is on RAID10), and we can mirror the log luns to keep a rolling backup up to date.
Joe WoodhousePrincipal ConsultantCommented:

Transaction logs on RAID 5 is usually a bad idea, I think. Logs get heavy writes, usually sequential, and my experience has tended to show RAID 5 imposes a write performance penalty of anywhere from 20-50%.

For the same reasons tempdb should also not be on RAID 5. Tempdb is usually the busiest database!
Jan FranekAuthor Commented:
OK, in fact, I have 10 disks for my database server. And I see this possibilities:

1. 5D+5P RAID10 and I'll have 5D capacity and cca 5x the speed of single disk
2. 8D+2P RAID50 (strip over two 4D+1P RAID5) and I'll have 8D capacity and cca 4xthe speed of single disk
3. 2D+2P RAID10 for log and 5D+1P RAID5 for data and I'll have 2D capacity and 2x speed for log and 5D capacity and 2.5x speed for data.

I have done some tests on our internal RAID and it seems, that write speed for RAID5 N+1 is approximately N/2 x write speed of one disk. It's pretty in line with my understanding how RAID5 write works (read blocks from all disks, update, calucate parity, write blocks back to disks).

So I wonder, whether it's better to go for solution with higher write speed but data and log stored on the same RAID group or to separate data and log but on slower RAID groups.

BTW - I have always believed, that RAID1 has higher write speed than RAID5, so on our current production server logs are stored on RAID1 (1D+1P) and data on RAID5 (4D+1P). Lately I did some disk performance tests (using xdd) and I was quite surprised, that our RAID5 is 2x quicker for both reads and writes that our RAID1 (in cases when cache can't handle whole transfer and there are really physical disk reads and writes).
Jan FranekAuthor Commented:
Joe, so you are saying, that if I destroy my data disks, ASE will still keep running so I can use DUMP TRANSACTION command ? Of course, I believe you, I'm just surprised :-)
Joe WoodhousePrincipal ConsultantCommented:

I have been in a genuine real-world situation where some raw partitions were lost, which included a number of data devices. Of course ASE wasn't happy about it (many 605 errors), but so long as master and tempdb are still up, you do still have a functional dataserver.

That was enough to do the dump tran with no_truncate for the up-to-the-minute transaction dump of all user databases. We then rebuilt all disks, rebuilt the databases, loaded all scheduled backups and then the post-crash ones. The system was down for half a day but we recovered up to the second (well, all committed transactions up to the second) of the crash.

I'm now a believer. 8-)

Since some failures can be at the LUN level, it's good to have more than one LUN. Ditto volume groups. And a few of us have now said, the performance vs availability characteristics of the RAID levels are qute different. I would never put tempdb on anything other than RAID 0, 0+1 or 10. RAID 5 is asking for trouble.

I'm also not surprised that RAID 5 outperformed RAID 1, which after all still has a single disk-head performing the write on each side of the mirror. Even with the redundant parity, RAID 5 is distributing the write. I do expect a properly configured RAID 5 to outperform everything except 0+1 or 10. 10 is unquestionably the best; 5 is a good compromise when you can't afford the number of disks to put everything on 10.

Perfect scenario: 3 x RAID 10 volume groups, one each for data, log and tempdb.
Very achieveable and still good: 1 x RAID 10 for tempdb, 1 x RAID 10 for log, 1 x RAID 5 for data.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Jan FranekAuthor Commented:
Hi Joe, it's very interesting. And good to know, that I should move master database out of data disk :-)

However, in my case synchronous mirror to backup disk array should deal with any primary disk array failure, so I hope, I'll be back online in few minutes without need for restoring data from backups.

So, it seems to me, that I can optimize for speed and put everything on the same RAID group.
Joe WoodhousePrincipal ConsultantCommented:
From ASE's point of view, yes, in your scenario I don't think you gain anything by having multiple RAID groups or not.

I'd want someone who knows more about the O/S and the storage hardware to confirm there are no issues at those levels. I don't know enough about those areas to competently advise.

Remember, too, a backup array is less helpful than you think unless you have a fully redundant path there - a second controller and a second channel. I *have* seen both of these fail in production incidents.
Jan FranekAuthor Commented:
I think, I'll have fully redundant solution - two HBA in all servers, two optical switches, two controllers in disk array, each disk connected to both controllers and the same setup in backup site.
Yesterday I have read some white paper from EMC about best practices and there was one point - they say it's better not to mix sequential and random I/O. AFAIK log writes are mostly sequential and data reads would be mostly random. But with several ASE servers connected to disk array these log writes will get probably "randomized".

It's interesting that most papers compare RAID types of the same capacity. I believe, that in real world you have some budget, so you know how many drives you can afford, so comparison based on the same number of disks would probably be more useful.
Joe WoodhousePrincipal ConsultantCommented:
Agree comparison on number of disks is more useful. Those pesky disk vendors! 8-)
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.