Solved

Data and log devices on the separate disks ?

Posted on 2007-04-02
12
1,286 Views
Last Modified: 2013-11-14
I know, that it's recommended to put data and log devices on separate disks. Some say it's for security reasons, others say it's for performance.

But, is this "old truth" still valid for external redundant disk arrays ?

With proper point-to-point architecture, there should be no reason for decreased performance with all the devices placed on the same RAID group. And for security - I don't see the advantage of separate disks - if one of them fails, you have to use your backups, so there's no difference, whether you still have half of your devices available.

If I have 8 disks, is it better to create 2 separate RAID10 groups, each with 4 disks and put data on one and log on second ? Or is it better to create one big RAID10 with 8 disks and put everything on it. AFAIK RAID10 on 8 disks should be approximately 2 times quicker on both read and write operations than RAID10 on 4 disks.

So, would you still recommend to place data and log devices on separate RAID groups on external redundant disk array (like Dell/EMC CX3-20 or IBM DS4700) ?
0
Comment
Question by:Jan_Franek
12 Comments
 
LVL 16

Assisted Solution

by:rboyd56
rboyd56 earned 100 total points
ID: 18837022
Another reason for putting the data and log on seperate devices is up to the minute recover.

If your data and log are on separate devices and the device that has you data goes bad, you can still backup the existing transaction log and get the data that is in the log so you can gt up to the minute recovery.
0
 
LVL 14

Author Comment

by:Jan_Franek
ID: 18842441
Hi rboyd56,

thanks for input. I haven't experienced disk subsystem crash yet, that would left log devices functional while data devices destroyed. Is it really possible to dump transaction log in such a situation ? I would expect database server to stop working immediately.

However, in my scenario this case is covered by synchronous mirror of all the data on disk array to remote site. So in case of any disk array failure we would switch to backup remote site disk array and we should be able to continue without need for restoring data from backup. So I believe, that "security reasons" are covered.
0
 
LVL 24

Expert Comment

by:Joe Woodhouse
ID: 18847796
It is totally possible to dump the log even when the data devices are dead. "Dump tran [database] with no_truncate".

The logs are actually more important than the data.
0
 
LVL 3

Assisted Solution

by:VXDguy
VXDguy earned 150 total points
ID: 18848005
I'm a performance kind of guy.  I strongly believe in seperating database and log files for performance reasons.  There's recoverability reasons as well, but that's dependent on your company's backup strategy and disaster recoverability needs.  In corporate environments, recoverability and data integrity is far more important than speed (i.e. I'd rather be slow and recoverable than fast and destroyed).

Anyway, lets try a simple example configuration:

HDS AMS 1000 (or HP EVA, or EMC [expensive metal cabinet], or STK, or 3Ware, or ...)
  Raid Group 0 (RG0) is a RAID5 (7D+1P)
  Raid Group 1 (RG1) is a RAID10 (4D+4P)

Carve out a lun on RG0, say 500GB.
Carve out a lun on RG1, say 100GB.

Database on RG0, logs on RG1.

Logs are 100% sequential writes during normal operations.  Placing them on RAID10 parity disks allows for very fast writes.  There's no parity calculations, just duplicate writes.

Database is a mix a reads and writes, and while a RAID10 parity disk for your database would be great, it's not cost effective for the performance gain due to the read/write I/O mix and random disk I/O.  Yes there's a parity calculation penality, but this is offset by dedicated parity generation hardware in most redundant arrays.  It will never be as fast for writes as RAID10, but read performance will be better, not to mention being far cheaper than the equivalent RAID10 array.

Keep in mind that each database update transaction is a sequence of three dependent writes.  They can't be queued, they must be performed in order, and they must all succeed.  Each update writes to the logs file (identifying the records to update), updates the main database, and updates the logs file again to show a successful update.  So for each database update, you're going to hit the log files twice.  That's why having logs on RAID10 for fast writes is important.  A slow log file/lun/raid group can bring the database to a crawl.

And it gets better...

External redundant arrays are smart controllers.  They're typically cache based controllers, so your disk writes are acknowledged to the host as soon as they enter the cache on the array.  This gives <2ms response times for I/O as long as there's cache space available.  As long as the pending disk writes can be written from cache to disk faster than the write I/Os from the host to the controller's cache, you'll get absolutely fantastic disk write performance.

So you can badly misconfigure an external raid array and not know it until it gets under heavy enough log to slow down incoming host I/O to backend disk I/O speeds due to cache being flooded with disk writes.  In other words, you can stick your database and logs on the same filesystems, LUN, raid group and the raid array's cache will "hide" performance problems until it comes under a real workload.

That said, I can tell you that some people argue for placing logs on RAID5 volumes instead of RAID10.  Until you place the database under a heavy write load, the controller's cache hides the performance advantage of putting logs on RAID10 volumes.  So it may be quite acceptable to use RAID5 for logs if the database isn't heavily utilized on a regular basis.

And finally, we use the ShadowImage (in-array mirroring capability) of the HDS AMS1000 to copy the log luns.  The transaction logs are then rolled forward against the development copy of the database.  It's easier to snapshot a 100GB lun than a 500GB lun because its smaller, it's a log file so we can take a snapshot at any time, it copies faster (underlying source/target LUN is on RAID10), and we can mirror the log luns to keep a rolling backup up to date.
0
 
LVL 24

Expert Comment

by:Joe Woodhouse
ID: 18848288
VXDguy,

Transaction logs on RAID 5 is usually a bad idea, I think. Logs get heavy writes, usually sequential, and my experience has tended to show RAID 5 imposes a write performance penalty of anywhere from 20-50%.

For the same reasons tempdb should also not be on RAID 5. Tempdb is usually the busiest database!
0
 
LVL 14

Author Comment

by:Jan_Franek
ID: 18848923
OK, in fact, I have 10 disks for my database server. And I see this possibilities:

1. 5D+5P RAID10 and I'll have 5D capacity and cca 5x the speed of single disk
2. 8D+2P RAID50 (strip over two 4D+1P RAID5) and I'll have 8D capacity and cca 4xthe speed of single disk
3. 2D+2P RAID10 for log and 5D+1P RAID5 for data and I'll have 2D capacity and 2x speed for log and 5D capacity and 2.5x speed for data.

I have done some tests on our internal RAID and it seems, that write speed for RAID5 N+1 is approximately N/2 x write speed of one disk. It's pretty in line with my understanding how RAID5 write works (read blocks from all disks, update, calucate parity, write blocks back to disks).

So I wonder, whether it's better to go for solution with higher write speed but data and log stored on the same RAID group or to separate data and log but on slower RAID groups.

BTW - I have always believed, that RAID1 has higher write speed than RAID5, so on our current production server logs are stored on RAID1 (1D+1P) and data on RAID5 (4D+1P). Lately I did some disk performance tests (using xdd) and I was quite surprised, that our RAID5 is 2x quicker for both reads and writes that our RAID1 (in cases when cache can't handle whole transfer and there are really physical disk reads and writes).
0
Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

 
LVL 14

Author Comment

by:Jan_Franek
ID: 18848929
Joe, so you are saying, that if I destroy my data disks, ASE will still keep running so I can use DUMP TRANSACTION command ? Of course, I believe you, I'm just surprised :-)
0
 
LVL 24

Accepted Solution

by:
Joe Woodhouse earned 250 total points
ID: 18857592
Jan,

I have been in a genuine real-world situation where some raw partitions were lost, which included a number of data devices. Of course ASE wasn't happy about it (many 605 errors), but so long as master and tempdb are still up, you do still have a functional dataserver.

That was enough to do the dump tran with no_truncate for the up-to-the-minute transaction dump of all user databases. We then rebuilt all disks, rebuilt the databases, loaded all scheduled backups and then the post-crash ones. The system was down for half a day but we recovered up to the second (well, all committed transactions up to the second) of the crash.

I'm now a believer. 8-)

Since some failures can be at the LUN level, it's good to have more than one LUN. Ditto volume groups. And a few of us have now said, the performance vs availability characteristics of the RAID levels are qute different. I would never put tempdb on anything other than RAID 0, 0+1 or 10. RAID 5 is asking for trouble.

I'm also not surprised that RAID 5 outperformed RAID 1, which after all still has a single disk-head performing the write on each side of the mirror. Even with the redundant parity, RAID 5 is distributing the write. I do expect a properly configured RAID 5 to outperform everything except 0+1 or 10. 10 is unquestionably the best; 5 is a good compromise when you can't afford the number of disks to put everything on 10.

Perfect scenario: 3 x RAID 10 volume groups, one each for data, log and tempdb.
Very achieveable and still good: 1 x RAID 10 for tempdb, 1 x RAID 10 for log, 1 x RAID 5 for data.
0
 
LVL 14

Author Comment

by:Jan_Franek
ID: 18858015
Hi Joe, it's very interesting. And good to know, that I should move master database out of data disk :-)

However, in my case synchronous mirror to backup disk array should deal with any primary disk array failure, so I hope, I'll be back online in few minutes without need for restoring data from backups.

So, it seems to me, that I can optimize for speed and put everything on the same RAID group.
0
 
LVL 24

Expert Comment

by:Joe Woodhouse
ID: 18861760
From ASE's point of view, yes, in your scenario I don't think you gain anything by having multiple RAID groups or not.

I'd want someone who knows more about the O/S and the storage hardware to confirm there are no issues at those levels. I don't know enough about those areas to competently advise.

Remember, too, a backup array is less helpful than you think unless you have a fully redundant path there - a second controller and a second channel. I *have* seen both of these fail in production incidents.
0
 
LVL 14

Author Comment

by:Jan_Franek
ID: 18862709
I think, I'll have fully redundant solution - two HBA in all servers, two optical switches, two controllers in disk array, each disk connected to both controllers and the same setup in backup site.
Yesterday I have read some white paper from EMC about best practices and there was one point - they say it's better not to mix sequential and random I/O. AFAIK log writes are mostly sequential and data reads would be mostly random. But with several ASE servers connected to disk array these log writes will get probably "randomized".

It's interesting that most papers compare RAID types of the same capacity. I believe, that in real world you have some budget, so you know how many drives you can afford, so comparison based on the same number of disks would probably be more useful.
0
 
LVL 24

Expert Comment

by:Joe Woodhouse
ID: 18888486
Agree comparison on number of disks is more useful. Those pesky disk vendors! 8-)
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Suggested Solutions

Create your own, high-performance VM backup appliance by installing NAKIVO Backup & Replication directly onto a Synology NAS!
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now