asked on

How to Size and Maintain a Transaction log (hands off)?

Little monitoring of the growth of any of the files on this SQL Server will occur. My focus right now is the transaction log of one database.

So here is a report from SSMS that shows Disk Usage on the database. And then log auto-grow events

	Total Space Reserved			324,946.94		MB						
	Data Files Space Reserved		269,809.81		MB						
	Transaction Log Space Reserved		55,137.13		MB						
											
											
											
	Data/Log Files Autogrow/Autoshrink Events										
											
	Event			Logical File Name	Start Time				Duration (ms.)	Change In Size (MB)		
	Log File Auto Growth	mylivedb_log	9/28/2014 10:49:10 PM				6250		1,216.69		
	Log File Auto Growth	mylivedb_log	9/28/2014 10:42:43 PM				25656		4,901.88		
	Log File Auto Growth	mylivedb_log	9/28/2014 10:39:34 PM				24040		4,456.25		
	Log File Auto Growth	mylivedb_log	9/28/2014 10:36:45 PM				21270		4,051.13		
	Log File Auto Growth	mylivedb_log	9/28/2014 10:34:10 PM				18896		3,682.88		
	Log File Auto Growth	mylivedb_log	9/28/2014 10:27:03 PM				17536		3,348.06		
	Log File Auto Growth	mylivedb_log	9/28/2014 10:25:08 PM				15723		3,043.69		
	Log File Auto Growth	mylivedb_log	9/28/2014 10:23:18 PM				14380		2,767.00		
	Log File Auto Growth	mylivedb_log	9/28/2014 10:20:37 PM				13503		2,515.44		
	Log File Auto Growth	mylivedb_log	9/28/2014 10:17:40 PM				12396		2,286.75		
	Log File Auto Growth	mylivedb_log	9/7/2014 10:26:44 PM				9683		2,078.88		
	Log File Auto Growth	mylivedb_log	9/7/2014 10:25:04 PM				9906		1,889.88		
	Log File Auto Growth	mylivedb_log	9/7/2014 10:22:16 PM				8606		1,718.06		
	Log File Auto Growth	mylivedb_log	9/7/2014 10:21:05 PM				8386		1,561.88		
	Log File Auto Growth	mylivedb_log	9/7/2014 10:18:10 PM				6800		1,419.88		
	Log File Auto Growth	mylivedb_log	9/7/2014 10:15:36 PM				6450		1,290.81		
	Log File Auto Growth	mylivedb_log	9/7/2014 10:14:33 PM				5900		1,173.50		
	Log File Auto Growth	mylivedb_log	9/7/2014 10:12:44 PM				5130		1,066.81		
	Log File Auto Growth	mylivedb_log	9/7/2014 10:09:27 PM				4890		969.81		
											
	Disk Space Used by Data Files

Open in new window

The last event ran the disk out of space. Both events can be directly attributed to an index reorg/rebuild job.

Question 1 - The log file needs to be shrunk immediately. I can get further information but I need to know where to set the initial size.

Question 2 - What is the best auto-grow setting? This one is 10% which looks to be too much for the given resources.

Question 3 - The server has Ola Hallengren's scripts for Index and Database Integrity. I would like to keep them running. One thing is clear to me 5% is way to low. I am thinking something like 50% and 75% for med and high. This is a very low transaction rate database. It mostly does inserts, updates and deletes. The occasional Ad Hoc queries tax the systems resources but it is what it is.

Goal - Keep from running out of disk space, failing, data corruption for a short period of time before I can get the processes in place to perform proper maintenance.

Thanks

jogos

Q1
You can set the initial size by the alter database-command. See E. at http://msdn.microsoft.com/en-us/library/bb522469.aspx
But don't shrink your logfile before seeing the whole picture.

Q2
When a (log)file must grow then it takes time. Time that may let timeout an operation that is supposed to be superfast. That growing comes on a unpredictable moment so maybe just on the time you don't have the place or support-people availlable.
It is better to assign the maximum space you think is necessary and have the autogrow just as safety. Autogrow can also be set in MB.

Elaborating on that : http://support2.microsoft.com/kb/315512

Q3:
If you set 50% then you allow your index to permanently be 50% fragmentet. Thats a lot of loss in performance. Indexes are the key to good performance, keeping then in the best chape possible. The 5/30 meets the microsoft-suggestions. http://msdn.microsoft.com/en-us/library/ms189858.aspx

An index maintenance can take time and so is shedulled in low usage hours, but maybe it concurs with other large operations on that time for the usage of the log-space. Loocking at the timing of things can be a good idea.
Alternative something like a higher of parameters for defragmenting very high fragmented indexes in between the jobs of the normal maintenance (every night emergency-maintenance and normal every weekend). But monitoring the indexes that regularly get high fragmentation can let you learn if something can be done to prevent that index from fragmenting. Clustering on a bad value as a guid for example.

Mark Wills

You can also use SSMS to adjust size and growth.

For size, right click on the database, go into properties, then files. You can change both size and growth.
For space, right click on the database, go into tasks, then shrink, then files. You can choose either main DB or Log and it shows current use. Logs are quick, DB take time. Note there is also an option to reorganise pages.

But you shouldn't do that without first considering a few important steps first.

By the looks of the size/growth increments, it is not really a huge database (in terms of growth increments), so accuracy will be critical in determining the right sizes. More accurate to say that monitoring and housekeeping is critical (autogrowth exists for a good reason). But the frequency of growth is more telling, and given the dates, seems that maybe some kind of peak activity on a periodic basis that might need to be considered differently.

With a Full Recovery model you do need checkpoint events and BOTH Full backups and separate Transaction Log backups at regular intervals.

With fragmentation scripts, you only need to run (say) weekly with reporting running more frequently. A highly random access database, fragmentation is not quite as critical a problem and over use of defragmentation techniques can exasperate the problem.

I encourage you to read two articles here in EE about Structure (and not just for the accidental DBA) :
https://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SQL-Server-2005/A_657-Managing-the-Transaction-Log-for-the-Accidental-DBA.html
https://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/A_691-Managing-Fragmentation-for-the-Accidental-DBA.html

If space is always critical, then you might need to relocate some files (maybe backups are there too) and/or add disk to house (say) the transaction logs as a longer term solution.

If critical, then there are a few strategies to relieve space. First step (and always first), is open a query window and explicitly run the 'checkpoint' command and do your full backup and transaction log backup. Then you can use the SSMS tasks to shrink files with reorganise pages (mainly for the transaction log). If and only if absolutely needed, (having done the first step) then change to simple recovery, and run checkpoint and full backup. If you do that, then change back to full and use SSMS to change properties for size and autogrowth - you can use a set size rather than percent (preferred for larger databases).

But first step would be to read those two articles...

phil916

ASKER

Mark,

Thank you for the very informative article. I have a greater level of confidence using the GUI to do the shrink the first time. I scripted out the command from the GUI so I could compare it to other information I have reviewed.

As for fragmentation: This a low transaction rate data warehouse type database. It is updated near real-time from the main database which just uses a different database model. The auto grow events you see are related to the exact time of the rebuild/reorganize job I have running. So it is causing log file growth right?

Mark Wills

Yes, it will be.

There are now quite a lot of activities that can impact Log files now.

You say it is a low transaction rate / DW type database, but then every insert / update / delete will add to transaction volumes...

As a thought, because it is a data warehouse type database, does that mean source data is also contained in other areas ? If so, maybe Full recovery mode isn't really needed ? If you don't have a strong case to be able to recover from transaction logs, then I would really be considering changing to Simple recovery mode...

As for fragmentation, you are really talking about the main DB and not so worried about Log (apart from what it does to the log when restructuring). So, you may need to consider row lengths and the fill factor. If there are lots of inserts, and a clustered index then it will have a lot of pointers to the next row in sequence. That can create "internal" fragmentation as opposed to external (physical) fragmentation.

How often are you attempting to defragment, and what is your basis (or goals) from defragmenting ?

Do you have clustered indexes ? Are they on a synthetic unique value (ie guid or identifier) and if so, is that the basis for access (ie indexed or sequential on that key) or just for uniqueness and have other natural indexes for access ?

We get to that point where more intimate knowledge of structure and (business) use is needed. So, maybe take stock for a moment, gather your thoughts, gather your questions (no matter what they might be), and we will see what we can do to help you with your concerns.

phil916

ASKER

Recovery mode has always been simple. This is part of the concern over the log file growth it shouldn't be so big?
A lot of what I have read about the log file is in relation to full recovery as if in simple there is not much to be concerned with.

The source system retains the data for a long time and if this SQL server database were to fail not only is there a daily backup but there is also a way to replay the transactions into the database. That is not to say that a loss event would be insignificant as the business may move more toward this database for reporting. Currently it is mainly doing one type of reporting but it is high value reporting. Meaning if the reporting is wrong or unable to be performed when it is due there is direct financial penalties.

So for fragmentation.

1. Should I worry about it? The storage is a shared SAN and I hear that since data reads are random anyway fragmentation is not much of a concern.

2. Initially I was doing weekly defragmentation with 5% triggering a reorg and 30% a rebuild. The goal is query performance improvement.

3. There are clustered indexes on all tables this is delivered like this from the vendor. No synthetic keys as you describe they better fit the definition of a natural key.

Mark Wills

Well, even with Simple there will always be transaction logging. It pretty much works the same way really (regardless of recovery mode), the only difference is how long those transactions are held in that log file.

"Simple" just allows the transaction log to mark areas as reusable when transactions complete and the truncate operation releases space from the end of the log.

There are a few operations that can 'hold' an entry (anywhere in the log) so the amount of free space at the end is somewhat arbitrary. That is where the reorganise pages is handy using SSMS.

Prior to 2008 R2, you could run a backup with 'truncate only' but this no longer works in more recent versions. Instead, you need to make sure transactions are "completed" and use the "checkpoint" command. A checkpoint is automatically issued with a backup.

So, what might cause a transaction log to keep growing (or prevent truncation) ? With "simple" the reasons are relatively few and sometimes not at all obvious (well, being apparently inexplicable, normally means you haven't found the reason). One of the biggest offenders is Replication. Another is a 'Hanging" transaction (ie did not complete), or a long running transaction (conflicting locks and/or waiting on resources).

Now, it is important to make the differentiation between truncation and releasing disk space. Truncation releases space at the end of the transaction log (within that transaction log file). Releasing disk space is releasing physical disk consumption above the allocated space.

An autogrow event actually locks up the resources until the space can be found and allocated. For that reason, even with simple recovery you want to minimise the frequency of autogrow events by having a reasonably allocated (physical) size for your transaction log. Interestingly, a backup is a shared event and doesn't have as much impact. Having said that, you still might be competing for resources if the machine is already suffering from lack of resources.

We know that even Simple recovery needs Checkpoints to help it keep its transaction log "clean".

It will automatically issue a checkpoint when approaching 70% full (transaction log), but you may still be adding transactions at the end. So, might be some reusable space, but for all intents will ultimately keep growing over time.

Best is to increase the frequency of those checkpoint events.

One method is to do more backups (database backups - cannot do transaction log backups - it throws an error for Simple Recovery).
Another is to backup to a NULL device : BACKUP DATABASE <database name> TO DISK = 'NUL:'
And yet another is to change the 'auto' checkpoint frequency : http://msdn.microsoft.com/en-us/library/ms189573.aspx

As for fragmentation, if you rebuild the clustered indexes, then the secondary indexes, those secondary indexes use the row pointers generated by the clustered index - in that sense, will be quicker to drop secondary indexes before rebuilding clustered indexes. And of course, a checkpoint issued between each iteration of table name.

Given your description, I probably would not bother with reorg at 5%, just the rebuild. Especially if adding data across the range of 'natural key' entries. Also given your description, you might not want to first drop secondary indexes, but that is your call (and maybe check with the vendor)...

Have a read of : http://msdn.microsoft.com/en-us/library/ms189858.aspx before doing anything with fragmentation (other that suspending those jobs until you have a better understanding).

phil916

ASKER

Well then that pretty much sums up the reasons for the log growth, my fault but kind of not.

The little server was chugging along with 4gb of memory and a decent disk set up but hardly best practice and certainly opportunity to do it better. Lurking were jobs running in the night grabbing large (relatively) amounts of data and pulling them over to another server.

Along comes some reporting that needs to be done pronto or money will be lost. Look the vendor even supplied queries ready to go, some configuration tweaking and you are all set. Then the first query runs, almost an hour later cancel the query and try some others, same results. You start investigating and you see all kinds of blocking and many different types of waits. OS is not happy either. Memory pressure maybe? Give it another 4gb (VM) no help. On and on same story it really sounds like memory pressure. Are you sure that the memory is not restricted on the vm side? Sure enough there is a restriction. When that is removed and some fiddling with max memory things are really good SQL server is really happy. The queries run in a few minutes an index here and an index there and wow these things are running in seconds.
Overhead not that bad.

Then there comes round 2. The vendor that made the queries has some kind of "partition" scheme sort of but not really. Wow really long query to fill up some staging tables and then you get to query those tables. Very front loaded but easier on the back end except cannot get the first part to load. Taking forever 7+ hours. Breakdown the large sproc one section at a time tune tweak. One is absolutely stubborn. Finally a query hint (had to do it) and some more tuning and we have it all soup to nuts 20 minutes.

I imagine all this time the transaction log is filling and with a 10% growth rate the exponential part kicks in. Throw in some fragmentation jobs (never was being done before) a checkdb job (had not ever been run, since installation, 3 years) and this is an accident waiting to happen.

That will teach me.

ASKER CERTIFIED SOLUTION

Mark Wills

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

phil916

ASKER

You were great! Thank you