I'm Confused on I/O and logs issue-Real experts answer required

Hello Experts,

First of i apologies to all my doubts in one question. I'm putting all in one b'cos it all relate to performances .As you all know , performance needs to monitor from all kind of metrics.That's reason I'm clubbing my question.
Kindly bare with me.

[i]The following commands can be used to determine the number of non-replicated vs. replicated transactions.  A transaction log with a high percentage of non-replicated transaction will cause latency as the Log Reader scans over transaction to be ignored.  These can be database maintenance transactions such as an online reinidex or data modifications to table not being replicated.  A 25gig transaction log may take up to 15 minutes to scan depending on the IO subsystem.
http://blogs.msdn.com/b/chrissk/archive/2009/05/25/transactional-replication-conversations.aspx
Referncing above articles my questin are:
[/i]

Open in new window


1.

1. fn_dblog ->When i run it's returns me approx 2500 record.
a.What does it mean, is log under stress?
b.is there any standard baseline value?
C:How intrepret and read those values?
d:Purpose of this function


[i]SELECT DB_NAME(database_id) AS [Database Name] , file_id , io_stall_read_ms , num_of_reads , CAST(io_stall_read_ms / ( 1.0 + num_of_reads ) AS NUMERIC(10, 1)) AS 

[avg_read_stall_ms] , io_stall_write_ms , num_of_writes ,
CAST(io_stall_write_ms / ( 1.0 + num_of_writes ) AS NUMERIC(10, 1)) AS [avg_write_stall_ms] , io_stall_read_ms + io_stall_write_ms AS [io_stalls] , num_of_reads + 

num_of_writes AS [total_io] , CAST(( io_stall_read_ms + io_stall_write_ms ) / ( 1.0 + num_of_reads
+ num_of_writes) AS NUMERIC(10,1)) AS [avg_io_stall_ms]FROM sys.dm_io_virtual_file_stats(NULL, NULL)ORDER BY avg_io_stall_ms DESC ;
[/i]

Open in new window

IN TEST ENV- When i run above query avg.write for some file shows 1500 ms and read 25 ms
and for some file it shows avg read as 300ms and write 25 ms.

In production -avg.write for some file shows 300 ms and read 25 ms
and for some file it shows avg read as 30ms and write 250 ms,
My doubt is,as per microsoft suggest avg I/O shouldn't be morethan 20 ms.But in the production all components are working fine.But in the test env there is performance lag.

Am i investigating in the right way?What value for avg io stall is acceptable for each corresponding file?



[i]My top waits 
[/i]wait_type	wait_time_s	Pert%
WRITELOG	163710.2	29.63
PAGELATCH_EX	78702.41	14.25
SOS_SCHEDULER_YIELD	56319.84	10.19
CXPACKET	55898.42	10.12
BACKUPBUFFER	38336.53	6.94
ASYNC_NETWORK_IO	31898.53	5.77
PAGEIOLATCH_SH	18993.78	3.44
PAGEIOLATCH_EX	14538.09	2.63
OLEDB	13280.5	2.4
LCK_M_IX	12883.34	2.33
LATCH_EX	8873.14	1.61
LCK_M_X	8385.48	1.52
IO_COMPLETION	8056.89	1.46
LCK_M_U	7628.39	1.38
PAGEIOLATCH_SH	7099.39	1.29
ASYNC_IO_COMPLETION	6899.53	1.25

Open in new window

30% waits are on writelog.What should I do to bring better?or reduce this wait.
My recovery internal time is 0.Should need to increase it?
LVL 5
VIVEKANANDHAN_PERIASAMYAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Marten RuneSQL Expert/Infrastructure ArchitectCommented:
1.
a.What does it mean, is log under stress?
This we/and you can't say without a baseline. Though 2500 records doesn't sound like its a lot.

b.is there any standard baseline value?
No, thats the thing about a baseline. It's for your system. And you create measurements, from there you can determine whats "normal" for your system, on your hardware, set up in good practices on your company etc

C:How intrepret and read those values?
Same answer as b

d:Purpose of this function
This link is Very good:
http://www.sqlskills.com/blogs/paul/using-fn_dblog-fn_dump_dblog-and-restoring-with-stopbeforemark-to-an-lsn/

--------------------------------
IN TEST ENV- When i run above query avg.write for some file shows 1500 ms and read 25 ms
and for some file it shows avg read as 300ms and write 25 ms.

In production -avg.write for some file shows 300 ms and read 25 ms
and for some file it shows avg read as 30ms and write 250 ms,
My doubt is,as per microsoft suggest avg I/O shouldn't be morethan 20 ms.But in the production all components are working fine.But in the test env there is performance lag.

Am i investigating in the right way?What value for avg io stall is acceptable for each corresponding file?

Well, you cant just stare at one metric, and from there determine the cause. Those values is ALWAYS a rule of thumb, and differ in real life. a good link:
http://sqlmonitormetrics.red-gate.com/average-io-stalls/

-----------------------------------------------------
30% waits are on writelog.What should I do to bring better?or reduce this wait.
My recovery internal time is 0.Should need to increase it?

I recon youre trying to reduce the waits to 0
Not possible, historically you have always been waiting for something.
And this query will sum up them to 100% and show you ordered by percentage wait time.
Now what waits you have is the key, look them up at for example http://sqlpedia.com/
and read up on what they really mean.
writelog for example is logflushes, checkpoints and so forth.
So if your system is performing good, well take this values into a baseline, the CHANGE when something goes bad is your key. But remember, this is the aggregated times since last SQL Server Service start. So if it's been running for a year, and you get problems with a category, it might not show, since the aggregated data is so massive.

The above are steps in order to understanding theis systems load. Depending on the application, the quality on the programmers, etc etc. It's going to behave in one way. A different system behaves a different way. Baseline is the way to understand youre respective systems. good luck

Regards Marten
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2005

From novice to tech pro — start learning today.