Hello Experts,

We doing the stress test where replication hits about 22 minutes of latency. The threshold limits is 15 mins. How to analysis and find the Root cause.

Steps taken.

Ran below query and find the output details in the attachment. I personally feel there is I/O issue. Avg.disk queue lenght is about 40. But when we escalated to SAN team, they said this is normal. But in internet blogs says ,avg.disk queue length shouldn't be morethan 2 .
Please help me out of trouble.

SELECT  r.session_id ,
        r.[status] ,
        r.wait_type ,
        r.scheduler_id ,
        SUBSTRING(qt.[text], r.statement_start_offset / 2,
            ( CASE WHEN r.statement_end_offset = -1
                   THEN LEN(CONVERT(NVARCHAR(MAX), qt.[text])) * 2
                   ELSE r.statement_end_offset
              END - r.statement_start_offset ) / 2) AS [statement_executing] ,
        DB_NAME(qt.[dbid]) AS [DatabaseName] ,
        OBJECT_NAME(qt.objectid) AS [ObjectName] ,
        r.cpu_time ,
        r.total_elapsed_time ,
        r.reads ,
        r.writes ,
        r.logical_reads ,
FROM    sys.dm_exec_requests AS r
        CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS qt
WHERE   r.session_id > 50
ORDER BY r.scheduler_id

Open in new window

My question :
Database has around 20 filegroups and each has 1/2 files in it.
How find which files is causing this issue.

Most of time the time writelog thread goes into suspended state.What does it mean?
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Ryan McCauleyEnterprise Analytics ManagerCommented:
For starters, a disk queue length of 40 is really concerning - If I saw anything over about 5, I'd start to get concerned. Can you tell what the translates into as far as disk wait time (in ms)? A queue of 40 is really long, and your SAN should be keeping things moving much more quickly than that, but if you're seeing really low latency and high throughput anyways, that may not be the root cause of the lag you're seeing, though I suspect it is.

Have you run a non-replication stress test on the storage? Is the storage performing well under normal usage, but just lagging excessively under replication load, or is it just that the replication lag actually puts a number (in minutes of lag) to slowness you've been seeing all along?

I use a tool called SQLIO to stress test the storage, as it simulates SQL-esque disk activity and gives you real throughput, latency, and Iops numbers that you can use to compare things. You can read about it and download it here:


or you can use Powershell to import/analyze the output if you want:


Once you download and install that tool, you can use the script I've attached, which is what I use when testing (change the extension to BAT from TXT) - it runs for about 45 minutes and performs a pretty wide array of tests (each for 15 seconds), and should give you a good overall feel for your performance. If you're seeing excessive latency, iops in the hundreds, or MB/sec in the single-digits, I think it's time to turn it back over to your storage team as that's definitely not normal.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
How many replication are setup? Is the distributor server different or the publisher itself is acting as a distributor? What kind of replication is it and are all the subscription agents of push type? How many different disks are there?
How many replication are setup? 2 transactional replication
 Is the distributor server different or the publisher itself is acting as a distributor? -both lies on the same server
What kind of replication is it and are all the subscription agents of push type? push
How many different disks are there? 12 disk
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2005

From novice to tech pro — start learning today.