Solved

Getting Wait type: Pageiolatch_sh and CXPACKET when I attempt to UPDATE a table with 200 million records. (500 Points)

Posted on 2004-09-01
5
1,953 Views
Last Modified: 2012-06-21
I have a table that has about 200 million records in a database that is about 300 GB. The database is on a server with 4 CPUs.
I also have a Stored Procedure that is to clean the data in one of the columns of the table. When I run the SP shown below via Query Analyzer I get four processes with the same spid in current activity with waits as shown below and the processes go into the "sleeping" mode:

Wait time: 235
Wait type: Pageiolatch_sh
Wait resource: 8:4:2693304
CPU: 189938
Physical I/O: 682059
Memory Usage: 0

I also get the CXPACKET wait type with a wait time of 317016.
*********************************************************************************

Below is my update Stored Procedure:

CREATE PROCEDURE dbo.usp_CleanMyLog
AS
BEGIN
SET NOCOUNT ON
DECLARE @MinCount int,@MinLogID int, @MaxCount int, @MaxLogID int

SELECT @MinLogID = MIN(tblMy_Log_ID), @MaxLogID = MAX(tblMy_Log_ID) FROM dbo.tblMy_Log

--SELECT @MinLogID As MinLogID, @MaxLogID As MaxLogID

SELECT @MinCount = @MinLogID , @MaxCount = @MinLogID + 100000

WHILE @MaxCount < @MaxLogID
BEGIN

UPDATE tblMy_Log
SET Description = CASE
WHEN PATINDEX('Name Changed From%',Description)<> 0 THEN 'Name Changed'
WHEN PATINDEX('Social Security Number Changed From%',Description)<> 0 THEN 'Social Security Number Changed'
WHEN PATINDEX('Date of Birth Changed From%',Description)<> 0 THEN 'Date of Birth Changed'
WHEN PATINDEX('Address Changed From%',Description)<> 0 THEN 'Address Changed'
WHEN PATINDEX('Home Phone Number Changed From%',Description)<> 0 THEN 'Home Phone Number Changed'
WHEN PATINDEX('Work Phone Number Changed From%',Description)<> 0 THEN 'Work Phone Number Changed'
WHEN PATINDEX('Fax Phone Number Changed From%',Description)<> 0 THEN 'Fax Phone Number Changed'
WHEN PATINDEX('Email Address Changed From%',Description)<> 0 THEN 'Email Address Changed'
ELSE Description
END
WHERE Activity_Type = 'UPDATE: Customer' AND tblMy_Log_ID BETWEEN @MinCount AND @MaxCount

PRINT 'Activity with Log_ID BETWEEN ' +convert(varchar(30),@MinCount)+ '  AND ' +convert(varchar(30),@MaxCount)

      SELECT @MinCount = @MaxCount + 1 , @MaxCount = @MaxCount + 100000

      IF @MaxCount >= @MaxLogID
            SELECT @MaxCount = @MaxLogID

END
SET NOCOUNT OFF
END
GO

********************************************************************
I have done the following:

sp_configure 'show advanced option', 1

sp_configure 'max degree of parallelism', 1 to disable parallelism but this did not help

What can I do to prevent the waits? Recommendations to the improvement of my SP is welcome.
0
Comment
Question by:DeMyu
  • 2
5 Comments
 
LVL 18

Accepted Solution

by:
ShogunWade earned 250 total points
ID: 11951835
Not a lot but vastly scaling up your kit.   It is waiting for a light weight lock.
depending on other tasks, you could TABLOCK the table which reduces the numer of writes and reads required from syslock table.     often vastly improving performance.    but of course you will then block everything else wanting acess to this table.
0
 
LVL 34

Assisted Solution

by:arbert
arbert earned 250 total points
ID: 11955374
CXpacket usually comes from parallel plans--one part of your task is waiting on the results from another thread in the task.....I think the above waits are pretty normal for a large update with parallelism.

You might make sure you have the LOGS and the data files on different DISKS/RAID Groups and I would check windows Perfmon and see what your IO waits look like--IO waits are probably your biggest issue.....

Is your update querying using an index on tblmy_log_id?
0
 
LVL 18

Expert Comment

by:ShogunWade
ID: 12158479
agree
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Slow Connectivity over ODBC 8 35
Webservices in T-SQL 3 31
Sql server, import complete table, using vb.net 9 34
Show Results for Latest DateTime in a View 27 25
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
I have a large data set and a SSIS package. How can I load this file in multi threading?
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question