asked on

Execute Task & Deadlock

Hi Experts,

I have one SSIS package for 7 clients. The package needs to run for all 7 at once. They all have an execute task that updates the same table. The update has a where clause with a CompanyID variable. So the task only updates the client it is being executed for. I actually update a few fields, but they are all based on CompanyID so I don't mix anything up.

This works fine when I run them one at a time. When I run all 7 at once I get frequent deadlocks. I've tried creating indexes based on the particular fields I'm updating. I'm not sure if that was the right choice. I've also tried changing the IsolationLevel for the entire package and the specific execute task that is causing the deadlock. I'm unable to find the right combination that will stop the deadlocks. Is this not possible?

How do I get this SSIS to complete without deadlocks?

Using: SQL Server 2008 R2

Scott Pletcher

You need to make sure you get the right clustered index.

How does the data in the table relate to CompanyID?
How many distinct values of CompanyID are in the table? How many total rows?

If you want to allow me to get more detailed, please replace the values in bold with your info, run the code below, and post the results. All indexes should always be considered, not one index at a time. Finally, reviewing missing index and index usage stats is the minimum to do before making index modifications.

USE [<your_db_name_here>]

SET DEADLOCK_PRIORITY LOW --probably irrelevant, but just in case

DECLARE @list_missing_indexes bit
DECLARE @table_name_pattern sysname

--NOTE: showing missing indexes can take some time; set to 0 if you don't want to wait.
SET @list_missing_indexes = 1
SET @table_name_pattern = '<your_table_name_here>'
--SET @table_name_pattern = '%' --'%'=all tables.

PRINT 'Started @ ' + CONVERT(varchar(30), GETDATE(), 120)

IF @list_missing_indexes = 1
BEGIN
SELECT
GETDATE() AS capture_date,
DB_NAME(mid.database_id) AS Db_Name,
dps.row_count,
OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,
mid.equality_columns, mid.inequality_columns, mid.included_columns,
user_seeks, user_scans, ca1.max_days_active, unique_compiles,
last_user_seek, last_user_scan, avg_total_user_cost, avg_user_impact,
system_seeks, system_scans, last_system_seek, last_system_scan, avg_total_system_cost, avg_system_impact,
mid.statement, mid.object_id, mid.index_handle
FROM sys.dm_db_missing_index_details mid WITH (NOLOCK)
CROSS APPLY (
SELECT DATEDIFF(DAY, create_date, GETDATE()) AS max_days_active FROM sys.databases WHERE name = 'tempdb'
) AS ca1
LEFT OUTER JOIN sys.dm_db_missing_index_groups mig WITH (NOLOCK) ON
mig.index_handle = mid.index_handle
LEFT OUTER JOIN sys.dm_db_missing_index_group_stats migs WITH (NOLOCK) ON
migs.group_handle = mig.index_group_handle
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = mid.object_id AND
dps.index_id IN (0, 1)
--order by
--DB_NAME, Table_Name, equality_columns
WHERE
1 = 1
AND mid.database_id = DB_ID() --only current db
AND OBJECT_NAME(mid.object_id) LIKE @table_name_pattern
--AND mid.object_id IN (OBJECT_ID('<table_name_1>'), OBJECT_ID('<table_name_2>'))
ORDER BY
--avg_total_user_cost * (user_seeks + user_scans) DESC,
Db_Name, Table_Name, equality_columns, inequality_columns
END --IF

PRINT 'Midpoint @ ' + CONVERT(varchar(30), GETDATE(), 120)
-- list index usage stats (seeks, scans, etc.)
SELECT
ius2.row_num, DB_NAME() AS db_name,
CASE WHEN i.name LIKE ca2.table_name + '%'
THEN '~' + SUBSTRING(i.name, LEN(ca2.table_name) + 1 +
CASE WHEN SUBSTRING(i.name, LEN(ca2.table_name) + 1, 1) = '_' THEN
CASE WHEN SUBSTRING(i.name, LEN(ca2.table_name) + 2, 1) = '_' THEN 2 ELSE 1 END
ELSE 0 END, 200)
ELSE i.name END AS index_name,
CASE WHEN i.is_unique = 0 THEN 'N' ELSE 'Y' END + '.' +
CASE WHEN i.is_primary_key = 0 AND i.is_unique_constraint = 0 THEN 'N' ELSE 'Y' END AS [uniq?],
ca2.table_name,
i.index_id, --ius.user_seeks + ius.user_scans AS total_reads,
dps.row_count,
SUBSTRING(key_cols, 3, 8000) AS key_cols, SUBSTRING(nonkey_cols, 3, 8000) AS nonkey_cols,
ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,
ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,
fk.Reference_Count AS fk_ref_count,
DATEDIFF(DAY, CASE WHEN o.create_date > ca1.sql_startup_date THEN o.create_date
ELSE ca1.sql_startup_date END, GETDATE()) AS max_days_active,
FILEGROUP_NAME(i.data_space_id) AS filegroup_name,
ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,
ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update
FROM sys.indexes i WITH (NOLOCK)
INNER JOIN sys.objects o WITH (NOLOCK) ON
o.object_id = i.object_id
CROSS JOIN (
SELECT create_date AS sql_startup_date FROM sys.databases WHERE name = 'tempdb'
) AS ca1
CROSS APPLY (
SELECT OBJECT_NAME(i.object_id/*, DB_ID()*/) AS table_name
) AS ca2
OUTER APPLY (
SELECT
', ' + COL_NAME(object_id, ic.column_id)
FROM sys.index_columns ic
WHERE
ic.key_ordinal > 0 AND
ic.object_id = i.object_id AND
ic.index_id = i.index_id
ORDER BY
ic.key_ordinal
FOR XML PATH('')
) AS key_cols (key_cols)
OUTER APPLY (
SELECT
', ' + COL_NAME(object_id, ic.column_id)
FROM sys.index_columns ic
WHERE
ic.key_ordinal = 0 AND
ic.object_id = i.object_id AND
ic.index_id = i.index_id
ORDER BY
COL_NAME(object_id, ic.column_id)
FOR XML PATH('')
) AS nonkey_cols (nonkey_cols)
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = i.object_id AND
dps.index_id = i.index_id
LEFT OUTER JOIN sys.dm_db_index_usage_stats ius WITH (NOLOCK) ON
ius.database_id = DB_ID() AND
ius.object_id = i.object_id AND
ius.index_id = i.index_id
LEFT OUTER JOIN (
SELECT
database_id, object_id, MAX(user_scans) AS user_scans,
ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+user_scans
FROM sys.dm_db_index_usage_stats WITH (NOLOCK)
WHERE
database_id = DB_ID()
--AND index_id > 0
GROUP BY
database_id, object_id
) AS ius2 ON
ius2.database_id = DB_ID() AND
ius2.object_id = i.object_id
LEFT OUTER JOIN (
SELECT
referenced_object_id, COUNT(*) AS Reference_Count
FROM sys.foreign_keys
WHERE
is_disabled = 0
GROUP BY
referenced_object_id
) AS fk ON
fk.referenced_object_id = i.object_id
WHERE
i.object_id > 100 AND
i.is_hypothetical = 0 AND
i.type IN (0, 1, 2) AND
o.type NOT IN ( 'IF', 'IT', 'TF', 'TT' ) AND
(
o.name LIKE @table_name_pattern AND
o.name NOT LIKE 'dtprop%' AND
o.name NOT LIKE 'filestream[_]' AND
o.name NOT LIKE 'MSpeer%' AND
o.name NOT LIKE 'MSpub%' AND
--o.name NOT LIKE 'queue[_]%' AND
o.name NOT LIKE 'sys%'
)
--AND OBJECT_NAME(i.object_id /*, DB_ID()*/) IN ('tbl1', 'tbl2', 'tbl3')
ORDER BY
--row_count DESC,
--ius.user_scans DESC,
--ius2.row_num, --user_scans&|user_seeks
db_name, table_name,
-- list clustered index first, if any, then other index(es)
CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END,
key_cols

PRINT 'Ended @ ' + CONVERT(varchar(30), GETDATE(), 120)

Alpesh Patel

Hey,

you can use UPDLock table hint to prevent deadlocks

see below link
http://www.sqlnotes.info/2012/10/10/update-with-updlock/

jay-are

ASKER

Scott,

I created the index on my temp table based on what I have read. It didn't make any difference, I think it actually hurt performance. I definitely did not create a clustered index. I'm basically throwing changes at this package to see what works. This is my first time trying to execute a package with multiple clients simultaneously. I executed your query.

capture_date	Db_Name	row_count	Table_Name	equality_columns	inequality_columns	included_columns	user_seeks	user_scans	max_days_active	unique_compiles	last_user_seek	last_user_scan	avg_total_user_cost	avg_user_impact	system_seeks	system_scans	last_system_seek	last_system_scan	avg_total_system_cost	avg_system_impact	statement	object_id	index_handle
2014-07-30 10:38:21.693	CLOC	2127506	gldetailload	NULL	[TransactionAmount]	NULL	1	0	1	1	2014-07-30 02:45:53.213	NULL	25.5610632031296	95.04	0	0	NULL	NULL	0	0	[CLOC].[dbo].[gldetailload]	821577965	85731
2014-07-30 10:38:21.693	CLOC	2127506	gldetailload	[LastModified]	NULL	NULL	2858	0	1	5	2014-07-30 02:45:00.073	NULL	24.9434831542073	99.73	0	0	NULL	NULL	0	0	[CLOC].[dbo].[gldetailload]	821577965	22002
2014-07-30 10:38:21.693	CLOC	2127506	gldetailload	[MADate]	NULL	NULL	2858	0	1	2652	2014-07-30 02:44:59.673	NULL	27.7363369582885	98.47	0	0	NULL	NULL	0	0	[CLOC].[dbo].[gldetailload]	821577965	21998
2014-07-30 10:38:21.693	CLOC	2127506	gldetailload	[uid]	NULL	NULL	2859	0	1	32	2014-07-30 02:45:01.473	NULL	31.0946669895849	98.68	0	0	NULL	NULL	0	0	[CLOC].[dbo].[gldetailload]	821577965	22000

Open in new window

row_num	db_name	index_name	uniq?	table_name	index_id	row_count	key_cols	nonkey_cols	user_seeks	user_scans	user_lookups	user_updates	last_user_seek	last_user_scan	last_user_lookup	last_user_update	fk_ref_count	max_days_active	filegroup_name	system_seeks	system_scans	system_lookups	system_updates	last_system_seek	last_system_scan	last_system_lookup	last_system_update
15	CLOC	NULL	N.N	gldetailload	0	2127506	NULL	NULL	0	11951	3191	34599	NULL	2014-07-30 02:46:42.590	2014-07-30 02:44:59.833	2014-07-30 02:45:53.547	NULL	1	PRIMARY	0	211	0	0	NULL	2014-07-30 02:45:53.363	NULL	NULL
15	CLOC	IX_gldetailload	N.N	gldetailload	13	2127506	CompanyNumber, id, aid, uid, MADate, LastModified, TransactionAmount, TransNumber, TranSeqNumber	NULL	2874	11439	0	17641	2014-07-30 02:45:02.383	2014-07-30 02:45:53.213	NULL	2014-07-30 02:45:53.213	NULL	1	PRIMARY	0	42	0	0	NULL	2014-07-30 02:42:15.017	NULL	NULL

Open in new window

I doubt that will format correctly. Might be easier to copy/paste into excel. This loading table can have as little as 0 rows and as much as 100k rows. Depends on the client. CompanyID is just a way to identify that specific row with a client. There are 7 distinct clientIDs.

Just out of curiousity. The execute task that is causing the problem contains an expression for a few different update statements. Here:

"
update gldetailload set transactionamount = convert(money,transactionamount) where uid is null and companynumber = '" + @[User::CoID] + "';
go

update gldetailload set madate = '" + @[User::MADate]  + "' where madate is null and companynumber = '" + @[User::CoID] + "';
go

declare @EC varchar(10)
set @EC = '" + @[User::EnterpriseCode] + "'
declare @CO varchar(10)
set @CO = '" + @[User::CoID] + "'

update gldetailload set uid =  + @EC  + ':' + @CO + ':' + [accountnumber] + ':' + convert(varchar,[transactionamount]) + ':' + [transnumber] + ':' + [transeqnumber] where uid is null and companynumber = '" + @[User::CoID] + "';
go
update gldetailload set lastmodified = getdate() where lastmodified is null and companynumber = '" + @[User::CoID] + "';
go
"

Open in new window

Should I separate these into individual tasks?

ASKER CERTIFIED SOLUTION

Scott Pletcher

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

jay-are

ASKER

Scott,

I hadn't realized there was that much data in that table already. It is accounting history.

The "id" column shouldn't be in that index actually. I don't use it. I created the index in a hurry and threw in a few columns just in case. This table does not contain an identity field. The field "uid" is what I use as a unique identifier. I use the update statements to build "uid".

I truncated the table and dropped the existing index. I used your code and simply replaced "id" with "uid". I did not create a nonclustered index for "uid".

The ssis package is faster today leading me to another question. I went through around 25 of these temp/load tables yesterday creating nonclustered indexes. Then I started testing the SSIS packages. The server became a little sluggish and I was unable to right click/properties the [cloc] db. Did the indexes need time to build after I created them? Is there some way to query their current progress before I test my package out after creating the indexes you suggested?

Scott Pletcher

>> I used your code and simply replaced "id" with "uid". <<

Whoa. You do not want to do that if the value for "uid" is updated / assigned later. Simply remove the column from the index:

CREATE CLUSTERED INDEX gldetailload__CL ON gldetailload ( CompanyNumber ) WITH ( FILLFACTOR = 96, SORT_IN_TEMPDB = ON ) ON [PRIMARY]

>> Did the indexes need time to build after I created them? <<
Yes.

>> Is there some way to query their current progress before I test my package out after creating the indexes you suggested? <<
Unless you built them using "ONLINE = ON", either explicitly or implicitly, the entire table will be locked until the index build is completed.

The first index created on any table (other than, perhaps, a purely staging table) should be a clustered index. Then, if needed, add nonclustered index(es).

jay-are

ASKER

Scott,

Thank you for your help. I have made the changes and removed uid from the clustered index (drop & recreate). I also created a nonclustered index for uid.

I will run the package for all clients now to test.

Scott Pletcher

Great; let me know how it goes.

jay-are

ASKER

Scott,

The package fails on the DFT now. The error:

"Cannot insert duplicate key row in object 'dbo.gldetailload' with unique index 'gldetailload__CL'.". This occurs before the Execute task. Is this because the "CompanyNumber" field isn't unique?

jay-are

ASKER

Scott,

I removed the "Unique" check for that index. Package is running now.

jay-are

ASKER

Two of the seven clients have completed. No locks. The other are larger clients and will take another hour or so to complete.

While I wait to make sure they don't lock I will begin making these index modifications to my other load tables and testing those packages. I will report back and award points later. I appreciate the help!

Scott Pletcher

You're welcome!

jay-are

ASKER

Resolved the deadlocks!

Scott Pletcher

Fantastic!

Just to be clear, because you'll hear otherwise from many SQL "experts":

The best clustered index is the single most important factor in overall table performance.

Their preferred method instead is to automatically cluster on identity and then build (gazillion) nonclustered "covering indexes" for (virtually) every query after the fact; that is just wrong. Instead, get the clustered index right, then build the limited number of nonclustered indexes that are actually required.

jay-are

ASKER

I appreciate you taking the time to explain this to those less-informed (me)!

I need to re-evaluate my other existing indexes I suppose. :)

Execute Task &amp; Deadlock

Execute Task & Deadlock