Solved

updating one column on 18 million record table

Posted on 2011-09-23
9
209 Views
Last Modified: 2012-05-12
there is an update on a column and the inner join is between two table. 18M records are being affected.the query is running for 1 hour and log has grown from 1.5 GB to 16 GB.. there is 20 GB more available on the disk.. so it is OK for now..

theh recovery is simple

is it possible to ask it to commit each update and not do it as a transaction, so that the log is not impacted like this?
0
Comment
Question by:25112
9 Comments
 
LVL 75

Accepted Solution

by:
Aneesh Retnakaran earned 125 total points
ID: 36590678
space seems to be fine
you can try shrinking the logfile from another query window, this may slow down your update

DBCC SHRINKFILE(logName)

>is it possible to ask it to commit each update and not do it as a transaction, so that the log is not impacted like this?
no, you need to do these kinds of perations as small batches. like updating 10K records at a time
0
 
LVL 5

Author Comment

by:25112
ID: 36590829
>>no, you need to do these kinds of perations as small batches. like updating 10K records at a time
10K records - you mean manually? it will be troublesome with 18M records.. how do we track which ones have been updated yet or not etc

the join is as simple as
update a set col1 = b.col1 from tablea a join tableb b on a.id = b.id and a.col2='value' and b.col2='value'
0
 
LVL 42

Assisted Solution

by:dqmq
dqmq earned 250 total points
ID: 36590874
This will update 10000 rows that havn't been updated yet.  Repeat until no rows are updated.


update a set col1 = b.col1
from tablea a join tableb b on a.id = b.id and a.col2='value' and b.col2='value'
inner join
(
select top 10000 from tablea a inner join tableb b on a.id = b.id  and a.col2='value' and b.col2='value'
  where a.col2='value' <>  b.col2='value'
) as c on c.id = a.id  






  where b.col1 <> b.cola
0
 
LVL 42

Expert Comment

by:dqmq
ID: 36590875
Pls ignore last line.  
0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 
LVL 75

Assisted Solution

by:Anthony Perkins
Anthony Perkins earned 125 total points
ID: 36590906
>>the join is as simple as
update a set col1 = b.col1 from tablea a join tableb b on a.id = b.id and a.col2='value' and b.col2='value' <<
Are the appropriate columns indexed.  If they are not It could take you several days to complete.  Once you have it indexed correctly consider re-writing it as follows:
UPDATE  a
SET     col1 = b.col1
FROM    tablea a
        JOIN tableb b ON a.id = b.id
WHERE   a.col2 = 'value'
        AND b.col2 = 'value'
0
 
LVL 5

Author Comment

by:25112
ID: 36601461
thanks I followed that logic and also indexed.. here is another similar query to update 23M records. Indexed NC on the join columns... but there is no other condition other than the join itself.. how could it be done to iterate 10K each time?

UPDATE A
SET A.Col1 = B.Col1
FROM A JOIN B ON A.ID = B.ID
0
 
LVL 42

Assisted Solution

by:dqmq
dqmq earned 250 total points
ID: 36616838
Update A
Set A.col1 = B.col1
from A join B on A.id = b.id
join
(Select top 10000 A.* from A  join B on A.id = B.id
 where a.col1 <> b.col1 ) as c  on c.id = a.id
0
 
LVL 5

Author Comment

by:25112
ID: 36639850
that is neat..

if a.col1 or b.col1 is null, will it need some tweaking?
0
 
LVL 5

Author Comment

by:25112
ID: 36709395
dqmq, would you then put the update within a while loop?
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Suggested Solutions

How to leverage one TLS certificate to encrypt Microsoft SQL traffic and Remote Desktop Services, versus creating multiple tickets for the same server.
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Via a live example, show how to setup several different housekeeping processes for a SQL Server.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now