Solved

updating one column on 18 million record table

Posted on 2011-09-23
9
212 Views
Last Modified: 2012-05-12
there is an update on a column and the inner join is between two table. 18M records are being affected.the query is running for 1 hour and log has grown from 1.5 GB to 16 GB.. there is 20 GB more available on the disk.. so it is OK for now..

theh recovery is simple

is it possible to ask it to commit each update and not do it as a transaction, so that the log is not impacted like this?
0
Comment
Question by:25112
9 Comments
 
LVL 75

Accepted Solution

by:
Aneesh Retnakaran earned 125 total points
ID: 36590678
space seems to be fine
you can try shrinking the logfile from another query window, this may slow down your update

DBCC SHRINKFILE(logName)

>is it possible to ask it to commit each update and not do it as a transaction, so that the log is not impacted like this?
no, you need to do these kinds of perations as small batches. like updating 10K records at a time
0
 
LVL 5

Author Comment

by:25112
ID: 36590829
>>no, you need to do these kinds of perations as small batches. like updating 10K records at a time
10K records - you mean manually? it will be troublesome with 18M records.. how do we track which ones have been updated yet or not etc

the join is as simple as
update a set col1 = b.col1 from tablea a join tableb b on a.id = b.id and a.col2='value' and b.col2='value'
0
 
LVL 42

Assisted Solution

by:dqmq
dqmq earned 250 total points
ID: 36590874
This will update 10000 rows that havn't been updated yet.  Repeat until no rows are updated.


update a set col1 = b.col1
from tablea a join tableb b on a.id = b.id and a.col2='value' and b.col2='value'
inner join
(
select top 10000 from tablea a inner join tableb b on a.id = b.id  and a.col2='value' and b.col2='value'
  where a.col2='value' <>  b.col2='value'
) as c on c.id = a.id  






  where b.col1 <> b.cola
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 42

Expert Comment

by:dqmq
ID: 36590875
Pls ignore last line.  
0
 
LVL 75

Assisted Solution

by:Anthony Perkins
Anthony Perkins earned 125 total points
ID: 36590906
>>the join is as simple as
update a set col1 = b.col1 from tablea a join tableb b on a.id = b.id and a.col2='value' and b.col2='value' <<
Are the appropriate columns indexed.  If they are not It could take you several days to complete.  Once you have it indexed correctly consider re-writing it as follows:
UPDATE  a
SET     col1 = b.col1
FROM    tablea a
        JOIN tableb b ON a.id = b.id
WHERE   a.col2 = 'value'
        AND b.col2 = 'value'
0
 
LVL 5

Author Comment

by:25112
ID: 36601461
thanks I followed that logic and also indexed.. here is another similar query to update 23M records. Indexed NC on the join columns... but there is no other condition other than the join itself.. how could it be done to iterate 10K each time?

UPDATE A
SET A.Col1 = B.Col1
FROM A JOIN B ON A.ID = B.ID
0
 
LVL 42

Assisted Solution

by:dqmq
dqmq earned 250 total points
ID: 36616838
Update A
Set A.col1 = B.col1
from A join B on A.id = b.id
join
(Select top 10000 A.* from A  join B on A.id = B.id
 where a.col1 <> b.col1 ) as c  on c.id = a.id
0
 
LVL 5

Author Comment

by:25112
ID: 36639850
that is neat..

if a.col1 or b.col1 is null, will it need some tweaking?
0
 
LVL 5

Author Comment

by:25112
ID: 36709395
dqmq, would you then put the update within a while loop?
0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
performance query 4 24
PROPERCASE SCRIPT IN SQL 3 8
Merge two rows in SQL 4 18
SQL Server group with two values 4 18
Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
This article shows gives you an overview on SQL Server 2016 row level security. You will also get to know the usages of row-level-security and how it works
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

825 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question