?
Solved

updating one column on 18 million record table

Posted on 2011-09-23
9
Medium Priority
?
215 Views
Last Modified: 2012-05-12
there is an update on a column and the inner join is between two table. 18M records are being affected.the query is running for 1 hour and log has grown from 1.5 GB to 16 GB.. there is 20 GB more available on the disk.. so it is OK for now..

theh recovery is simple

is it possible to ask it to commit each update and not do it as a transaction, so that the log is not impacted like this?
0
Comment
Question by:25112
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
9 Comments
 
LVL 75

Accepted Solution

by:
Aneesh Retnakaran earned 500 total points
ID: 36590678
space seems to be fine
you can try shrinking the logfile from another query window, this may slow down your update

DBCC SHRINKFILE(logName)

>is it possible to ask it to commit each update and not do it as a transaction, so that the log is not impacted like this?
no, you need to do these kinds of perations as small batches. like updating 10K records at a time
0
 
LVL 5

Author Comment

by:25112
ID: 36590829
>>no, you need to do these kinds of perations as small batches. like updating 10K records at a time
10K records - you mean manually? it will be troublesome with 18M records.. how do we track which ones have been updated yet or not etc

the join is as simple as
update a set col1 = b.col1 from tablea a join tableb b on a.id = b.id and a.col2='value' and b.col2='value'
0
 
LVL 42

Assisted Solution

by:dqmq
dqmq earned 1000 total points
ID: 36590874
This will update 10000 rows that havn't been updated yet.  Repeat until no rows are updated.


update a set col1 = b.col1
from tablea a join tableb b on a.id = b.id and a.col2='value' and b.col2='value'
inner join
(
select top 10000 from tablea a inner join tableb b on a.id = b.id  and a.col2='value' and b.col2='value'
  where a.col2='value' <>  b.col2='value'
) as c on c.id = a.id  






  where b.col1 <> b.cola
0
Percona Live Europe 2017 | Sep 25 - 27, 2017

The Percona Live Open Source Database Conference Europe 2017 is the premier event for the diverse and active European open source database community, as well as businesses that develop and use open source database software.

 
LVL 42

Expert Comment

by:dqmq
ID: 36590875
Pls ignore last line.  
0
 
LVL 75

Assisted Solution

by:Anthony Perkins
Anthony Perkins earned 500 total points
ID: 36590906
>>the join is as simple as
update a set col1 = b.col1 from tablea a join tableb b on a.id = b.id and a.col2='value' and b.col2='value' <<
Are the appropriate columns indexed.  If they are not It could take you several days to complete.  Once you have it indexed correctly consider re-writing it as follows:
UPDATE  a
SET     col1 = b.col1
FROM    tablea a
        JOIN tableb b ON a.id = b.id
WHERE   a.col2 = 'value'
        AND b.col2 = 'value'
0
 
LVL 5

Author Comment

by:25112
ID: 36601461
thanks I followed that logic and also indexed.. here is another similar query to update 23M records. Indexed NC on the join columns... but there is no other condition other than the join itself.. how could it be done to iterate 10K each time?

UPDATE A
SET A.Col1 = B.Col1
FROM A JOIN B ON A.ID = B.ID
0
 
LVL 42

Assisted Solution

by:dqmq
dqmq earned 1000 total points
ID: 36616838
Update A
Set A.col1 = B.col1
from A join B on A.id = b.id
join
(Select top 10000 A.* from A  join B on A.id = B.id
 where a.col1 <> b.col1 ) as c  on c.id = a.id
0
 
LVL 5

Author Comment

by:25112
ID: 36639850
that is neat..

if a.col1 or b.col1 is null, will it need some tweaking?
0
 
LVL 5

Author Comment

by:25112
ID: 36709395
dqmq, would you then put the update within a while loop?
0

Featured Post

Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A Stored Procedure in Microsoft SQL Server is a powerful feature that it can be used to execute the Data Manipulation Language (DML) or Data Definition Language (DDL). Depending on business requirements, a single Stored Procedure can return differe…
Recently we ran in to an issue while running some SQL jobs where we were trying to process the cubes.  We got an error saying failure stating 'NT SERVICE\SQLSERVERAGENT does not have access to Analysis Services. So this is a way to automate that wit…
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Suggested Courses

765 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question