Solved

Mass updating a large set of data with less overhead

Posted on 2013-12-03
8
233 Views
Last Modified: 2013-12-10
I have a large table with about 30 million rows.

There is one column in the table that I have to update to a new value (currently 0, needs to be null).

When I run the normal update table set column = null where column = 0 the query takes for ever, blows up the size of tempdb, etc, etc.

Is there a better way to do this with less overhead?
0
Comment
Question by:ccleebelt
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
8 Comments
 
LVL 25

Expert Comment

by:chaau
ID: 39694272
I think you need to update in batches, like this:
set noCount on
set rowCount 10000
While 1=1 begin
  Begin Transaction
  update Table1 Set Column1 = NULL WHERE Column1 = 0
  COMMIT
  if @@rowCount = 0 break
end
set rowCount 0
set noCount off

Open in new window

0
 

Author Comment

by:ccleebelt
ID: 39710261
Chaau - this works great.  

How would I adjust the query above to just keep looping through chunks of 10,000 so I don't have to run the query over and over again on a large data set?

Also, would it kill performance to put more than on update clause in the transaction?  I need to update several columns in the table to null if they are 0.

ex. update table1 set column2 = null where column2 = 0

Thanks again
0
 
LVL 25

Expert Comment

by:chaau
ID: 39710277
I expected that one to work. Now I see where I've made a mistake. Please use the following script. That will run in a loop until all records updated:
set noCount on
set rowCount 10000
declare @LastUpdated int
While 1=1 begin
  Begin Transaction
  update Table1 Set Column1 = NULL WHERE Column1 = 0
  SET @LastUpdated = @@rowCount
  COMMIT
  if @LastUpdated = 0 break
end
set rowCount 0
set noCount off

Open in new window

As for your other question. You need to estimate how many records are updated in your other tables. If you think that the number of records is not big, you can add more updates as well. Otherwise you need to adjust the ROWCOUNT value to a smaller number
0
NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

 

Author Comment

by:ccleebelt
ID: 39710290
Thanks.  This is exactly what I needed.

Last follow up.  If I put indexes on the columns to be updated first would it go faster?
0
 
LVL 25

Expert Comment

by:chaau
ID: 39710306
It will help if you put indexes on the columns appearing in your where clause. In your case it is the same column, but imagine if you had a statement like this:
Update Table1 Set Column1 = 1 Where Column2 = 1

Open in new window

In this case the index on column1 will do nothing (in fact it will decrease the update speed, the the index will be required to be re-indexed by the RDBMS)
However, if you create an index for Column2 the update will be faster
0
 

Author Comment

by:ccleebelt
ID: 39710317
yeah - the query is update table1 set column1 = null where column1 = 0

So I am updating the same column that is in the where.   Would that slow it down?
0
 
LVL 25

Accepted Solution

by:
chaau earned 500 total points
ID: 39710328
It is hard to say. My best bet is that it will improve the performance. As you see, it is a catch22 situation. The index will help to locate the required records in your where clause. At the same time, the actual update will be a bit slower, because of the re-indexing overhead.

You really need to measure the performance yourself using your own data. Most likely, the index will help, as the modern RDBMSes have "smartness" in how the indexes are re-generated during updates. In your case perhaps the re-indexing overhead will be minimal.
0
 

Author Closing Comment

by:ccleebelt
ID: 39710613
Thanks for your help.  Very useful.
0

Featured Post

Instantly Create Instructional Tutorials

Contextual Guidance at the moment of need helps your employees adopt to new software or processes instantly. Boost knowledge retention and employee engagement step-by-step with one easy solution.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
What if you have to shut down the entire Citrix infrastructure for hardware maintenance, software upgrades or "the unknown"? I developed this plan for "the unknown" and hope that it helps you as well. This article explains how to properly shut down …
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

617 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question