CHANGING LARGE UPDATE STATEMENT TO RUN IN BATCHES OF 500

I have an update statement that is running on a table. I have given a small example of it below.

The problem I have is that there are 448653 records.

Basically I want to add some sort of loop that will commit the update every 500 records rather than doing them all at the same time.

Can someone give me some advice.

Thanks
UPDATE 
	Customer
SET
	ModifiedDate = GETDATE(),
	ModifiedBy = 808  
FROM 
	Customer c (NOLOCK)
INNER JOIN Prospect p (NOLOCK)
	ON c.CustomerId = p.CustomerId

Open in new window

scm0smlAsked:
Who is Participating?
 
chapmandewCommented:
this shoudl work:


declare @rowcount int
set @rowcount = 1

while @rowcount > 0
begin
SET ROWCOUNT 500

UPDATE Customer
SET
        ModifiedDate = GETDATE(),
        ModifiedBy = 808  
FROM
        Customer c (NOLOCK)
INNER JOIN Prospect p (NOLOCK)
        ON c.CustomerId = p.CustomerId
where modifieddate < getdate() or modifieddate is null

set @rowcount = @@rowcount

end
0
 
chapmandewCommented:
something like this (if you're on 2005)

declare @rowcount int
set @rowcount = 1

while @rowcount > 0
begin
UPDATE top(500) Customer
SET
        ModifiedDate = GETDATE(),
        ModifiedBy = 808  
FROM
        Customer c (NOLOCK)
INNER JOIN Prospect p (NOLOCK)
        ON c.CustomerId = p.CustomerId

set @rowcount = @@rowcount

end
0
 
chapmandewCommented:
whoops...forgot something:

something like this (if you're on 2005)

declare @rowcount int
set @rowcount = 1

while @rowcount > 0
begin
UPDATE top(500) Customer
SET
        ModifiedDate = GETDATE(),
        ModifiedBy = 808  
FROM
        Customer c (NOLOCK)
INNER JOIN Prospect p (NOLOCK)
        ON c.CustomerId = p.CustomerId
where modifieddate < getdate() or modifieddate is null

set @rowcount = @@rowcount

end
0
Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

 
scm0smlAuthor Commented:
i dont really need to worry about:
where modifieddate < getdate() or modifieddate is null


can you just explain how that loop then knows to do the next 500 etc, so the 2nd time arounf do rows 500-1000 etc?
0
 
chapmandewCommented:
I included the criteria so you're not constantly updating the same 500 records everytime.

so, it starts the loop and updates 500, and will continue updating in 500 record increments so long as there are records to update.  when you finally run an update statement that does not update any records, the loop will exit.
0
 
scm0smlAuthor Commented:
ahhh im with u :)

im getting:
Incorrect syntax near the keyword 'TOP'.

I can't see anything wrong with that can you?
0
 
chapmandewCommented:
are you on sql 2005?  is your database compatability set to 90?
0
 
scm0smlAuthor Commented:
im on 2005 yes.

not sure how to check the compatibility?
0
 
scm0smlAuthor Commented:
ITS 80.
0
 
scm0smlAuthor Commented:
used sp_dbcmptlevel to find it
0
 
chapmandewCommented:
you need to change it to 90 if you can.  

use yourdb
exec sp_dbcmptlevel 90
0
 
ZberteocCommented:
In my opinion this is bad practice. There is no justification to beak an update statement in chunks when it can be done in one batch. This will not necessarily help performance. Overall it will take at least the same amount of time to do the update and maybe more as the query is "prepared" for each chunk. And less then 500 thousand rows are a joke for SQL server.


0
 
scm0smlAuthor Commented:
DAMN!!!

ARRGGGHHHHHHHH

I've been an idiot. This is a 2000 database!!

Is it just the top that wont work? is there a work around?
0
 
chapmandewCommented:
Z - the idea is to not leave one really big transaction open for any large amount of time.  Breaking it into chunks can prevent that
0
 
ZberteocCommented:
but for this size of table that statement should not take too long unless you have a lots of indexes on it.
0
 
chapmandewCommented:
I agree...but your idea of "not that long" and the authors might be different...
0
 
ZberteocCommented:
on the other hand placing the update in a loop will not make any difference as it will run the update again one iteration after another and the transaction will still be open.
0
 
scm0smlAuthor Commented:
There are a lot more joins than i have shown, each of the tables have around half a million records in.

Sorry I'm prob missing something again but can't see what:
SET ROWCOUNT 500

is doing?
0
 
scm0smlAuthor Commented:
"on the other hand placing the update in a loop will not make any difference as it will run the update again one iteration after another and the transaction will still be open." - doesnt the transaction commit each time?
0
 
chapmandewCommented:
they will be different transactions unless you specify a BEGIN TRANSACTION statement.
0
 
chapmandewCommented:
SET ROWCOUNT is the 2000 version of the UPDATE TOP(500) statement.  I am using it to set the number of records you're updating to 500 each time...make sense.
0
 
chapmandewCommented:
meant that to say

...make sense?
0
 
scm0smlAuthor Commented:
ah right ok im with you.
0
 
scm0smlAuthor Commented:
Thanks for your help!!

We got there in the end.
0
 
ZberteocCommented:
OK, try this:
declare
	@upd_date datetime
select 
	@upd_date = GETDATE()
 
 
 
	UPDATE top 500
		Customer
	SET
		ModifiedDate = @upd_date,
		ModifiedBy = 808  
	FROM 
		Customer c (NOLOCK)
		INNER JOIN Prospect p (NOLOCK)
				ON c.CustomerId = p.CustomerId
	where 
		c.ModifiedDate<>@upd_date
-- ### --
while  @@rowcount<>0
begin 
	select 
		@upd_date = GETDATE()
 
	UPDATE top 500
		Customer
	SET
		ModifiedDate = @upd_date,
		ModifiedBy = 808  
	FROM 
		Customer c (NOLOCK)
		INNER JOIN Prospect p (NOLOCK)
				ON c.CustomerId = p.CustomerId
	where 
		c.ModifiedDate<>@upd_date
end 

Open in new window

0
 
ZberteocCommented:
oops, won't work
0
 
ZberteocCommented:
it will run forever. the problem here is that you want to know what rows were already updated and what were not. at the same time you want to update only the rows that were not yet updated. The oly way is to use a timestamp with ModifiedDate column, which I tried but the problem is that it will never stop as already ModifiedDate values will be always <> @upd_date.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.