robrodp
asked on
update a large ms sql table aginst a smaller one.
I have an ms sql table with 1,000,000 records. Anotehr one with 100 records.
The query:
update largetable set largetable.ok=1 from smalltable where largetable.name=smalltable .name
Even if smalltable is indexed in name it has to go through the 1,000,000 to check each one if it is in smalltablle.
Is it possible to scan through the 100 records and update the largetable?
The query:
update largetable set largetable.ok=1 from smalltable where largetable.name=smalltable
Even if smalltable is indexed in name it has to go through the 1,000,000 to check each one if it is in smalltablle.
Is it possible to scan through the 100 records and update the largetable?
Do you have an index on largetable.name? That should reduce time.
ASKER
Yes I do... but it has to go through all the records? Regardless of the index
How about updating in batches?
This site may be useful for you, there's a specific example of a looping batch update that processes 20,000 records at a time. May be worth a look.
http://www.sqlservergeeks. com/blogs/ AhmadOsama /personal/ 450/sql-se rver-optim izing-upda te-queries -for-large -data-volu mes
This site may be useful for you, there's a specific example of a looping batch update that processes 20,000 records at a time. May be worth a look.
http://www.sqlservergeeks.
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
Cluster the small table on name.
How many of the 1M rows need UPDATEd? If it's just a few, an index could be useful. If it's a more than a low %, SQL will just scan the whole table anyway. (If the large table is clustered on name, the processing will be faster, but I wouldn't change the clustering on that large a table just to match one UPDATE).
Also, check the "ok" column and don't UPDATE if it's already = 1: NOTE: since this will force a table scan of the largetable, only do this if SQL is going to scan that table anyway.
UPDATE lt
SET lt.ok = 1
FROM dbo.largetable lt
WHERE
(lt.ok IS NULL OR lt.ok <> 1) AND
EXISTS(
SELECT 1
FROM dbo.smalltable st
WHERE
st.name = lt.name
)
How many of the 1M rows need UPDATEd? If it's just a few, an index could be useful. If it's a more than a low %, SQL will just scan the whole table anyway. (If the large table is clustered on name, the processing will be faster, but I wouldn't change the clustering on that large a table just to match one UPDATE).
Also, check the "ok" column and don't UPDATE if it's already = 1: NOTE: since this will force a table scan of the largetable, only do this if SQL is going to scan that table anyway.
UPDATE lt
SET lt.ok = 1
FROM dbo.largetable lt
WHERE
(lt.ok IS NULL OR lt.ok <> 1) AND
EXISTS(
SELECT 1
FROM dbo.smalltable st
WHERE
st.name = lt.name
)
do you have an index on largetable.name or a multicolumn index in which largetable.name is the first one ? if yes, it definitively should not read the whole table
ASKER
The question is if I can scan only the 100 records to update the large table.
If fox pro I loop through the 100 records find the records in large table (which is indexed) and it is done very fast
If fox pro I loop through the 100 records find the records in large table (which is indexed) and it is done very fast
tell us which indexes you have.
Always Take Small table first in join.
It' should be indexed.
It' should be indexed.