?
Solved

SELECT TOP x DELETE TOP x for batch move?

Posted on 2011-04-20
8
Medium Priority
?
633 Views
Last Modified: 2012-05-11
I have a table with millions of rows and I need to run stored procedures to transform the data.  200K rows is the most SQL can reasonably handle (with the way the sp's are currently written.)    I do not have any ID field in my table (DataTable below.)  I want to MOVE batches of 200K records into another table, process them, and then go get another 200K.

My question is:

If I do ...

INSERT INTO WorkingTable (col1, col2, col3)
SELECT TOP 200000 val1, val2, val3 FROM DataTable

and then

DELETE TOP 200000 FROM DataTable

Will this always delete the SAME rows I have just inserted into my WorkingTable?   Or will TOP sometimes return a different set of rows.  There is no ORDER BY statement, and the data coming in may not have a column that I can reliably sort on... not to mention the performance hit for the ORDER BY.

0
Comment
Question by:richxyz
  • 5
  • 2
8 Comments
 
LVL 58

Accepted Solution

by:
cyberkiwi earned 1000 total points
ID: 35436610
Because you have no ORDER BY, any 200000 records are game on for either statement - and there will be no guarantees.  None at all.

Use an ORDER BY to guarantee order, it's the only way.
0
 
LVL 25

Expert Comment

by:Ron Malmstead
ID: 35436662
I do the below all the time and i've never had an issue... although not knowing your table designs , I can't guarantee it will work for you every time.

INSERT INTO WorkingTable (col1, col2, col3)
SELECT TOP 200000 val1, val2, val3 FROM DataTable

DELETE TOP 200000 FROM DataTable WHERE RecordID IN(SELECT TOP 200000 RecordID FROM DataTable)


0
 
LVL 25

Expert Comment

by:Ron Malmstead
ID: 35436680
... the better way would be to have a column in the "workingtable" that you can store an ID from the "datatable" when you insert.  Then you know you are deleting the exact records by doing the following...

DELETE TOP 200000 FROM DataTable WHERE RecordID IN(SELECT DTRecordID FROM WorkingTable)

0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 
LVL 58

Expert Comment

by:cyberkiwi
ID: 35436687
You could instead in your stored proc use something like

Declare @Identifier uniqueidentifier
set @Identifier = newID()
INSERT INTO WorkingTable (batchid, col1, col2, col3)
SELECT TOP 200000 @Identifier, val1, val2, val3 FROM DataTable

DELETE WorkingTable where batchid = @Identifier
0
 
LVL 25

Expert Comment

by:Ron Malmstead
ID: 35436689
Sorry I meant...

DELETE FROM DataTable WHERE RecordID IN(SELECT DTRecordID FROM WorkingTable)

0
 

Author Comment

by:richxyz
ID: 35436692
Thanks xuserx2000, but I do not have an ID column.  I know I could create one and do that - but my question was more about what is SQL's behavior when using "TOP".  Thanks anyway!
0
 
LVL 25

Expert Comment

by:Ron Malmstead
ID: 35436706
Then if you go back to my first post, it "should" work, because you are deleting from the exact same SELECT statement that you are doing the insert.

INSERT INTO WorkingTable (col1, col2, col3)
SELECT TOP 200000 val1, val2, val3 FROM DataTable

DELETE TOP 200000 FROM DataTable WHERE RecordID IN(SELECT TOP 200000 RecordID FROM DataTable)

0
 
LVL 25

Expert Comment

by:Ron Malmstead
ID: 35436710
oh...right...no id...gotcha...lol..more coffee
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have a large data set and a SSIS package. How can I load this file in multi threading?
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.
Suggested Courses

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question