Solved

MySQL (MariaDB) Update in Batches and loop until done?

Posted on 2016-07-14
8
70 Views
Last Modified: 2016-08-02
I have a couple of massive tables (T1 = 132,343,644 and T2 = 87,537,041) I need to update fields in T1 from field values in T2. Both have common 'recordID' fields that are indexed on each table.

Is there a way to run an update query in a batch type method so that its taking like 10,000 or 100,000 records at a time to update and then loop and grab the next batch until done. I have a feeling that this would be more efficient then my current update query.

UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL

Open in new window



Server is CENTOS 7.2.1511
CPU : 8 vCPU
Memory : 32,768
0
Comment
Question by:FirstDirect
  • 5
  • 3
8 Comments
 
LVL 24

Expert Comment

by:mankowitz
ID: 41711372
The way you have seems efficient assuming that there are indexes on the recordid's. If you want to do batches, you could either create partitions or simply use a where clause, like this:

UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL
AND T1.RecordID BETWEEN 0 and 1000000
1
 

Author Comment

by:FirstDirect
ID: 41711380
Is there a way to have it work in a LOOP? So that it processes the first million then skips to the next, the next until done?
0
 
LVL 24

Expert Comment

by:mankowitz
ID: 41711388
There really aren't efficient loops in SQL. Why do you want a loop, when you can have it all done in one swoop? If you are worried about server activity, you can add LOW PRIORITY, if you want.
0
 
LVL 24

Expert Comment

by:mankowitz
ID: 41711391
Or, if you really want to divide the job into batches, you could use WHERE or partitions.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 

Author Comment

by:FirstDirect
ID: 41711450
Hmm, not really what I was looking for.
0
 
LVL 24

Accepted Solution

by:
mankowitz earned 500 total points
ID: 41711459
I suppose another option would be to make a stored procedure, but it adds unnecessary complexity.

CREATE PROCEDURE `update_all` ()
BEGIN

 DECLARE x  INT;
 
 SET x = 1;
 
 WHILE x  <= 5000000 DO
 UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL and T1.recordID < x; 
 SET  x = x + 1000000; 
 END WHILE;
 
END

Open in new window

1
 

Author Comment

by:FirstDirect
ID: 41716308
Mankowitz, I think you might be on to something. The other option since we have zip codes on the file. Would be to perhaps have it search and update records in a given zip code using a separate ZIP table as a reference table.

I have an index on ZIP5 (5 Digit Zip Code)
0
 
LVL 24

Expert Comment

by:mankowitz
ID: 41717657
Again, this is an option, but still not the best way to do it.

CREATE PROCEDURE `update_all` ()
BEGIN

 DECLARE x  INT;
 
 SET x = 1;
 
 WHILE x  <= 99999 DO
 UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL and T1.zip5 = x; 
 SET  x = x + 1; 
 END WHILE;
 
END

Open in new window

0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
I annotated my article on ransomware somewhat extensively, but I keep adding new references and wanted to put a link to the reference library.  Despite all the reference tools I have on hand, it was not easy to find a way to do this easily. I finall…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now