Expiring Today—Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

MySQL (MariaDB) Update in Batches and loop until done?

Posted on 2016-07-14
8
Medium Priority
?
99 Views
Last Modified: 2016-08-02
I have a couple of massive tables (T1 = 132,343,644 and T2 = 87,537,041) I need to update fields in T1 from field values in T2. Both have common 'recordID' fields that are indexed on each table.

Is there a way to run an update query in a batch type method so that its taking like 10,000 or 100,000 records at a time to update and then loop and grab the next batch until done. I have a feeling that this would be more efficient then my current update query.

UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL

Open in new window



Server is CENTOS 7.2.1511
CPU : 8 vCPU
Memory : 32,768
0
Comment
Question by:FirstDirect
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
8 Comments
 
LVL 24

Expert Comment

by:mankowitz
ID: 41711372
The way you have seems efficient assuming that there are indexes on the recordid's. If you want to do batches, you could either create partitions or simply use a where clause, like this:

UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL
AND T1.RecordID BETWEEN 0 and 1000000
1
 

Author Comment

by:FirstDirect
ID: 41711380
Is there a way to have it work in a LOOP? So that it processes the first million then skips to the next, the next until done?
0
 
LVL 24

Expert Comment

by:mankowitz
ID: 41711388
There really aren't efficient loops in SQL. Why do you want a loop, when you can have it all done in one swoop? If you are worried about server activity, you can add LOW PRIORITY, if you want.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 24

Expert Comment

by:mankowitz
ID: 41711391
Or, if you really want to divide the job into batches, you could use WHERE or partitions.
0
 

Author Comment

by:FirstDirect
ID: 41711450
Hmm, not really what I was looking for.
0
 
LVL 24

Accepted Solution

by:
mankowitz earned 2000 total points
ID: 41711459
I suppose another option would be to make a stored procedure, but it adds unnecessary complexity.

CREATE PROCEDURE `update_all` ()
BEGIN

 DECLARE x  INT;
 
 SET x = 1;
 
 WHILE x  <= 5000000 DO
 UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL and T1.recordID < x; 
 SET  x = x + 1000000; 
 END WHILE;
 
END

Open in new window

1
 

Author Comment

by:FirstDirect
ID: 41716308
Mankowitz, I think you might be on to something. The other option since we have zip codes on the file. Would be to perhaps have it search and update records in a given zip code using a separate ZIP table as a reference table.

I have an index on ZIP5 (5 Digit Zip Code)
0
 
LVL 24

Expert Comment

by:mankowitz
ID: 41717657
Again, this is an option, but still not the best way to do it.

CREATE PROCEDURE `update_all` ()
BEGIN

 DECLARE x  INT;
 
 SET x = 1;
 
 WHILE x  <= 99999 DO
 UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL and T1.zip5 = x; 
 SET  x = x + 1; 
 END WHILE;
 
END

Open in new window

0

Featured Post

NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Microsoft Access is a place to store data within tables and represent this stored data using multiple database objects such as in form of macros, forms, reports, etc. After a MS Access database is created there is need to improve the performance and…
In this article, I’ll look at how you can use a backup to start a secondary instance for MongoDB.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…

719 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question