Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 110
  • Last Modified:

MySQL (MariaDB) Update in Batches and loop until done?

I have a couple of massive tables (T1 = 132,343,644 and T2 = 87,537,041) I need to update fields in T1 from field values in T2. Both have common 'recordID' fields that are indexed on each table.

Is there a way to run an update query in a batch type method so that its taking like 10,000 or 100,000 records at a time to update and then loop and grab the next batch until done. I have a feeling that this would be more efficient then my current update query.

UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL

Open in new window



Server is CENTOS 7.2.1511
CPU : 8 vCPU
Memory : 32,768
0
FirstDirect
Asked:
FirstDirect
  • 5
  • 3
1 Solution
 
mankowitzCommented:
The way you have seems efficient assuming that there are indexes on the recordid's. If you want to do batches, you could either create partitions or simply use a where clause, like this:

UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL
AND T1.RecordID BETWEEN 0 and 1000000
1
 
FirstDirectAuthor Commented:
Is there a way to have it work in a LOOP? So that it processes the first million then skips to the next, the next until done?
0
 
mankowitzCommented:
There really aren't efficient loops in SQL. Why do you want a loop, when you can have it all done in one swoop? If you are worried about server activity, you can add LOW PRIORITY, if you want.
0
NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

 
mankowitzCommented:
Or, if you really want to divide the job into batches, you could use WHERE or partitions.
0
 
FirstDirectAuthor Commented:
Hmm, not really what I was looking for.
0
 
mankowitzCommented:
I suppose another option would be to make a stored procedure, but it adds unnecessary complexity.

CREATE PROCEDURE `update_all` ()
BEGIN

 DECLARE x  INT;
 
 SET x = 1;
 
 WHILE x  <= 5000000 DO
 UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL and T1.recordID < x; 
 SET  x = x + 1000000; 
 END WHILE;
 
END

Open in new window

1
 
FirstDirectAuthor Commented:
Mankowitz, I think you might be on to something. The other option since we have zip codes on the file. Would be to perhaps have it search and update records in a given zip code using a separate ZIP table as a reference table.

I have an index on ZIP5 (5 Digit Zip Code)
0
 
mankowitzCommented:
Again, this is an option, but still not the best way to do it.

CREATE PROCEDURE `update_all` ()
BEGIN

 DECLARE x  INT;
 
 SET x = 1;
 
 WHILE x  <= 99999 DO
 UPDATE MASTER AS T1
INNER JOIN `DATE-TYPE_07042016` AS T2 ON T1.recordID = T2.recordID
SET T1.DATE = T2.DATE, T1.NEW_USED = T2.NEW_USED_CODE
WHERE T1.NEW_USED IS NULL and T1.zip5 = x; 
 SET  x = x + 1; 
 END WHILE;
 
END

Open in new window

0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

  • 5
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now