?
Solved

Most efficient way to alter a 4 billion row innodb table?

Posted on 2006-04-05
9
Medium Priority
?
747 Views
Last Modified: 2008-02-01
Hi All - I am using MySQL 4.1.10 on RedHat. I have one HUGE innodb table, it has 4.2 billion rows and is almost 500G in size. We are in a situation where we would like to add two columns to the table, eg, something like;

ALTER TABLE big ADD (
        test_id INTEGER UNSIGNED NOT NULL,
        test_id2 INTEGER UNSIGNED NOT NULL

However, in testing with a smaller table(23 million rows) this operation not only takes a long time, but the size of the table almost doubles. The size growth seems unusual given that we aren't actually adding data yet. I am wondering if this is expected, or if there is a better way to add the columns, or if migrating up to mysql V5 would help?

Thanks!

Jennifer
0
Comment
Question by:jennfir
  • 4
  • 3
  • 2
9 Comments
 
LVL 30

Expert Comment

by:todd_farmer
ID: 16386745
But you are adding data - you've defined it as NOT NULL, so it is storing the default INTEGER value (0) for every row.
0
 
LVL 30

Expert Comment

by:todd_farmer
ID: 16386821
In terms of raw storage, an INT takes up 4 bytes (for MyISAM tables).  You have two columns, so you're effectively adding 8 bytes of storage per row.  That calculates out to about 30 GB of raw storage.  InnoDB has a different storage architecture that I don't know the details on, but definately requires more disk space than MyISAM (likely because of multi-versioning for ACID transaction support).
0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 16391940
ALTER TABLE works by copying the existing table completely into another table, then replacing the original table with the new table.  So, yes, the disk space consumed by the table will more than double because it has to simultaneously hold the old table and new table.

You could do a manual process that would not increase the overall size by much.

CREATE TABLE newbig like big;
ALTER TABLE newbig ADD (
        test_id INTEGER UNSIGNED NOT NULL,
        test_id2 INTEGER UNSIGNED NOT NULL
);

-- to make sure nobody is using them but us
LOCK TABLES big WRITE, newbig WRITE;

--figure out a WHERE condition that applies to one Nth of your database of the rows in big.
--Call it $WHERE1
INSERT INTO newbig (...list fields from big...) SELECT ...list fields from big.. FROM big WHERE $WHERE1;
DELETE FROM big WHERE $WHERE1;

INSERT INTO newbig (...list fields from big...) SELECT ...list fields from big.. FROM big WHERE $WHERE2;
DELETE FROM big WHERE $WHERE2;

... and so on until you have all the rows copied into newbig and deleted from big.
DROP TABLE big;
ALTER TABLE newbig RENAME TO big;

0
Veeam and MySQL: How to Perform Backup & Recovery

MySQL and the MariaDB variant are among the most used databases in Linux environments, and many critical applications support their data on them. Watch this recorded webinar to find out how Veeam Backup & Replication allows you to get consistent backups of MySQL databases.

 

Author Comment

by:jennfir
ID: 16394668
Thanks! - I am currently copying the database to another server where I can test, its taking awhile since its so large. I will try the solution above once it is done and let you know, it will probably be a day or two. I am hoping with this change the DB will only grow by 30G or so as Todd suggests.
0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 16394872
Now I'm not so sure my answer will work well.  

The main problem is that the deleted rows could be in the middle of blocks, and these partially used blocks will still be occupied and not recycled for use by newbig.  It would wind up in the same situation as if you had used ALTER TABLE - the database files would grow to nearly double their size.

It would be ideal if the $WHERE conditions took the records in  the order they are actually recorded on the disk, so that the blocks could be recycled for use by newbig as they are freed from big.  OPTIMIZE TABLE won't help because it will just copy the table like ALTER TABLE.

0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 16395018
The partial-page thing is a problem.  See the last paragraph of http://dev.mysql.com/doc/refman/5.0/en/innodb-file-space.html :

When you delete data from a table, InnoDB  contracts the corresponding B-tree indexes. Whether the freed space becomes available for other users depends on whether the pattern of deletes frees individual pages or extents to the tablespace. Dropping a table or deleting all rows from it is guaranteed to release the space to other users, but remember that deleted rows are physically removed only in an (automatic) purge operation after they are no longer needed for transaction rollbacks or consistent reads. (See Section 14.2.12, “Implementation of Multi-Versioning”.)
0
 

Author Comment

by:jennfir
ID: 16429997
Thanks NovaDenizen - if I am going to drop table 'big' once the inserts into 'newbig' are done, then it seems as if once I restart mysql, the space would be released? I realize I will need double the space until the table is dropped and needed, but I should be able to accomodate that temporarily. Unfortunately, I am having difficulties mounting my NAS device to the server(see http://www.experts-exchange.com/Networking/Linux_Networking/Q_21806263.html) so I am not in any position to test right now. Theoretically it seems like your above solution would work once the table is dropped, am I missing something?

0
 
LVL 22

Accepted Solution

by:
NovaDenizen earned 750 total points
ID: 16430609
Another problem is that your InnoDB datafiles will remain at the "high-water mark" once the process is complete.  I don't think there's any way to decomission InnoDB data files once they are configured into a mysqld server.  

I think I have come up with a better procedure.  I'm assuming your table has one primary key field, called id.  Another field is necessary to outsmart the query optimizer.  Use a small field in big, I'll call it other_small_field.  

There is no strict guarantee that mysql will return the rows in the order in which they occur on disk, but this procedure assumes that the query in the "INSERT INTO batchids SELECT..." query does just that by "outsmarting" the query optimizer.  You should try this with your reduced dataset to make sure the query doesn't push the 'high-water mark' up too high.

The procedure:

First, back your data up!  
Drop all indexes on big except the primary key index.

Make sure that "EXPLAIN SELECT id, other_small_field FROM big LIMIT 1000" has possible_keys=NULL, key=NULL, and type=ALL.  I believe this means that this particular query will return records in the order in which they appear on disk, not in some arbitrary index order.

Create the newbig table with the extra two fields, and _no_ indexes, not even a primary key.  Also need to create a table to hold the id's in each batch.
CREATE TABLE batchids (id <same type as in big> PRIMARY KEY, other_small_field <same as in big>) ENGINE=MEMORY;

These next steps have to be repeated many times until the entire table is copied over.  You can change the LIMIT 1000 to be whatever value you find to be most convenient, as long as you think your batchids table can take that many records.  
TRUNCATE batchids;
INSERT INTO batchids (id, other_small_field) SELECT id, other_small_field FROM big LIMIT 1000;
INSERT INTO newbig ( <all fields in big> ) SELECT big.<all fields in big> FROM batchids
                  STRAIGHT_JOIN big ON (batchids.id = big.id);
DELETE big FROM batchids, big WHERE batchids.id = big.id;

After all records have been moved out of big:
DROP TABLE batchids;
DROP TABLE big;
ALTER TABLE newbig RENAME TO big, CHANGE COLUMN id id <type of id> PRIMARY KEY, ADD INDEX <add any other indexes that you dropped before>;

0
 

Author Comment

by:jennfir
ID: 16564867
I apologize for not responding recently, this project got tabled due to something else that came up and I do not know when I will be able to get back to it. But, when I do, I will try NovaDenizon's response, so I am going to take that as the accepted answer. Thanks All!
0

Featured Post

Free recovery tool for Microsoft Active Directory

Veeam Explorer for Microsoft Active Directory provides fast and reliable object-level recovery for Active Directory from a single-pass, agentless backup or storage snapshot — without the need to restore an entire virtual machine or use third-party tools.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This article is intended for those who are new to PHP error handling (https://www.experts-exchange.com/articles/11769/And-by-the-way-I-am-New-to-PHP.html).  It addresses one of the most common problems that plague beginning PHP develop…
In this series, we will discuss common questions received as a database Solutions Engineer at Percona. In this role, we speak with a wide array of MySQL and MongoDB users responsible for both extremely large and complex environments to smaller singl…
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question