Best way to chunk data from MySQL for import into shared-hosting MySQL database

Posted on 2011-04-23
Last Modified: 2013-12-14
I have a MySQL table with c.1,850 rows and two columns - `ID` (int - not auto-incrementing) and `data` (mediumblob). The table is c.400MiB, with many individual entries exceeding 1MiB and some as large as 4MiB. I must upload it to a typical Linux shared-hosting installation.

So far, I have run into a variety of size restrictions. Bigdump, which effortlessly imported the rest of the database, cannot handle this table - stopping at different places, whichever method I have used (various attempts using SQL or CSV). Direct import using phpMyAdmin has also failed.

I now accept that I have to split the table's content in some way, if the import is ever to be successful. But as (for example) the last CSV displayed 1.6m rows in GVIM (when there are only 1,850 rows in the table), I don't even know where to start with this.

What is the best method? And what settings must I use at export to make the method work?
Question by:hnmcc
    LVL 38

    Expert Comment

    by:Aaron Tomosky
    I'd start by using FTP to get the file to the server. Either in chunks or all at once.
    Then you could write a tiny php file using fopen an fget to get a line at a time, and insert it into the table. Make sure to echo something every hundred rows or so, that way php won't timeout.
    LVL 24

    Expert Comment

    Hmmm I guess you shared hosting subscription doesn't allow shell access, which is why you are trying these different methods?  In such a case I would upload the output of mysqldump to the server, then just ask/tell/demand the hosting provider to import it for me.  If that's not possible, I would try it the as Aaron said, or maybe I would just switch hosts =).

    LVL 107

    Accepted Solution

    This is an interesting question and it might be best to redesign the application.  First the "why" then the "how"...

    Data bases with large blobs of data are terribly inefficient and slow, sometimes to the point of failure (as you have observed).  Consider what happens when someone writes a query that says, "SELECT *" -- you get a table scan.  In data base terms, a table scan is sort of like dog poop - it's out of place no matter where you find it.  When that table scan involves transferring many megabytes of unnecessary information... well, you get the picture.

    Here is what I might try.  Use ALTER TABLE to add a URL column to the existing data base table.  Then go into the table with a query that retrieves each row.  Write each blob out into the file system using a URL name based on the key.  Store each URL in the new column of the table.  Then use ALTER TABLE to remove the blob column.  Once you have done that you have a clean-and-lean data base table with two narrow columns, and you have the large files in the file system, where they belong.

    Use FTP to upload your 1,850 files.  Use phpMyAdmin to import the new, lean table.  HTH, ~Ray
    LVL 25

    Expert Comment

    forget csv if you transfer blobs. this is definitely asking for trouble

    - mysqldump has features that will generate a query for whatever number of rows you specify.
    just cat the resulting file into a mysql client if you can
    if you can only use phpmyadmin's import, just copy-paste the stuff breaking it every 40 megs or so

    - you may also write a php script that remotely performs an sql query on you server and imports the received data on the target mysql. the script may do this chunk by chunk easily (roughly 10-20 lines of codes required)

    - you may also send your dump to the target web server, and hava a php script perform each import query one by one

    - if your web and mysql server are on the same host, you may also consider using "select ... into outfile", move the file to the remote host and use "load data infile" to reimport. this is the less painfull and more efficient method if it applies

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Enabling OSINT in Activity Based Intelligence

    Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

    If your business is like most, chances are you still need to maintain a fax infrastructure for your staff. It’s hard to believe that a communication technology that was thriving in the mid-80s could still be an essential part of your team’s modern I…
    Meet the world's only “Transparent Cloud™” from Superb Internet Corporation. Now, you can experience firsthand a cloud platform that consistently outperforms Amazon Web Services (AWS), IBM’s Softlayer, and Microsoft’s Azure when it comes to CPU and …
    The viewer will learn how to count occurrences of each item in an array.
    This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor ( If you're looking for how to monitor bandwidth using netflow or packet s…

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    12 Experts available now in Live!

    Get 1:1 Help Now