Link to home
Start Free TrialLog in
Avatar of birwin
birwinFlag for Canada

asked on

Upload data from a url to PHP / mySQL

I need to load data from a url and dump it into a mySQL database using PHP. I will be croning this process, so I need it to happen with no intervention. The data is in .csv format. The file is large, approximately 75 MB.

I tried a test with this code, which I know works with smaller files.
<?php
set_time_limit(600);
$file = file_get_contents('http://www.xxxxxx.com/download_inventory/somefile.csv', FILE_USE_INCLUDE_PATH);

// some PHP to process $file
?>
Even with the set_time_limit it appears to time out. I assume that my variable $file can't hold 75 MB, so I need some way to write it to my server's drive, without having to hold the entire file in a variable.

My intention was to then use PHP to process the data and dump it into a database, but there may be a better way. Should I be writing it to the drive and then using LOAD DATA LOCAL INFILE to dump it to the database? If yes, what is the code to write the data from the remote url to my server's drive?  (note: I have a dedicated server, so I have full access) Is there a way to use mySQL to dump it directly into the database?

So I am looking for the best way to automate daily getting the data from the url and updating my database with it.

I'd appreciate your help.
Avatar of f_o_o_k_y
f_o_o_k_y
Flag of Poland image

Hello,
Execution time limit is not the only one limit which you must consider.
Another one may be memory utilization, try to check if your php script can consume 75MB?

I think you can try to download file in loop ofr example read 4kb and write it to file on the server then read next 4kb.

example: https://www.experts-exchange.com/questions/21526090/file-get-contents-limiting-file-size.html

and then after each part fwrite

I hope this will help you.
Best Regards
FooKy
ASKER CERTIFIED SOLUTION
Avatar of f_o_o_k_y
f_o_o_k_y
Flag of Poland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Because of the size of your expected data you receive with file_get_contents its propable better as a first step of the solution, to redirect everything directly into a temporary file. For this you may use the ob_start function, which allows for a callback function parameter. In this callback function you can write the output buffer to a file. Further you should use readfile which directly writes the read data into the output buffer.

To open URLs with readfile you must set allow_url_fopen = On in your php.ini.

And if you have save_mode on be aware about the scripts owner and the owner of the target directory where the temporary file should be written, they must be equal.

I do not believe that time outs are the problem, because you run the php script from the shell and not inside a web server. Also using the output buffer should solve any memory limit problems.

Hope this solves the first part of your problem. The second part, updating the database, should be easy now.

Regards,

Oli
$ob_file = fopen('/some/tmp/dir/some_tmp_file.csv','w');
ob_start('write_ob');
readfile('http://www.xxxxxx.com/download_inventory/somefile.csv');
ob_end_flush();
 
//callback function
function write_ob($buffer)
{
  global $ob_file;
  fwrite($ob_file,$buffer);
}

Open in new window

Avatar of birwin

ASKER

Thanks FooKy:

I had to set the  ini_set('memory_limit', '100M'); to 100M, I assume due to processing overhead, and it works.
I am still trying to find a way to dump it directly to the database, but that will be another question. You gave me the answer to get my code to work.

I appreciate your prompt help.