Solved

How can I make this more efficient?

Posted on 2014-09-23
6
128 Views
Last Modified: 2014-09-26
I'm parsing out a JSON file using the following code:

 //This input should be from somewhere else, hard-coded in this example
$file_name = '00_8ptcd6jgjn201311060000_day.json.gz';
// Raising this value may increase performance
$buffer_size = 4096; // read 4kb at a time
$out_file_name = str_replace('.gz', '', $file_name); 
// Open our files (in binary mode)
$file = gzopen($file_name, 'rb');
$out_file = fopen($out_file_name, 'wb'); 
// Keep repeating until the end of the input file
while(!gzeof($file)) {
// Read buffer-size bytes
// Both fwrite and gzread and binary-safe
  fwrite($out_file, gzread($file, $buffer_size));
}  
// Files are done, close files
fclose($out_file);
gzclose($file);

$jsondata = file_get_contents("00_8ptcd6jgjn201311060000_day.json");
$json = json_decode($jsondata, true);
//echo $json
$output = "<ul>";
	foreach($json['id'] as $id) {
	$output .= "<h4>".$id."</h4>";
	$output .="<li>".$id['actor/id']."</li>";
	$output .="<li>".$id['actor/displayName']."</li>";
	$output .="<li>".$id['actor/postedTime']."</li>";
	$output .="<li>".$id['generator/displayName']."</li>";
	$output .="<li>".$id['geo/type']."</li>";
	$output .="<li>".$id['geo/coordinates/0']."</li>";
	$output .="<li>".$id['geo/coordinates/1']."</li>";
	}
$output .="</ul>";
echo $output;

Open in new window


The first part, as far as decompressing the file, works fine. The problem comes when I'm printing the output. I get this:

( ! ) Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 1086054108 bytes) in C:\wamp\www\json\uncompress.php on line 30

How can I process things incrementally so I don't time out?
0
Comment
Question by:brucegust
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
6 Comments
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 40340426
What is line 30?  Compare these numbers.  I think we will have to find a way to process this data incrementally.

   134,217,728 - memory limit
1,086,054,108 - requirement
0
 

Author Comment

by:brucegust
ID: 40341512
Morning, Ray!

I agree. The size of the file is 1,060,592 kb so doing things in stages is going to be essential.

The file, by the way, is a decompressed JSON file that I need to parse and then insert into a database. I've got 365 such files to process that way. After it's all done, I plan on writing a script that exports the parsed results from the database to a csv file.

That's the goal for today.

What do you think?
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 40341760
PHP may not be the right tool for this, or you may need to get a very, very large server and increase the memory limit to the stratosphere.  What instruction is on line 30?
0
SharePoint Admin?

Enable Your Employees To Focus On The Core With Intuitive Onscreen Guidance That is With You At The Moment of Need.

 

Author Comment

by:brucegust
ID: 40341875
Hey, Ray!

The "instruction" at line 30 is $jsondata=file_get_contents("00_8ptcd6jgjn201311060000_day.json");

Since getting into work this am, I've been trying to figure out how to break the elephant down into bite sized pieces and I've yet to figure it out.

Here's what I've got thus far:

$jsondata = file_get_contents("00_8ptcd6jgjn201311060000_day.json");
//breaking the elephant down into byte sized pieces
$json_size = 4096;
$buffer = fgets($jsondata, $json_size);
$json = json_decode(($buffer), true);
//echo $json
 while (!feof($json))
 {

Open in new window


Problem is, I can't get to "$buffer" because I'm getting hung up $jsondata in light of the file being over 1 GB.

Is there a way to do something like $jsondata=file_get_contents($file_name, $json_size)?

I see how it works with fgets, but how about on the "get_contents" side?
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 40341923
Where did that file come from?  Is there a URL that I can read?

PHP functions are all documented in the online man pages.  Example:
http://php.net/manual/en/function.file-get-contents.php
0
 

Author Comment

by:brucegust
ID: 40346141
Ray, here's what I came up with:

$chunk_size=4096;
$url = '00_8ptcd6jgjn201311060000_day.json';
$handle=@fopen($url,'r');
      if(!$handle)
      {
            echo "failed to open JSON file";
      }
while (!feof($handle))
{
$buffer = fgets($handle, $chunk_size);
      if(trim($buffer)!=='')
      {
$obj=json_decode(($buffer), true);
//the rest of my code

It works!

Thanks for your help!
0

Featured Post

Secure Your WordPress Site: 5 Essential Approaches

WordPress is the web's most popular CMS, but its dominance also makes it a target for attackers. Our eBook will show you how to:

Prevent costly exploits of core and plugin vulnerabilities
Repel automated attacks
Lock down your dashboard, secure your code, and protect your users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Build an array called $myWeek which will hold the array elements Today, Yesterday and then builds up the rest of the week by the name of the day going back 1 week.   (CODE) (CODE) Then you just need to pass your date to the function. If i…
Many old projects have bad code, but the budget doesn't exist to rewrite the codebase. You can update this code to be safer by introducing contemporary input validation, sanitation, and safer database queries.
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

717 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question