Solved

How would I decompress and parse 365 JSON files, each over a gig?

Posted on 2014-09-25
6
90 Views
Last Modified: 2014-09-26
I've got the script I need to decompress and parse the files, but what I need now is something that can "look" into the directory and automatically grab each file, process it and then move on to the next.

I can't imagine how I could do that, unless I had the name of each file loaded in a database somewhere.

Perhaps some clever spin on fopen?

How could I do it so I could initiate the process as I'm headed out the door and the process continue automatically through the nite and be done in the morning?
0
Comment
Question by:brucegust
6 Comments
 
LVL 9

Assisted Solution

by:Brian Tao
Brian Tao earned 125 total points
ID: 40345391
This skeleton may be what you need:
if ($dh = opendir("$dir_name")){
  while (($file = readdir($dh)) !== false){
    // code for processing each individual file
    // e.g. print the file name
    echo "$file <br>\n";
  }
  closedir($dh);
}

Open in new window

0
 
LVL 109

Assisted Solution

by:Ray Paseur
Ray Paseur earned 250 total points
ID: 40345909
Here is what I would do.

Get a list of the files.  Scandir() will handle that part.  Then with each file name, start a process to do whatever you want with the file.  You can use fsockopen() or cURL to start the process.  Give the process script the name of the file and let it run.  You will want to start the process with a POST-method request, so you can disconnect and let the process run asynchronously.

You probably want to sleep() a few moments between starting the processes.  You probably want to keep a log of the file names and a timestamp when the process was started.  You probably want to keep a log of the times when each process ended, so you know what succeeded and what failed.
0
 

Author Comment

by:brucegust
ID: 40346075
Gentlemen!

Thanks so much for your willingness to share your expertise!

Question: In both your examples, the output includes two rows of "blank" values. By that I mean, in my current directory, I have one file. Rather than that file being listed by itself, with taoyipai' s suggestion I get:

.
 ..
 00_8ptcd6jgjn201311060000_day.json

Ray, with your scenario I get:

Array ( [0] => . [1] => .. [2] => 00_8ptcd6jgjn201311060000_day.json ) Array ( [0] => 00_8ptcd6jgjn201311060000_day.json [1] => .. [2] => . )

Again, you're getting those "dots" and I'm wondering, first of all, what they represent and, secondly, how can I remove them from the list of files that I want to preform some code on? In other words, how do I ensure that the list of files in the directory do not include "." and ".."?
0
Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 10

Assisted Solution

by:Chris_Gralike
Chris_Gralike earned 125 total points
ID: 40346142
This will remove the dots from the array.

$d = scandir($path);

foreach($d as $k => $v){
        if( !(( $v === '.') || ($v === '..')) ){
                $files[]= $v";
        }
}

print_r($files);

Open in new window

0
 
LVL 109

Accepted Solution

by:
Ray Paseur earned 250 total points
ID: 40346153
getting those "dots" and I'm wondering, first of all, what they represent
They are directory indicators and irrelevant to your application.  You would only be looking for files that end in ".json" right?  Skip the others as you process the array.
0
 

Author Comment

by:brucegust
ID: 40346172
Got it!

Thank you!

Also, feel free to head out to http://www.experts-exchange.com/Programming/Languages/Scripting/PHP/Q_28526319.html for a question that pertains to the next piece of scaffolding for this project...
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I imagine that there are some, like me, who require a way of getting currency exchange rates for implementation in web project from time to time, so I thought I would share a solution that I have developed for this purpose. It turns out that Yaho…
Developers of all skill levels should learn to use current best practices when developing websites. However many developers, new and old, fall into the trap of using deprecated features because this is what so many tutorials and books tell them to u…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

828 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question