Avatar of Kim Walker
Kim WalkerFlag for United States of America asked on

Problem reading .csv file from zip archive using PHP

I'm having a problem reading the contents of a .csv file from a zip archive in PHP. The zip archive contains several .csv files and is approximately 5K in size. My script works when looping through echoing the file names, but as soon as I add the line to read the contents of a file, the page hangs up and eventually results in a "Network Error (tcp_error)" "Operation timed out."

Here is my code, it's pretty basic and practically copied from the documentation.
if (is_file($rpt) ) {
	if ($zip = zip_open($rpt) ) {
		while ($zip_entry = zip_read($zip) ) {
			if (zip_entry_name($zip_entry) == 'config.csv') {
				if (zip_entry_open($zip,$zip_entry) ) {
					if ($buff =  zip_entry_read($zip_entry) ) {
						echo $buff;
					}
				}
			} else {
				echo zip_entry_name($zip_entry)."\n";
			}
		}
		zip_close($zip);
	}
}

Open in new window

Here are the contents of the zip file I'm trying to read:
Key,Value
AccountID,494
AccountName,XXXXXXXXXXXXXXXXX
ScheduleID,45
EndPoint,http://xxxxxxxxxxxxx.xxx/REST/programservice/GetOutboundReportData
Email,xxxxxxxxxx@xxxxxxxxxxx.com
FtpUrl,ftp://XXXXXXXXXXXXX.com
FtpUsername,XXXXXXXXXX
FtpPassword,XXXXXXXXX

Open in new window

PHP

Avatar of undefined
Last Comment
Kim Walker

8/22/2022 - Mon
SOLUTION
rinfo

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
Ray Paseur

Please post a link to your test data, and I'll see if I can give you a tested and working code sample.  Thanks, ~Ray
ASKER
Kim Walker

rinfo, I copied your solution, removed lines 1, 2, 24-27, and added a closing parenthesis to line 3, and inserted in place of the 16 lines I posted in my question. It produced the same results.

Thanks to Ray_Paseur, however, I've discovered the problem is in the archive I'm trying to unzip. I redacted the files in the archive and re-uploaded them to the server and both our scripts generated the appropriate output from the redacted files.

The archive is produced by a service provider and uploaded automatically by them every thirty minutes. I need to write a script to access those files and append the contents to a database that is used to generated a dynamic report. The data is incremental so if I miss one archive, the dynamic report is inaccurate.

Can you suggest a reason that PHP might not be able to unzip the archive but I can unzip it on my local computer? I can even decompress the archive and re-compress it without modification on my local computer, upload it to the server and execute the script without error. But of course, I can't do that every thirty minutes.

Ray_Paseur, is there a way to upload the original archive and delete it after you've looked at it? It contains personally identifiable information, so I'm reluctant to upload it permanently.
Ray Paseur

I am pretty sure I can delete the file for you, but I hope you can give us test data instead of live data -- well-crafted test data is a requirement for successful programming.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
ASKER
Kim Walker

This is data from test subjects all of whom are adults and none of the data is confidential. But I would appreciate it if we can delete it when the question is closed.
ASKER
Kim Walker

It appears that I can't upload the file to EE. I gave up when it hadn't finished after several minutes. So I've posted it where I can delete myself when the time comes. I should have thought of that earlier.
SOLUTION
Ray Paseur

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
ASKER
Kim Walker

Thanks, Ray_Paseur. Miraculously, the script is working now -- though it does push the 60-second timeout limit and the re-zipped archive processes almost instantaneously. I'm going to increase the time limit and proceed as is. But I will contact our service provider to see if they have any alternatives to offer. I'll post an update if anything changes in the next couple of days. But I expect to close the question with my own resolution and, since it's a new month, award some points for effort.

Thanks,
XMediaMan
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Ray Paseur

Glad you've got a working solution, but I still sense a disconnect here.  You can make PHP scripts survive and run as long as you want with set_time_limit(). I wonder why there is a difference in time between two ZIP archives of essentially the same data.  This sounds like a bug in the ZIP extension!
ASKER
Kim Walker

You're absolutely right, Ray_Paseur. I felt the same way. Now it appears that the server has been serving the last rendered page instead of the error. It seemed that no matter what changes I made to my script, I was getting the exact same results. I just looked at my error logs and this is what I found repeated over and over.
[Sun Sep 01 16:41:41 2013] [warn] [client 67.213.33.1] mod_fcgid: read data timeout in 45 seconds
[Sun Sep 01 16:41:41 2013] [error] [client 67.213.33.1] Premature end of script headers: process_report.php

Open in new window

Only after stopping and restarting Apache am I again getting the timeout errors in my browser even after increasing the timeout to 5 minutes.

This would not have been a viable solution anyway when I start to get these reports every 30 minutes for 10-15 different clients.

Now it's time for my service provide to start providing better service!
ASKER CERTIFIED SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
Kim Walker

Good information, Slick812. Unfortunately, my 7z program doesn't give me any details of this nature. Can you suggest a free app that would?
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
Member_2_248744

I have the 7z program and I know that it will do many, many ZIP file configurations, but I do not see any file analysys in that program. Sorry I can not recomend any programs at this time, and just can not take the time now, However you seem like a capable developer, and web searches are my greatest programming resource and "savior" in doing code work, , , some search for zip file analysis may turn up something.
the producer of this file may have some info about it, but I have seen some Bzip2 with a .zip extention, and I know that PHP has a separate decompression for Bzip2, worth a shot maybe?
ASKER
Kim Walker

My service provider has responded that they use "7za" and included a file named 7za in their response. They also included a php file they described as "code that we have used to unzip." The php file contains nearly 750 lines of code. The only comments appear to be lines of code they've disabled. On line 713 is this function definition:
function unZipOnLinux($domain,$sourceFileName,$destinationPath){
  $destinationPath = $destinationPath.'/';
  $directoryPos = strrpos($sourceFileName,'/');
  $directory = substr($sourceFileName,0,$directoryPos+1);
  $dir = opendir( $directory );
  $info = pathinfo($sourceFileName);
  if ( strtolower($info['extension']) == 'zip' ) {
echo '7za e '.$sourceFileName .'  -o'. $destinationPath.'<br>';
   //system('unzip -q '.$sourceFileName .'  -d '. $destinationPath);
   system('/var/www/vhosts/'.$domain.'/httpdocs/dashboard/upload/7za e '.$sourceFileName .'  -o'. $destinationPath);
  }
  closedir( $dir );
}

Open in new window

Do I upload the 7za file to my working folder?

They appear to extract a path from the $sourceFileName variable (see lines 3 and 4 in the code above). So do I submit the entire path to the zipped file or just the relative path from the 7za file?

I've searched for examples or commentary of using 7za in php on linux and have come up empty. What little I have found uses shell_exec.

Any advise would be helpful.
Member_2_248744

I read your last post and to me the "Important" line of code is this -
7za e '.$sourceFileName .'  -o'. $destinationPath

which seems to correspond to -
echo '7za e '.$sourceFileName .'  -o'. $destinationPath.'<br>';

this to me looks like it calls  a LINUX executable (7za) with  input-output parameters e and -o , , , by using system( )  and this was substituted for this line -
system('unzip -q '.$sourceFileName .'  -d '. $destinationPath);
  which uses the "Standard" unzip method.

I would think that the 7za is the 7z archive program for linux, and instead of using the .7z  file extension, they use the .zip extension.
I do know that the 7z archive file format is NOT compatible with the standard .zip archive file format. But you can set the 7z to do a DEFAULT "standard" zip file format, but apparently they did not do this.

for you to use this line  -
system('/var/www/vhosts/'.$domain.'/httpdocs/dashboard/upload/7za e '.$sourceFileName .'  -o'. $destinationPath);

on your server, you would need to have the 7za archive program installed , , and on a file path that your PHP can use with one of the PHP linux system functions calls, , like -
shell_exec('7za e '.$sourceFileName .'  -o'. $destinationPath);
exec('7za e '.$sourceFileName .'  -o'. $destinationPath);
system('7za e '.$sourceFileName .'  -o'. $destinationPath);
    Not all of these system functions are available (maybe none) in various PHP-LINUX setups, and they do vary somewhat in what is returned from the function.

In this system call it looks like the  7za executable is installed on this directory -
/var/www/vhosts/thisDomain/httpdocs/dashboard/upload/

I could be wrong about that, since this seems like a highly unusual place to to have an archive program in linux?

anyhow, if u is not so familiar wid de PHP LINUX system stuff, you might start out with the easy -
$output = shell_exec('ls -lart');
echo "<pre>$output</pre>"; // from manual , uses shell script  BASH "ls" to list files in a directory

just to see if the shell_exec( ) thing works. you can use the various linux install methods to get and or install 7z, But you may find better help for this than me in the EE Linux section. .

I just found this, which may shed some light -
https://www.ibm.com/developerworks/community/blogs/6e6f6d1b-95c3-46df-8a26-b7efd8ee4b57/entry/how_to_use_7zip_on_linux_command_line144?lang=en
http://www.dotnetperls.com/7-zip-examples
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
Kim Walker

I finally found a site with instructions for installing the p7zip package properly for my CentOS linux installation. With proper installation, I don't need to include a path to the 7za bin file. I can expand the archive with the following php command:
system('7za e /var/www/vhosts/domain.com/reports/report-*.zip');

Open in new window

This expands the files to the same folder as the archive where I can process them and delete them.
ASKER
Kim Walker

I've split the points according to how much your comment contributed to my own resolution. I doubt if I'd have solved this without your comments. Thanks.