Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 86
  • Last Modified:

Download Problem

This php program is used to download files from a web server.

It works perfectly unless the files extension is pptx (Microsoft Powerpoint)

<?php
date_default_timezone_set("America/Los_Angeles");
// download_docs.php
require( 'wp-load.php' );
global $wpdb;
$nf = 0;
$files = array();
$did = array();
$title = array();
foreach( $_POST as $var => $val ) {
	if (substr($var,0,2) == "ck") {
		if ($val == "on") {
			$ln = strlen($var);
			$docid = intval(substr($var,2,$ln-2));
			$qry = "SELECT * from documents where docid = " . $docid;
			$doc = $wpdb->get_row($qry, ARRAY_A);
			$files[$nf] = "docs/" . $doc['file'];
			$did[$nf] = $doc['docid'];
			$title[$nf] = $doc['title'];
			$nf++;
		}
	}
}	
echo "nf = " . $nf . "<br>";
session_start();
// insert into log
$tdate = date('Y-m-d H:i:s');
for ($i = 0; $i < $nf; $i++) {
	$qryi = "INSERT into dnld_docs VALUES('" . $_SESSION['uid'] . "', '" . $tdate . "', " . $did[$i] . ", '" . $title[$i] . "')";
	$rs = $wpdb->get_results($qryi);
}	
// zip em up
if ($nf > 1) {
	// get the part of the user email BEFORE the @
	$emp = explode("@", $_SESSION['uid']);
	$id = $emp[0];
	$zip = new ZipArchive();
	$zipname = $id . "zip.zip";
	$path = "zips/" . $zipname;
	//echo "path = " . $path . "<br>";
	$zip->open($path, ZipArchive::CREATE);
	$dirname = $id . "files";
	if (file_exists($dirname)) {
		array_map('unlink', glob("$dirname/*.*"));
		rmdir($dirname);
	}
	mkdir($dirname);	
	for ($j = 0; $j < $nf; $j++) {
		// get filename AFTER "/"
		$ln = strlen($files[$j]);
		$slash = strpos($files[$j], "/");
		$thefile = substr($files[$j], $slash+1, $ln-1);
		$indir = $dirname . "/" . $thefile;
		copy($files[$j], $indir);
	}
	$dfiles = scandir($dirname);
	foreach ($dfiles as $file) {		
		$filename = $dirname . "/" . $file;
		//echo "file = " . $file . "<br>";
		if ($file == "." || $file == ".." ) continue;
		if ( !is_file($filename) )        continue;
    // ACTIVATE THIS FOR A PROGRESS REPORT: echo PHP_EOL . "$filename: $file";
		$zip->addFile($filename, $file 	);
	}	
	$zip->close();
} else {
	$path = $files[0];
	//echo "path on file " . $path;
}	
// download
	header('Content-type: application/force-download'); 
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

?>

Open in new window


Note that if there is a SINGLE file being downloaded, it just downloads that file. If there are two or more, it creates a zip file.

As it functions now, it creates a meaningless zip file from a single pptx file. The zip contains other stuff, NOT the pptx file.

Does anyone know of issues with downloading pptx files & how to get around it?

Thanks
0
Richard Korts
Asked:
Richard Korts
  • 8
  • 5
  • 4
2 Solutions
 
Julian HansenCommented:
What happens when it is a PowerPoint file
0
 
gr8gonzoConsultantCommented:
A .pptx file IS a ZIP file. All of the newer Office formats that end in "x" are actually ZIP files. Just rename it back to .pptx and open it with Powerpoint.
1
 
Richard KortsAuthor Commented:
Julian,

Now it does the attached (image1 & image2).

This is BRAND new (the 2nd part). I selected open with & Windows Explore in image1.

Before it created a zip file I could download that had meaningless content.

Now it cannot download a pdf that it could before. On the pdf, I manually downloaded it with ftp, it was fine, opened it in Acrobat Reader just fine.

There HAS to be something wrong with this:
header('Content-type: application/force-download'); 
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

Open in new window


But i've used this without issue in other places.
0
Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

 
Richard KortsAuthor Commented:
Julian, I forgot the images
Image1.jpg
Image2.jpg
0
 
gr8gonzoConsultantCommented:
You should also remove the echo on line 24:
echo "nf = " . $nf . "<br>";

Not only is it in the wrong position and can cause warnings/errors, but if you're try to stream a file, then it's going to get added to the file contents for download.
0
 
Richard KortsAuthor Commented:
To All,

What gr8gonzo said is DEFINITELY true. I took out the echo, now I can download pdf's & excel, there is also a Word I have not tried yet.

But the original problem remains for pptx.

It tries to download it as a zip, which would be fine if the pptx was in the zip.

See attached images. In expanding the zip, this is what I get, which is of course unacceptable to the customer.

So the original question remains, how do I download a pptx while not screwing up the other doc types?

I'm thinking of maybe this? http://tutsnare.com/how-to-download-files-in-php/
Image3.jpg
Image4.jpg
0
 
gr8gonzoConsultantCommented:
What you're seeing is the contents of the Powerpoint file itself.

To see what I mean, open up a relatively recent version of Powerpoint, Word, or Excel, and save a file with the "x" format / extension (e.g. file.pptx, file.docx, or file.xlsx). Then close the program and rename that file to file.zip and open it in Windows Explorer. You'll see the same structure you're seeing in your results.

The reason is because the newer Office format documents are really just a ton of individual XML files that have been zipped up and then renamed with a new file extension. The file itself is still a normal ZIP file.

So really, the main issue seems to be that you're somehow picking the ".zip" extension for your Powerpoint/Office documents. Not sure exactly why that is, but my -guess- is that some part of the system is looking at the content to try and auto-determine the content type and setting the file extension according to what it THINKS the file is. If it looks at the file like that, then it'll always assume that newer Office documents are ZIP files.

Since the browser is really going to get the file extension from the server, that usually means the server is providing that file extension, so I would suggest logging/tracing the filename from start to finish to see how it's picking up the .zip extension.
1
 
Richard KortsAuthor Commented:
So what about using what they suggest in that link?
0
 
Julian HansenCommented:
So what about using what they suggest in that link?
Very similar to what you are doing already.

Any particular reason you are choosing not to ZIP if there is only one file - would it not be easier for the code to just zip regardless?

I am confused though - in the screen shot we are looking at a file called Fabricated.zip but according to the code you posted - only if there is more than 1 file are they zipped and then they get a name that ends in zip.zip (line 38)?
0
 
Richard KortsAuthor Commented:
Julian,

We decided to not zip if there is only one file because our user base is generally not expected to be real computer literate; some might not know what "zip" is and normally, they will only download one file per visit.

I just arbitrarily picked a name for the zip file from the first word of the document "title" plus the characters. So the zip file is like "Fabricatedzip.zip".

I think what gr8gonzo is saying about the Microsoft docs with extension ending in "x" forcing a zip may be correct.

On the link I was specifically thinking of this:

header('Content-Type: application/'.$extension);

where $extension is the file extension.

I am using:

header('Content-type: application/force-download');

which seems more generic, may be that the pptx overrides this & makes a so called zip file.
0
 
Richard KortsAuthor Commented:
I changed the bottom of my code to this:

	$extension = "zip";
} else {
	$path = $files[0];
	//echo "path on file " . $path;
	$extension = explode('.',$files[0]);
	$extension = $extension[count($extension)-1];
}	
// download
	header('Content-type: application/' . $extension); 
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

Open in new window


It now recognizes it as ppt (see attached) but I cannot open it as pptx because if I save it, there is no file extension.

I tried it with pdf & xlsx, both worked fine.

There is something about pptx that it stumbles on.
Image5.jpg
0
 
Richard KortsAuthor Commented:
Hello,

I changed my code to this:

	if ($extension == "pptx") {
		$ctype = "application/vnd.openxmlformats-officedocument.presentationml.presentation";
	}	
}	
// download
	if ($extension == "pptx") {
		header("Content-Type: ".$ctype);
	} else {	
		header('Content-type: application/' . $extension); 
	}	
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

Open in new window


It works in Firefox; not sure about Chrome or Safari. I'll have someone else test.
0
 
gr8gonzoConsultantCommented:
So if you just want the file to download, you should be able to use the generic "application/octet-stream" content type. I would recommend you follow the generic sample from the PHP documentation:

http://php.net/manual/en/function.readfile.php

    header('Content-Description: File Transfer');
    header('Content-Type: application/octet-stream');
    header('Content-Disposition: attachment; filename="'.basename($file).'"');
    header('Expires: 0');
    header('Cache-Control: must-revalidate');
    header('Pragma: public');
    header('Content-Length: ' . filesize($file));
    readfile($file);
    exit;

Open in new window


I'd have to look at network / Fiddler captures to see exactly what was happening in regards to the PPTX, but what you've presented before looks like it -should- work, which makes me think there's something else going on before the download even begins. It's just really hard to say with the information we see at this point.

If you want to really track it down, my recommendation is to use Fiddler to capture the raw server responses and then add in file-based logging throughout the download script to capture the source data and what's happening to the different variables as the script progresses.
0
 
Julian HansenCommented:
just arbitrarily picked a name for the zip file from the first word of the document "title" plus the characters. So the zip file is like "Fabricatedzip.zip".
Still confused in this image https://filedb.experts-exchange.com/incoming/2016/09_w38/1116463/Image3.jpg it is fabricated.zip not fabricatedzip.zip - was that generated with the same code?
0
 
Julian HansenCommented:
Here is a sample using Gr8gonzo's code from his post above.

Does this file download and open on your side?

Tested in FF and Chrome - seems to be fine.
0
 
Richard KortsAuthor Commented:
What is wrong with the code I posted?

It's now been tested in FF & Chrome; works in both.
0
 
gr8gonzoConsultantCommented:
It might work in both browsers but if you have to try and "bandaid" the problem for specific formats, then there's a higher likelihood that it will also fail on a different format or a different browser/platform. Bottom line, you shouldn't need to add specific code for specific formats, at least not for this scenario.

It's up to you whether you want to keep it that way, but bear in mind that you or someone else might have to come back to this code months or years later and wonder why there's a specific call-out for Powerpoint files. It's a good practice to try and keep code as clean and generic as possible.
1
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

  • 8
  • 5
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now