Link to home
Start Free TrialLog in
Avatar of Richard Korts
Richard KortsFlag for United States of America

asked on

Download Problem

This php program is used to download files from a web server.

It works perfectly unless the files extension is pptx (Microsoft Powerpoint)

<?php
date_default_timezone_set("America/Los_Angeles");
// download_docs.php
require( 'wp-load.php' );
global $wpdb;
$nf = 0;
$files = array();
$did = array();
$title = array();
foreach( $_POST as $var => $val ) {
	if (substr($var,0,2) == "ck") {
		if ($val == "on") {
			$ln = strlen($var);
			$docid = intval(substr($var,2,$ln-2));
			$qry = "SELECT * from documents where docid = " . $docid;
			$doc = $wpdb->get_row($qry, ARRAY_A);
			$files[$nf] = "docs/" . $doc['file'];
			$did[$nf] = $doc['docid'];
			$title[$nf] = $doc['title'];
			$nf++;
		}
	}
}	
echo "nf = " . $nf . "<br>";
session_start();
// insert into log
$tdate = date('Y-m-d H:i:s');
for ($i = 0; $i < $nf; $i++) {
	$qryi = "INSERT into dnld_docs VALUES('" . $_SESSION['uid'] . "', '" . $tdate . "', " . $did[$i] . ", '" . $title[$i] . "')";
	$rs = $wpdb->get_results($qryi);
}	
// zip em up
if ($nf > 1) {
	// get the part of the user email BEFORE the @
	$emp = explode("@", $_SESSION['uid']);
	$id = $emp[0];
	$zip = new ZipArchive();
	$zipname = $id . "zip.zip";
	$path = "zips/" . $zipname;
	//echo "path = " . $path . "<br>";
	$zip->open($path, ZipArchive::CREATE);
	$dirname = $id . "files";
	if (file_exists($dirname)) {
		array_map('unlink', glob("$dirname/*.*"));
		rmdir($dirname);
	}
	mkdir($dirname);	
	for ($j = 0; $j < $nf; $j++) {
		// get filename AFTER "/"
		$ln = strlen($files[$j]);
		$slash = strpos($files[$j], "/");
		$thefile = substr($files[$j], $slash+1, $ln-1);
		$indir = $dirname . "/" . $thefile;
		copy($files[$j], $indir);
	}
	$dfiles = scandir($dirname);
	foreach ($dfiles as $file) {		
		$filename = $dirname . "/" . $file;
		//echo "file = " . $file . "<br>";
		if ($file == "." || $file == ".." ) continue;
		if ( !is_file($filename) )        continue;
    // ACTIVATE THIS FOR A PROGRESS REPORT: echo PHP_EOL . "$filename: $file";
		$zip->addFile($filename, $file 	);
	}	
	$zip->close();
} else {
	$path = $files[0];
	//echo "path on file " . $path;
}	
// download
	header('Content-type: application/force-download'); 
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

?>

Open in new window


Note that if there is a SINGLE file being downloaded, it just downloads that file. If there are two or more, it creates a zip file.

As it functions now, it creates a meaningless zip file from a single pptx file. The zip contains other stuff, NOT the pptx file.

Does anyone know of issues with downloading pptx files & how to get around it?

Thanks
Avatar of Julian Hansen
Julian Hansen
Flag of South Africa image

What happens when it is a PowerPoint file
A .pptx file IS a ZIP file. All of the newer Office formats that end in "x" are actually ZIP files. Just rename it back to .pptx and open it with Powerpoint.
Avatar of Richard Korts

ASKER

Julian,

Now it does the attached (image1 & image2).

This is BRAND new (the 2nd part). I selected open with & Windows Explore in image1.

Before it created a zip file I could download that had meaningless content.

Now it cannot download a pdf that it could before. On the pdf, I manually downloaded it with ftp, it was fine, opened it in Acrobat Reader just fine.

There HAS to be something wrong with this:
header('Content-type: application/force-download'); 
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

Open in new window


But i've used this without issue in other places.
Julian, I forgot the images
Image1.jpg
Image2.jpg
You should also remove the echo on line 24:
echo "nf = " . $nf . "<br>";

Not only is it in the wrong position and can cause warnings/errors, but if you're try to stream a file, then it's going to get added to the file contents for download.
To All,

What gr8gonzo said is DEFINITELY true. I took out the echo, now I can download pdf's & excel, there is also a Word I have not tried yet.

But the original problem remains for pptx.

It tries to download it as a zip, which would be fine if the pptx was in the zip.

See attached images. In expanding the zip, this is what I get, which is of course unacceptable to the customer.

So the original question remains, how do I download a pptx while not screwing up the other doc types?

I'm thinking of maybe this? http://tutsnare.com/how-to-download-files-in-php/
Image3.jpg
Image4.jpg
What you're seeing is the contents of the Powerpoint file itself.

To see what I mean, open up a relatively recent version of Powerpoint, Word, or Excel, and save a file with the "x" format / extension (e.g. file.pptx, file.docx, or file.xlsx). Then close the program and rename that file to file.zip and open it in Windows Explorer. You'll see the same structure you're seeing in your results.

The reason is because the newer Office format documents are really just a ton of individual XML files that have been zipped up and then renamed with a new file extension. The file itself is still a normal ZIP file.

So really, the main issue seems to be that you're somehow picking the ".zip" extension for your Powerpoint/Office documents. Not sure exactly why that is, but my -guess- is that some part of the system is looking at the content to try and auto-determine the content type and setting the file extension according to what it THINKS the file is. If it looks at the file like that, then it'll always assume that newer Office documents are ZIP files.

Since the browser is really going to get the file extension from the server, that usually means the server is providing that file extension, so I would suggest logging/tracing the filename from start to finish to see how it's picking up the .zip extension.
So what about using what they suggest in that link?
So what about using what they suggest in that link?
Very similar to what you are doing already.

Any particular reason you are choosing not to ZIP if there is only one file - would it not be easier for the code to just zip regardless?

I am confused though - in the screen shot we are looking at a file called Fabricated.zip but according to the code you posted - only if there is more than 1 file are they zipped and then they get a name that ends in zip.zip (line 38)?
Julian,

We decided to not zip if there is only one file because our user base is generally not expected to be real computer literate; some might not know what "zip" is and normally, they will only download one file per visit.

I just arbitrarily picked a name for the zip file from the first word of the document "title" plus the characters. So the zip file is like "Fabricatedzip.zip".

I think what gr8gonzo is saying about the Microsoft docs with extension ending in "x" forcing a zip may be correct.

On the link I was specifically thinking of this:

header('Content-Type: application/'.$extension);

where $extension is the file extension.

I am using:

header('Content-type: application/force-download');

which seems more generic, may be that the pptx overrides this & makes a so called zip file.
I changed the bottom of my code to this:

	$extension = "zip";
} else {
	$path = $files[0];
	//echo "path on file " . $path;
	$extension = explode('.',$files[0]);
	$extension = $extension[count($extension)-1];
}	
// download
	header('Content-type: application/' . $extension); 
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

Open in new window


It now recognizes it as ppt (see attached) but I cannot open it as pptx because if I save it, there is no file extension.

I tried it with pdf & xlsx, both worked fine.

There is something about pptx that it stumbles on.
Image5.jpg
Hello,

I changed my code to this:

	if ($extension == "pptx") {
		$ctype = "application/vnd.openxmlformats-officedocument.presentationml.presentation";
	}	
}	
// download
	if ($extension == "pptx") {
		header("Content-Type: ".$ctype);
	} else {	
		header('Content-type: application/' . $extension); 
	}	
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

Open in new window


It works in Firefox; not sure about Chrome or Safari. I'll have someone else test.
So if you just want the file to download, you should be able to use the generic "application/octet-stream" content type. I would recommend you follow the generic sample from the PHP documentation:

http://php.net/manual/en/function.readfile.php

    header('Content-Description: File Transfer');
    header('Content-Type: application/octet-stream');
    header('Content-Disposition: attachment; filename="'.basename($file).'"');
    header('Expires: 0');
    header('Cache-Control: must-revalidate');
    header('Pragma: public');
    header('Content-Length: ' . filesize($file));
    readfile($file);
    exit;

Open in new window


I'd have to look at network / Fiddler captures to see exactly what was happening in regards to the PPTX, but what you've presented before looks like it -should- work, which makes me think there's something else going on before the download even begins. It's just really hard to say with the information we see at this point.

If you want to really track it down, my recommendation is to use Fiddler to capture the raw server responses and then add in file-based logging throughout the download script to capture the source data and what's happening to the different variables as the script progresses.
just arbitrarily picked a name for the zip file from the first word of the document "title" plus the characters. So the zip file is like "Fabricatedzip.zip".
Still confused in this image https://filedb.experts-exchange.com/incoming/2016/09_w38/1116463/Image3.jpg it is fabricated.zip not fabricatedzip.zip - was that generated with the same code?
SOLUTION
Avatar of Julian Hansen
Julian Hansen
Flag of South Africa image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
What is wrong with the code I posted?

It's now been tested in FF & Chrome; works in both.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial