Solved

Download Problem

Posted on 2016-09-13
17
52 Views
Last Modified: 2016-09-15
This php program is used to download files from a web server.

It works perfectly unless the files extension is pptx (Microsoft Powerpoint)

<?php
date_default_timezone_set("America/Los_Angeles");
// download_docs.php
require( 'wp-load.php' );
global $wpdb;
$nf = 0;
$files = array();
$did = array();
$title = array();
foreach( $_POST as $var => $val ) {
	if (substr($var,0,2) == "ck") {
		if ($val == "on") {
			$ln = strlen($var);
			$docid = intval(substr($var,2,$ln-2));
			$qry = "SELECT * from documents where docid = " . $docid;
			$doc = $wpdb->get_row($qry, ARRAY_A);
			$files[$nf] = "docs/" . $doc['file'];
			$did[$nf] = $doc['docid'];
			$title[$nf] = $doc['title'];
			$nf++;
		}
	}
}	
echo "nf = " . $nf . "<br>";
session_start();
// insert into log
$tdate = date('Y-m-d H:i:s');
for ($i = 0; $i < $nf; $i++) {
	$qryi = "INSERT into dnld_docs VALUES('" . $_SESSION['uid'] . "', '" . $tdate . "', " . $did[$i] . ", '" . $title[$i] . "')";
	$rs = $wpdb->get_results($qryi);
}	
// zip em up
if ($nf > 1) {
	// get the part of the user email BEFORE the @
	$emp = explode("@", $_SESSION['uid']);
	$id = $emp[0];
	$zip = new ZipArchive();
	$zipname = $id . "zip.zip";
	$path = "zips/" . $zipname;
	//echo "path = " . $path . "<br>";
	$zip->open($path, ZipArchive::CREATE);
	$dirname = $id . "files";
	if (file_exists($dirname)) {
		array_map('unlink', glob("$dirname/*.*"));
		rmdir($dirname);
	}
	mkdir($dirname);	
	for ($j = 0; $j < $nf; $j++) {
		// get filename AFTER "/"
		$ln = strlen($files[$j]);
		$slash = strpos($files[$j], "/");
		$thefile = substr($files[$j], $slash+1, $ln-1);
		$indir = $dirname . "/" . $thefile;
		copy($files[$j], $indir);
	}
	$dfiles = scandir($dirname);
	foreach ($dfiles as $file) {		
		$filename = $dirname . "/" . $file;
		//echo "file = " . $file . "<br>";
		if ($file == "." || $file == ".." ) continue;
		if ( !is_file($filename) )        continue;
    // ACTIVATE THIS FOR A PROGRESS REPORT: echo PHP_EOL . "$filename: $file";
		$zip->addFile($filename, $file 	);
	}	
	$zip->close();
} else {
	$path = $files[0];
	//echo "path on file " . $path;
}	
// download
	header('Content-type: application/force-download'); 
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

?>

Open in new window


Note that if there is a SINGLE file being downloaded, it just downloads that file. If there are two or more, it creates a zip file.

As it functions now, it creates a meaningless zip file from a single pptx file. The zip contains other stuff, NOT the pptx file.

Does anyone know of issues with downloading pptx files & how to get around it?

Thanks
0
Comment
Question by:Richard Korts
  • 8
  • 5
  • 4
17 Comments
 
LVL 51

Expert Comment

by:Julian Hansen
ID: 41796564
What happens when it is a PowerPoint file
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 41796600
A .pptx file IS a ZIP file. All of the newer Office formats that end in "x" are actually ZIP files. Just rename it back to .pptx and open it with Powerpoint.
1
 

Author Comment

by:Richard Korts
ID: 41796670
Julian,

Now it does the attached (image1 & image2).

This is BRAND new (the 2nd part). I selected open with & Windows Explore in image1.

Before it created a zip file I could download that had meaningless content.

Now it cannot download a pdf that it could before. On the pdf, I manually downloaded it with ftp, it was fine, opened it in Acrobat Reader just fine.

There HAS to be something wrong with this:
header('Content-type: application/force-download'); 
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

Open in new window


But i've used this without issue in other places.
0
 

Author Comment

by:Richard Korts
ID: 41796678
Julian, I forgot the images
Image1.jpg
Image2.jpg
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 41796710
You should also remove the echo on line 24:
echo "nf = " . $nf . "<br>";

Not only is it in the wrong position and can cause warnings/errors, but if you're try to stream a file, then it's going to get added to the file contents for download.
0
 

Author Comment

by:Richard Korts
ID: 41796757
To All,

What gr8gonzo said is DEFINITELY true. I took out the echo, now I can download pdf's & excel, there is also a Word I have not tried yet.

But the original problem remains for pptx.

It tries to download it as a zip, which would be fine if the pptx was in the zip.

See attached images. In expanding the zip, this is what I get, which is of course unacceptable to the customer.

So the original question remains, how do I download a pptx while not screwing up the other doc types?

I'm thinking of maybe this? http://tutsnare.com/how-to-download-files-in-php/
Image3.jpg
Image4.jpg
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 41796783
What you're seeing is the contents of the Powerpoint file itself.

To see what I mean, open up a relatively recent version of Powerpoint, Word, or Excel, and save a file with the "x" format / extension (e.g. file.pptx, file.docx, or file.xlsx). Then close the program and rename that file to file.zip and open it in Windows Explorer. You'll see the same structure you're seeing in your results.

The reason is because the newer Office format documents are really just a ton of individual XML files that have been zipped up and then renamed with a new file extension. The file itself is still a normal ZIP file.

So really, the main issue seems to be that you're somehow picking the ".zip" extension for your Powerpoint/Office documents. Not sure exactly why that is, but my -guess- is that some part of the system is looking at the content to try and auto-determine the content type and setting the file extension according to what it THINKS the file is. If it looks at the file like that, then it'll always assume that newer Office documents are ZIP files.

Since the browser is really going to get the file extension from the server, that usually means the server is providing that file extension, so I would suggest logging/tracing the filename from start to finish to see how it's picking up the .zip extension.
1
 

Author Comment

by:Richard Korts
ID: 41796833
So what about using what they suggest in that link?
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 51

Expert Comment

by:Julian Hansen
ID: 41796882
So what about using what they suggest in that link?
Very similar to what you are doing already.

Any particular reason you are choosing not to ZIP if there is only one file - would it not be easier for the code to just zip regardless?

I am confused though - in the screen shot we are looking at a file called Fabricated.zip but according to the code you posted - only if there is more than 1 file are they zipped and then they get a name that ends in zip.zip (line 38)?
0
 

Author Comment

by:Richard Korts
ID: 41796926
Julian,

We decided to not zip if there is only one file because our user base is generally not expected to be real computer literate; some might not know what "zip" is and normally, they will only download one file per visit.

I just arbitrarily picked a name for the zip file from the first word of the document "title" plus the characters. So the zip file is like "Fabricatedzip.zip".

I think what gr8gonzo is saying about the Microsoft docs with extension ending in "x" forcing a zip may be correct.

On the link I was specifically thinking of this:

header('Content-Type: application/'.$extension);

where $extension is the file extension.

I am using:

header('Content-type: application/force-download');

which seems more generic, may be that the pptx overrides this & makes a so called zip file.
0
 

Author Comment

by:Richard Korts
ID: 41796966
I changed the bottom of my code to this:

	$extension = "zip";
} else {
	$path = $files[0];
	//echo "path on file " . $path;
	$extension = explode('.',$files[0]);
	$extension = $extension[count($extension)-1];
}	
// download
	header('Content-type: application/' . $extension); 
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

Open in new window


It now recognizes it as ppt (see attached) but I cannot open it as pptx because if I save it, there is no file extension.

I tried it with pdf & xlsx, both worked fine.

There is something about pptx that it stumbles on.
Image5.jpg
0
 

Author Comment

by:Richard Korts
ID: 41796985
Hello,

I changed my code to this:

	if ($extension == "pptx") {
		$ctype = "application/vnd.openxmlformats-officedocument.presentationml.presentation";
	}	
}	
// download
	if ($extension == "pptx") {
		header("Content-Type: ".$ctype);
	} else {	
		header('Content-type: application/' . $extension); 
	}	
    header('Content-Transfer-Encoding: Binary'); 
    header('Content-length: ' . filesize($path)); 
    header('Content-disposition: attachment; filename=' . basename($path)); 
    readfile($path);

Open in new window


It works in Firefox; not sure about Chrome or Safari. I'll have someone else test.
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 41797267
So if you just want the file to download, you should be able to use the generic "application/octet-stream" content type. I would recommend you follow the generic sample from the PHP documentation:

http://php.net/manual/en/function.readfile.php

    header('Content-Description: File Transfer');
    header('Content-Type: application/octet-stream');
    header('Content-Disposition: attachment; filename="'.basename($file).'"');
    header('Expires: 0');
    header('Cache-Control: must-revalidate');
    header('Pragma: public');
    header('Content-Length: ' . filesize($file));
    readfile($file);
    exit;

Open in new window


I'd have to look at network / Fiddler captures to see exactly what was happening in regards to the PPTX, but what you've presented before looks like it -should- work, which makes me think there's something else going on before the download even begins. It's just really hard to say with the information we see at this point.

If you want to really track it down, my recommendation is to use Fiddler to capture the raw server responses and then add in file-based logging throughout the download script to capture the source data and what's happening to the different variables as the script progresses.
0
 
LVL 51

Expert Comment

by:Julian Hansen
ID: 41797326
just arbitrarily picked a name for the zip file from the first word of the document "title" plus the characters. So the zip file is like "Fabricatedzip.zip".
Still confused in this image https://filedb.experts-exchange.com/incoming/2016/09_w38/1116463/Image3.jpg it is fabricated.zip not fabricatedzip.zip - was that generated with the same code?
0
 
LVL 51

Assisted Solution

by:Julian Hansen
Julian Hansen earned 100 total points
ID: 41797561
Here is a sample using Gr8gonzo's code from his post above.

Does this file download and open on your side?

Tested in FF and Chrome - seems to be fine.
0
 

Author Comment

by:Richard Korts
ID: 41798889
What is wrong with the code I posted?

It's now been tested in FF & Chrome; works in both.
0
 
LVL 34

Accepted Solution

by:
gr8gonzo earned 400 total points
ID: 41798920
It might work in both browsers but if you have to try and "bandaid" the problem for specific formats, then there's a higher likelihood that it will also fail on a different format or a different browser/platform. Bottom line, you shouldn't need to add specific code for specific formats, at least not for this scenario.

It's up to you whether you want to keep it that way, but bear in mind that you or someone else might have to come back to this code months or years later and wonder why there's a specific call-out for Powerpoint files. It's a good practice to try and keep code as clean and generic as possible.
1

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Deprecated and Headed for the Dustbin By now, you have probably heard that some PHP features, while convenient, can also cause PHP security problems.  This article discusses one of those, called register_globals.  It is a thing you do not want.  …
This article discusses how to create an extensible mechanism for linked drop downs.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now