Solved

File Download in PHP fails for Office Documents on some servers...

Posted on 2016-11-09
15
77 Views
Last Modified: 2016-11-10
Hello,

I am using the following PHP code to fetch a file by FTP and then deliver it to the client.  This works perfectly on a Windows Server 2012 R2 IIS build (PHP 5.4), but does not on another of the same specification.  I am using the same browser (Chrome, contemporary) to view each server.

When served out from the VM that fails, PDFs do download and display correctly, but DOC, XLS and PPT files fail, as do DOCX, XLSX and PPTX, although the latter appear to be recoverable when opened in the relevant Microsoft Office application.

When I put up a simple TXT file, this works fine from the failing server, although if I uncomment the header('Content-Length: ' . filesize($file)); line, the file is truncated by the last three characters.  Now, it would be understandable if the DOCXs etc. were failing due to this, but when this header is not delivered, the TXT file serves in its entirety, whereas the DOCXs etc. still fail.

One possibility is that the issue lies within the IIS configuration, as this is one of the only variables that is not consistent between the builds.  I've even copied the precise same version of PHP, in case this was causing the issue.

It is also worth noting that the issue does not lie with the FTP connection, as the file can be echoed to the screen without error (including when the Content-Length header is sent).  It only fails when I attempt to stream the contents with the header data as a download.

Thanks for reading.  If you've got any pointers, I'd be grateful to hear them.

$conn_id = ftp_connect($user_access_result['details']['server']);
$login_result = ftp_login($conn_id, $user_access_result['details']['username'], $user_access_result['details']['password']);

if (!ftp_get($conn_id, $tmp_path."\\".$tmp_filename, $_GET['path']."\\".$_GET['name'], FTP_BINARY)) {
    echo "There was a problem\n";
}

// close the connection
ftp_close($conn_id);

$file = $tmp_path."\\".$tmp_filename;
$file_contents = file_get_contents ($file);


header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename="'.basename($_GET['name']).'"');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
//header('Content-Length: ' . filesize($file));

echo $file_contents;

unlink ($tmp_path."\\".$tmp_filename);

Open in new window

0
Comment
Question by:bottishamvc
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 5
  • 3
  • +1
15 Comments
 
LVL 56

Expert Comment

by:Julian Hansen
ID: 41881149
but DOC, XLS and PPT files fail, as do DOCX, XLSX and PPTX, although the latter appear to be recoverable when opened in the relevant Microsoft Office application.
Does fail mean corrupted / short length?

Fail I interpret to mean - does not work at all - but if I read between the lines it seems the files are downloading they are just not complete or not opening by default or both?

Have you tried using readfile() instead of file_get_contents() / echo?
0
 
LVL 1

Author Comment

by:bottishamvc
ID: 41881155
Indeed, the document fails to load in the Office Application, suggesting it is corrupt, or incomplete.

I have tried the readfile() function, but the same happens, on this server.  As mentioned, it works on the other of the same build!
0
 
LVL 56

Expert Comment

by:Julian Hansen
ID: 41881173
Is it always 3 bytes short or does it vary?
0
Does Powershell have you tied up in knots?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

 
LVL 1

Author Comment

by:bottishamvc
ID: 41881177
Always three bytes with the single text file I've been testing with.
0
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 250 total points
ID: 41881185
Any chance that the first three bytes of the file are a Byte-Order Mark?  Can you please post the first dozen or so bytes of the file you're testing?  Use "view source" to get them.  Thanks.
0
 
LVL 80

Expert Comment

by:David Johnson, CD, MVP
ID: 41881197
you must use a binary transfer not an ascii transfer for all but text files
1
 
LVL 56

Expert Comment

by:Julian Hansen
ID: 41881198
Last 3 bytes?
What about the DOCX - also 3 bytes?
0
 
LVL 1

Author Comment

by:bottishamvc
ID: 41881268
It's the last three bytes that are being removed when the Content-Length header is used.  The text file I've been testing with simply contains the following:

Testing12345

Open in new window


This gets truncated to:

Testing12

Open in new window

.

The Excel document I've been trying with is attached, both the version from the server and the version that results once it's downloaded and doesn't open properly in the application.
test-excel---on-server.xlsx
test-excel---downloaded.xlsx
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 41881320
Not seeing a Byte-Order mark.  The file appears to be binary, not subject to rules of any character encoding.

Have you tried using strlen($file_contents) instead of filesize($file) ?
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 41881327
This worked correctly in Chrome.  It downloaded the file, and I could start Excel to use the spreadsheet.  However, Chrome did not launch Excel automatically.
<?php // demo/temp_bottishav.php
/**
 * https://www.experts-exchange.com/questions/28982110/File-Download-in-PHP-fails-for-Office-Documents-on-some-servers.html#a41881268
 */
error_reporting(E_ALL);

// FUNCTION TO FORCE A DOWNLOAD FROM A FILE
function force_download($url, $filename=NULL)
{
    // GET THE DOWNLOAD FILE NAME
    if (empty($filename)) $filename = basename($url);

    // GET LENGTH AND FILE RESOURCE POINTER
    $hdr = get_headers($url, TRUE);
    $len = trim($hdr['Content-Length']);
    $fpr = fopen($url,'rb');

    // ON SUCCESS
    if ($fpr)
    {
        // THESE HEADERS ARE USED ON ALL BROWSERS
        header("Content-Type: application-x/force-download");
        header("Content-Disposition: attachment; filename=$filename");
        header("Content-length: $len");
        header("Expires: ".gmdate("D, d M Y H:i:s", mktime(date("H")+2, date("i"), date("s"), date("m"), date("d"), date("Y")))." GMT");
        header("Last-Modified: ".gmdate("D, d M Y H:i:s")." GMT");

        // THIS HEADER MUST BE OMITTED FOR IE 6+
        if (FALSE === strpos($_SERVER["HTTP_USER_AGENT"], 'MSIE '))
        {
            header("Cache-Control: no-cache, must-revalidate");
        }

        // THIS IS THE LAST HEADER
        header("Pragma: no-cache");

        // FLUSH THE HEADERS TO THE BROWSER
        flush();

        // WRITE THE FILE
        fpassthru($fpr);
    }

    // ERROR
    else
    {
        trigger_error("Unable to open $url", E_USER_ERROR);
    }
}

// THE TEST FILE
$url = 'https://filedb.experts-exchange.com/incoming/2016/11_w46/1126866/test-excel---on-server.xlsx';
force_download($url);

Open in new window

0
 
LVL 1

Author Comment

by:bottishamvc
ID: 41881360
Thank you, everyone for your contributions.  The above script results in the same outcome, so I'm now pretty convinced it must be something to do with the IIS configuration, or otherwise something outside of the scripting.  As originally stated, my script does work, on a different server, with the same webserver, version of PHP and Chrome client.  Very frustrating.

Any IIS-related ideas?
0
 
LVL 56

Assisted Solution

by:Julian Hansen
Julian Hansen earned 250 total points
ID: 41881668
Not seeing a Byte-Order mark.  
It is there Ray - I think your guess is correct here.

First 16 bytes of test-excel---on-server.xlsx
50 4B 03 04 14 00 06 00 08 00 00 00 21 00 A4 53

Open in new window


The first 16 bytes of test-excel---downloaded.xlsx is as follows
EF BB BF 50 4B 03 04 14 00 06 00 08 00 00 00 21

Open in new window


EF BB BF
0
 
LVL 56

Assisted Solution

by:Julian Hansen
Julian Hansen earned 250 total points
ID: 41881697
Suggestion - open your script file with a hex editor and make sure there is not a BOM at the front of it.
0
 
LVL 1

Author Comment

by:bottishamvc
ID: 41881838
It turns out that an include further up the page was yielding the inclusion of a Byte Order Mark, as suggested.  This has now been dealt with and the script works fine.  Thanks for the help - much appreciated.
0
 
LVL 56

Expert Comment

by:Julian Hansen
ID: 41881902
You are welcome.
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Background Information Recently I have fixed file server permission issues for one of my client. The client has 1800 users and one Windows Server 2008 R2 domain joined file server with 12 TB of data, 250+ shared folders and the folder structure i…
These days socially coordinated efforts have turned into a critical requirement for enterprises.
In this Micro Tutorial viewers will learn how to restore single file or folder from Bare Metal backup image of their system. Tutorial shows how to restore files and folders from system backup. Often it is not needed to restore entire system when onl…
The viewer will learn how to dynamically set the form action using jQuery.

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question