Mod_deflate and pre-compressed html files.

Posted on 2008-06-16
Last Modified: 2013-11-05
We are using Apache Mod_deflate to dynamically compress output to clients that can handle it. It works great, however most of our large archives are plain text files and we are running out of space.

While keeping the same file names and endings (i.e .html), we would like to serve both pre-compressed and not-compressed html pages (different pages, NOT the same page twice) at the same time, with the web serve able to a) distinguish if the file is compressed or not  and b) if the client can accept compressed files or not. And then make adjustments as necessary (uncompressing for clients who can't accept compressed files, or compressing files that clients can, etc.)

Ideally, for apache to serve:
 /index.html -> not compressed. Mod_deflate to compress for clients that can accept it or send plain for clients who can't accept compress files

and also serve

 /archiveda.html -> compressed. Mod_deflate bypassing compressing and serves the pre-compressed version to clients that can handle it OR uncompresses for clients that can not.

The pre-compressed files can be intermixed with no compressed files, so no solutions based on directories. Also, we cannot change the names of any files.

Ideally, the older mod_gunzip would seem like the solution, but is only for Apache 1.1. In the mod_deflate docs, it is suppose to bypass compressing already compressed files. We tried pre-compressing using gzip,compress and pack to no luck - the client gets giberish ascii text. I am also not sure that even if we get it to bypass compressing pre-compressed files, that it would automatically decompress those files for clients who can't accept compressed docs.

Below is our current mod_deflate configuration in httpd.conf.
<Location />

# Insert filter

SetOutputFilter DEFLATE

# Netscape 4.x has some problems...

BrowserMatch ^Mozilla/4 gzip-only-text/html

# Netscape 4.06-4.08 have some more problems

BrowserMatch ^Mozilla/4\.0[678] no-gzip

# MSIE masquerades as Netscape, but it is fine

# BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

# NOTE: Due to a bug in mod_setenvif up to Apache 2.0.48

# the above regex won't work. You can use the following

# workaround to get the desired effect:

BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html

# Don't compress images

SetEnvIfNoCase Request_URI \

\.(?:gif|jpe?g|png)$ no-gzip dont-vary

# Make sure proxies don't deliver the wrong content

Header append Vary User-Agent env=!dont-vary


Open in new window

Question by:surfsideinternet
1 Comment
LVL 27

Accepted Solution

caterham_www earned 500 total points
ID: 21947282
> We tried pre-compressing using gzip,compress and pack to no luck - the client gets giberish ascii text.

Wrong or missing encoding headers? See the AddEncoding directive. But mixing-up file extensions (.html sometimes encoded, sometimes not) won't work, I think, unless the content-encoding header is being set accordingly. Or they're compressed twice now (may be caused due to a missing content-encoding response header).

> Also, we cannot change the names of any files.

But you'll need to send a proper content-encoding header. AddEncoding expects extensions.

> Mod_deflate bypassing compressing

This is done by mod_deflate, if you have a header like 'content-encoding: gzip' present.

> and serves the pre-compressed version to clients that can handle it OR uncompresses for clients that can not.

mod_deflate can uncompress a compressed response body, too.

    SetOutputFilter INFLATE

But unfortunately the directive 'SetOutputFilter' does not accept an ENV like
   SetEnvIf ....... uncompress
   SetEnvIf ....... uncompress
   SetOutputFilter INFLATE env=uncompress

You'll need to modify httpd's source code here in order to accept such a syntax or create some AddOutputFilterByEnv. AddOutputFilterByType accepts only MIME types.

mod_ext_filter may be another option, too.

Featured Post

Control application downtime with dependency maps

Visualize the interdependencies between application components better with Applications Manager's automated application discovery and dependency mapping feature. Resolve performance issues faster by quickly isolating problematic components.

Join & Write a Comment

If your site has a few sections that need to be secure when data is transmitted between the server and local computer, such as a /order/ section for ordering or /customer/ which contains customer data, etc it would of course be recommended to secure…
When it comes to showing a 404 error page to your visitors, you do not want that generic page to show, and you especially do not want your hosting provider’s ad error page to show either. In this article, I will show you how to enable the custom 40…
This video discusses moving either the default database or any database to a new volume.
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

759 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now