Link to home
Start Free TrialLog in
Avatar of novanation
novanation

asked on

Expire response header not honored by internet explorer

We are concentrating on an issue whereby I.E. seems to ignore our cache-control headers.  Some years back we began adding a max-age response header to speed up initial page hits on a new browser launch. This stops the browser from doing a content update check (304 return code) until the content expires and significantly improved the performance of the web sites doing this.  While researching a performance problem we have observed that max-age no longer has any effect. We have tested on IE7, 8 and 9 internally, both XP and 7.  

It could be a problem with Internet Explorer not handling cached content the way we expect it to.  The effect in the US is minimal, but in our international offices, it accounts for a noticeable slowdown.

The previous (and desired) behavior is for the 304 request not to be made, saving the round trip time to the server.  The 304 content check should not occur until the content has expired (in this case, May of this year). Nothing we are doing is setting the Pragma header, it isn't really in the request, we believe its internal to IE and the reason its bypassing the expiration date on the content.  There is no javascript involved.

When many 304 requests happen on initial startup, as is our case, the performance of the web site is badly impacted.

So bottom line, any content with a max-age set on initial fetch, should never make a request to the web server until it has expired, and no 304's should be observed.
Avatar of Dave Baldwin
Dave Baldwin
Flag of United States of America image

A lot of info about cache control here: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html  You might want to make sure that your server is not sending a "max-age=0' in it's response header.
Avatar of novanation
novanation

ASKER

Based on what I can see, it does not appear to be sending a max-age-0 in its response header.
How are you checking the Request and Response headers?  Are you familiar with Wireshark or Fiddler?
Yes, by collecting a Fiddler trace I am able to replay the traffic and observe that the cache file gets reused if you just goto the URL by entering manually or navigate from another page, however, if you refresh the page or press F5, the pragma header and the If-Modified-Since header gets sent in the request, which triggers the 304 response from the server.  

Attached is the trace file.  This site blocked me from uploaded a .SAZ file so I renamed it to a .ZIP.  Once you download it, you will need to change it back to a .SAZ file extension.

(Edit: File replaced - Modulus_Twelve)
Trace-CacheProblemIE.zip
Ctl-F5 (or Ctl-R) is always supposed to do that.  But the browser can be setup to Always do that.  I don't have a page set up to test that at the moment.  In the 'General' tab in Internet Options, there is Browsing History and when I click on the Settings button, my IE is set to 'Automatically' check for page changes.  I don't actually know what that means.
I know this is tricky to understand.. "no request" is the previously observed and we feel, correct behavior, it should immediately come from "local cache".   The content check should not even be occurring until it is has reached its expiration.  Non-Internal sites appear to function OK so far (caching correctly).  Could it be something related to "Intranet Zone" settings and Internet Explorer?  As far as I am aware, Internet Explorer has not changed the way it handles max-age headers.  Externally, could something be stripping headers off the response?  If you need any additional traces just let me know.

It is also important to mention that we do not see this on alternative browsers, such as Firefox or Chrome.
I understand what you're saying and I was wondering about the other browsers.  Do all of your IE installs do the same thing?  Are there addons being used in IE?

I just looked at your Fiddler file.  A lot of the image and javascript files are being requested with "Pragma: no-cache" in the Request header with a 200 response.  That is followed immediately by a request for the same file with a 304 response.  I don't know why the "Pragma: no-cache" is in the Request header.  I also don't know why a second request for the same file is following.
In our international offices, yes all instances of IE are exhibiting this behavior.  We tried disabling browser addons to rule out any potential conflict; however, there was no change in the response time.  

I am not sure why image and javascript files are being requested with "Pragma: no-cache". This code was written by another team so I will have to verify why it was done that way.  I believe it goes back to why they began adding a max-age response header, which now seems to be ignored by the browser.

My main concern though is why we are only observing this in IE.  If that is reason for the latency, why wouldn't the same behavior be exhibited in alternative browsers?
Something involving IE and your setup is causing those files to be requested Twice, both times ignoring what you think the cache settings are.  Note that if the cache settings were being observed, you would not see Any requests on the network for files that are in the cache.
Would you have any thoughts or theories as to what that might be?  I am sort of at a loss at this point with the next direciton to take.  This issue just surfaced a few weeks back and there have been no changes to our environment.
You mean no 'intentional' changes because if it was working and now it isn't... Something has changed.  I guess I would check to see if any GPO settings have been sent out.  They are much more likely to affect IE than the other browsers.
I just checked with our development team and they have fixed the "Pragma: no-cache" in the Request header.  I guess the main issue right now is that IE will not honor the max-age header on a page after a meta-refresh.. so we have removed that from the main page. The Pragma no cache is coming from IE itself, like a command to skip it's cache, it doesn't go out over the wire to the server.

We don't know when this behavior in IE changed... only that we believe it used to work.
it doesn't go out over the wire to the server

I don't know who told you that but I don't believe it's true.  It would not be in the Wireshark capture of network traffic if it didn't get sent to the server.  That also does not explain why IE is requesting the same file twice.  Give it a day and do another Wireshark capture and we'll see what it's doing then.
I attached another collection taken following the fix.  Alot of the 304's have been eliminated, but not all.  Again, the capture will need to be renamed as a .SAZ file.  

We have determined that when a web page does a 'meta-refresh' to another page (a form of redirect) IE will ignore un-expired content and do unconditional requests for updated content (304 return code). When a sufficient volume of resources are involved, as with this site, this impacts performance substantially.  We corrected this by removing the meta-refresh in the page and replacing it with javascript.Trace-CacheProblemIEAfterFix.zip
(Edit: File replaced - Modulus_Twelve)
This time I do not see any duplicate file requests.  In looking at the response headers in the bottom half of the right hand section in Fiddler, I do not see any files that are setting 'max-age' and cache Date where it appears is today.  Try two different things: #1. Do a capture when pressing Ctl-F5 to reload all the files so we can see what the expiration dates are in the response headers.  #2. Do the same thing in Firefox so we can compare them.  Ctl-F5 should make an unconditional request for the files that results in a 200 response code instead of 304.
Thanks for your continued support on this issue.  I am in the process of doing the additional captures now.  

We are still trying to get a better understanding of the meta-refresh function in IE and why it does not seem to honor the max-age header.  Our web team believes this to be a bug in IE, but at a minimum, it appears to be a change in behavior in the last year. As previously mentioned, FireFox works properly.  Now that we have a relatively straightforward workaround, we have to decide how much energy to spend on this with Microsoft, our history teaches us that is probably a dead end.  Have you any theories on this?
This is a somewhat old article but it says that IE6 and IE7 have this issue and supposedly IE8 doesn't.  I have to ask Why are you doing a "meta-refresh" in the first place?  If you have access to the server, redirects should be done there.
I totally agree and as I mentioned earlier, we actually corrected this by removing the meta-refresh on the page in question and replaced it with javascript.  This did make a significant difference in the response time.  At this point it's just trying to understand if this is expected behavior in IE.
ASKER CERTIFIED SOLUTION
Avatar of Dave Baldwin
Dave Baldwin
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial