Link to home
Start Free TrialLog in
Avatar of brian-barnett
brian-barnett

asked on

IE 10 not displaying my html

There are strange characters being inserted into the response html sent back to the client in my JSP pages. I cannot determine what is causing it. Here is a sample using this web site - http://www.rexswain.com/httpview.html
2000(CR)(LF)
<!DOCTYPE·html·PUBLIC·"-//W3C//DTD·HTML·4.01//EN"·"http://www.w3.org/TR/html4/strict.dtd">(LF)
(LF)
<html>(LF)
<head>(LF)

Open in new window

The 2000(CR)(LF) is getting inserted in there somehow. The inserted content varies from page to page.

It is causing IE10 to display a blank page.

Using Tomcat 6.0.29, Java 1.7, JSP 2.1, JSTL 1.2.

You can see it for yourself using this URL:
http://test1.calcxml.com/calculators/home-affordability?skn=504 

If you paste the above URL into http://www.rexswain.com/httpview.html, you will see this:

5a6(CR)(LF)
<!DOCTYPE·html·PUBLIC·"-//W3C//DTD·HTML·4.01//EN"·"http://www.w3.org/TR/html4/strict.dtd">(CR)(LF)
<html>(CR)(LF)

Open in new window

All other browsers seem to strip out the characters before the doctype declaration, but not IE10.

Anybody have any ideas on how to prevent this from being inserted in there?
Avatar of Dave Baldwin
Dave Baldwin
Flag of United States of America image

In that viewer, (CR)(LF) is the 'carriage return / linefeed' that you would find in any text or HTML file for normal formatting.  It/they are ignored in web pages but they are included in the viewer for you information.  They are not showing up in the "View Source" for your page in Firefox or IE8.

When I put one of my own web pages in that viewer, I don't see anything before the DOCTYPE.
Avatar of brian-barnett
brian-barnett

ASKER

Right, the (CR)(LF) do not show up in "View Source", nor does the number/hex code preceding the (CR)(LF). But why is "5a6(CR)(LF)" inserted before my DOCTYPE declaration? Hoping that someone has experienced this before with JSP, Tomcat, etc., and has figured out what causes this to be inserted. My JSP page begins with the DOCTYPTE declaration, so I don't know where it is coming from.
I'm wondering if you might have some Apache prepend setting like this:

http://stackoverflow.com/questions/5038692/how-to-tell-apache-to-prepend-to-each-html-page-login-php
Apache is not part of it. This is a Tomcat-only web app.
Try creating the simplest possible JSP page and start from there.  I tried a few other JSP pages I found in that viewer and none of them had the problem.  Maybe Tomcat is putting it in there in your installation.
I created a very simple JSP page. I am using Struts and Struts Tiles.

Template JSP page
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<%@ taglib prefix="tiles" uri="http://struts.apache.org/tags-tiles-el"%>
<%@ page contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>
<html>
<head>
	<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
	<meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
	<meta http-equiv="Pragma" content="no-cache" />
	<meta http-equiv="Expires" content="0" />
</head>
<body>
	<tiles:insert attribute="primaryPanel" ignore="true" />
</body>
</html>

Open in new window

JSP page which is the primaryPanel for the tiles:insert above
<div id="formcontent">
<p>Hello World!</p>
</div>

Open in new window

If you run this through http://www.rexswain.com/httpview.html using this link - http://test1.calcxml.com/do/mytest, you will see three separate pieces of data that have "magically" been inserted into the response html, bolded below:

179(CR)(LF)
<!DOCTYPE·html·PUBLIC·"-//W3C//DTD·HTML·4.01//EN"·"http://www.w3.org/TR/html4/strict.dtd">(CR)(LF)

47(CR)(LF)
<div·id="formcontent">(CR)(LF)

</html>(CR)(LF)
(CR)(LF)
0(CR)(LF)
(CR)(LF)
you will see three separate pieces of data that have "magically" been inserted into the response html, bolded below:
The viewer is telling you that the page contains CRLF pairs
By saving it from IE8, I was able to see that your page includes a BOM - Byte Order Mark - which Firefox understands but I don't know if IE does.  http://en.wikipedia.org/wiki/Byte_order_mark  Also, W3C says it's not needed for UTF-8.  http://www.w3.org/International/questions/qa-byte-order-mark.en.php

See if you can eliminate that and see if the other problems still exist.  An easy test would be to change 'UTF-8' to 'ISO-8859-1' in that page above.
Changed to ISO-8859-1, but still getting the inserted characters. You can see it with the change to ISO-8859-1 at same link - http://test1.calcxml.com/do/mytest
The encoding doesn't have anything to do with it. You'll find the CRLF pair have the same encoding in nearly all charsets. I'm not sure why you're worried about it - it won't show in any normal user agent
I'm not sure why you're worried about it

1.

IE10 will not render the page

2.

Curious why Struts, Tiles, or Tomcat is injecting the BOM, or whatever they are, into my html prior to returning to client.

3.

w3c's validator doesn't validate it
http://test1.calcxml.com/calculators/home-affordability?skn=504 doesn't have a BOM ...

"This document was successfully checked as HTML 4.01 Strict!
Result:       Passed "
With the ISO-8859-1 code, there is no BOM showing in IE or Firefox.  I don't know why the viewer is showing something.

The only reason your current test page (with ISO-8859-1) does not validate is because HTML 4.01 doesn't use the 'self-closing' tags with '/>' at the end, just '>'.

Does IE10 render the current test page?
Interesting... on the viewer page, if I select HTTP/1.0, the extra data does not show.
If I paste http://test1.calcxml.com/calculators/home-affordability?skn=502 into the Direct Upload version of the W3C validator, it does validate.  The error you get from a link that says there was no content indicates that the validator and your server do not agree on how to communicate.
The problem with that is in the area of validity - nothing to do with boms and the like
I modified the test page so the meta tags did not self-close and also changed it back to UTF-8. I ran it through w3c validator successfully and then tested it on IE10. It rendered on IE10.

The viewer still shows inserted content though??
inserted content
What kind of 'inserted content'?
Why would HTTP/1.0 not cause the extra characters to get inserted??
Not if you select HTTP/1.0 instead of HTTP/1.1.  I suspect that you are seeing some code that a browser will ignore or at the least not show to you.  Since it is working in IE10 for you, you now have something to compare against.
What kind of 'inserted content'?

Go here http://www.rexswain.com/httpview.html and use this link - http://test1.calcxml.com/do/mytest. Also choose HTTP/1.1

Here is the content as reported by the viewer:

Content (Length = 479):

186(CR)(LF) <-- Why is this line here? I did not put it there.
<!DOCTYPE·html·PUBLIC·"-//W3C//DTD·HTML·4.01//EN"·"http://www.w3.org/TR/html4/strict.dtd">(CR)(LF)
(CR)(LF)
(CR)(LF)
<html>(CR)(LF)
<head>(CR)(LF)
(HT)<meta·http-equiv="Content-type"·content="text/html;·charset=UTF-8">(CR)(LF)
(HT)<meta·http-equiv="Cache-Control"·content="no-cache,·no-store,·must-revalidate">(CR)(LF)
(HT)<meta·http-equiv="Pragma"·content="no-cache">(CR)(LF)
(HT)<meta·http-equiv="Expires"·content="0">(CR)(LF)
(HT)<title></title>(CR)(LF)
</head>(CR)(LF)
<body>(CR)(LF)
(HT)(CR)(LF)
47(CR)(LF) <-- Why is this line here? I did not put it there.
<div·id="formcontent">(CR)(LF)
<p>Hello·World!</p>(CR)(LF)
</div>(CR)(LF)
</body>(CR)(LF)
</html>(CR)(LF)
(CR)(LF)  <-- Why is this line here? I did not put it there.
0(CR)(LF)  <-- Why is this line here? I did not put it there.
(CR)(LF)  <-- Why is this line here? I did not put it there.
I'm not sure why but I think you are seeing the 'chunk' encoding.  http://en.wikipedia.org/wiki/Chunked_transfer_encoding  Yes, that's what it is.  The only other page I could find quickly that used it is Amazon.com and it shows those extra numbers also.  And the reason it doesn't show with HTTP/1.0 is because it's only available with HTTP/1.1.

I guess it's obvious that what you do is often not the only thing being delivered with your web pages.  The viewer page also shows the headers for request and response that are generated.  Those exist for every request but they are not something that you write.  They are from the interaction of the browser and the server.
<-- Why is this line here? I did not put it there.
As we've said several times - that's *not* content, it's just line separators, non-printing characters
If those line separators/non-printing characters are "normal" or "okay", then not sure why w3c validator does not validate http://test1.calcxml.com/calculators/home-affordability?skn=502.

Also not sure why IE10 does not render this same page (except if you select "compatibility mode").

Problem is that we have clients complaining that our pages will not render for them and they are running IE10. Maybe I was chasing the wrong thing, but I thought it was the "extra content" I had mentioned.
As we've said several times - that's *not* content, it's just line separators, non-printing characters

I realize the (CR)(LF) piece is line separators/non-printing, but what about the numbers which precede them?
I realize the (CR)(LF) piece is line separators/non-printing, but what about the numbers which precede them?
Offset of their appearance?
SOLUTION
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
My guess was IE10 failed to render for the same reason the w3c validator failed. Unfortunately, I don't have to ability to "either upload the html or input it directly" for our IE10 users.

It seems that every page which succeeds in the w3c validator succeeds to render in IE10.

Fails in both (with no helpful indication as to why it fails):
http://test1.calcxml.com/calculators/home-affordability?skn=502

Succeeds in both:
http://test1.calcxml.com/calculators/home-affordability?skn=504
The '502' version fails because it isn't being either sent or loaded, one of the two.  In the validator, the content is 0.  It does load in IE8.

How are you sending those files?  Obviously it's not a direct link.
They are JSP pages, built with Struts Tiles, JSTL tags, etc., sent from Tomcat 6 servlet container. As you noted, it loads in IE8, and all other browsers I am aware of, except IE10.

There must be something about how Tomcat is passing the response?? or something about the html itself??

I'm at a loss right now...
I guess it's time for Wireshark to see what's being passed between the two.  http://www.wireshark.org/
Can you locate an xhtml test page please - one that renders correctly in IE10? We can then start examining the source
I'm surprised that more people haven't seen this.  I think it's just IE10 applying more security rules which is making it a problem for the asker.

For example, here's a hex dump of the header of a saved html files generated by a tomcat 6 jsp page:
od -xc fragment.html

0000000 683c 6d74 206c 6d78 6e6c 3d73 6822 7474
          <   h   t   m   l       x   m   l   n   s   =   "   h   t   t
0000020 3a70 2f2f 7777 2e77 3377 6f2e 6772 312f
          p   :   /   /   w   w   w   .   w   3   .   o   r   g   /   1
0000040 3939 2f39 6878 6d74 226c e23e 8b80 80e2
          9   9   9   /   x   h   t   m   l   "   > 342 200 213 342 200
0000060 e28b 8b80 80e2 e28b 8b80 683c 6165 3e64
        213 342 200 213 342 200 213 342 200 213   <   h   e   a   d   >
0000100 0a0a 0a0a 0a0a 0a0a 0a0a
         \n  \n  \n  \n  \n  \n  \n  \n  \n  \n

Notice the unprintable characters between xhtml"> and <head> .  I've spent some time tracking this down as well, and I think it's generated by Tomcat as it renders the html from the jsp.  Which is why there isn't any code to find in the jsp itself to cause this.
That could be.  But a direct download of those two files shows no sign of unexpected non-printing characters in a hex viewer, just a lot of extra white space.
DaveBaldwin - you're right -- no unprintable chars in the link to the "always fails" html given above.

Either that's not the problem, or the asker didn't get a good source of the generated page.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Notice the unprintable characters between xhtml"> and <head> .
I find that illegible i'm afraid - can you please use xxd instead of od? ;)
Made some progress today, but need to go to bed. Found a handful of skn=XXX numbers that all failed. Tried to figure out what was similar among them and found that they all have a setting in our database to execute our mobile device detection code. We use WURFL 2.3.3 file and API to do this. I found that with the WURFL code we are using, that the Validator and IE10 both are identified as wireless devices, i.e., is_wireless_device = true in the WURFL code.

So, this has shown me that I need to check out our "mobile-friendly" html code that gets returned and make sure it is compliant. It also shows me that I need to try to figure out why Validator and IE10 are being identified as wireless devices by WURFL.
That is some serious progress.  Let us know what you find.
the Validator and IE10 both are identified as wireless devices
That would probably break it, yes ;) Sounds like you need to look at the User-Agent database
Anybody know how I can change the title of this question? I'd like it to be more descriptive of the actual problem we finally identified. Maybe something like "IE10 not displaying my html" or something like that, and maybe modify the tags a bit too.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
If WURFL is designed properly, it should allow extension of its 'database' plus, with luck, overriding of its heuristics. You ought to be able to correct those glitches. It probably hasn't been touched since the appearance of IE10 ;)
I was able to identify the cause of the problem, but was able to use the suggestions of the experts along the way to lead me there.