weird html stuff going on

cycledude
cycledude used Ask the Experts™
on
Hi

I have a website, which is being converted form english to another language,

because of all the unusual characters, I am converting each file to utf8, so that the page will display them correctly...

when I view the pages in firefox, and open firebug, it is displaying some weird html entity code, not appearing on the page only in the html... the entity is

in firebug I see:


+<head>
-<body>
  &#65279;
  <h1>Hello</h1>
  &#65279;
</body>

Open in new window



heres a link

http://79.170.44.87/yourwebsite.co.uk/mdc/po/

the odd characters seem to be related to the helper file and the footer, as when they are removed the entities disappear....  any ideas what's going on?

I have an

index.php

all this does is includes 3 files, header, home, and footer

in the header there is an include to another file called helper.

before I converted everything to utf8 it was working as expected.

index.php
<?php 
	
		include 'header.html.php';
		include 'views/home.html.php';
		include 'footer.html.php';
?>

Open in new window


header.html.php

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
	<head>
		
	
		<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
		
		<?php include "inc/helpers.php"; ?>

	</head>
	<body>
	

Open in new window


home.html.php
			
			
	<h1>Hello</h1>
			
	

Open in new window


footer.html.php
	
                </body>

</html>

Open in new window


helpers.php
<?php

function get_active($pagename)
{
	switch (substr($pagename,0,5))
	{
		case "Home":
		$a = 0; 
		break;
		
			
	}

	return $a; 
	
}

?>

Open in new window

Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Most Valuable Expert 2011
Top Expert 2016

Commented:
Here is the view source from that URL.  No funny stuff anywhere in sight!
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
	<head>
		
	
		<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
		
		¿
	</head>
	<body>
				
			
	<h1>Hello</h1>
			

¿	</body>

</html>

Open in new window

Author

Commented:
you have to use firebug to see the entities... just above and below the h1

they are throwing out the layout, and I need to know why they are there so I can stop them happening....

Author

Commented:
i just tested the exact same code in files that were ansi based, and there are no problems whatsoever....

this is so weird... I gotta find a resolution though cos I need utf for the weird characters in the portuguese language!
OWASP: Forgery and Phishing

Learn the techniques to avoid forgery and phishing attacks and the types of attacks an application or network may face.

Anuradha GoliSystems Development / Support Specialist
Commented:
The character in question &#65279 is the Unicode Character 'ZERO WIDTH NO-BREAK SPACE' (U+FEFF). It may be that you copied it into your code via a copy/paste without realizing it. The fact that it's not visible makes it hard to tell if you're using an editor that displays actual unicode characters.

One option is to open the file in a very basic text editor(notepad) that doesn't understand unicode, or one that understands it but has the ability to display any non-ascii characters using their actual codes.

Once you locate it, you can delete the small block of text around it and retype that text manually.

Author

Commented:
same code ansi encoding....

http://79.170.44.87/yourwebsite.co.uk/mdc/po2/

you should be able to see the different location of the <h1>, it is higher in mdc/po2 than it is in mdc/po

Author

Commented:
@anuradhay

I am using notepad++ to develop

Author

Commented:
I have just been doing a little  testing and it turns out that in my simple example,

with all pages set to ansi, if I change the 'home.html.php' to utf8 I get strange html entity above the <h1> again... assuming that if I change the footer to utf8 also I will get the entity below also... ... yes, it does... so the 2 files I have problems with are the home and footer...

I entered the code by typing and didn't copy/paste

driving me crazy!
yay I have the solution


http://www.w3.org/International/questions/qa-utf8-bom.en.php

I am able to set the encoding to utf8 without BOM, and it works!


seems there are problems when you use php to generate the html and are including files like I was.... hurrah...

Author

Commented:
thanks for the help fellas..

Author

Commented:
drove me mad but got it sorted, won't forget about the BOM again that's for sure!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial