Including UTF-8 content in UTF-8 output

Posted on 2008-11-06
Last Modified: 2010-08-05
I have some UTF-8 text that I want to include in my PHP. The PHP itself outputs UTF-8, so I thought it should be a no brainer. The PHP has <?php header('Content-Type: text/html; charset=UTF-8'); ?> and <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />.

I was expecting to be able to include the UTF-8 text simply by using include(), but I find that I have to convert it to ISO-8559-1 for it to get converted back again to UTF-8 - see the code snippet. This seems silly. Is there a portable way to make the internal encoding UTF-8 rather than ISO-8559-1 to avoid the to-and-fro conversion?


if (is_file('frag/movie/review/'.$id_path.'.txt'))




	echo("<p>No movie review available</p>");


Open in new window

Question by:rstaveley
    LVL 2

    Assisted Solution

    You can simply change the character set of the content to utf8 by special software. I recommend you to use notepad++ but you can google it and there should be a lot of documents explaining how to do it.

    If you couldn't do it, let me know.
    LVL 17

    Author Comment

    You didn't understand my question.

    Here is the background:

    1. The text file is valid UTF-8 text (with no BOM). This has been verified.
    2. My PHP outputs UTF-8, using Content-Type: text/html; charset=UTF-8.
    3. I expected to be able to use include() to include the UTF-8 file, but I was wrong.
    4. If I convert my included UTF-8 to ISO-8559-1, using utf8_decode(), it works.
    Here is my problem:

    • It seems inefficient for the UTF-8 text to be converted by the PHP script to ISO-8559-1 so that PHP can convert it back again to UTF-8. This must be making it slow and it must mean that it can only handle characters which can be converted to ISO-8559-1.
    Here is my question:

    • How do I make my PHP work internally in UTF-8 rather than ISO-8559-1?
    LVL 17

    Accepted Solution

    I was wrong about this.

    > The text file is valid UTF-8 text (with no BOM). This has been verified.

    This is what was going on: http:Q_23900962.html. My verification was plain wrong. I have verified that it really is UTF-8 that I'm generating now and it really is.

    I now find that PHP "does the right thing" with include(). Like a server-side include, it assumes that the included content has the character set specified by

    This completely makes sense now.

    On that basis, it makes sense in my application either to AddCharSet a special extension for UTF-8 text or to echo(file_get_contents($filename)), which is what I've wound up doing.

    I expect that this is more efficient than using include() anyhow.
    if (is_file('frag/movie/review/'.$id_path.'.txt'))
    	# Naive include would only work if the .txt was ISO-8559-1
    	# or if .txt was in an AddCharSet for UTF-8 in Apache's
    	# directives. The include is high level, going via Apache
    	# and the Content-Type reported by Apache is respected and
    	# it is converted from that Content-Type.
    	# Bad publishes from MySQL "doubly UTF-8'ed" the data and the following
    	# bodge designed to convert UTF-8 to ISO-8559-1 was needed to 
    	# get UTF-8.
    	# This is the right way to put raw UTF-8 data into the output
    	# buffer. Out .txt file goes through no conversion.
    	echo("<p>No movie review available</p>");

    Open in new window


    Expert Comment

    It is far more efficient to store these reviews in a database and edit them via an admin panel, than using an include as oppose to your current method which is also extremely bad practice on a security level.
    LVL 17

    Author Comment

    Thanks, but movie reviews do not really have security concerns. A database isn't appropriate for that environment locally, though it is used to manage the reviews offline. There are half a million of these reviews published off-line as text fragments, which get pushed into production when modified. It is a quirky set-up, I know.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    How to run any project with ease

    Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
    - Combine task lists, docs, spreadsheets, and chat in one
    - View and edit from mobile/offline
    - Cut down on emails

    Introduction Many web sites contain image galleries; a common design for these galleries includes a page with a collection of thumbnail images.  You can click on each of the thumbnail images to see the larger version of the image.  This is easily i…
    This article will explain how to display the first page of your Microsoft Word documents (e.g. .doc, .docx, etc...) as images in a web page programatically. I have scoured the web on a way to do this unsuccessfully. The goal is to produce something …
    Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
    The viewer will learn how to dynamically set the form action using jQuery.

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    12 Experts available now in Live!

    Get 1:1 Help Now