QueryPath UTF-8 Encoding

Posted on 2012-08-22
Last Modified: 2012-08-29
I have a page that I have specifically converted to UTF-8 to eliminate unwanted characters. I have verified the encoding and the page comes up fine locally in all browsers. When I parse the page with QueryPath (htmlqp) I am left with a phantom character:

U+00E2      â      c3 a2      LATIN SMALL LETTER A WITH CIRCUMFLEX

in place of

U+0027      '      27      APOSTROPHE

I've tried adding the options convert_from_encoding => utf-8 and strip_low_ascii but I'm still left with this character. Any ideas how to fix this?
Question by:kjenney
    LVL 9

    Expert Comment

    I don't know this library, but by looking around, I saw that you may try the option convert_to_encoding => utf-8. Maybe also combined with the from form.
    LVL 1

    Accepted Solution

    I ended up just substituting the unwanted character. None of the options worked to remove it.
    LVL 9

    Expert Comment

    Thus removing the useless character works, you can still get others when modifying the page.

    This is clearly an encoding issue. Maybe a problem exists within the library you use (which I hope they already tested it), maybe there is a miss use of it. Though, this should be fixed by using encoding conversion.

    Hoping you won't have to modify this page soon if you continue with your fix.
    LVL 1

    Author Closing Comment

    No solution given to filter out the character. Substitution worked for me.

    Featured Post

    What Should I Do With This Threat Intelligence?

    Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

    Join & Write a Comment

    Author Note: Since this E-E article was originally written, years ago, formal testing has come into common use in the world of PHP.  PHPUnit ( and similar technologies have enjoyed wide adoption, making it possib…
    Part of the Global Positioning System A geocode ( is the major subset of a GPS coordinate (, the other parts being the altitude and t…
    This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
    The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

    730 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    16 Experts available now in Live!

    Get 1:1 Help Now