Foreign language conversion

Posted on 2006-04-27
Last Modified: 2008-03-17
I have a database containing some letters which aren't very good to be used at websites, so I need to convert them to the following:
ø // ø
å // å
æ // æ
and their capitalized versions...

I have a clue, but I don't know how to do it exactly.
My thought was to have a string parameter inserted into the function, and the function would search through the string looking for one or several instances of these letters and then change them, and return the fixed string.

Appreciate every comment,

Question by:Gaute Rønningen
    LVL 29

    Expert Comment

    ... there are other options, as displaying web pages with the corect character code, or directly with UTF-8.

    Making this decision depends heavily on your problem.
    - What is YOUR language? (From your profile, I would assume Norvegian)
    - Which are the languages expected to be used on the site, ie which languages do visitors expect/ appreciate to find

    Now some technical questions:
    - which version of MySQL are you using
    - is the "mbstring" (multibyte) libraray available (check with phpinfo).

    If Norvegian or "non-pure English" languages are used significantly, I would recommend that you switch to UTF-8 asap. It might be a pain...but lots more if you wait for the site to grow.
    I have realised at a site which uses MySQL 4.0, PHP and handles this character problem in UTF-8, even though I'm not using mbstring.

    I would suggest you take this route rather than using the str_replace function, wich is the one you seem to be looking for.
    LVL 6

    Accepted Solution

    This will work perfectly for you:

    $text = 'Søme strånge text.';

    function htmlchar($char)
          return '&#' . ord($char) . ';';
    $text = preg_replace('/[^\x09\x0A\x0D\x20-\x7F]/e', 'htmlchar("$0")', $text);

    To explain what this does:
    First of all, not only can you reference 'ø' as 'ø', but also as 'ø' because 248 is the ASCII code for 'ø'. The built-in function ord() gives this ASCII value. So the function we have declared, htmlchar(), changes a character into its HTML character entity equivalent. However, this function does not care whether you input strange characters like 'ø' or basic characters like A through Z. The next line does, though, when we call preg_replace(). The way I have the regex set up, it grabs any single character that is not your standard character (A through Z, 0 through 9, !, @, #, etc.), and calls our function to replace it with &#(num); This will output exactly what you want!
    LVL 6

    Expert Comment

    So, in the example I just gave, it would change 'Søme strånge text.' into 'Søme strånge text.', which is a lot more browser-friendly.
    LVL 6

    Expert Comment

    In fact, rather than creating an additional function, you could translate the whole thing in one fell swoop like this:

    $text = preg_replace('/[^\x09\x0A\x0D\x20-\x7F]/e', '"&#" . ord("$0") . ";"', $text);

    Featured Post

    What Is Threat Intelligence?

    Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

    Join & Write a Comment

    A colleague recently asked me about how to give his client a small part of the web site that could be completely under the client's control.  Since I have done this sort of thing before to add emergency banners to a web site, I decided I would creat…
    Consider the following scenario: You are working on a website and make something great - something that lets the server work with information submitted by your users. This could be anything, from a simple guestbook to a e-Money solution. But what…
    Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
    The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

    734 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    18 Experts available now in Live!

    Get 1:1 Help Now