Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

PHP mail - Special Characters

Posted on 2006-07-12
11
Medium Priority
?
1,439 Views
Last Modified: 2013-12-03
Hi!

I'm wondering how to handle special characters like for example ä,ö,ü, etc. in plain-text e-mails...

I have a contact form with a text-area - the text submitted from this is included in a mail sent to the website operators. This text can contain special characters. I simply use a content-type header with text/plain and UTF-8 encoding.
This seems to work fine for me (Mozilla Thunderbird) but it may be that it's not displayed correctly in other clients (Pegasus for example) - I have to confirm this though.

When the enquiry mail is sent, another e-mail is sent to the sender basically like a personalised auto-responder.
The text for this mail is specified in the php script itself and contains some 'Ü's. I also use plain/text, UTF-8 for this mail and use the utf8_encode() function. Using this method the special characters seem to display correctly in both Thunderbird and Outlook Express, but my client told me that it doesn't display properly in her mail-client (Pegasus).

How can I ensure that the mail is displayed properly in all mail-clients ?
Again, both mails are plain-text, no HTML...
0
Comment
Question by:Julian Matz
  • 6
  • 3
  • 2
11 Comments
 
LVL 15

Expert Comment

by:bpmurray
ID: 17102225
How I hate that catch-all "doesn't display properly"! Do you have a description of how the mail displays at your client? If the accented characters are shown as little squares, it's more than likely that the issue is the use of an incorrect font, i.e. one which only supports ASCII. It seems odd that this should be the case - I can't imagine any Windows font not having support for the full CP 1252 characters.

Most likely, either the header of the mail claims it's something else, e.g. US-ASCII (the default for Pegasus) or maybe 1252: are you sure you've set it to UTF-8? Otherwise, the problem is at the receiver's mail settings. If you go to Advanced Settings, you can specify UTF-8 as the default character set, but I think that was first enabled in V4. Do you know which version she has?
0
 
LVL 15

Assisted Solution

by:bpmurray
bpmurray earned 2000 total points
ID: 17102243
I just checked - from version 4.3 there's support for UTF-8, but I think it's only for the message body, not the headers.
0
 
LVL 44

Expert Comment

by:scrathcyboy
ID: 17103290
You are right, it definitely will not work in Pegasus, or many big corporate email clients that strip all caharacters to basic ascii.  I get emails all the time where the apostrophe is a ? -- like jan?s pet?s name isn?t easy, it?s hard.  This gets extrememly tiring.  YOu even see this on many websites.  Sure you can ask people to change their encoding set, but they wont be bothered doing it.

So consider stripping them all out at the beginning using the PHP strip functions --

www.php.net/function.mail -- ** note, this mail function is designed just for this job **

also --  http://www.experts-exchange.com/Web/Web_Languages/PHP/Q_21810069.html

You can do it in client side javascript --

http://www.experts-exchange.com/Web/Web_Languages/JavaScript/Q_20160578.html

ALso here is a guide to do it right at the keycode level, as a person types --

http://www.experts-exchange.com/Web/Web_Languages/JavaScript/Q_20999435.html
http://www.experts-exchange.com/Web/Web_Languages/JavaScript/Q_21729341.html
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 15

Expert Comment

by:bpmurray
ID: 17103719
Very naught: if you strip out accented characters, you remove part of the language, e.g. in Danish you have Båd and Bad, Boat and Bath; or øst and ost, east and cheese. If you strip accents, you change the meaning. Anyway, what do you do with Japanese or one of the Indic scripts? You can't strip the accent off ideographs.

Apostrophes are actually inside the ASCII range, so you're using another character instead. Many sites attempt to translate the characters to the current codepage, i.e. 1252 for Windows in the US, and if that other apostrophe isn't in the character set of the codepage, it will be translated to a fallback, here a "?". I have very extensive experience in this area, and the reality is that most corporates do not strip characters to 7-bit. 10 years ago there were many gateways that could only handle 7-bit, but not any more. The clients and servers have been able to handle accented characters for many years.

Simply put, it is WRONG to strip accents. Fix the problem instead - get the client to upgrade to 4.3. After all, it's not like Pegasus is expensive!
0
 
LVL 21

Author Comment

by:Julian Matz
ID: 17103806
I've spoken to my client, and she agrees that it's probably her mail-client -v4.01.

She said that characters were being replaced by these:
Ã, ÿ, ¼

For example:
Ü = ü
ß = Ãÿ

Where "ÿ" is actually a capital umlaut Y.

This doesn't really make sense to me but I think these characters might be from before I changed to UTF-8 and encoding the text string...
0
 
LVL 21

Author Comment

by:Julian Matz
ID: 17103816
("encoded" not "encoding")
0
 
LVL 15

Accepted Solution

by:
bpmurray earned 2000 total points
ID: 17103842
OK - I know the problem: as you probably know, characters int his range take up 2 bytes, and they typically display as 2 consecutive odd characters. The fix is for your client to enable UTF-8 on her side.
0
 
LVL 44

Expert Comment

by:scrathcyboy
ID: 17103893
bpmurray, I wasnt trying to demote the accented languages, I thought julianmartz WANTED this stuff out of the inpout fields.  If not, then the correct answer is to CHANGE to code page each person is using, so they are all "Talking on the same page".
0
 
LVL 15

Assisted Solution

by:bpmurray
bpmurray earned 2000 total points
ID: 17104019
Well, UTF-8 is probably as close to the perfect choice: it covers all Unicode characters. However, if the data are exclusively Western European languages, Codepage 1252 is a good choice since that's what Windows uses in those locales. Pegasus V4.01 will probably support that, unless it's on Unix/Linux. In that case use ISO 8859-1, ISO-Latin1.
0
 
LVL 21

Author Comment

by:Julian Matz
ID: 17129729
Thank you! I'm glad the problem wasn't on my end :)
I didn't think it was but had to be sure...
0
 
LVL 15

Expert Comment

by:bpmurray
ID: 17129778
Glad to help. Thx for the pts.
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article discusses how to implement server side field validation and display customized error messages to the client.
Originally, this post was published on Monitis Blog, you can check it here . In business circles, we sometimes hear that today is the “age of the customer.” And so it is. Thanks to the enormous advances over the past few years in consumer techno…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
Suggested Courses

885 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question