Link to home
Start Free TrialLog in
Avatar of Bernard Thouin
Bernard ThouinFlag for Switzerland

asked on

Outlook 2013 wrongly renders HTML special characters such as the brasilian "ú"

Hi

I have an application that generates an HTML file and then sends it as the body in an email generated by a command line email utility. Everything works fine, EXCEPT that there are Portuguese characters like "ú" that are not understood by Outlook (2013) when opening the mail, although the HTML defines the content as "text/html; charset=UTF-8". The character set is also a parameter in the call to the command line utility, but to no avail. Outlook still displays that "ú" as "= FA" !  But even Internet Explorer shows me these character properly when I open the body file directly in the browser !

I'm lost for a solution, how can I get Outlook to understand such special characters, short of encoding them with % and hex values ?

Thanks for help
Bernard
Avatar of ste5an
ste5an
Flag of Germany image

The questions are:

1)  How does your file look like exactly?
2) As you said file: is properly encoded, thus saved as UTF-8 with BOM?
3) Does your command line utility work correctly with such files?

Imho it's a problem with your file and how it's handled by your utility.

btw. My default template:
<html>  
<head>   
	<meta charset="utf-8"> e>  
</head>  
<body>   
</body> 
</html>

Open in new window

works with most readers. Where does your "text/html; charset=UTF-8" come into play?
Avatar of Bernard Thouin

ASKER

Hi

Thanks for your prompt answer !

>>How does your file look like exactly?<<
My file has been originally created by users using Word :(, but the head tag looks like this (modified from showing "windows-something" instead of UTF-8:

<head>
<meta http-equiv=Content-Type content="text/html; charset=UTF-8">

>>2) As you said file: is properly encoded, thus saved as UTF-8 with BOM?<<
What is BOM ? I could not find anything about the encoding of the file, it's just a text file full of HTML, originally gained from opening an Outlook message that I got as example from my users in Outlook and getting Outlook to save the message as HTML. I use Notepad to edit the file.

>>3) Does your command line utility work correctly with such files?<<
the utility is called febootimail, which is used a lot at my client. It supports HTML files as body, and has a -charset switch, which I set to -charset UTF8, didn't change anything in the result. The file before being sent as the body of the mail looks perfect in IE, looks wrong in the mail message...:(
Okay, grab yourself an good text editor - definitely not Word, WordPad or Word: Notepad++, VS Code, Sublime to name some (the first two are free).

The will show you, what encoding is used. E.g. Notepad++:

User generated image
It must be at least UTF-8. But many applications are happier, when you use a BOM (Byte Order Mark, a magic indicating UTF text encoding).
And I would recommend to recreate your HTML file to get clean HTML..

When your file is ok - check it by opening it in different browser - then play with your command line utility. Try Utf-8 as well as Unicode.
Hi again

I found out about the BOM in the meantime, and saved the HTML file with the UTF-8 encoding, even in Notepad it shows the BOM at the bottom :).
But it did NOT change anything, same wrong display problem remains in Outlook :(.

But the weird thing is that I noticed now that the famous "ú" character is appearing twice in the HTML file, and the 1st one is shown correctly in the mail, but NOT the 2nd one ! I did not notice the first occurrence because it was correct... What can change the display of that same character within the same file ?

I cannot regenerate the HTML, as I don't have the original Word document that was used for generating the "template" message that the users are using for manually generating their mails. I'm supposed to automate these mails, that's why I can only use the same HTML file extracted from the "template" Outlook message. I update that file manually to satisfy the requirements (the file needs placeholders that I then replace programmatically with data). And the HTML is horribly bloated by Word, about 80% of it seems to be totally useless tags, however I don't dare changing anything because it's full of complex formatting information.

And as mentioned before, the HTML file which is sent to the Exchange server as body of the mail, is perfect when opened in IE on the machine that generates the mail for the Exchange server, but it displays wrongly in Outlook :(
The solution is to replace all such letters with their HTML name, e.g. the e with an acute accent (é), existing in Portuguese and in French, has to be replaced with "&eacute;" in the HTML, and it is then shown properly everywhere :)

These HTML names are easy to find in the internet.

Regards
Bernard
ASKER CERTIFIED SOLUTION
Avatar of Bernard Thouin
Bernard Thouin
Flag of Switzerland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial