We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you a podcast all about Citrix Workspace, moving to the cloud, and analytics & intelligence. Episode 2 coming soon!Listen Now

x

Strange characters.

pucko
pucko asked
on
Medium Priority
451 Views
Last Modified: 2008-02-26
Hello!

I have a litle problem. I have a formmailscript and som of the mail that I got contains a lot of strange characters insted of the swedish å,ä,ö. Can some one tell me how to replace those characters into other, for examper ä to ae, ö to oe and å to aa? Or is it possibel to always get swedish characters when someone enter something in the form?
I

foreach $pair (@pairs)
{
   ($name, $value) = split(/=/, $pair);

   $value =~ tr/+/ /;
   $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
   $name =~ tr/+/ /;
   $name =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
         
   $FORM{$name} = $value;
}
Comment
Watch Question

Author

Commented:
Edited text of question

Commented:
Swedish (and all other national) characters can have
different encodings. In WWW, the encoding normally used is
ISO-8859-1 (it means ISO Latin-1).

Just check, what you get when you enter e.g. %C4. If you
get Ä (Ä) then the problem is in the characters that are entered.

Replacing some characters is very easy in perl.
Just use replacement:
  s/ä/ae/g;

Author

Commented:
Can I change the encoding in anyway?
How do I check %C¤

Author

Commented:
I've found out by my self how to do, but how can I remove this question? It seems that all I can do is to add a coment and Edit the question.

Commented:
To remove a question, you need to accept an answer and award the points.

Mike
Commented:
Using the substitution function (s///) is a very easy way to do this as suggested by keegi. Another way to try this is to use the translation operator (tr/// or y///) which is used the same way as in sed.

eg. s/PATTERN/REPLACEMENT/switches
   tr/SEARCHLIST/REPLACEMENTLIST/switches

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts

Author

Commented:
tr don't work because å, ä ,ö seems to be more than one char.But I have alredy solved the problem by my self. Thanks anyway!
ozo
CERTIFIED EXPERT
Most Valuable Expert 2014
Top Expert 2015

Commented:
%tr = ('å'=>'aa','ä'=>'ae','ö'=>'oe');
s/([åäö])/$tr{$1}/g;


I have the same problem, because in Portuguese we use characters with uml, tilde, grave, acute, cedilla and circumflex.
Maybe translating all these characters to an HTML &whatever; or &decimal; would work. I could not find a way to make this translation, too. Some characters have more than one HTML notation, like these ones:
ä = ä = ä
å = å = å
From there on I am not sure if making an array with these character notations for translation would work...
ozo
CERTIFIED EXPERT
Most Valuable Expert 2014
Top Expert 2015

Commented:
%tr = (
'ä' => 'ä',
'ä' => 'ä',
'å' => 'å',
'å' => 'å',
);
s/(&[#\w]+;)/$tr{$1}/g;

s/&#(\d*);/chr $1/eg; #for just the &# numbers
Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.