lakshmisubram
asked on
ord function in perl
The ord function returns the ascii value for the character .
$str = "o";
print ord $str; - returns 111
What will be the ord value for non-ascii character ?
$str = "ó";
print ord $str; - returns 198
Actually the ascii decimal equivalent for the above non-ascii character is 243 . Why is the ord value returns 198 ? How should I get the ascii decimal for non-ascii charcters in perl ?
Thanks for the help .
Lakshmi
$str = "o";
print ord $str; - returns 111
What will be the ord value for non-ascii character ?
$str = "ó";
print ord $str; - returns 198
Actually the ascii decimal equivalent for the above non-ascii character is 243 . Why is the ord value returns 198 ? How should I get the ascii decimal for non-ascii charcters in perl ?
Thanks for the help .
Lakshmi
ASKER
My query is 'impresión' . Before sending to my search engine , I have to encode this query string manually . Through form submission , the query is automatically encoded and coming as 'impresi%F3n' . But when I encode it manually , I am getting 'impresi%C6n' . The decimal equivalent of %C6 is 198 and %F3 is 243.
The ord function returns the ascii decimal value for the ascii characters properly . But the ord value for non-ascii characters are wrong . Hence , the query sent to my search engine is wrong .
I appreciate your immediate help .
Thanks .
The ord function returns the ascii decimal value for the ascii characters properly . But the ord value for non-ascii characters are wrong . Hence , the query sent to my search engine is wrong .
I appreciate your immediate help .
Thanks .
my suggestion about the charcter "in real life" does not only apply to your file, but also to any browser. Same dragon there.
In your CGI you must rely on what you get, you have no possibility to identify, just to imagine, what the user, sitting in front of the browser, want to send you. You have to believe it.
Either you tell the users which font, which keyboard mapping, etc. they have to use, or you need to use what you get from them.
Probaly some experts using KOI8 and ISO8859-* fonts are listening here too, they might tell you more details about the font-dragons (I'm just a pure latin charset user;-)
In your CGI you must rely on what you get, you have no possibility to identify, just to imagine, what the user, sitting in front of the browser, want to send you. You have to believe it.
Either you tell the users which font, which keyboard mapping, etc. they have to use, or you need to use what you get from them.
Probaly some experts using KOI8 and ISO8859-* fonts are listening here too, they might tell you more details about the font-dragons (I'm just a pure latin charset user;-)
ASKER
My question is nothing related to browser or what the user type in . I am taking my query from a file which is 'impresión' . Before sending to my search engine , I want to encode my query as 'impres%F3n' . But I get the encoded value as 'impresi%C6n' using the below function .
sub URLEncode {
my $theURL = $_[0];
$theURL =~ s/([\W])/"%" . uc(sprintf("%2.2x",ord($1) ))/eg;
return $theURL;
}
Please correct me if my subroutine for encoding miss anything .
This is urgent . Please reply .
Thanks for your response .
Lakshmi
sub URLEncode {
my $theURL = $_[0];
$theURL =~ s/([\W])/"%" . uc(sprintf("%2.2x",ord($1)
return $theURL;
}
Please correct me if my subroutine for encoding miss anything .
This is urgent . Please reply .
Thanks for your response .
Lakshmi
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
As you said , it's problem with the string in file .
If I give 'impresiµn' in my file , it is escaping correctly as impresi%F3n .
Thank you so much for your immediate answers !!
One more help if possible .
How will I know that 'impresión' to be given as 'impresiµn' ? (There are so many accented charaters in my file .) How will I find out for those ?
Thanks again .
If I give 'impresiµn' in my file , it is escaping correctly as impresi%F3n .
Thank you so much for your immediate answers !!
One more help if possible .
How will I know that 'impresión' to be given as 'impresiµn' ? (There are so many accented charaters in my file .) How will I find out for those ?
Thanks again .
echo 'impresión' | perl -pe '$_=~tr/[a-zA-Z0-9,._]//d; '
ASKER
Executing the above gives the same ó character only know ?
Or I am doing something wrong ?
Or I am doing something wrong ?
Please use
od -c file-conatinig-above-scrip
to see which octal value it is.
If you use
$str="\306";
you get what you want.