Solved

processing unicode_ form submission

Posted on 2004-09-17
3
196 Views
Last Modified: 2010-03-05
I am using following form to submit unicode data with 2 russian characters in "rus_lang" field.

<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=unicode">
</HEAD><BODY>
<FORM action=http://209.45.33.8/cgi-bin/cgiwrap/absolute/WRITEENV.CGI method=post>
<INPUT type=text name=rus_lang value="Ce">
<INPUT type=submit value="Submit">
</FORM>
</BODY></HTML>


When I processing this form submission with Perl script I receive following string as input:
rus_lang=%D0%A1%D0%B5

How can I decode that received string to original value (that was : rus_lang="\x21\x04\x35\x04" )

P.S. Form example above cant show russian characters on this web site :-(
So I changed 2 russian characters to 2 english character to give an idea how it looks like
0
Comment
Question by:serg111
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
3 Comments
 
LVL 18

Expert Comment

by:kandura
ID: 12089400
does your script "use CGI;" ?
is "unicode" an appropriate value for charset?
does your webserver serve this document with that same charset?
0
 
LVL 2

Author Comment

by:serg111
ID: 12089650
1) No, it use plain Perl
2) Yes, it is used by microsoft and other prividers
3) Yes, this web page above is from same webserver
0
 
LVL 18

Accepted Solution

by:
kandura earned 250 total points
ID: 12091119
1) then do use CGI, since it will do the decoding for you.
2) unicode is not a registered value for charset. See http://www.iana.org/assignments/character-sets for a complete list
3) what I mean is, does your webserver also emit a HTTP header stating
    Content-type: text/html; charset=unicode

It probably doesn't. You should consider using an accepted encoding such as utf-8 or ISO-8859-1.

Just for fun I tried running the W3C Validator on a script that sets the charset to "unicode". The result is enlightening:
    http://validator.w3.org/check?uri=http%3A%2F%2Fwww.spiritofamerica.net%3A8080%2Fcgi-bin%2Fsoa%2Ftest.pl&charset=%28detect+automatically%29&doctype=%28detect+automatically%29&verbose=1

Other recommended reading: http://www.cs.tut.fi/~jkorpela/chars.html
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
remove duplicates from the csv file 13 122
Search in text file in column and compare 4 75
Awk Question 2 136
Perl script to process a .csv file 18 84
Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

732 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question