Solved

encoding a query string

Posted on 2003-06-24
9
7,374 Views
Last Modified: 2007-12-19
I looked at http://www.experts-exchange.com/Web/Web_Languages/JavaScript/Q_20335762.html when I was trying to find a way to encode a query string with #'s and +'s in it.  

I used the hexcode function posted there by dirge, and it worked great until I tried to get it to work with Korean characters.  Some Korean characters use 2 bytes, and the hexcode function only converts to 1 byte in hex.  I tried, just to see, changing the hexcode function to:

function hexnib(d) {
  if(d<10) return d; else return String.fromCharCode(65+d-10);
}

function hexcode(url) {
     var result="";
     for(var i=0;i<url.length;i++) {
        var cc=url.charCodeAt(i);
        var hex= "00" + hexnib((cc&240)>>4)+""+hexnib(cc&15);
        result+="%"+hex;
     }
     return result;
}

The only change I made was I added the "00" + in the line: var hex= "00" + hexnib((cc&240)>>4)+""+hexnib(cc&15);  to make it 2 bytes.

I used this to see if it would work for English characters (all of which would have zeros for the first two digits in a 4-digit hex number), but it didn't work.  When it gets to the server, it is not decoded correctly.  It gets converted on the server to empty string (presumably it was only seeing the 0's?)  Does this mean query strings cannot be encoded to the form %A492%B61A%AE53 etc. ?

If not, then how can Korean characters be passed in a query string?

thanks for the help!
0
Comment
Question by:maltomeal8
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
9 Comments
 
LVL 18

Expert Comment

by:SquareHead
ID: 8794393
I had the same problem with double byte chars and encoding html entities for the querystring... I was not able to find a solution and ended up replacing the '#' char with something before adding it to the qs, then doing another replace on the receiving end... not an elegent solution by any means but it worked for me... :p
0
 
LVL 14

Expert Comment

by:avner
ID: 8794646
Have you tried using the escape() method ?


0
 

Author Comment

by:maltomeal8
ID: 8799729
The fact that escape() does not handle + correctly was why I used HexCode in the first place

I just noticed something interesting.  On Google, they seem to take what the user types in and put it into a query string.  So, I tried searching for the word français and I noticed it puts this string in the address bar:

http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=fran%C3%A7ais&btnG=Google+Search

it looks like the ç was converted to %C3%A7 but how is that possible?  When I use javascript's charCodeAt function on ç, it gives me 231, which is %00%E7 in hex.
Also, they are passing ie=UTF-8 which looks like a flag to say to decode unicode characters?
0
PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

 

Author Comment

by:maltomeal8
ID: 8799998
I think I have answered my own question (so I guess I'll keep my points).  Apparently a query string can only handle single byte characters.

I found on http://www.w3.org/TR/html4/interact/forms.html#h-17.13.1

that:
Note. The "get" method restricts form data set values to ASCII characters. Only the "post" method (with enctype="multipart/form-data") is specified to cover the entire [ISO10646] character set.
0
 
LVL 1

Accepted Solution

by:
dirge earned 125 total points
ID: 8800229
The following is an update on my old script. It works fine with ??? for instance, when compared to what Google generates.

You may want to check out http://www1.tip.nl/~t876506/utf8tbl.html and http://selfaktuell.teamone.de/artikel/javascript/utf8b64/utf8.htm (German)

<html>
<head>
<script language="javascript">
<!--

function hexnib(d) {
   if(d<10) return d; else return String.fromCharCode(65+d-10);
}

function hexbyte(d) {
        return "%"+hexnib((d&240)>>4)+""+hexnib(d&15);
}

function hexcode(url) {
     var result="";
    var hex="";
     for(var i=0;i<url.length; i++) {
             var cc=url.charCodeAt(i);
             if (cc<128) {
                 result+=hexbyte(cc);
             } else if((cc>127) && (cc<2048)) {
                result+=  hexbyte((cc>>6)|192)
                        + hexbyte((cc&63)|128);
             } else {
                result+=  hexbyte((cc>>12)|224)
                        + hexbyte(((cc>>6)&63)|128)
                        + hexbyte((cc&63)|128);
             }
     }
    return result;
}

function encoder() {
   document.forms.test.r.value=hexcode(document.forms.test.s.value);
}

// -->
</script>
</head>
<body>
   <form name="test">
      URL (without http://) <input type="text" name="s"><br>
      Result: <input type="text" name="r"><br>
      <input type="button" value="Encode" onClick="encoder()">
      <input type="reset" value="Clear">
   </form>
</body>
</html>

0
 
LVL 1

Expert Comment

by:dirge
ID: 8800250
That's 'fine with "Korea" (in Korean)' -- not sure if you see it in your browser, but I don't. I just copied the characters from http://kr.yahoo.com/ 
0
 
LVL 1

Expert Comment

by:dirge
ID: 8800294
And..... ;-D it's not Google which generates the codes -- it's the browser, once you press Submit.

'Nuff said. Good luck.

0
 

Author Comment

by:maltomeal8
ID: 8802269
Thank you dirge!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
0
 

Expert Comment

by:justkeys
ID: 9189412
(un)escape is NOT the same as url-encode/decode in IE and opera;

url-encode/decode = characters are translated in 1 to 4 "%xx" strings, which represent the unicode bytes:
the algorithm of url-encoding works like this:

                byte[] bytes = the_char.getBytes("UTF-8");
                for (int j = 0; j < bytes.length; j++)
                {
                    buffer.append("%");
                    String hex = Integer.toHexString(255 & bytes[j]);
                    buffer.append("00".substring(hex.length()));
                    buffer.append(hex);
                }

In javascript, i don't know how to do this (i don't know how to find the unicode index for a char in javascript), but for sure, the browser does it when you submit a form that contains "international" input (like chinese). Thats what happens when you look for the euro sign in google.

Netscape's (un)escape IS url-encode/decode; while IE and opera's (un)escape is NOT: in those browsers, escape translates "simple accented chars" to on single "%xx" expression, probably by using a table, because there is no relation between the hex code and the unicode value for the char. For more complex characters, the escape returns a "%uxxxx" where xxxx = the hex unicode for the character.
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Boost your ability to deliver ambitious and competitive web apps by choosing the right JavaScript framework to best suit your project’s needs.
Originally, this post was published on Monitis Blog, you can check it here . In business circles, we sometimes hear that today is the “age of the customer.” And so it is. Thanks to the enormous advances over the past few years in consumer techno…
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…
Suggested Courses
Course of the Month10 days, 16 hours left to enroll

631 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question