Ascii characters -> url encoded form

I have an unusual problem.  Like most unusual problems, this one requires an unusual solution.

I need someone to provide me with a two column list.  In the left column are the numbers 1 through 255.  These numbers represent all possible ascii characters (minus the first, but that's okay).

On the right, I need the escape sequence that the ascii character would be converted to if that character were submitted over the internet in a form submit.  For example, the space character (I'm not sure what number it is associated with off hand) would be converted to a '+' sign or a %20.  A # would be %23 etc.

If you are unsure of my question, please ask for further clarification.  What I need is very clear to me, all I need is someone to do it.  For 300 points, I hope someone out there will come through.  If this problem is more difficult than I anticipate it to be, I will increase points further.
cyberuserAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

vendrigCommented:
http://www.netspace.org/users/dwb/url-guide.html#appendixA
"These escape sequences are of the format "%+US-ASCII-character-hexadecimal value"."
So your columns will look like this:
1 %01
2 %02
....
255 %FF

You just need to know the hexadecimal system, but that's not very hard. I can give you the full list if this is what you mean.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
vendrigCommented:
Actually, you just have to add the percentages and you get the ASCII characters as a bonus:
http://www.cdrummond.qc.ca/cegep/informat/Professeurs/Alain/files/ascii.htm
0
cyberuserAuthor Commented:
Hello Vendrig,

The web site you provided me with was of immense help.

However, I'm still a bit confused.  Perhaps explaining my problem will help you explain things to me.

I need to encode a large amount of text to send to a form on the internet.  The text could include any letters/numbers/special characters.

If I setup a computer program that will convert all the characters mentioned on the web site you provided me with to their escape sequence counterpart, will that allow me to transmit it to the internet form?

The characters I'm referring to are these:

     SPACE      %20
     <          %3C
     >          %3E
     #          %23
     %          %25
     {          %7B
     }          %7D
     |          %7C
     \          %5C
     ^          %5E
     ~          %7E
     [          %5B
     ]          %5D
     `          %60

     ;          %3B
     /          %2F
     ?          %3F
     :          %3A
     @          %40
     =          %3D
     &          %26

I am using http (obviously).  If you can help me with this problem, the points are yours.  Thanks!
0
The Ultimate Tool Kit for Technolgy Solution Provi

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy for valuable how-to assets including sample agreements, checklists, flowcharts, and more!

ozoCommented:
Do you want to encode those characters in a URL?  or in an HTML page?
0
vendrigCommented:
Hi cyberuser,
I'm afraid you have to be more specific about "send to a form on the internet".
If you fill in a form, your browser will automatically convert the specific characters before submitting the text, so don't worry about that.
If you want to bypass your browser with your own program, you can use these codes.
0
cyberuserAuthor Commented:
I want to encode these characters in an url.

Yes, I want to bypass the browser all together.

Let's say I have a form at:
http://www.somewhere.com/someform.asp

and I want to submit some data to it.  The url might look like:

http://www.somewhere.com/someform.asp?SomeVariable=Some%20Data1234%23Some%20More%20Data

Are the only characters that I need to encode listed above in my previous comment?
0
cyberuserAuthor Commented:
Alternatively, does the browser do anything besides encoding the above chartacters to their escape sequence counterparts?

You see, I've written the program so that it encodes the above characters.  Unfortunately, the program doesn't work while submitting it via the internet does.  The ASP seems to treat my encoded text differently from the browser's.

If you would like, I could post the encode routine so you can look at it (it uses C++ and a touch of MFC but is not that long).
0
vendrigCommented:
May I ask why you didn't use the default Server.URLEncode in ASP?
0
vendrigCommented:
The official document on this can be found at http://www.ietf.org/rfc/rfc1738.txt
You missed the double quote "

Do you have full control over the ASP server? If so, I'd rather see the URLs handled by ASP in both the browser and your app's case than the code.
Did you make sure there are no hidden form fields?
0
cyberuserAuthor Commented:
Let me start from square one :-)  I'm new to this which may be why we're having problems communicating.

I want to write a program that submits form data to an ASP on the internet automatically.  The idea is to automate the process so I don't have to do it myself.

The ASP is not under my control.

The ASP takes just one form variable.  This variable can contain a lot of bytes however.  If you were to submit the data by going to a URL, it would look like this:

http://www.somewhere.com/someasp.asp?variable=junk

Where "junk" is almost anything and goes on for quite some time (about 20000 bytes worth).  By almost anything, I mean it includes new line characters, spaces, <, >, !@#$%^&*(), letters, numbers etc.

The problem I am having is encoding these bytes properly.

I know the ASP recognizes small amounts of text submitted by my program like "abc".  It will acknowledge that it received the data.  So, I know I am not missing any hidden form variables and my program is indeed submitting SOMETHING to the form.

Thus, the problem I am having is how to encode these letters and numbers in my computer program.

After looking at:
http://www.ietf.org/rfc/rfc1738.txt

I saw that I could re-write the encoding routine in my program so that it encodes every character except for "a" to "z" "A" to "Z" "0" to "9" and the special characters "$-_.+!*'(),".

Unfortunately, this doesn't seem to work either (which makes me wonder if my program's encoding routine is at fault).  The ASP still complains that the text it is receiving is "not in the proper format".  It does not complain if I submit the text manually via a browser.  This indicates to me that the original version of the data is okay, but my encoding screws it up.

So, my question is: what ASCII characters should I encode and what should I encode them to?

Thanks for your help!
0
vendrigCommented:
Did you encode the new line characters?
0
cyberuserAuthor Commented:
The newline characters are being encoded.

Here is the code:

CString Encode (CString data)
{ // declare variables
  CString temp, escape;
  char c;
  int i;

  // go through the data that needs to be encoded
  for (i = 0; i < data.GetLength (); i++)
  { // get the current byte
    c = data.GetAt (i);

    // check if it needs to be encoded
    if (
         isalnum (c) != 0 ||
         c == '$' ||
         c == '-' ||
         c == '_' ||
         c == '.' ||
         c == '+' ||
         c == '!' ||
         c == '*' ||
         c == '\'' ||
         c == '(' ||
         c == ')' ||
         c == ','
       )
    {
      temp += c;
    }
    else
    {
      // if it needs encoding, convert it to a hex escape sequence.
      escape.Format ("%%%X", c);
      temp += escape;
    }
  }

  // return the encoded form:
  return temp;
}

If you would like, I could write a test program, give the function some bytes, encode it, and post it here.
0
vendrigCommented:
Could you give two strings with the same input: one encoded by a browser (desired output) and one encoded by your program (fail output)? I'd like to know what the difference between them is.
The only thing I can see is that a newline is encoded as %A rather than %0A. Maybe it's picky about that.
0
vendrigCommented:
Plus, I would encode the + character as well, even though it doesn't say so in the specification. Maybe you should encode all those special characters, just to be on the safe side...
0
cyberuserAuthor Commented:
Can you explain to me how to get encoded information from the browser?  That is, can you explain how I could encode some bytes with the browser?  If we could get that, it would be easy to see where the problem is.

I will adjust the program by making all the single digit hex numbers output as double digits.  Additionally, I will make it encode all the other special characters anyway (leaving only alphanumerics untouched).
0
cyberuserAuthor Commented:
I've re-written the function like so.  It now encodes everything except the alphanumerics.  It also adds a 0 if necessary (%F is now %0F).

Unfortunately, it still doesn't work.  It tells me I gave it bad data.  I tried submitting the text via a form in my browser and it worked.

CString Encode (CString data)
{
  CString temp, escape;
  char c;
  int i;

  for (i = 0; i < data.GetLength (); i++)
  {
    c = data.GetAt (i);

    if (isalnum (c) != 0)
    {
      temp += c;
    }
    else
    {
      if (c <= 0xf)
      {
        escape.Format ("%%0%X", c);
      }
      else
      {
        escape.Format ("%%%X", c);
      }

      temp += escape;
    }
  }

  return temp;
}

If we could just get a copy of the encoded text a browser spits out, we could probably solve this problem.
0
vendrigCommented:
Can you give the URL you test on?
0
cyberuserAuthor Commented:
Unfortunately it's on a secure web site.  You need the password. :(
0
vendrigCommented:
???
Then have you tested your application with other forms? Because if it's on a secure Web-site, how can your application transfer the info?
0
cyberuserAuthor Commented:
The program logs on to the web site for me.  I have also tested submitting information to other forms on other web sites.  It does work.  I am 95%+ sure the problem lies in the encoding of it (the part I've pasted above).

I don't know what else the problem could be.
0
vendrigCommented:
But if the encoding is okay for the other forms, that's an indication the security is the problem.
0
cyberuserAuthor Commented:
Oh, my last comment was not very clear (it's early here).

When I said "It does work." I was referring to the fact that I know my program submits the data.  I know it logs me on to the web site properly.

I will try and find a website that can reproduce this problem.  Then I will set up a test program for you to try it with and send it to you.

Before that, though, is there any way to find out how the browser encodes data?
0
cyberuserAuthor Commented:
One thing that just crossed my mind, which makes me think that knowing how the browser encodes it is important, is that the ASP has to parse the data.  The errors it has been giving me are errors it is encountering while parsing through the data.  This makes me think that while the data is being encoded properly as far as the standard is concerned, it may not be exactly what the form is looking for.

My browser (IE 5.0) may be encoding the data one way, which is right according to the standard.  The program, however, may be encoding it slightly differently.

I hope that makes sense.
0
BigRatCommented:
The encoding routine seems to be OK. Could we look at the bit of code which appends the translated data onto the URL just before it is sent?
Also, have you been testing this routine with a SMALL data string. I never GET with strings longer than a few bytes. I always POST.
0
cyberuserAuthor Commented:
Certainly.

  CString temp;
  temp.Format ("Action=Update&ReportData=%s", m_EncodedData);

Then it calls (MFC again):

  CHtmlViewObject->Navigate2 (G_ADDDATA_ASP,
           navNoWriteToCache,
           NULL,
           "Content-Type: application/x-www-form-urlencoded",
           (char *) temp.operator LPCTSTR (),
           temp.GetLength ());

Also, this uses "post," not "get."
0
cyberuserAuthor Commented:
The problem was on my end.  The encoding routine was definitely correct.  Since Vendrig provided by far the most assistance, I'll give him him the points.

Thanks for the help!
0
vendrigCommented:
Your welcome, good luck with it.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
HTML

From novice to tech pro — start learning today.