C# - Convert URL to plain text

Hi,

I am trying to convert an URL format to plain text format by this way:

 string fileTxt = File.ReadAllText("test1.txt");

  MessageBox.Show(Uri.UnescapeDataString(fileTxt)); --> It doesn't work, it shows: oJ\x2BkTRbYHjeME72JTd\x2F\x2Fnb\x2BKt0RWeAmk8Gzw7dOc\x2FAM\x3D:1438850410112:hEj6agVToFMpYj7MufF9Qg\x3D\x3D

The content of "test1.txt": oJ\x2BkTRbYHjeME72JTd\x2F\x2Fnb\x2BKt0RWeAmk8Gzw7dOc\x2FAM\x3D:1438850410112:hEj6agVToFMpYj7MufF9Qg\x3D\x3D

If I use directly the string: MessageBox.Show(Uri.UnescapeDataString("oJ\x2BkTRbYHjeME72JTd\x2F\x2Fnb\x2BKt0RWeAmk8Gzw7dOc\x2FAM\x3D:1438850410112:hEj6agVToFMpYj7MufF9Qg\x3D\x3D")); ---> It works.


Could you let me know how to make string fileTxt = File.ReadAllText("test1.txt"); work?

Thank you!
Bach NguyenAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Robert SchuttSoftware EngineerCommented:
When you use a string literal like that in C# the special characters are already processed in the code before any functions are called. So the unescape does nothing in that case. If you leave it out you get the same result:
MessageBox.Show("oJ\x2BkTRbYHjeME72JTd\x2F\x2Fnb\x2BKt0RWeAmk8Gzw7dOc\x2FAM\x3D:1438850410112:hEj6agVToFMpYj7MufF9Qg\x3D\x3D"); // ---> It "works".

Open in new window


However, I suspect this is actually not the correct string. Look at it carefully below at *1*: there is a strange character that I don't "trust" (just before M=:).

How is the file generated? It does not look like it's generated with Uri.EscapeDataString() because then you would see %xx tokens in the string. In fact you can use the unescape if you replace "\x" with "%" in fileTxt:
MessageBox.Show(Uri.UnescapeDataString(fileTxt.Replace(@"\x", "%")));

Open in new window


This gives a slightly different result (see *2* below) because of \x2FA which is processed as a double byte character in the literal string but probably needs to be processed as \x2F (a slash) followed by "AM" etc.

the 2 outputs:
*1* oJ+kTRbYHjeME72JTd//nb+Kt0RWeAmk8Gzw7dOc˺M=:1438850410112:hEj6agVToFMpYj7MufF9Qg==
*2* oJ+kTRbYHjeME72JTd//nb+Kt0RWeAmk8Gzw7dOc/AM=:1438850410112:hEj6agVToFMpYj7MufF9Qg==

Open in new window


Be careful because it seems there's more going on: the file may contain 3 fields separated by colons because if it was 1 string I would expect they would have been encoded to \x3A. You may need to split it up first because if any of the fields contain an encoded colon then a split afterwards would result in more than 3 fields.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Bach NguyenAuthor Commented:
Thank you very much!

I got it.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C#

From novice to tech pro — start learning today.