Solved

Get remote page (url) with foreign characters

Posted on 2004-08-09
5
242 Views
Last Modified: 2012-06-27
Hi,
I want to get the contents of a remote page (an url like http://www.google.com) into a string variable using asp.
I know i can use the MSXML2.ServerXMLHTTP object to "get" a remote file, using the .ResponseText property to read the returning text into a string.
However, this method raises severe problems when the resulting text contains foreign characters like ë or é etc.
Does anyone knows a solution for this ?
(I've read something about reading the return values as binary and converting them to ascii ... how does this work?)

Thanks in advance!
Steffest
0
Comment
Question by:Steffest
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 4

Expert Comment

by:Tasneem
ID: 11752285
<%@CodePage = 65001
Response.CharSet = "utf-8"
%>
Put the above code in the calling page.  ie the page where you are doing xmlhttppost.It should ideally work. If not then can think of alternatives
0
 
LVL 1

Author Comment

by:Steffest
ID: 11761573
Hi Tasneem,

nope, i tried setting it both in the calling page and in the page that is called ....
no results ...

In the mean time I've found a solution that works (more or less)

reading the response as binary data using the .ResponseBody property and converting it to ascii using the function at http://www.motobit.com/tips/detpg_binarytostring.htm

but it's ridiculously slow ....
There's got to be a better solution for this ....
0
 
LVL 4

Expert Comment

by:Tasneem
ID: 11781804
0
 
LVL 4

Accepted Solution

by:
Tasneem earned 250 total points
ID: 11781813
The above link posted earlier.. is of PHP.. but you can use that solution for your page perhaps.
for general reading
http://www.mezzoblue.com/archives/2003/07/29/html_and_for/
0
 
LVL 1

Author Comment

by:Steffest
ID: 11782036
Thanks Tasneem

I had some clarifying reads there.
The problem was indicated very well:

quote
"Oh, I can just pretend this is UTF-8. This sometimes works, but unfortunately there's not that much pure ASCII left in the world
If there's even one é or smart quotation mark (“ instead of ") in your text, it's probably encoded in ISO-8859 or some Microsoft code page, and will seriously confuse software that thinks it's reading UTF-8, including most XML software."
/quote

Seems that most of the requested url's are not UTF-8 at all, therefore messing up the MSXML2 text parser ...
Problem solved.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

I would like to start this tip/trick by saying Thank You, to all who said that this could not be done, as it forced me to make sure that it could be accomplished. :) To start, I want to make sure everyone understands the importance of utilizing p…
Have you ever needed to get an ASP script to wait for a while? I have, just to let something else happen. Or in my case, to allow other stuff to happen while I was murdering my MySQL database with an update. The Original Issue This was written…
Finding and deleting duplicate (picture) files can be a time consuming task. My wife and I, our three kids and their families all share one dilemma: Managing our pictures. Between desktops, laptops, phones, tablets, and cameras; over the last decade…

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question