Solved

Get remote page (url) with foreign characters

Posted on 2004-08-09
5
238 Views
Last Modified: 2012-06-27
Hi,
I want to get the contents of a remote page (an url like http://www.google.com) into a string variable using asp.
I know i can use the MSXML2.ServerXMLHTTP object to "get" a remote file, using the .ResponseText property to read the returning text into a string.
However, this method raises severe problems when the resulting text contains foreign characters like ë or é etc.
Does anyone knows a solution for this ?
(I've read something about reading the return values as binary and converting them to ascii ... how does this work?)

Thanks in advance!
Steffest
0
Comment
Question by:Steffest
  • 3
  • 2
5 Comments
 
LVL 4

Expert Comment

by:Tasneem
ID: 11752285
<%@CodePage = 65001
Response.CharSet = "utf-8"
%>
Put the above code in the calling page.  ie the page where you are doing xmlhttppost.It should ideally work. If not then can think of alternatives
0
 
LVL 1

Author Comment

by:Steffest
ID: 11761573
Hi Tasneem,

nope, i tried setting it both in the calling page and in the page that is called ....
no results ...

In the mean time I've found a solution that works (more or less)

reading the response as binary data using the .ResponseBody property and converting it to ascii using the function at http://www.motobit.com/tips/detpg_binarytostring.htm

but it's ridiculously slow ....
There's got to be a better solution for this ....
0
 
LVL 4

Expert Comment

by:Tasneem
ID: 11781804
0
 
LVL 4

Accepted Solution

by:
Tasneem earned 250 total points
ID: 11781813
The above link posted earlier.. is of PHP.. but you can use that solution for your page perhaps.
for general reading
http://www.mezzoblue.com/archives/2003/07/29/html_and_for/
0
 
LVL 1

Author Comment

by:Steffest
ID: 11782036
Thanks Tasneem

I had some clarifying reads there.
The problem was indicated very well:

quote
"Oh, I can just pretend this is UTF-8. This sometimes works, but unfortunately there's not that much pure ASCII left in the world
If there's even one é or smart quotation mark (“ instead of ") in your text, it's probably encoded in ISO-8859 or some Microsoft code page, and will seriously confuse software that thinks it's reading UTF-8, including most XML software."
/quote

Seems that most of the requested url's are not UTF-8 at all, therefore messing up the MSXML2 text parser ...
Problem solved.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

I recently decide that I needed a way to make my pages scream on the net.   While searching around how I can accomplish this I stumbled across a great article that stated "minimize the server requests." I got to thinking, hey, I use more than one…
I have helped a lot of people on EE with their coding sources and have enjoyed near about every minute of it. Sometimes it can get a little tedious but it is always a challenge and the one thing that I always say is:  The Exchange of information …
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…
This tutorial demonstrates a quick way of adding group price to multiple Magento products.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now