Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

Get remote page (url) with foreign characters

Posted on 2004-08-09
5
Medium Priority
?
247 Views
Last Modified: 2012-06-27
Hi,
I want to get the contents of a remote page (an url like http://www.google.com) into a string variable using asp.
I know i can use the MSXML2.ServerXMLHTTP object to "get" a remote file, using the .ResponseText property to read the returning text into a string.
However, this method raises severe problems when the resulting text contains foreign characters like ë or é etc.
Does anyone knows a solution for this ?
(I've read something about reading the return values as binary and converting them to ascii ... how does this work?)

Thanks in advance!
Steffest
0
Comment
Question by:Steffest
  • 3
  • 2
5 Comments
 
LVL 4

Expert Comment

by:Tasneem
ID: 11752285
<%@CodePage = 65001
Response.CharSet = "utf-8"
%>
Put the above code in the calling page.  ie the page where you are doing xmlhttppost.It should ideally work. If not then can think of alternatives
0
 
LVL 1

Author Comment

by:Steffest
ID: 11761573
Hi Tasneem,

nope, i tried setting it both in the calling page and in the page that is called ....
no results ...

In the mean time I've found a solution that works (more or less)

reading the response as binary data using the .ResponseBody property and converting it to ascii using the function at http://www.motobit.com/tips/detpg_binarytostring.htm

but it's ridiculously slow ....
There's got to be a better solution for this ....
0
 
LVL 4

Accepted Solution

by:
Tasneem earned 1000 total points
ID: 11781813
The above link posted earlier.. is of PHP.. but you can use that solution for your page perhaps.
for general reading
http://www.mezzoblue.com/archives/2003/07/29/html_and_for/
0
 
LVL 1

Author Comment

by:Steffest
ID: 11782036
Thanks Tasneem

I had some clarifying reads there.
The problem was indicated very well:

quote
"Oh, I can just pretend this is UTF-8. This sometimes works, but unfortunately there's not that much pure ASCII left in the world
If there's even one é or smart quotation mark (“ instead of ") in your text, it's probably encoded in ISO-8859 or some Microsoft code page, and will seriously confuse software that thinks it's reading UTF-8, including most XML software."
/quote

Seems that most of the requested url's are not UTF-8 at all, therefore messing up the MSXML2 text parser ...
Problem solved.
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I would like to start this tip/trick by saying Thank You, to all who said that this could not be done, as it forced me to make sure that it could be accomplished. :) To start, I want to make sure everyone understands the importance of utilizing p…
I was asked about the differences between classic ASP and ASP.NET, so let me put them down here, for reference: Let's make the introductions... Classic ASP was launched by Microsoft in 1998 and dynamically generate web pages upon user interact…
This video shows how to quickly and easily deploy an email signature for all users in Office 365 and prevent it from being added to replies and forwards. (the resulting signature is applied on the server level in Exchange Online) The email signat…
Integration Management Part 2
Suggested Courses

580 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question