Insert HTML code with special characters into Access DB or Add as a Text file

I have an application that reads the html code from a web page and adds the whole html code into Access Database.
The problem is that some special characters like ÛÛÛ are converted to ??? when inserted into DB.
But when I copy/paste a webpage with special characters directly into DB there is no problem.

Is there another way to solve this? Perhaps by insterting the .txt or .htm file into DB, if Access supports it?

Almost complete code:
Dim Httpobj As MSXML2.XMLHTTP
Set Httpobj = New MSXML2.XMLHTTP
Dim oConn As ADODB.Connection
Dim oRs As ADODB.Recordset
Dim sSql As String
Dim sSql2 As String
Set oConn = New ADODB.Connection
oConn.Open ("DRIVER={Microsoft Access Driver (*.mdb)}; DBQ=" & "d:/documents.mdb")
sSql = "SELECT * FROM doc"
Set oRs = oConn.Execute(sSql)

Do While Not oRs.EOF
link = oRs("link")
id = oRs("id")

With Httpobj
.Open "GET", link, False
.send
niz = .responseText
End With

sSql2 = "UPDATE doc SET nfo = '" & niz & "' WHERE ID = " & id & ""
Set oRs2 = oConn.Execute(sSql2)
oRs.MoveNext
Loop
VampirAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ASPGuruCommented:
try to open the recordset and then do this:
rs("nfo") = Httpobj.responseText

ASPGuru
0
jitgangulyCommented:
Have you tried with server.htmlencode ?

Like

sSql2 = "UPDATE doc SET nfo = '" & server.htmlencode(niz) & "' WHERE ID = " & id & ""

0
VampirAuthor Commented:
Neither works, the problem is with niz = .responseText.
The problem is that .responseText converts the text and I need to keep it "binary".
But since I dont have a clue about encoding.

I found some info on http://dbforums.com/arch/195/2003/2/676676

And there is another link on expert-excange, but cant seem to find it again. Its listed in the XML section, I should probably ask my question there.
0
Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

jitgangulyCommented:
0
ASPGuruCommented:
well... have a look at the xmlhttp object... there's not only responseText but also a binary property... don't know the exact name right now...

ASPGuru
0
AlfaNoMoreCommented:
It's got to be an encoding issue? You're viewing html pages (or html output), so it MUST be ASCII. So you need to know what the source encoding is, and use this when you right into your database. Trouble is, I have no idea how you'd do this?? :-)
0
ASPGuruCommented:
no... it doesn't need to be ascii...

ASPGuru
0
AlfaNoMoreCommented:
really? Shows what I know. html is text though? yes.
0
BreedjCommented:
The characters are mapped against the wrong character map.

before calling Httpobj.open set the code page to UTF-8 with SetRequestHeader. like: Httpobj.SetRequestHeader "Content-Type", "text/html;Charset=UTF-8"



Also set the codepage to utf-8 in the asp page you are executing this code using Session.CodePage=65001 and Session.Charset = "UTF-8"

Httpobj.SetRequestHeader "Content-Type", "text/html;Charset=utf-8"
0
VampirAuthor Commented:
First of I must admit that I currently testing the application as Visual Basic application.

niz = .responseStream produces an error

ASPGuru : i think that the binary method is .responseBody, see the solution i came with below, but it doesnt work with niz = .responseBody

Breedj(your solution seems to be the closest, but still an error):
Unspecified error in:
Httpobj.SetRequestHeader "Content-Type", "text/html;Charset=UTF-8"


Now here is what I got from the link above (my post) that works, but it first writes to txt file then reads from txt. There is also a link on what site I'm testing. Writing directy to Access DB as below doesnt work either.

Sub test()
sFile = "c:\test.txt"
sURL = "http://www.nforce.nl/nfos/clear_txt.php?id=27651"

Set objXMLHTTP = CreateObject("MSXML2.serverXMLHTTP.4.0")

objXMLHTTP.Open "GET", sURL, False
objXMLHTTP.send

Set strm1 = CreateObject("adodb.stream")
With strm1
.Type = 1
.Open
.Write objXMLHTTP.responseBody
.SaveToFile sFile, 2 ' adSaveCreateOverWrite
.Close
End With

Set strm2 = CreateObject("adodb.stream")
With strm2
.Type = 2
.Charset = "euc-kr" 'Use any proper charset
.Open
.LoadFromFile "c:\test.txt"
MsgBox .ReadText
Text1.Text = .ReadText
.Close
End With
End Sub
0
VampirAuthor Commented:
Sorry for bad spelling above. I need a better solution without the use of saving to Text file.
0
ASPGuruCommented:
requestheaders have nothing to do with the response...

you don't need to write to a file... you can load the text into the stream directly... then go to the beginnig of the stream,  set the right charset and read it out again...


ASPGuru
0
VampirAuthor Commented:
Sorry for bad spelling above. I need a better solution without the use of saving to Text file.
0
VampirAuthor Commented:
Could you please post a working code?
0
ASPGuruCommented:
try something like this:


Set strm2 = CreateObject("adodb.stream")
With strm2
.Type = 2
.Charset = "euc-kr" 'Use any proper charset
.Open
objXMLHTTP.responseStream.copyTo(strm2)
MsgBox .ReadText
Text1.Text = .ReadText
.Close
End With
End Sub


ASPGuru
0
VampirAuthor Commented:
Could you please post a working code?
0
VampirAuthor Commented:
Refresh...

Sorry ASPGuru but your code doesnt seem to work:
Error is Method Required at: objXMLHTTP.responseStream.CopyTo (strm2)
there doesnt seem to be sub method .CopyTo.

Here is the code is used (I changed MSXML a little, so it gives better error reporting)

Sub test()
sFile = "c:\test3.txt"
sURL = "http://www.nforce.nl/nfos/clear_txt.php?id=27651"

Dim objXMLHTTP As MSXML2.ServerXMLHTTP40
Set objXMLHTTP = New MSXML2.ServerXMLHTTP40

objXMLHTTP.Open "GET", sURL, False
objXMLHTTP.send

Set strm2 = CreateObject("adodb.stream")
With strm2
.Type = 2
.Charset = "euc-kr" 'Use any proper charset
.Open
objXMLHTTP.responseStream.CopyTo (strm2)
MsgBox .ReadText
Text1.Text = .ReadText
.Close
End With
End Sub
0
VampirAuthor Commented:
Btw I've also decided to just use text files instead inserting text in Access DB (too much trouble anyway).
Not to mention that still not every character is recognized, is there a Charset table for all characters? I've tried iso-8859-1.
I'll still give points for a working example for my previous question.
0
ASPGuruCommented:
ok... then i won't bother to find the db solution..

you need to use the encoding of the html file...
this can be different for any file...

ASPGuru
0
VampirAuthor Commented:
All files are plain text similar to http://www.nforce.nl/nfos/clear_txt.php?id=27651.
What encoding should I use?
0
VampirAuthor Commented:
Remove . at the end of the link
0
ASPGuruCommented:
well... it depends... i can't tell you...
the characters with code 0 - 127 are equal for all ascci encodings...
just for characters above it depends...

oh... i just had a look at the text...
with which characters do you have problems?
NFOs normally use the DOS-charset, which also includes this frame graphics...

ASPGuru
0
VampirAuthor Commented:
When creating a text file using the following method:
.Type = 1 'means binary and doesnt allow charset'
.Open
.Write objXMLHTTP.responseBody 'also binary'
.SaveToFile sFile, 2 ' adSaveCreateOverWrite

character (AND) & is converted to &
There is also >, <, "
Possible others too.

Problem can be solved by editing all files and searching for those strings and replacing them with characters (with an application of course).
0
moduloCommented:
Dear expert(s),

A request has been made to close this Q in CS:
http://www.experts-exchange.com/Community_Support/Q_20562615.html

Without a response in 72 hrs, a moderator will finalize this question by:

 - Saving this Q as a PAQ and refunding the points to the questionner

When you agree or disagree, please add a comment here.

Thank you.

modulo

Community Support Moderator
Experts Exchange
0
moduloCommented:
Saving this Q as a PAQ and refunding the points to the questionner

modulo

Community Support Moderator
Experts Exchange
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
ASP

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.