Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 580
  • Last Modified:

Insert HTML code with special characters into Access DB or Add as a Text file

I have an application that reads the html code from a web page and adds the whole html code into Access Database.
The problem is that some special characters like ÛÛÛ are converted to ??? when inserted into DB.
But when I copy/paste a webpage with special characters directly into DB there is no problem.

Is there another way to solve this? Perhaps by insterting the .txt or .htm file into DB, if Access supports it?

Almost complete code:
Dim Httpobj As MSXML2.XMLHTTP
Set Httpobj = New MSXML2.XMLHTTP
Dim oConn As ADODB.Connection
Dim oRs As ADODB.Recordset
Dim sSql As String
Dim sSql2 As String
Set oConn = New ADODB.Connection
oConn.Open ("DRIVER={Microsoft Access Driver (*.mdb)}; DBQ=" & "d:/documents.mdb")
sSql = "SELECT * FROM doc"
Set oRs = oConn.Execute(sSql)

Do While Not oRs.EOF
link = oRs("link")
id = oRs("id")

With Httpobj
.Open "GET", link, False
.send
niz = .responseText
End With

sSql2 = "UPDATE doc SET nfo = '" & niz & "' WHERE ID = " & id & ""
Set oRs2 = oConn.Execute(sSql2)
oRs.MoveNext
Loop
0
Vampir
Asked:
Vampir
  • 11
  • 7
  • 2
  • +3
1 Solution
 
ASPGuruCommented:
try to open the recordset and then do this:
rs("nfo") = Httpobj.responseText

ASPGuru
0
 
jitgangulyCommented:
Have you tried with server.htmlencode ?

Like

sSql2 = "UPDATE doc SET nfo = '" & server.htmlencode(niz) & "' WHERE ID = " & id & ""

0
 
VampirAuthor Commented:
Neither works, the problem is with niz = .responseText.
The problem is that .responseText converts the text and I need to keep it "binary".
But since I dont have a clue about encoding.

I found some info on http://dbforums.com/arch/195/2003/2/676676

And there is another link on expert-excange, but cant seem to find it again. Its listed in the XML section, I should probably ask my question there.
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
jitgangulyCommented:
0
 
ASPGuruCommented:
well... have a look at the xmlhttp object... there's not only responseText but also a binary property... don't know the exact name right now...

ASPGuru
0
 
AlfaNoMoreCommented:
It's got to be an encoding issue? You're viewing html pages (or html output), so it MUST be ASCII. So you need to know what the source encoding is, and use this when you right into your database. Trouble is, I have no idea how you'd do this?? :-)
0
 
ASPGuruCommented:
no... it doesn't need to be ascii...

ASPGuru
0
 
AlfaNoMoreCommented:
really? Shows what I know. html is text though? yes.
0
 
BreedjCommented:
The characters are mapped against the wrong character map.

before calling Httpobj.open set the code page to UTF-8 with SetRequestHeader. like: Httpobj.SetRequestHeader "Content-Type", "text/html;Charset=UTF-8"



Also set the codepage to utf-8 in the asp page you are executing this code using Session.CodePage=65001 and Session.Charset = "UTF-8"

Httpobj.SetRequestHeader "Content-Type", "text/html;Charset=utf-8"
0
 
VampirAuthor Commented:
First of I must admit that I currently testing the application as Visual Basic application.

niz = .responseStream produces an error

ASPGuru : i think that the binary method is .responseBody, see the solution i came with below, but it doesnt work with niz = .responseBody

Breedj(your solution seems to be the closest, but still an error):
Unspecified error in:
Httpobj.SetRequestHeader "Content-Type", "text/html;Charset=UTF-8"


Now here is what I got from the link above (my post) that works, but it first writes to txt file then reads from txt. There is also a link on what site I'm testing. Writing directy to Access DB as below doesnt work either.

Sub test()
sFile = "c:\test.txt"
sURL = "http://www.nforce.nl/nfos/clear_txt.php?id=27651"

Set objXMLHTTP = CreateObject("MSXML2.serverXMLHTTP.4.0")

objXMLHTTP.Open "GET", sURL, False
objXMLHTTP.send

Set strm1 = CreateObject("adodb.stream")
With strm1
.Type = 1
.Open
.Write objXMLHTTP.responseBody
.SaveToFile sFile, 2 ' adSaveCreateOverWrite
.Close
End With

Set strm2 = CreateObject("adodb.stream")
With strm2
.Type = 2
.Charset = "euc-kr" 'Use any proper charset
.Open
.LoadFromFile "c:\test.txt"
MsgBox .ReadText
Text1.Text = .ReadText
.Close
End With
End Sub
0
 
VampirAuthor Commented:
Sorry for bad spelling above. I need a better solution without the use of saving to Text file.
0
 
ASPGuruCommented:
requestheaders have nothing to do with the response...

you don't need to write to a file... you can load the text into the stream directly... then go to the beginnig of the stream,  set the right charset and read it out again...


ASPGuru
0
 
VampirAuthor Commented:
Sorry for bad spelling above. I need a better solution without the use of saving to Text file.
0
 
VampirAuthor Commented:
Could you please post a working code?
0
 
ASPGuruCommented:
try something like this:


Set strm2 = CreateObject("adodb.stream")
With strm2
.Type = 2
.Charset = "euc-kr" 'Use any proper charset
.Open
objXMLHTTP.responseStream.copyTo(strm2)
MsgBox .ReadText
Text1.Text = .ReadText
.Close
End With
End Sub


ASPGuru
0
 
VampirAuthor Commented:
Could you please post a working code?
0
 
VampirAuthor Commented:
Refresh...

Sorry ASPGuru but your code doesnt seem to work:
Error is Method Required at: objXMLHTTP.responseStream.CopyTo (strm2)
there doesnt seem to be sub method .CopyTo.

Here is the code is used (I changed MSXML a little, so it gives better error reporting)

Sub test()
sFile = "c:\test3.txt"
sURL = "http://www.nforce.nl/nfos/clear_txt.php?id=27651"

Dim objXMLHTTP As MSXML2.ServerXMLHTTP40
Set objXMLHTTP = New MSXML2.ServerXMLHTTP40

objXMLHTTP.Open "GET", sURL, False
objXMLHTTP.send

Set strm2 = CreateObject("adodb.stream")
With strm2
.Type = 2
.Charset = "euc-kr" 'Use any proper charset
.Open
objXMLHTTP.responseStream.CopyTo (strm2)
MsgBox .ReadText
Text1.Text = .ReadText
.Close
End With
End Sub
0
 
VampirAuthor Commented:
Btw I've also decided to just use text files instead inserting text in Access DB (too much trouble anyway).
Not to mention that still not every character is recognized, is there a Charset table for all characters? I've tried iso-8859-1.
I'll still give points for a working example for my previous question.
0
 
ASPGuruCommented:
ok... then i won't bother to find the db solution..

you need to use the encoding of the html file...
this can be different for any file...

ASPGuru
0
 
VampirAuthor Commented:
All files are plain text similar to http://www.nforce.nl/nfos/clear_txt.php?id=27651.
What encoding should I use?
0
 
VampirAuthor Commented:
Remove . at the end of the link
0
 
ASPGuruCommented:
well... it depends... i can't tell you...
the characters with code 0 - 127 are equal for all ascci encodings...
just for characters above it depends...

oh... i just had a look at the text...
with which characters do you have problems?
NFOs normally use the DOS-charset, which also includes this frame graphics...

ASPGuru
0
 
VampirAuthor Commented:
When creating a text file using the following method:
.Type = 1 'means binary and doesnt allow charset'
.Open
.Write objXMLHTTP.responseBody 'also binary'
.SaveToFile sFile, 2 ' adSaveCreateOverWrite

character (AND) & is converted to &
There is also >, <, "
Possible others too.

Problem can be solved by editing all files and searching for those strings and replacing them with characters (with an application of course).
0
 
moduloCommented:
Dear expert(s),

A request has been made to close this Q in CS:
http://www.experts-exchange.com/Community_Support/Q_20562615.html

Without a response in 72 hrs, a moderator will finalize this question by:

 - Saving this Q as a PAQ and refunding the points to the questionner

When you agree or disagree, please add a comment here.

Thank you.

modulo

Community Support Moderator
Experts Exchange
0
 
moduloCommented:
Saving this Q as a PAQ and refunding the points to the questionner

modulo

Community Support Moderator
Experts Exchange
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 11
  • 7
  • 2
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now