?
Solved

Can I use the Unicode variants of WININET API from VB6?

Posted on 2005-03-16
7
Medium Priority
?
1,134 Views
Last Modified: 2010-08-05
I have succesfully used the ?????A variants of the WININET API to scrape HTML page content from the web.
Here's the relevant DLL function declarations from my BAS file - the code to use these calls is widely available everywhere.

Private Declare Function InternetOpen Lib "wininet" Alias "InternetOpenA" ( _
   ByVal lpszAgent$, _
   ByVal dwAccessType&, _
   ByVal lpszProxyName$, _
   ByVal lpszProxyBypass$, _
   ByVal dwFlags&) As Long

Private Declare Function InternetConnect Lib "wininet" Alias "InternetConnectA" ( _
   ByVal internet_handle&, _
   ByVal lpszServerName$, _
   ByVal nServerPort&, _
   ByVal lpszUserName$, _
   ByVal lpszPassword$, _
   ByVal dwService&, _
   ByVal dwFlags&, _
   ByVal context_word&) As Long

Private Declare Function HttpOpenRequest Lib "wininet" Alias "HttpOpenRequestA" ( _
    ByVal hHttpSession&, _
    ByVal lpszVerb$, _
    ByVal lpszObjectName$, _
    ByVal lpszVersion$, _
    ByVal lpszReferer$, _
    ByVal lpszAcceptTypes$, _
    ByVal dwFlags&, _
    ByVal dwContext&) As Long

Private Declare Function HttpSendRequest Lib "wininet" Alias "HttpSendRequestA" ( _
    ByVal hHttpRequest&, _
    ByVal lpszHeaders$, _
    ByVal dwHeadersLength&, _
    ByVal lpOptional$, _
    ByVal dwOptionalLength&) As Boolean

Private Declare Function InternetReadFile Lib "wininet" ( _
    ByVal hFile&, _
    ByVal lpBuffer$, _
    ByVal dwNumberOfBytesToRead&, _
    ByRef lpNumberOfBytesRead&) As Boolean

The trouble is that when I acquire HTML source that is encoded in UTF-8 Unicode, it does not return the correct content.
I assume that the variants of the api type ????????W could be used as an alternative, but I can't get any instance I have seen to work.
Can anyone supply source examples?


0
Comment
Question by:PECollingwood
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
7 Comments
 
LVL 38

Expert Comment

by:PaulHews
ID: 13558090
Can you post an example URL where the content is encoded and you are not returning the desired results?
0
 

Author Comment

by:PECollingwood
ID: 13564359
Hi,
this URL is a prime example:

http://babelfish.altavista.com/babelfish/tr?tt=urltext&url=x&lp=en_pt&doit=done&urltext=dog

This is enquiring for a Portuguese translation of th word 'dog' using AltaVista's BabelFish portal.
If you paste this URL into your IE who'll see the translation is 'cao', with the letter a capped with a tilde (~).

If you attempt to acquire the HTML content using a call like this:

DownloadURLToString "babelfish.altavista.com", "/babelfish/tr?tt=urltext&url=x&lp=en_pt&doit=done&urltext=dog"

to the code below, the text content representing the translated word is different.
This can be proved by saving the string to a .html file and then browsing this using IE.

What's going on?

Thanks,
Paul

---------------------------------------------------------------------------------------------------------------------------------------------

Option Explicit

Private Const INTERNET_OPEN_TYPE_DIRECT As Long = 1

Private Const INTERNET_SERVICE_HTTP  As Long = 3

Private Const INTERNET_FLAG_NO_COOKIES As Long = &H80000
Private Const INTERNET_FLAG_NO_CACHE_WRITE As Long = &H4000000
Private Const INTERNET_FLAG_RELOAD = &H80000000

Private Declare Function InternetOpen Lib "wininet" Alias "InternetOpenA" ( _
   ByVal lpszAgent$, _
   ByVal dwAccessType&, _
   ByVal lpszProxyName$, _
   ByVal lpszProxyBypass$, _
   ByVal dwFlags&) As Long

Private Declare Function InternetCloseHandle Lib "wininet" ( _
   ByVal hEnumHandle&) As Long
     
Private Declare Function InternetConnect Lib "wininet" Alias "InternetConnectA" ( _
   ByVal internet_handle&, _
   ByVal lpszServerName$, _
   ByVal nServerPort&, _
   ByVal lpszUserName$, _
   ByVal lpszPassword$, _
   ByVal dwService&, _
   ByVal dwFlags&, _
   ByVal context_word&) As Long

Private Declare Function HttpOpenRequest Lib "wininet" Alias "HttpOpenRequestA" ( _
    ByVal hHttpSession&, _
    ByVal lpszVerb$, _
    ByVal lpszObjectName$, _
    ByVal lpszVersion$, _
    ByVal lpszReferer$, _
    ByVal lpszAcceptTypes$, _
    ByVal dwFlags&, _
    ByVal dwContext&) As Long

Private Declare Function HttpSendRequest Lib "wininet" Alias "HttpSendRequestA" ( _
    ByVal hHttpRequest&, _
    ByVal lpszHeaders$, _
    ByVal dwHeadersLength&, _
    ByVal lpOptional$, _
    ByVal dwOptionalLength&) As Boolean

Private Declare Function InternetQueryDataAvailable Lib "wininet" ( _
    ByVal hFile&, _
    ByRef lpdwNumberOfBytesAvailable&, _
    ByVal dwFlags&, _
    ByVal dwContext&) As Boolean

Private Declare Function InternetReadFile Lib "wininet" ( _
    ByVal hFile&, _
    ByVal lpBuffer$, _
    ByVal dwNumberOfBytesToRead&, _
    ByRef lpNumberOfBytesRead&) As Boolean


Public Function DownloadURLToString(ByVal server_url$, ByVal path_url$, ByRef return_string$) As Boolean
   Dim open_handle&, connect_handle&, request_handle&, result As Boolean, bytes_read&, total_string$
   Dim string_buffer As String * 1024
   
   On Error Resume Next
   
   DownloadURLToString = False
   
   open_handle = InternetOpen(App.Title, _
                              INTERNET_OPEN_TYPE_DIRECT, _
                              vbNullString, vbNullString, 0)
   
   If open_handle Then
   
      connect_handle = InternetConnect(open_handle, _
                                       server_url, _
                                       80, _
                                       "", "", _
                                       INTERNET_SERVICE_HTTP, _
                                       0, 0)
      If connect_handle Then
     
         request_handle = HttpOpenRequest(connect_handle, _
                                          "GET", _
                                          path_url, _
                                          "HTTP/1.0", vbNullString, vbNullString, _
                                          INTERNET_FLAG_NO_COOKIES Or INTERNET_FLAG_NO_CACHE_WRITE Or INTERNET_FLAG_RELOAD, _
                                          0)

         If request_handle Then

            result = HttpSendRequest(request_handle, vbNullString, 0, vbNullString, 0)
   
            If result Then
               
               Do
                  DoEvents
                  result = InternetReadFile(request_handle, string_buffer, Len(string_buffer), bytes_read)
                 
                  If result Then
                     If bytes_read > 0 Then
                        total_string = total_string & Left$(string_buffer, bytes_read)
                     End If
                  Else
                     Exit Do
                  End If
                 
               Loop While bytes_read > 0
   
               If result Then
                  return_string = total_string
               
                  DownloadURLToString = True
               End If
               
            End If
   
            InternetCloseHandle request_handle
         
         End If
   
         InternetCloseHandle connect_handle
         
      End If

      InternetCloseHandle open_handle

   End If
   
End Function


0
 
LVL 38

Expert Comment

by:PaulHews
ID: 13565480
You can encode the string correctly as UTF8 before writing it to disk:

Option Explicit
Private Declare Function WideCharToMultiByte Lib "kernel32" _
    (ByVal CodePage As Long, _
     ByVal dwFlags As Long, _
     ByVal lpWideCharStr As Long, _
     ByVal cchWideChar As Long, _
     ByRef lpMultiByteStr As Any, _
     ByVal cchMultiByte As Long, _
     ByVal lpDefaultChar As String, _
     ByVal lpUsedDefaultChar As Long) As Long
Const CP_UTF8 = 65001



Private Sub Command1_Click()
    Dim strText As String
    Dim bytBuffer() As Byte
    Dim lngLen As Long
   
    'strText = OpenURL("http://babelfish.altavista.com/babelfish/tr?tt=urltext&url=x&lp=en_pt&doit=done&urltext=dog")
    DownloadURLToString "babelfish.altavista.com", "/babelfish/tr?tt=urltext&url=x&lp=en_pt&doit=done&urltext=dog", strText
   
    bytBuffer = Utf8Encode(strText)
    If UBound(bytBuffer) > -1 Then
        Open "C:\test\test.html" For Binary As #1
        Put #1, , bytBuffer
        Close
    End If
   
   
End Sub

Public Function Utf8Encode(ByRef strUnicode As String) As Byte()
    Dim lngRes As Long
    Dim b() As Byte
    Dim lngUnicodeLen As Long
    Dim lngBufferSize As Long
    ReDim b(-1 To -1)
   
    lngUnicodeLen = Len(strUnicode)
    If lngUnicodeLen > 0 Then
        lngBufferSize = lngUnicodeLen * 3 + 1  'ensure adequate space for UTF wide char
        ReDim b(lngBufferSize - 1)
        lngRes = WideCharToMultiByte(CP_UTF8, 0, StrPtr(strUnicode), lngUnicodeLen, b(0), lngBufferSize, vbNullString, 0)
        If lngRes Then
            'trim to correct length
            ReDim Preserve b(lngRes - 1)
        End If
        Utf8Encode = b
    End If
   
End Function


0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 38

Expert Comment

by:PaulHews
ID: 13565517
Sorry, change this:

Public Function Utf8Encode(ByRef strUnicode As String) As Byte()
    Dim lngRes As Long
    Dim b() As Byte
    Dim lngUnicodeLen As Long
    Dim lngBufferSize As Long
    ReDim b(-1 To -1)
   
    lngUnicodeLen = Len(strUnicode)
    If lngUnicodeLen > 0 Then
        lngBufferSize = lngUnicodeLen * 3 + 1  'ensure adequate space for UTF wide char
        ReDim b(lngBufferSize - 1)
        lngRes = WideCharToMultiByte(CP_UTF8, 0, StrPtr(strUnicode), lngUnicodeLen, b(0), lngBufferSize, vbNullString, 0)
        If lngRes Then
            'trim to correct length
            ReDim Preserve b(lngRes - 1)
        End If
    End If
    Utf8Encode = b
End Function
0
 

Author Comment

by:PECollingwood
ID: 13566953
PaulHews,
Ok that's sorted out the Portuguese - can you pinpoint the problem here?

This URL:

http://babelfish.altavista.com/babelfish/tr?tt=urltext&url=x&lp=en_ko&doit=done&urltext=dog

This is enquiring for a Korean translation of th word 'dog' using AltaVista's BabelFish portal.

I'm getting garbage for this - even with your suggested solution. IE of course, shows Korean alphabet symbols correctly.

Can you confirm that you are as well?
Any ideas how I fix this?
0
 
LVL 38

Accepted Solution

by:
PaulHews earned 500 total points
ID: 13567531
I've noticed that the WinInet functions often return different results compared to what I see in IE.  I've never tried the "w" version of the API functions, as I've never seen them documented anywhere.

However, when I need to match exactly what I get in IE, I use the webbrowser control to download the page.

Add "Microsoft Internet Controls" to your toolbox, and drag a webbrowser control onto the form and name it wb.  Then you can use this code to download the same page:

Private Sub Command2_Click()
    wb.Navigate "http://babelfish.altavista.com/babelfish/tr?tt=urltext&url=x&lp=en_ko&doit=done&urltext=dog"
End Sub


Private Sub wb_DocumentComplete(ByVal pDisp As Object, URL As Variant)
    If pDisp Is wb.Object Then
        Open "C:\Documents and Settings\Administrator.PAULHOME\Desktop\test.html" For Output As #1
        Print #1, ""
        Print #1, "<HTML><HEAD>"  'can only retrieve body, so we need to reconstruct a minimal version of HTML head section.
        Print #1, "<TITLE>AltaVista - Babel Fish Translation - Translated Text</TITLE>"
        Print #1, "<meta http-equiv=""content-type"" content=""text/html; charset=UTF-8""></HEAD><BODY>"
        Print #1, wb.Document.body.innerHTML
        Print #1, "</BODY></HTML>"
        Close
    End If
End Sub

0
 

Author Comment

by:PECollingwood
ID: 13600416
PaulHews,
thanks - solution found using your help!
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Did you know SD-WANs can improve network connectivity? Check out this webinar to learn how an SD-WAN simplified, one-click tool can help you migrate and manage data in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

There are many ways to remove duplicate entries in an SQL or Access database. Most make you temporarily insert an ID field, make a temp table and copy data back and forth, and/or are slow. Here is an easy way in VB6 using ADO to remove duplicate row…
Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…
Suggested Courses

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question