Solved

Mid$ anomaly in VB5

Posted on 2014-03-11
13
223 Views
Last Modified: 2014-03-14
In parsing a long string of HTML code,  the Mid$ function does something odd when seeing a comma.  

x$= ">SMITH       , JOHN            <"

Mid$(x$,2,30) returns "SMITHJOHN           "

If I replace the comma with a period it properly returns,

 "SMITH       . JOHN            "

This intial position in the document is 40,000 characters from beginning.  When I test on just the small snippet above,  it works properly with comma in field.

Actual coding is:
        pos1 = InStr(pos, HTMLtext, "<table ")
        pos2 = InStr(pos1, HTMLtext, "</table>")
        pos2 = pos2 + 8
        CrewInfoHTML = Mid(HTMLtext, pos1, pos2 - pos1)
0
Comment
Question by:drleewood
  • 5
  • 4
  • 2
  • +2
13 Comments
 
LVL 24

Expert Comment

by:chaau
Comment Utility
I think you need to use this code:
        pos1 = InStr(pos, HTMLtext, "<table ")
        pos1 = pos1 + 8
        pos2 = InStr(pos1, HTMLtext, "</table>")
        CrewInfoHTML = Mid(HTMLtext, pos1, pos2 - pos1) 

Open in new window

0
 
LVL 45

Expert Comment

by:aikimark
Comment Utility
please post an instance of the HTML you are parsing and post the VB statements you are using to get the HTML
0
 
LVL 45

Expert Comment

by:Martin Liss
Comment Utility
There's no problem with your code in VB6 but if the comma is a problem in VB5 then you can do something like this.

x$ = ">SMITH       , JOHN            <"
x$ = Replace(x$, ",", "|")
x$ = Replace(x$, "|", ",")

Open in new window


I just remembered that VB5 doesn't have VB6's Replace function so you can add this clone that I found on DevX.

' A clone of the VB6's Replace function for use under VB5

Function Replace(Source As String, Find As String, ReplaceStr As String, _
    Optional ByVal Start As Long = 1, Optional Count As Long = -1, _
    Optional Compare As VbCompareMethod = vbBinaryCompare) As String

    Dim findLen As Long
    Dim replaceLen As Long
    Dim index As Long
    Dim counter As Long
    
    findLen = Len(Find)
    replaceLen = Len(ReplaceStr)
    ' this prevents an endless loop
    If findLen = 0 Then Err.Raise 5
    
    If Start < 1 Then Start = 1
    index = Start
    
    ' let's start by assigning the source to the result
    Replace = Source
    
    ' if Find and ReplaceStr strings have same length, it is possible to
    ' use an optimized algorithm, based on the Mid$ command
    Do
        index = InStr(index, Replace, Find, Compare)
        If index = 0 Then Exit Do
        If findLen = replaceLen Then
            ' if the find and replace strings have same length
            ' we can use the faster Mid$ command
            Mid$(Replace, index, findLen) = ReplaceStr
        Else
            ' else we must use concatenation
            Replace = Left$(Replace, index - 1) & ReplaceStr & Mid$(Replace, _
                index + findLen)
        End If
        ' skip over the string just added
        index = index + replaceLen
        ' increment the replacement counter
        counter = counter + 1
        ' Note that the Loop Until test will always fail if Count = -1
    Loop Until counter = Count
    
    ' The next operation serves to keep complete compatibility with
    ' VB6's Replace function. You can delete it if you prefer.
    If Start > 1 Then Replace = Mid$(Replace, Start)

End Function

Open in new window

0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
Comment Utility
I fear that the issue is with the length of 40K ...
in VB5, the default data type for integer was +-32000, so you may try to use a bigger data type than integer for the start/length ...
0
 

Author Comment

by:drleewood
Comment Utility
1. pos, pos1 and po2 are all declared as Long
2. I cut the document to a much smaller size with same result.
3. Would prefer not to find and replace " , " if it can be avoided.
4. HTML document code is attached.  The position of "<Table " is 46,407.
Pairing-Information.htm
0
 
LVL 45

Accepted Solution

by:
aikimark earned 250 total points
Comment Utility
I think the problem is in the file format, rather than the Mid() or Instr() functions.
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 45

Expert Comment

by:aikimark
Comment Utility
Since this file is unicode, you need to use InStrB() for your string searching.  Likewise, you will probably need to use MidB().
0
 

Author Comment

by:drleewood
Comment Utility
I believe the problem is getting all the characters from the document into the variable.  I am using this code:

Open "c:\Pairing Information.htm" For Input As #f
Do While Not EOF(f)
    Input #f, A$
    HTMLdoc = HTMLdoc & A$
Loop
Close #f

I have found the resulting HTMLdoc does not have the those commas.  How do I input the data from the document file so it is accurate?
0
 

Author Comment

by:drleewood
Comment Utility
I have found using:

Line Input #f, A$

instead of:

Input #f, A$

preserves the commas.  Any better ideas as it may have some other effects?
0
 
LVL 45

Expert Comment

by:aikimark
Comment Utility
Here is an example of code that reads the unicode HTML file you posted and parses using the InstrB() and MidB() functions.
Option Explicit

Public Sub Q_28386246()
    Dim oFS As Object
    Dim oTS As Object
    Dim HTMLtext As String, CrewInfoHTML As String
    Dim pos1 As Long, pos2 As Long
    Set oFS = CreateObject("scripting.filesystemobject")
    Set oTS = oFS.opentextfile("C:\Users\Mark\Downloads\Pairing-Information.htm", 1, 1)
    HTMLtext = oTS.readall
    oTS.Close
    HTMLtext = StrConv(HTMLtext, vbFromUnicode)
    pos1 = 1
    pos1 = InStrB(pos1, HTMLtext, "<table ")
    pos2 = InStrB(pos1, HTMLtext, "</table>")
    pos2 = pos2 + 8
    CrewInfoHTML = MidB(HTMLtext, pos1, pos2 - pos1)
    Debug.Print Len(HTMLtext)
    Debug.Print CrewInfoHTML
End Sub

Open in new window

0
 
LVL 142

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 250 total points
Comment Utility
Line Input:
http://msdn.microsoft.com/en-us/library/aa243392%28v=vs.60%29.aspx
vs Input:
http://msdn.microsoft.com/en-us/library/aa243386%28v=vs.60%29.aspx

visibly, the input statement does things which I didn't know (remember), about "parsing" the data, compared to the way the data was written, and indeed it may "skip" the comma.
so, please switch to the Line Input, the only important point to note there is that the carriage return / line feed characters are NOT put to the variable. if you need them, you need to do:
 HTMLdoc = HTMLdoc & A$ & vbCrlf
0
 

Author Comment

by:drleewood
Comment Utility
Thank you all for your input.  I am afraid the problem was not in the mid$ function but rather in the reading of the HTML code after it was saved to a file.  The project actually reads the HTML source code directly from the browser control and may eliminate anomaly of the comma.  The reading of the file was for testing purposes only.  The Line Input seems to have solved that problem.  I am going to close this issue for now.  Thank you.
0
 
LVL 45

Expert Comment

by:aikimark
Comment Utility
What about my code?
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

This article describes some techniques which will make your VBA or Visual Basic Classic code easier to understand and maintain, whether by you, your replacement, or another Experts-Exchange expert.
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…
Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now