Solved

Mid$ anomaly in VB5

Posted on 2014-03-11
13
232 Views
Last Modified: 2014-03-14
In parsing a long string of HTML code,  the Mid$ function does something odd when seeing a comma.  

x$= ">SMITH       , JOHN            <"

Mid$(x$,2,30) returns "SMITHJOHN           "

If I replace the comma with a period it properly returns,

 "SMITH       . JOHN            "

This intial position in the document is 40,000 characters from beginning.  When I test on just the small snippet above,  it works properly with comma in field.

Actual coding is:
        pos1 = InStr(pos, HTMLtext, "<table ")
        pos2 = InStr(pos1, HTMLtext, "</table>")
        pos2 = pos2 + 8
        CrewInfoHTML = Mid(HTMLtext, pos1, pos2 - pos1)
0
Comment
Question by:drleewood
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
  • 2
  • +2
13 Comments
 
LVL 25

Expert Comment

by:chaau
ID: 39922611
I think you need to use this code:
        pos1 = InStr(pos, HTMLtext, "<table ")
        pos1 = pos1 + 8
        pos2 = InStr(pos1, HTMLtext, "</table>")
        CrewInfoHTML = Mid(HTMLtext, pos1, pos2 - pos1) 

Open in new window

0
 
LVL 45

Expert Comment

by:aikimark
ID: 39922647
please post an instance of the HTML you are parsing and post the VB statements you are using to get the HTML
0
 
LVL 47

Expert Comment

by:Martin Liss
ID: 39922666
There's no problem with your code in VB6 but if the comma is a problem in VB5 then you can do something like this.

x$ = ">SMITH       , JOHN            <"
x$ = Replace(x$, ",", "|")
x$ = Replace(x$, "|", ",")

Open in new window


I just remembered that VB5 doesn't have VB6's Replace function so you can add this clone that I found on DevX.

' A clone of the VB6's Replace function for use under VB5

Function Replace(Source As String, Find As String, ReplaceStr As String, _
    Optional ByVal Start As Long = 1, Optional Count As Long = -1, _
    Optional Compare As VbCompareMethod = vbBinaryCompare) As String

    Dim findLen As Long
    Dim replaceLen As Long
    Dim index As Long
    Dim counter As Long
    
    findLen = Len(Find)
    replaceLen = Len(ReplaceStr)
    ' this prevents an endless loop
    If findLen = 0 Then Err.Raise 5
    
    If Start < 1 Then Start = 1
    index = Start
    
    ' let's start by assigning the source to the result
    Replace = Source
    
    ' if Find and ReplaceStr strings have same length, it is possible to
    ' use an optimized algorithm, based on the Mid$ command
    Do
        index = InStr(index, Replace, Find, Compare)
        If index = 0 Then Exit Do
        If findLen = replaceLen Then
            ' if the find and replace strings have same length
            ' we can use the faster Mid$ command
            Mid$(Replace, index, findLen) = ReplaceStr
        Else
            ' else we must use concatenation
            Replace = Left$(Replace, index - 1) & ReplaceStr & Mid$(Replace, _
                index + findLen)
        End If
        ' skip over the string just added
        index = index + replaceLen
        ' increment the replacement counter
        counter = counter + 1
        ' Note that the Loop Until test will always fail if Count = -1
    Loop Until counter = Count
    
    ' The next operation serves to keep complete compatibility with
    ' VB6's Replace function. You can delete it if you prefer.
    If Start > 1 Then Replace = Mid$(Replace, Start)

End Function

Open in new window

0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 39923350
I fear that the issue is with the length of 40K ...
in VB5, the default data type for integer was +-32000, so you may try to use a bigger data type than integer for the start/length ...
0
 

Author Comment

by:drleewood
ID: 39923772
1. pos, pos1 and po2 are all declared as Long
2. I cut the document to a much smaller size with same result.
3. Would prefer not to find and replace " , " if it can be avoided.
4. HTML document code is attached.  The position of "<Table " is 46,407.
Pairing-Information.htm
0
 
LVL 45

Accepted Solution

by:
aikimark earned 250 total points
ID: 39924059
I think the problem is in the file format, rather than the Mid() or Instr() functions.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39924240
Since this file is unicode, you need to use InStrB() for your string searching.  Likewise, you will probably need to use MidB().
0
 

Author Comment

by:drleewood
ID: 39924356
I believe the problem is getting all the characters from the document into the variable.  I am using this code:

Open "c:\Pairing Information.htm" For Input As #f
Do While Not EOF(f)
    Input #f, A$
    HTMLdoc = HTMLdoc & A$
Loop
Close #f

I have found the resulting HTMLdoc does not have the those commas.  How do I input the data from the document file so it is accurate?
0
 

Author Comment

by:drleewood
ID: 39924429
I have found using:

Line Input #f, A$

instead of:

Input #f, A$

preserves the commas.  Any better ideas as it may have some other effects?
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39924533
Here is an example of code that reads the unicode HTML file you posted and parses using the InstrB() and MidB() functions.
Option Explicit

Public Sub Q_28386246()
    Dim oFS As Object
    Dim oTS As Object
    Dim HTMLtext As String, CrewInfoHTML As String
    Dim pos1 As Long, pos2 As Long
    Set oFS = CreateObject("scripting.filesystemobject")
    Set oTS = oFS.opentextfile("C:\Users\Mark\Downloads\Pairing-Information.htm", 1, 1)
    HTMLtext = oTS.readall
    oTS.Close
    HTMLtext = StrConv(HTMLtext, vbFromUnicode)
    pos1 = 1
    pos1 = InStrB(pos1, HTMLtext, "<table ")
    pos2 = InStrB(pos1, HTMLtext, "</table>")
    pos2 = pos2 + 8
    CrewInfoHTML = MidB(HTMLtext, pos1, pos2 - pos1)
    Debug.Print Len(HTMLtext)
    Debug.Print CrewInfoHTML
End Sub

Open in new window

0
 
LVL 143

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 250 total points
ID: 39925724
Line Input:
http://msdn.microsoft.com/en-us/library/aa243392%28v=vs.60%29.aspx
vs Input:
http://msdn.microsoft.com/en-us/library/aa243386%28v=vs.60%29.aspx

visibly, the input statement does things which I didn't know (remember), about "parsing" the data, compared to the way the data was written, and indeed it may "skip" the comma.
so, please switch to the Line Input, the only important point to note there is that the carriage return / line feed characters are NOT put to the variable. if you need them, you need to do:
 HTMLdoc = HTMLdoc & A$ & vbCrlf
0
 

Author Comment

by:drleewood
ID: 39928275
Thank you all for your input.  I am afraid the problem was not in the mid$ function but rather in the reading of the HTML code after it was saved to a file.  The project actually reads the HTML source code directly from the browser control and may eliminate anomaly of the comma.  The reading of the file was for testing purposes only.  The Line Input seems to have solved that problem.  I am going to close this issue for now.  Thank you.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39929218
What about my code?
0

Featured Post

Salesforce Has Never Been Easier

Improve and reinforce salesforce training & adoption using WalkMe's digital adoption platform. Start saving on costly employee training by creating fast intuitive Walk-Thrus for Salesforce. Claim your Free Account Now

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Introduction While answering a recent question about filtering a custom class collection, I realized that this could be accomplished with very little code by using the ScriptControl (SC) library.  This article will introduce you to the SC library a…
Have you ever wanted to restrict the users input in a textbox to numbers, and while doing that make sure that they can't 'cheat' by pasting in non-numeric text? Of course you can do that with code you write yourself but it's tedious and error-prone …
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question