We help IT Professionals succeed at work.

Splitting a variant string

DarrinE
DarrinE asked
on
I need to split a variant string down to each part that it consists without knowing exactly how pieces there are to the puzzle

The following is the code I presently use which limited the breakdown of the string to 5 pieces:


What am I doing wrong ??

Many thanks in advance

DarrinE

Dim iVnt as variant
Dim strParse() As String

        If Len(OPData(2).Text) > 0 Then
         iVnt = Split(OPData(2).Text, vbCrLf)
         
         ReDim strParse(0)
             
              If Not IsMissing(iVnt) Then
                  If VarType(iVnt) = 8200 Or VarType(iVnt) = 8204 Then
                      'there are multiple entries
                      'if it is an array of strings vbString + VbArray or 82000
                      'or it is an array of variant type vbVariant + vbArray or 8204
                      ReDim strParse(UBound(iVnt) + 1)
                      For i = LBound(iVnt) To UBound(iVnt)
                          If iVnt(i) <> "" Then strParse(i) = iVnt(i)
'                          Debug.Print strParse(i)
                      Next
                  Else
                      'It should just be one string
                      If VarType(iVnt) = vbString Then strParse(0) = iVnt(0)
                  End If
              End If
Comment
Watch Question

Darrin,
   1) I see nothing that would inherently " limit the breakdown of the string to 5 pieces".
   2) the Split function by itself, would AUTOMATICALLY create the necesary array, without requiring you to do the additional coding:

Dim strParse() As String

   If Len(OPData(2).Text) > 0 Then
        strParse = Split(OPData(2).Text, vbCrLf)
   end if


in the code that you are using, VarType() = 8204 SHOULD not be possible, and Split WILL ALWAYS create an array of Strings. (VarType = 8200)

from the VB help system, on the Function Split:

Returns a zero-based, one-dimensional array containing a specified number of substrings.

Note that what is returned is ALWAYS SubStrings, never VARIANTS.

Commented:
Arthur Wood's comments are all correct. However, if you are trying to filter out "empty lines" from the array then you will require a second array. This is an example of such (based on your code):

    Dim iVnt As Variant
    Dim strParse() As String
    Dim iCnt As Integer

    If Len(OPData(2).Text) > 0 Then
       
        iVnt = Split(OPData(2).Text, vbCrLf)
       
        For i = 0 To UBound(iVnt)
            If iVnt(i) <> "" Then
               ReDim Preserve strParse(iCnt)
               strParse(iCnt) = iVnt(i)
               'Debug.Print strParse(iCnt)
               iCnt = iCnt + 1
           End If
        Next
       
    End If

Commented:
It seems that these lines are incorrect:

        iVnt = Split(OPData(2).Text, vbCrLf)
        ReDim strParse(0)
        If Not IsMissing(iVnt) Then

and should be

    strParse = Split(OPData(2).Text, vbCrLf)
    if UBound(strParse) >= 0 then
       iVnt = UBound(strParse)

Author

Commented:
Hi Guys sorry it has taken so long for me to get back to you ... busy day

What I am trying to do is create a function similar to that which outlook uses to get the address information or the name information ... outlook always seems to work out which is the city state and postcode info (can you tell I'm from Australia) - as well as differentiate between there being one or two lines in the street name

for instance

123 MyStreet Court
Brisbane 4001
Australia

and

Level 1
123 MyStreet Court
Brisbane Australia
4001

are sorted properly by outlook into street, city postcode and state - the differences being of course the location of the postcode and the number of lines in the street

jqv comment seems to be getting towards what I am looking at ... but I am still not sure

I have increased the points ..

DarrinE
jgv

Commented:
>> "outlook always seems to work out which is the city state and postcode info"

Actually, this information is entered into seperate textboxes in Outlook. These fields are then concatenated for "quick viewing" (such as <city>, <state>). If you are collecting this information from a VB form then you should also be collecting it in seperate textboxes NOT in one textbox. It would be a nightmare to try and sort everything out....

Author

Commented:
no ... sorry thats not right ... if you enter a new contact in Outlook (not outlook express) you enter the address in one textbox, if outlook is uncertain on the formatting then a window opens with sevral textboxes to enter the information correctly - it then concatenates them into one textbox

Try the name - its entered into one textbox and if you click on the "Full Name" button - it almost always gets the breakdown right including prefix and suffix

For the moment I am entering info into several textboxes - because you're right it is a nightmare to sort out ... but how does MS do it ??

DarrinE

jgv

Commented:
When I enter a name in Outlook (not express) there is only basic checks made. If you enter "Mr. Bob Smith" then, yes, it will format it correctly. If you enter "Smith Mr. Bob" then Outlook sees "Smith Mr." as the Title and "Bob" as the last name. There is an assumption that you will enter a name in the proper order.

It would seem that the determining factor is the spaces in the full name entry. If you enter "Billy Joe Smith" then it is broken down into First/Middle/Last name. If you enter "Billy-Joe Smith" then it is First/Last name.

A person's title should always be entered first so you can check the first block of text (before a space) for the common titles (Mr, Ms, Mrs, etc.). The same logic applies for the suffix except that it is assumed to be at the end.

This example should demonstrate some of this logic. It needs testing and does not have any commenting. If you find problems or have any questions, let me know.

Start a new project and place a command button and a textbox on the form. Copy/Paste this code and run.

Private Sub Command1_Click()
    Dim vName As Variant
    Dim Title As String
    Dim First As String
    Dim Middle As String
    Dim Last As String
    Dim Suffix As String
    Dim iLower As Integer
    Dim iUpper As Integer
       
    If Len(Trim(Text1.Text)) > 0 Then
       
        vName = Split(Text1.Text, " ")
       
        iLower = LBound(vName)
        iUpper = UBound(vName)
       
        If IsTitle(CStr(vName(0))) Then
            Title = vName(0)
            iLower = 1
        End If
       
        If IsSuffix(CStr(vName(UBound(vName)))) Then
            Suffix = vName(UBound(vName))
            iUpper = UBound(vName) - 1
        End If
       
        Select Case iUpper - iLower
                Case 0
                    First = vName(iLower)
                Case 1
                    First = vName(iLower)
                    Last = vName(iLower + 1)
                Case 2
                    First = vName(iLower)
                    Middle = vName(iLower + 1)
                    Last = vName(iLower + 2)
                Case 3
                    First = vName(iLower) & " " & vName(iLower + 1)
                    Middle = vName(iLower + 2)
                    Last = vName(iLower + 3)
                Case 4
                    First = vName(iLower) & " " & vName(iLower + 1) & " " & vName(iLower + 2)
                    Middle = vName(iLower + 3)
                    Last = vName(iLower + 4)
                Case 5
                    First = vName(iLower) & " " & vName(iLower + 1) & " " & vName(iLower + 2) & " " & vName(iLower + 3)
                    Middle = vName(iLower + 4)
                    Last = vName(iLower + 5)
        End Select
                   
    End If
   
    MsgBox "Title = " & Title & vbCrLf & _
           "First = " & First & vbCrLf & _
           "Middle = " & Middle & vbCrLf & _
           "Last = " & Last & vbCrLf & _
           "Suffix = " & Suffix
End Sub

Private Function IsTitle(strString As String) As Boolean
    Select Case strString
        Case "Dr.", "Miss", "Mr.", "Mrs.", "Ms.", "Prof."
            IsTitle = True
    End Select
End Function

Private Function IsSuffix(strString As String) As Boolean
    Select Case strString
        Case "I", "II", "III", "Jr.", "Sr."
            IsSuffix = True
    End Select
End Function

Author

Commented:
I've increased the points because, if you dont mind, we'll stay with this .. until I have found what I need .. I'll test you code ... needless to say which ever way it goes you get the points when we're finished

If you promise to stay with me on this I will give the points now ... but my experience is that once the points are awarded interest is lost

what do you think ??

DarrinE

Author

Commented:
If I write a class to do what we want .. are you interested in continuing this by private email ??

If you look at my expert points you'll see I do bugger all in VB mostly low level windows stuff - mainly C - which not many people do these days

DarrinE
jgv

Commented:
DarrinE,

At the moment I have 3 large projects that I am involved with and as much as I'd like to, I cannot commit any serious time to another side project. If you have questions or need help on certain code issues I'd be willing to follow up and give you a hand. In the meantime, I think that you need to step back and put together a working spec on what you are trying to accomplish. This will give you a reference point as to the logic required and the technique you will use to implement it. Writing a large code component without a working document can be difficult and quickly becomes spaghetti code as you "patch" in additional sections.

In this case you need to decide how robust the name parsing is going to be. The example I gave you reproduces some of Outlook's behavior but not all. If you want to completely emulate Outook's style then the example would obviously have to be altered and possibly have to be rebuilt using different logic flow.

So....the first thing you should do is open Outlook and start typing in different name formats and recording how Outlook splits them apart. Make sure you include situations where Outlook does not evaluate it correctly.

Raw Input               Title     First     Middle          Last     Suffix
John Smith                    John               Smith
John Adam Smith               John     Adam          Smith
Mr. John Smith               Mr.     John               Smith
Mr. John Adam Smith Jr.     Mr.     John     Adam          Smith     Jr.
Smith Jr., John                    John               Smith     Jr.
etc, etc...

Once you have this information you can begin working out some pseudo code on paper to help you determine the best method for evaluating the string.

If you find that you need help with the code and want someone to collaborate with then you at least have a detailed document that outlines what you are trying to do.

Jim

Author

Commented:
I am grateful for the additonal information - I'll keep posting here from time to time like I said and we can keep it going ... hope all goes well

DarrinE
jgv

Commented:
no sweat :)

Commented:
Just as an alternate idea...

I recently had a similar project where I had to use A.I. to determine what various pieces of an address file meant.  Since I was working with SQL*Server, I added some new tables to support all current cities, states, zipcodes, phone number area codes, etc.   I then validated each address piece against the tables.

Although I didn't have to deal with city validation, you could probably do something like compare each word against the city table, and the following word against a state/region/country table.

Actually, as jgv suggests, you first need to work out a gameplan, but having some ideas in place can help you decide which gameplan to use.

With my variation, you might want to match each word against the country table first, then work your way backwards.  Obviously, if you don't want to use table verification, then you'll need to build your A.I. in a different way.

Author

Commented:
the game plan is of course the main thing .. jgv has been immensely helpful ... your idea is good because I have already written a class to verify the city ... I'll have a crack at drawing a plan to get it underway properly .. interestes in staying in the discussion too ??

DarrinE