Link to home
Start Free TrialLog in
Avatar of nellster
nellster

asked on

Extract and split string

Hi All. Am running IIS6 / .Net Framework / working in ASP.NET using Visual Basic.NET.

I have some "contacts" data imported from a CSV into an Access Database. The troublesome field is the "Phone" field as it contains text and numbers, for example "Office phone: 0234 132432Cell phone: 07865 99999". What I need to be able to do is extract the "office phone" number to a seperate "Office Phone" field, same for "Cell Phone" and I think there are some "Office Fax" numbers as well.

I have performed string manipulation before but am a bit stumped by this.

Any help would be gratefully recieved!

Nellster
SOLUTION
Avatar of Mike Tomlinson
Mike Tomlinson
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
or try this
'Add the following reference
    '
    'Imports System.Text.RegularExpressions
    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        ExtractPhoneNos("Office phone: 0234 132432Cell phone: 07865 99999", "Office", "Phone")
        ExtractPhoneNos("Office phone: 0234 132432Cell phone: 07865 99999", "Cell", "Phone")
        ExtractPhoneNos("Office fax: 0234 132432Cell phone: 07865 99999", "Office", "Fax")
    End Sub
    Private Sub ExtractPhoneNos(ByVal Source As String, ByVal FirstWord As String, ByVal SecondWord As String)
        Dim oReg As Regex
        Dim oMats As MatchCollection, oMat As Match
        oReg = New Regex(FirstWord & "[ \t]+" & SecondWord & "[ \t]*\:([0-9 ]+)", RegexOptions.IgnoreCase)
        oMats = oReg.Matches(Source)
        For Each oMat In oMats
            MsgBox(FirstWord & " " & SecondWord & " - " & Trim(oReg.Replace(oMat.Value, "$1")))
            Debug.WriteLine(FirstWord & " " & SecondWord & " - " & Trim(oReg.Replace(oMat.Value, "$1")))
        Next
    End Sub
Avatar of nellster
nellster

ASKER

Thanks for all the replies, been away so apologies for late reply. I have been reading up on Regular Expressions as they seem to be the way to go in terms of efficieny as I have alot of strings to split (not that the other suggestions were'nt usefull by the by and will be splitting some points!). I have managed to change the data itself to have a unique indentifier, however I am unsure on the pattern I need to apply to extract the various sections of the string. I'll layout below:

The string I have now is "~1~020 786 8989~2~020 6767 6767~3~673 7878 6766~4~098 7654 1234"
~1~ = the office phone
~2~ = fax
~3~ = Mobile
~4~ = Home phone

What I need to be able to do is split the string up so I can write these new values to my database but unsure on the pattern/logic to use.

Many thanks for all help!

Nellster
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks for the points!
no probs.. thanks for the input! Sorry for delay!