Solved

VBSCRIPT - EXTRACT EMAIL ADDRESS FROM TEXT FILE - FIX

Posted on 2011-03-15
7
856 Views
Last Modified: 2012-05-11
Hi there,

Refering to the following thread:
http://www.experts-exchange.com/Programming/Languages/Visual_Basic/VB_Script/Q_26886544.html

The following email could not be extracted from that text file:

<tr>
<td width="150" valign="top">Moyen(s) de communication&nbsp;:</td>
<td>courriel (courrier ‚lectronique) : email@domain.com</td>
</tr>
</tabl


Thanks for fixing the VBS script in order to also extract these emails.

Thanks and cheers,
Rene
Option Explicit

Dim objArgs, fso, strFile, strTestString
Dim strPattern, arrAllMatches

Set objArgs = WScript.Arguments

ErrCheck objArgs.Count < 1, 1, "No argument specified."

strFile = objArgs(0)

Set fso = CreateObject("Scripting.FileSystemObject")
ErrCheck Not fso.FileExists(strFile), 1, "File supplied as argument cannot be found: '" & strFile & "'"

'# WScript.Echo "Checking file contents for email addresses: '" & strFile & "'" & vbCrlf

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}"

strTestString = fso.OpenTextFile(strFile, 1).ReadAll

arrAllMatches = fGetMatches(strPattern, strTestString)

If UBound(arrAllMatches) <> 0 Then
    WScript.Echo Join(arrAllMatches, vbCrlf)
Else
    WScript.Echo "-- None Found --"
End If

'# WScript.Echo vbCrlf & "End of " & WScript.ScriptName

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.

    If colMatches.Count = 0 Then
        temp = Array("")
    Else
        '# Convert Collection to Array
        For Each sMatch In colMatches
            temp = temp & sMatch & "¶"
        Next
        temp = Left(temp, Len(temp) - 1)
        temp = Split(temp, "¶")
    End If

    fGetMatches = temp
End Function

Sub ErrCheck(blTest, iErrNum, sTxt)
    Dim sErrText
    If Not blTest Then Exit Sub
    sErrText = "Error: " & sTxt
    MsgBox sErrText, vbSystemModal + vbCritical, "Error in: " & WScript.ScriptName
    WScript.Quit iErrNum
End Sub

Open in new window

0
Comment
Question by:ReneGe
  • 5
  • 2
7 Comments
 
LVL 12

Expert Comment

by:Daz_1234
ID: 35143017
Hi Rene,

There was a bug in the script I originally wrote for you.

I have fixed it by a little bit of juggling and it should now be Ok.

I don't think that you should give any more points, since the original solution I gave you was obviously flawed (  sorry about that :o)  )

Regards,
Daz.
Option Explicit

Dim objArgs, fso, strFile, strTestString
Dim strPattern, strAllMatches

Set objArgs = WScript.Arguments

ErrCheck objArgs.Count < 1, 1, "No argument specified."

strFile = objArgs(0)

Set fso = CreateObject("Scripting.FileSystemObject")
ErrCheck Not fso.FileExists(strFile), 1, "File supplied as argument cannot be found: '" & strFile & "'"

'# WScript.Echo "Checking file contents for email addresses: '" & strFile & "'" & vbCrlf

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}"

strTestString = fso.OpenTextFile(strFile, 1).ReadAll

strAllMatches = fGetMatches(strPattern, strTestString)

If strAllMatches <> "" Then
    WScript.Echo strAllMatches
Else
    WScript.Echo "-- None Found --"
End If

'# WScript.Echo vbCrlf & "End of " & WScript.ScriptName

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.

    If colMatches.Count = 0 Then
        temp = ""
    Else
        For Each sMatch In colMatches
            temp = temp & sMatch & "¶"
        Next
        temp = Left(temp, Len(temp) - 1)
        temp = Replace(temp, "¶", vbCrlf)
    End If
    fGetMatches = temp
End Function

Sub ErrCheck(blTest, iErrNum, sTxt)
    Dim sErrText
    If Not blTest Then Exit Sub
    sErrText = "Error: " & sTxt
    MsgBox sErrText, vbSystemModal + vbCritical, "Error in: " & WScript.ScriptName
    WScript.Quit iErrNum
End Sub

Open in new window

0
 
LVL 10

Author Comment

by:ReneGe
ID: 35143044
Don't worry, you absolutely deserve the points.

I'll try it now, and let you know how it goes.

It will pass through more than 500k files so it will take a little while, but I'll know it tonight.

Cheers,
Rene
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35144157
Your new script worked a lot better.

It was not able to extract email from:

<td width="150" valign="top">Moyen(s) de communication&nbsp;:</td>
<td>t?l?copieur&nbsp;: 905-555-1212<br />
courriel (courrier ?lectronique)&nbsp;: test.email@domain-name.com<br />
</td>


Thanks and cheers,
Rene
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 
LVL 12

Accepted Solution

by:
Daz_1234 earned 500 total points
ID: 35146144
You are a brilliant test of my regular expressions!

Below is the same script with an amended regular expression on line 17.

Regards,
Daz.


Option Explicit

Dim objArgs, fso, strFile, strTestString
Dim strPattern, strAllMatches

Set objArgs = WScript.Arguments

ErrCheck objArgs.Count < 1, 1, "No argument specified."

strFile = objArgs(0)

Set fso = CreateObject("Scripting.FileSystemObject")
ErrCheck Not fso.FileExists(strFile), 1, "File supplied as argument cannot be found: '" & strFile & "'"

'# WScript.Echo "Checking file contents for email addresses: '" & strFile & "'" & vbCrlf

strPattern = "([\w-\.]+)@[\w-]{2,}(\.[\w-]{2,}){1,5}"

strTestString = fso.OpenTextFile(strFile, 1).ReadAll

strAllMatches = fGetMatches(strPattern, strTestString)

If strAllMatches <> "" Then
    WScript.Echo strAllMatches
Else
    WScript.Echo "-- None Found --"
End If

'# WScript.Echo vbCrlf & "End of " & WScript.ScriptName

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.

    If colMatches.Count = 0 Then
        temp = ""
    Else
        For Each sMatch In colMatches
            temp = temp & sMatch & "¶"
        Next
        temp = Left(temp, Len(temp) - 1)
        temp = Replace(temp, "¶", vbCrlf)
    End If
    fGetMatches = temp
End Function

Sub ErrCheck(blTest, iErrNum, sTxt)
    Dim sErrText
    If Not blTest Then Exit Sub
    sErrText = "Error: " & sTxt
    MsgBox sErrText, vbSystemModal + vbCritical, "Error in: " & WScript.ScriptName
    WScript.Quit iErrNum
End Sub

Open in new window

0
 
LVL 10

Author Comment

by:ReneGe
ID: 35146561
Thanks Daz, testing it now!

Cheers,
Rene
0
 
LVL 10

Author Closing Comment

by:ReneGe
ID: 35154603
Hey Daz,

It's working like a charm.

I need to add a function to it, so I'll create a new thread and let you know.

Thanks and cheers,
Rene
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35154639
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Displaying an arrayList in a listView using the default adapter is rarely the best solution. To get full control of your display data, and to be able to refresh it after editing, requires the use of a custom adapter.
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now