Solved

VBSCRIPT - EXTRACT EMAIL ADDRESS FROM TEXT FILE

Posted on 2011-03-14
20
2,772 Views
Last Modified: 2012-05-11
Hi there,

I need a VBScript (nothing else) that will extract the email addresses from a text file by searching for "@", then parsing the email address around it.  A text file may contain several email addresses, so all emails must be extracted.

Extracted emails will require to simply be displayed to the screen (the DOS window), not exported to a file.

The VSScript will be run by the batch file command line:
cscript ExtractEmail.vbs "TextFile.txt

Thanks for your help,
Rene
0
Comment
Question by:ReneGe
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 7
  • 3
  • +2
20 Comments
 
LVL 9

Assisted Solution

by:sshah254
sshah254 earned 20 total points
ID: 35133753
Can you upload a sample of the input text file?

Ss
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35133791
It will be complicated for me to upload one cause it contains confidential infos. However, email addresses may be anywhere in it, and a line may contain more than 256 characters.

For example:
 envonfv ofnvfv fkdjfnv;djfn;dn email@domain.com.sdvzsd vj>¨^c zcln test@domain.com
 vsifdvjzs0v8djzs0fivdj zsofvn>anothermail@domain.com >| xidisdvnsv

Thanks and cheers,
Rene
0
 
LVL 22

Assisted Solution

by:Paka
Paka earned 20 total points
ID: 35134702
Do you have standard characters that are before and after the email address (such as space and > as in your example)?  Without standard delimiters, extracting email addresses from this type of file will be extremely difficult.  For example, the first email from your example could be email@domain.com or email@domain.com.sd (which could be an email address in Sudan).
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 10

Author Comment

by:ReneGe
ID: 35134926
Paka,

Interessting question/comment.

For the sake of this thread, the moment an email caracter that is not "a-z,0-9,.,_,-" etc, should be it's dellimiter.

I hope that helps.

Thanks and cheers,
Rene
0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35137069
Hi Rene,

The script below uses a regular expression to attempt to extract all legal email addresses from a string.

You can see that I have set the strTestString variable to the sample you gave earlier.  But there is a problem.  Currently the script returns the following valid format email addresses from your sample:

email@domain.com.sdvzsd
test@domain.com
anothermail@domain.com


As you can  see the first one, to a human, is clearly not a real email address (although it could be), but it is in a correct email address format.

It would help to refine the regular expression if you knew that all email addresses were to end in .com or .co.uk or a finite list.

Regards,
Daz.


Dim strPattern, arrAllMatches, strMatch, strMatches

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}(;([\w-\.]+)@\w{2,}(\.\w{2,}){1,5})*?"

strTestString = " envonfv ofnvfv fkdjfnv;djfn;dn email@domain.com.sdvzsd vj>¨^c zcln test@domain.com" & vbCrlf _
              & " vsifdvjzs0v8djzs0fivdj zsofvn>anothermail@domain.com >| xidisdvnsv "

colAllMatches = fGetMatches(strPattern, strTestString)

For Each strMatch In colAllMatches
    strMatches = strMatches & strMatch & vbCrlf
Next

MsgBox strMatches,, "Email Addresses found:"

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.
    '# Convert Collection to Array
    For Each sMatch In colMatches
        temp = temp & sMatch & vbCrlf
    Next
    temp = Split(temp, vbCrlf)
    fGetMatches = temp

End Function

Open in new window

0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35137146
Slight change, the regular expression I copy/pasted above was to match a delimited list of email addresses from another of my scripts, so is unnecessarily complicated (although still works)

Updated script below, same as the previous one, but with changed line 3 (and no change in functionality).

Regards,
Daz


Dim strPattern, arrAllMatches, strMatch, strMatches

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}"

strTestString = " envonfv ofnvfv fkdjfnv;djfn;dn email@domain.com.sdvzsd vj>¨^c zcln test@domain.com" & vbCrlf _
              & " vsifdvjzs0v8djzs0fivdj zsofvn>anothermail@domain.com >| xidisdvnsv "

colAllMatches = fGetMatches(strPattern, strTestString)

For Each strMatch In colAllMatches
    strMatches = strMatches & strMatch & vbCrlf
Next

MsgBox strMatches,, "Email Addresses found:"

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.
    '# Convert Collection to Array
    For Each sMatch In colMatches
        temp = temp & sMatch & vbCrlf
    Next
    temp = Split(temp, vbCrlf)
    fGetMatches = temp

End Function

Open in new window

0
 
LVL 10

Author Comment

by:ReneGe
ID: 35137619
Daz_1234,

Nicely done!

Knowing that it will be called from a batch file with: cscript ExtractEmail.vbs "TextFile.txt"

Now, what seems to be left to be done is:
1-Insted og the "strTestString", it must read the text file called from the cscript (see up).
2-Have the output in the DOS window, not in a Windows box.

Thanks and cheers,
Rene
0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35137847
Here you go, complete:

Regards,
Daz.


Option Explicit

Dim objArgs, fso, strFile, strTestString
Dim strPattern, arrAllMatches

Set objArgs = WScript.Arguments

ErrCheck objArgs.Count < 1, 1, "No argument specified."

strFile = objArgs(0)

Set fso = CreateObject("Scripting.FileSystemObject")
ErrCheck Not fso.FileExists(strFile), 1, "File supplied as argument cannot be found: '" & strFile & "'"

WScript.Echo "Checking file contents for email addresses: '" & strFile & "'" & vbCrlf

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}"

strTestString = fso.OpenTextFile(strFile, 1).ReadAll

arrAllMatches = fGetMatches(strPattern, strTestString)

If UBound(arrAllMatches) <> 0 Then
    WScript.Echo Join(arrAllMatches, vbCrlf)
Else
    WScript.Echo "-- None Found --"
End If

WScript.Echo vbCrlf & "End of " & WScript.ScriptName

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.

    If colMatches.Count = 0 Then
        temp = Array("")
    Else
        '# Convert Collection to Array
        For Each sMatch In colMatches
            temp = temp & sMatch & vbCrlf
        Next
        temp = Split(temp, vbCrlf)
    End If

    fGetMatches = temp
End Function

Sub ErrCheck(blTest, iErrNum, sTxt)
    Dim sErrText
    If Not blTest Then Exit Sub
    sErrText = "Error: " & sTxt
    MsgBox sErrText, vbSystemModal + vbCritical, "Error in: " & WScript.ScriptName
    WScript.Quit iErrNum
End Sub

Open in new window

0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35137862
By the way, I added 'friendly' messages at lines 15 and 29.  Feel free to remove/comment those lines if not required.

If there are no matches it will return the string:
-- None Found --


Regards,
Daz.
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35138288
I rem'ed "WScript.Echo "Checking file contents for email addresses: '" & strFile & "'" & vbCrlf
"

Now in the output, I get:
test@domain.com

End of FindEmail.vbs


Can the extra CRLF and "End of FindEmail.vbs" be removed?

Thanks and cheers,
Rene
0
 
LVL 13

Accepted Solution

by:
Daz_1234 earned 440 total points
ID: 35138338
As requested:

Regards,
Daz.

Option Explicit

Dim objArgs, fso, strFile, strTestString
Dim strPattern, arrAllMatches

Set objArgs = WScript.Arguments

ErrCheck objArgs.Count < 1, 1, "No argument specified."

strFile = objArgs(0)

Set fso = CreateObject("Scripting.FileSystemObject")
ErrCheck Not fso.FileExists(strFile), 1, "File supplied as argument cannot be found: '" & strFile & "'"

'# WScript.Echo "Checking file contents for email addresses: '" & strFile & "'" & vbCrlf

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}"

strTestString = fso.OpenTextFile(strFile, 1).ReadAll

arrAllMatches = fGetMatches(strPattern, strTestString)

If UBound(arrAllMatches) <> 0 Then
    WScript.Echo Join(arrAllMatches, vbCrlf)
Else
    WScript.Echo "-- None Found --"
End If

'# WScript.Echo vbCrlf & "End of " & WScript.ScriptName

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.

    If colMatches.Count = 0 Then
        temp = Array("")
    Else
        '# Convert Collection to Array
        For Each sMatch In colMatches
            temp = temp & sMatch & "¶"
        Next
        temp = Left(temp, Len(temp) - 1)
        temp = Split(temp, "¶")
    End If

    fGetMatches = temp
End Function

Sub ErrCheck(blTest, iErrNum, sTxt)
    Dim sErrText
    If Not blTest Then Exit Sub
    sErrText = "Error: " & sTxt
    MsgBox sErrText, vbSystemModal + vbCritical, "Error in: " & WScript.ScriptName
    WScript.Quit iErrNum
End Sub

Open in new window

0
 
LVL 26

Assisted Solution

by:pony10us
pony10us earned 20 total points
ID: 35138350
Rene,

Remove line 29:  WScript.Echo vbCrlf & "End of " & WScript.ScriptName

as Daz stated should take care of that.

0
 
LVL 10

Author Comment

by:ReneGe
ID: 35138393
Thanks DAZ for your excellent work. You'r a hero !!

Thanks ss and Paka for your contribution.

Cheers,
Rene
0
 
LVL 26

Expert Comment

by:pony10us
ID: 35138768
Rene

I am glad you got this working. However I don't really think I deserved any of the points as I merely pointed out what Daz had already said.  I really didn't contribute anything to your solution.
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35138814
Your little contribution made me realized what he did without me looking for it. So yes, you deserve some points. Don't worry, to respect Daz efforts, I only gave you 20.

Thanks and cheers,
Rene
0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35138843
As far as I'm concerned that is fair enough, since although the contribution was small, it was indeed a positive one in the spirit of helping out  ;o)

Daz.
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35138869
Well said Daz!

Rene

0
 
LVL 26

Expert Comment

by:pony10us
ID: 35138954
Well thank you both then and I will accept them graciously.    :)

I actually was fascinated by the topic title so stopped in to read it and just answered the question.  

I like what you did Daz.
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35141399
FYI,

After using the script, I found that in a certain situation, emails could not be extracted.

So I created a new thread.
http://www.experts-exchange.com/Programming/Languages/Visual_Basic/VB_Script/Q_26888745.html

Thanks and cheers,
Rene
0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35143025
Fixed script below.

Daz.


Option Explicit

Dim objArgs, fso, strFile, strTestString
Dim strPattern, strAllMatches

Set objArgs = WScript.Arguments

ErrCheck objArgs.Count < 1, 1, "No argument specified."

strFile = objArgs(0)

Set fso = CreateObject("Scripting.FileSystemObject")
ErrCheck Not fso.FileExists(strFile), 1, "File supplied as argument cannot be found: '" & strFile & "'"

'# WScript.Echo "Checking file contents for email addresses: '" & strFile & "'" & vbCrlf

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}"

strTestString = fso.OpenTextFile(strFile, 1).ReadAll

strAllMatches = fGetMatches(strPattern, strTestString)

If strAllMatches <> "" Then
    WScript.Echo strAllMatches
Else
    WScript.Echo "-- None Found --"
End If

'# WScript.Echo vbCrlf & "End of " & WScript.ScriptName

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.

    If colMatches.Count = 0 Then
        temp = ""
    Else
        For Each sMatch In colMatches
            temp = temp & sMatch & "¶"
        Next
        temp = Left(temp, Len(temp) - 1)
        temp = Replace(temp, "¶", vbCrlf)
    End If
    fGetMatches = temp
End Function

Sub ErrCheck(blTest, iErrNum, sTxt)
    Dim sErrText
    If Not blTest Then Exit Sub
    sErrText = "Error: " & sTxt
    MsgBox sErrText, vbSystemModal + vbCritical, "Error in: " & WScript.ScriptName
    WScript.Quit iErrNum
End Sub

Open in new window

0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Help to debug powershell script 5 57
JVM error from eclipse 1 27
Add Log to powershell Script 14 30
dropdownlist in asp.net vb. 3 23
Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
This article will inform Clients about common and important expectations from the freelancers (Experts) who are looking at your Gig.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question