Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

VBSCRIPT - EXTRACT EMAIL ADDRESS FROM TEXT FILE

Posted on 2011-03-14
20
Medium Priority
?
2,972 Views
Last Modified: 2012-05-11
Hi there,

I need a VBScript (nothing else) that will extract the email addresses from a text file by searching for "@", then parsing the email address around it.  A text file may contain several email addresses, so all emails must be extracted.

Extracted emails will require to simply be displayed to the screen (the DOS window), not exported to a file.

The VSScript will be run by the batch file command line:
cscript ExtractEmail.vbs "TextFile.txt

Thanks for your help,
Rene
0
Comment
Question by:ReneGe
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 7
  • 3
  • +2
20 Comments
 
LVL 9

Assisted Solution

by:sshah254
sshah254 earned 80 total points
ID: 35133753
Can you upload a sample of the input text file?

Ss
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35133791
It will be complicated for me to upload one cause it contains confidential infos. However, email addresses may be anywhere in it, and a line may contain more than 256 characters.

For example:
 envonfv ofnvfv fkdjfnv;djfn;dn email@domain.com.sdvzsd vj>¨^c zcln test@domain.com
 vsifdvjzs0v8djzs0fivdj zsofvn>anothermail@domain.com >| xidisdvnsv

Thanks and cheers,
Rene
0
 
LVL 22

Assisted Solution

by:Paka
Paka earned 80 total points
ID: 35134702
Do you have standard characters that are before and after the email address (such as space and > as in your example)?  Without standard delimiters, extracting email addresses from this type of file will be extremely difficult.  For example, the first email from your example could be email@domain.com or email@domain.com.sd (which could be an email address in Sudan).
0
The top UI technologies you need to be aware of

An important part of the job as a front-end developer is to stay up to date and in contact with new tools, trends and workflows. That’s why you cannot miss this upcoming webinar to explore the latest trends in UI technologies!

 
LVL 10

Author Comment

by:ReneGe
ID: 35134926
Paka,

Interessting question/comment.

For the sake of this thread, the moment an email caracter that is not "a-z,0-9,.,_,-" etc, should be it's dellimiter.

I hope that helps.

Thanks and cheers,
Rene
0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35137069
Hi Rene,

The script below uses a regular expression to attempt to extract all legal email addresses from a string.

You can see that I have set the strTestString variable to the sample you gave earlier.  But there is a problem.  Currently the script returns the following valid format email addresses from your sample:

email@domain.com.sdvzsd
test@domain.com
anothermail@domain.com


As you can  see the first one, to a human, is clearly not a real email address (although it could be), but it is in a correct email address format.

It would help to refine the regular expression if you knew that all email addresses were to end in .com or .co.uk or a finite list.

Regards,
Daz.


Dim strPattern, arrAllMatches, strMatch, strMatches

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}(;([\w-\.]+)@\w{2,}(\.\w{2,}){1,5})*?"

strTestString = " envonfv ofnvfv fkdjfnv;djfn;dn email@domain.com.sdvzsd vj>¨^c zcln test@domain.com" & vbCrlf _
              & " vsifdvjzs0v8djzs0fivdj zsofvn>anothermail@domain.com >| xidisdvnsv "

colAllMatches = fGetMatches(strPattern, strTestString)

For Each strMatch In colAllMatches
    strMatches = strMatches & strMatch & vbCrlf
Next

MsgBox strMatches,, "Email Addresses found:"

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.
    '# Convert Collection to Array
    For Each sMatch In colMatches
        temp = temp & sMatch & vbCrlf
    Next
    temp = Split(temp, vbCrlf)
    fGetMatches = temp

End Function

Open in new window

0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35137146
Slight change, the regular expression I copy/pasted above was to match a delimited list of email addresses from another of my scripts, so is unnecessarily complicated (although still works)

Updated script below, same as the previous one, but with changed line 3 (and no change in functionality).

Regards,
Daz


Dim strPattern, arrAllMatches, strMatch, strMatches

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}"

strTestString = " envonfv ofnvfv fkdjfnv;djfn;dn email@domain.com.sdvzsd vj>¨^c zcln test@domain.com" & vbCrlf _
              & " vsifdvjzs0v8djzs0fivdj zsofvn>anothermail@domain.com >| xidisdvnsv "

colAllMatches = fGetMatches(strPattern, strTestString)

For Each strMatch In colAllMatches
    strMatches = strMatches & strMatch & vbCrlf
Next

MsgBox strMatches,, "Email Addresses found:"

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.
    '# Convert Collection to Array
    For Each sMatch In colMatches
        temp = temp & sMatch & vbCrlf
    Next
    temp = Split(temp, vbCrlf)
    fGetMatches = temp

End Function

Open in new window

0
 
LVL 10

Author Comment

by:ReneGe
ID: 35137619
Daz_1234,

Nicely done!

Knowing that it will be called from a batch file with: cscript ExtractEmail.vbs "TextFile.txt"

Now, what seems to be left to be done is:
1-Insted og the "strTestString", it must read the text file called from the cscript (see up).
2-Have the output in the DOS window, not in a Windows box.

Thanks and cheers,
Rene
0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35137847
Here you go, complete:

Regards,
Daz.


Option Explicit

Dim objArgs, fso, strFile, strTestString
Dim strPattern, arrAllMatches

Set objArgs = WScript.Arguments

ErrCheck objArgs.Count < 1, 1, "No argument specified."

strFile = objArgs(0)

Set fso = CreateObject("Scripting.FileSystemObject")
ErrCheck Not fso.FileExists(strFile), 1, "File supplied as argument cannot be found: '" & strFile & "'"

WScript.Echo "Checking file contents for email addresses: '" & strFile & "'" & vbCrlf

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}"

strTestString = fso.OpenTextFile(strFile, 1).ReadAll

arrAllMatches = fGetMatches(strPattern, strTestString)

If UBound(arrAllMatches) <> 0 Then
    WScript.Echo Join(arrAllMatches, vbCrlf)
Else
    WScript.Echo "-- None Found --"
End If

WScript.Echo vbCrlf & "End of " & WScript.ScriptName

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.

    If colMatches.Count = 0 Then
        temp = Array("")
    Else
        '# Convert Collection to Array
        For Each sMatch In colMatches
            temp = temp & sMatch & vbCrlf
        Next
        temp = Split(temp, vbCrlf)
    End If

    fGetMatches = temp
End Function

Sub ErrCheck(blTest, iErrNum, sTxt)
    Dim sErrText
    If Not blTest Then Exit Sub
    sErrText = "Error: " & sTxt
    MsgBox sErrText, vbSystemModal + vbCritical, "Error in: " & WScript.ScriptName
    WScript.Quit iErrNum
End Sub

Open in new window

0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35137862
By the way, I added 'friendly' messages at lines 15 and 29.  Feel free to remove/comment those lines if not required.

If there are no matches it will return the string:
-- None Found --


Regards,
Daz.
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35138288
I rem'ed "WScript.Echo "Checking file contents for email addresses: '" & strFile & "'" & vbCrlf
"

Now in the output, I get:
test@domain.com

End of FindEmail.vbs


Can the extra CRLF and "End of FindEmail.vbs" be removed?

Thanks and cheers,
Rene
0
 
LVL 13

Accepted Solution

by:
Daz_1234 earned 1760 total points
ID: 35138338
As requested:

Regards,
Daz.

Option Explicit

Dim objArgs, fso, strFile, strTestString
Dim strPattern, arrAllMatches

Set objArgs = WScript.Arguments

ErrCheck objArgs.Count < 1, 1, "No argument specified."

strFile = objArgs(0)

Set fso = CreateObject("Scripting.FileSystemObject")
ErrCheck Not fso.FileExists(strFile), 1, "File supplied as argument cannot be found: '" & strFile & "'"

'# WScript.Echo "Checking file contents for email addresses: '" & strFile & "'" & vbCrlf

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}"

strTestString = fso.OpenTextFile(strFile, 1).ReadAll

arrAllMatches = fGetMatches(strPattern, strTestString)

If UBound(arrAllMatches) <> 0 Then
    WScript.Echo Join(arrAllMatches, vbCrlf)
Else
    WScript.Echo "-- None Found --"
End If

'# WScript.Echo vbCrlf & "End of " & WScript.ScriptName

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.

    If colMatches.Count = 0 Then
        temp = Array("")
    Else
        '# Convert Collection to Array
        For Each sMatch In colMatches
            temp = temp & sMatch & "¶"
        Next
        temp = Left(temp, Len(temp) - 1)
        temp = Split(temp, "¶")
    End If

    fGetMatches = temp
End Function

Sub ErrCheck(blTest, iErrNum, sTxt)
    Dim sErrText
    If Not blTest Then Exit Sub
    sErrText = "Error: " & sTxt
    MsgBox sErrText, vbSystemModal + vbCritical, "Error in: " & WScript.ScriptName
    WScript.Quit iErrNum
End Sub

Open in new window

0
 
LVL 26

Assisted Solution

by:pony10us
pony10us earned 80 total points
ID: 35138350
Rene,

Remove line 29:  WScript.Echo vbCrlf & "End of " & WScript.ScriptName

as Daz stated should take care of that.

0
 
LVL 10

Author Comment

by:ReneGe
ID: 35138393
Thanks DAZ for your excellent work. You'r a hero !!

Thanks ss and Paka for your contribution.

Cheers,
Rene
0
 
LVL 26

Expert Comment

by:pony10us
ID: 35138768
Rene

I am glad you got this working. However I don't really think I deserved any of the points as I merely pointed out what Daz had already said.  I really didn't contribute anything to your solution.
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35138814
Your little contribution made me realized what he did without me looking for it. So yes, you deserve some points. Don't worry, to respect Daz efforts, I only gave you 20.

Thanks and cheers,
Rene
0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35138843
As far as I'm concerned that is fair enough, since although the contribution was small, it was indeed a positive one in the spirit of helping out  ;o)

Daz.
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35138869
Well said Daz!

Rene

0
 
LVL 26

Expert Comment

by:pony10us
ID: 35138954
Well thank you both then and I will accept them graciously.    :)

I actually was fascinated by the topic title so stopped in to read it and just answered the question.  

I like what you did Daz.
0
 
LVL 10

Author Comment

by:ReneGe
ID: 35141399
FYI,

After using the script, I found that in a certain situation, emails could not be extracted.

So I created a new thread.
http://www.experts-exchange.com/Programming/Languages/Visual_Basic/VB_Script/Q_26888745.html

Thanks and cheers,
Rene
0
 
LVL 13

Expert Comment

by:Daz_1234
ID: 35143025
Fixed script below.

Daz.


Option Explicit

Dim objArgs, fso, strFile, strTestString
Dim strPattern, strAllMatches

Set objArgs = WScript.Arguments

ErrCheck objArgs.Count < 1, 1, "No argument specified."

strFile = objArgs(0)

Set fso = CreateObject("Scripting.FileSystemObject")
ErrCheck Not fso.FileExists(strFile), 1, "File supplied as argument cannot be found: '" & strFile & "'"

'# WScript.Echo "Checking file contents for email addresses: '" & strFile & "'" & vbCrlf

strPattern = "([\w-\.]+)@\w{2,}(\.\w{2,}){1,5}"

strTestString = fso.OpenTextFile(strFile, 1).ReadAll

strAllMatches = fGetMatches(strPattern, strTestString)

If strAllMatches <> "" Then
    WScript.Echo strAllMatches
Else
    WScript.Echo "-- None Found --"
End If

'# WScript.Echo vbCrlf & "End of " & WScript.ScriptName

Function fGetMatches(sPattern, sStr)
    Dim regEx, retVal, sMatch, colMatches, temp
    Set regEx = New RegExp     ' Create a regular expression.
    regEx.Pattern = sPattern   ' Set pattern.
    regEx.IgnoreCase = True   ' Set case insensitivity.
    regEx.Global = True        ' Set global applicability.

    Set colMatches = regEx.Execute(sStr)   ' Execute search.

    If colMatches.Count = 0 Then
        temp = ""
    Else
        For Each sMatch In colMatches
            temp = temp & sMatch & "¶"
        Next
        temp = Left(temp, Len(temp) - 1)
        temp = Replace(temp, "¶", vbCrlf)
    End If
    fGetMatches = temp
End Function

Sub ErrCheck(blTest, iErrNum, sTxt)
    Dim sErrText
    If Not blTest Then Exit Sub
    sErrText = "Error: " & sTxt
    MsgBox sErrText, vbSystemModal + vbCritical, "Error in: " & WScript.ScriptName
    WScript.Quit iErrNum
End Sub

Open in new window

0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
In this post we will learn how to make Android Gesture Tutorial and give different functionality whenever a user Touch or Scroll android screen.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question