Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Searching  A Text File

Posted on 2002-05-29
8
Medium Priority
?
193 Views
Last Modified: 2010-05-02
Hey Everyone,

What I am trying to do here is search a text file for certain keywords and values. On thing I need to search for is email addresses.  What is the best way to search through a text file scanning for these values?  Any ideas?

0
Comment
Question by:dsplice
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 22

Expert Comment

by:rspahitz
ID: 7042176
Open "myfile.txt" for binary as #1
strFileContents = input$(lof(1), #1)
close #1

' Search contents for e-mail address
iEMailPosit = 0
do
  iEMailPosit = instr(iEMailPosit+1, strFileContents, "@")
  if iEMailPosit =0 then
    exit do
  endif
  ' add extra code to determine start and end of e-mail address
  iEMailStart = instrrev(iEMailPosit, strFileContents, " ")
  iEMailEnd = instr(iEMailPosit+1, strFileContents, " ")
loop

' Note that the above logic will have to be expanded to accomodate other e-mail delimiters besides space characters.
0
 
LVL 18

Accepted Solution

by:
bobbit31 earned 200 total points
ID: 7042316
you could also use the microsoft script control to use javascript regular expressions:

ie:

Dim ff As Integer
Dim strLine As String
Dim scr As New ScriptControl
Dim funcRegExpr As String
Dim strFile As String

scr.Language = "javascript"

funcRegExpr = "function findExpression(str, pattern) {" & _
              "   var regEmailCheck = /[A-Za-z0-9\_\-]+\@[A-Za-z0-9\_\-]+.*\.\w{2,3}/g;" & _
              "   var res = regEmailCheck.exec(str);" & _
              "   return (res == null) ? '' : res;" & _
              "}"
scr.AddCode (funcRegExpr)

ff = FreeFile

Open "C:\my documents\test.txt" For Input As #ff

Do While Not EOF(ff)

    Line Input #ff, strLine
    strFile = strFile & strLine
   
Loop

Close (ff)

'' get all emails out
Dim strEmails As String
strEmails = scr.Eval("findExpression('" & strFile & "')")

If strEmails = "" Then
    MsgBox "No Emails Found"
Else
    MsgBox strEmails
End If

you might have to tweak the regular expression shown above... see the link below for some help w/ regular expressions:
http://www.marzie.com/devtools/misc/regexp.asp
0
 

Author Comment

by:dsplice
ID: 7042808
Thanks for the great comments...How would I go about capturing the entire email address?  I guess Im alittle unclear on the logic behind searching through the file.

0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 22

Expert Comment

by:rspahitz
ID: 7042846
There's no easy answer because e-mail addresses are like postal addresses and are not necessarily in any common format.

Here are the restrictions as I understand them:

1) Must contain "@"
2) Must not contain any spaces or non-printable characters
3) "@" must be preceded by at least one valid character
4) "@" must be followed by at least one valid character
5) Somewhere following the "@" msut be a "." which will be followed by a domain category (com, edu, uk, fi, etc.)
6) Among the list of *possibly* invalid characters: @, *, ?, =, +, ", <, >, |, /, \.  Some of these may be valid, but not likely; other invalid characters probably exist.
7) Among the list of *probably* valid characters: A through Z, a through z, 0 through 9, -, _, .

Other than that, some servers may have additional limitations.

Based on this, your parsing routine must search for "@" symbols, then work backwards until it finds an invalid character, then work forward until it finds an invalid character.  The e-mail address is that which is located between the invalid characters.

Further clouding the issue is that carriage return/line feed combinations may get embedded in the e-mail address but are not part of the address.

Then, of course, there may be "@" symbols embedded within other contexts, such as "apples: 2@$0.29" or "my company is named Fan@ix."

And don't forget that when you extract all of these e-mails and start spamming people that your ISP can cancel your account and legal action could be taken against you.
0
 
LVL 18

Expert Comment

by:bobbit31
ID: 7042851
adjustment to my above code:

Dim ff As Integer
Dim strLine As String
Dim scr As New ScriptControl
Dim funcRegExpr As String
Dim strFile As String

scr.Language = "javascript"

funcRegExpr = "function findExpression(str, pattern) {" & _
              "   var regEmailCheck = /\w+[\w-\.]*\@\w+((-\w+)|(\w*))\.[a-z]{2,3}/;" & _
              "   var res = regEmailCheck.exec(str);" & _
              "   return (res == null) ? '' : res;" & _
              "}"
scr.AddCode (funcRegExpr)

ff = FreeFile

Open "C:\my documents\test.txt" For Input As #ff

Do While Not EOF(ff)

    Line Input #ff, strLine
   
    '' check for email addresses
    Dim strEmails As String
    strEmails = scr.Eval("findExpression('" & strLine & "')")
   
    If strEmails <> "" Then
        MsgBox strEmails
    End If
   
   
Loop

Close (ff)


See what happens when you run this (strEmails will be your email address if there was one found)
0
 
LVL 18

Expert Comment

by:bobbit31
ID: 7042856
also, you can go to: http://www.regexlib.com/Default.aspx and search for other helpful regular expressions
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 7851251
Hi dsplice,
It appears that you have forgotten this question. I will ask Community Support to close it unless you finalize it within 7 days. I will ask a Community Support Moderator to:

    Accept bobbit31's comment(s) as an answer.

dsplice, if you think your question was not answered at all or if you need help, just post a new comment here; Community Support will help you.  DO NOT accept this comment as an answer.

EXPERTS: If you disagree with that recommendation, please post an explanatory comment.
==========
DanRollins -- EE database cleanup volunteer
0
 

Expert Comment

by:SpideyMod
ID: 7912850
per recommendation

SpideyMod
Community Support Moderator @Experts Exchange
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you have ever used Microsoft Word then you know that it has a good spell checker and it may have occurred to you that the ability to check spelling might be a nice piece of functionality to add to certain applications of yours. Well the code that…
When designing a form there are several BorderStyles to choose from, all of which can be classified as either 'Fixed' or 'Sizable' and I'd guess that 'Fixed Single' or one of the other fixed types is the most popular choice. I assume it's the most p…
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…
Suggested Courses

664 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question