Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Searching  A Text File

Posted on 2002-05-29
8
Medium Priority
?
194 Views
Last Modified: 2010-05-02
Hey Everyone,

What I am trying to do here is search a text file for certain keywords and values. On thing I need to search for is email addresses.  What is the best way to search through a text file scanning for these values?  Any ideas?

0
Comment
Question by:dsplice
8 Comments
 
LVL 22

Expert Comment

by:rspahitz
ID: 7042176
Open "myfile.txt" for binary as #1
strFileContents = input$(lof(1), #1)
close #1

' Search contents for e-mail address
iEMailPosit = 0
do
  iEMailPosit = instr(iEMailPosit+1, strFileContents, "@")
  if iEMailPosit =0 then
    exit do
  endif
  ' add extra code to determine start and end of e-mail address
  iEMailStart = instrrev(iEMailPosit, strFileContents, " ")
  iEMailEnd = instr(iEMailPosit+1, strFileContents, " ")
loop

' Note that the above logic will have to be expanded to accomodate other e-mail delimiters besides space characters.
0
 
LVL 18

Accepted Solution

by:
bobbit31 earned 200 total points
ID: 7042316
you could also use the microsoft script control to use javascript regular expressions:

ie:

Dim ff As Integer
Dim strLine As String
Dim scr As New ScriptControl
Dim funcRegExpr As String
Dim strFile As String

scr.Language = "javascript"

funcRegExpr = "function findExpression(str, pattern) {" & _
              "   var regEmailCheck = /[A-Za-z0-9\_\-]+\@[A-Za-z0-9\_\-]+.*\.\w{2,3}/g;" & _
              "   var res = regEmailCheck.exec(str);" & _
              "   return (res == null) ? '' : res;" & _
              "}"
scr.AddCode (funcRegExpr)

ff = FreeFile

Open "C:\my documents\test.txt" For Input As #ff

Do While Not EOF(ff)

    Line Input #ff, strLine
    strFile = strFile & strLine
   
Loop

Close (ff)

'' get all emails out
Dim strEmails As String
strEmails = scr.Eval("findExpression('" & strFile & "')")

If strEmails = "" Then
    MsgBox "No Emails Found"
Else
    MsgBox strEmails
End If

you might have to tweak the regular expression shown above... see the link below for some help w/ regular expressions:
http://www.marzie.com/devtools/misc/regexp.asp
0
 

Author Comment

by:dsplice
ID: 7042808
Thanks for the great comments...How would I go about capturing the entire email address?  I guess Im alittle unclear on the logic behind searching through the file.

0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 22

Expert Comment

by:rspahitz
ID: 7042846
There's no easy answer because e-mail addresses are like postal addresses and are not necessarily in any common format.

Here are the restrictions as I understand them:

1) Must contain "@"
2) Must not contain any spaces or non-printable characters
3) "@" must be preceded by at least one valid character
4) "@" must be followed by at least one valid character
5) Somewhere following the "@" msut be a "." which will be followed by a domain category (com, edu, uk, fi, etc.)
6) Among the list of *possibly* invalid characters: @, *, ?, =, +, ", <, >, |, /, \.  Some of these may be valid, but not likely; other invalid characters probably exist.
7) Among the list of *probably* valid characters: A through Z, a through z, 0 through 9, -, _, .

Other than that, some servers may have additional limitations.

Based on this, your parsing routine must search for "@" symbols, then work backwards until it finds an invalid character, then work forward until it finds an invalid character.  The e-mail address is that which is located between the invalid characters.

Further clouding the issue is that carriage return/line feed combinations may get embedded in the e-mail address but are not part of the address.

Then, of course, there may be "@" symbols embedded within other contexts, such as "apples: 2@$0.29" or "my company is named Fan@ix."

And don't forget that when you extract all of these e-mails and start spamming people that your ISP can cancel your account and legal action could be taken against you.
0
 
LVL 18

Expert Comment

by:bobbit31
ID: 7042851
adjustment to my above code:

Dim ff As Integer
Dim strLine As String
Dim scr As New ScriptControl
Dim funcRegExpr As String
Dim strFile As String

scr.Language = "javascript"

funcRegExpr = "function findExpression(str, pattern) {" & _
              "   var regEmailCheck = /\w+[\w-\.]*\@\w+((-\w+)|(\w*))\.[a-z]{2,3}/;" & _
              "   var res = regEmailCheck.exec(str);" & _
              "   return (res == null) ? '' : res;" & _
              "}"
scr.AddCode (funcRegExpr)

ff = FreeFile

Open "C:\my documents\test.txt" For Input As #ff

Do While Not EOF(ff)

    Line Input #ff, strLine
   
    '' check for email addresses
    Dim strEmails As String
    strEmails = scr.Eval("findExpression('" & strLine & "')")
   
    If strEmails <> "" Then
        MsgBox strEmails
    End If
   
   
Loop

Close (ff)


See what happens when you run this (strEmails will be your email address if there was one found)
0
 
LVL 18

Expert Comment

by:bobbit31
ID: 7042856
also, you can go to: http://www.regexlib.com/Default.aspx and search for other helpful regular expressions
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 7851251
Hi dsplice,
It appears that you have forgotten this question. I will ask Community Support to close it unless you finalize it within 7 days. I will ask a Community Support Moderator to:

    Accept bobbit31's comment(s) as an answer.

dsplice, if you think your question was not answered at all or if you need help, just post a new comment here; Community Support will help you.  DO NOT accept this comment as an answer.

EXPERTS: If you disagree with that recommendation, please post an explanatory comment.
==========
DanRollins -- EE database cleanup volunteer
0
 

Expert Comment

by:SpideyMod
ID: 7912850
per recommendation

SpideyMod
Community Support Moderator @Experts Exchange
0

Featured Post

Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

There are many ways to remove duplicate entries in an SQL or Access database. Most make you temporarily insert an ID field, make a temp table and copy data back and forth, and/or are slow. Here is an easy way in VB6 using ADO to remove duplicate row…
Enums (shorthand for ‘enumerations’) are not often used by programmers but they can be quite valuable when they are.  What are they? An Enum is just a type of variable like a string or an Integer, but in this case one that you create that contains…
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…
Suggested Courses

926 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question