Regular Expression to extract data from a text file

I have several text files of varying formats.

Sample data from text file:

NAME: JOHNSON  DATE: 6-29-98  NUMBER: 1120
ADDRESS: 123 WISNER ST  CITY: SCRANTON
PHONE: 555-1212  ACCOUNT #: NEW ACCOUNT
METER #: NA  DATE OF INSTALLATION: 6-29-98
COMPLETE YES: Y  NO:
COMMENT: FLAT RATE-(103)

I would like something that will run through the text file and output something like this for each instance of city it finds:

SCRANTON
WEST PITTSTON
WILKES-BARRE
SCRANTON
DALLAS
DALLAS
WILKES-BARRE

Basically the regular expression needs to find instances of "CITY: CITY NAME " CR/LF

I'm not sure how to do this.  If there is an online resource or page that has this where I can copy and paste data from the files and get the output like I am looking for that could work or I am open to using PHP to do it as well.

I'm looking for Quick and Easy.
LVL 1
wfninpaAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

HonorGodSoftware EngineerCommented:
What Operating System are you using, and what tools / utilities / languages do you have installed?

If you have Python, you could do something like:

Test output:
SCRANTON
'''Command: %(cmdName)s\n
Purpose: Locate instances of "City" in the specified input file, and display
         the associated value\n
  Usage: python %(cmdName)s.py inputFile\n
Example: python %(cmdName)s.py %(cmdName)s.txt'''

import os, os.path
import re
import sys

def main( filename ) :
  cityRE = re.compile( 'City: (.*)$', re.IGNORECASE )
  if not os.path.exists( filename ) :
    print 'Error: File not found: %s\n' % filename
    Usage()
  try :
    fh = open( filename )
    for line in fh :
      mo = cityRE.search( line )
      if mo :
        print mo.group( 1 )
  finally :
    fh.close()

def Usage( cmdName = None ) :
  if not cmdName :
    cmdName = os.path.basename( sys.argv[ 0 ] )
  if cmdName[ -3: ] == '.py' :
    cmdName = cmdName[ :-3 ]
  print __doc__ & locals()
  sys.exit()


if __name__ == '__main__' :
  argc = len( sys.argv )
  if argc != 2 :
    print 'Error: Unexpected number of command line arguments: %d\n' % argc
    Usage()
  main( sys.argv[ 1 ] )
else :
  print 'Error: script should be executed, not imported.\n'
  Usage( __name__ )

Open in new window

0
point_pleasantCommented:
quick shell script

grep "CITY: " city.txt > /tmp/junk
while read inputline
do
        echo $inputline | awk '{print $6}'
done < /tmp/junk
rm /tmp/junk
0
point_pleasantCommented:
oops this will handle cities up to three words long

grep "CITY: " city.txt > /tmp/junk
while read inputline
do
        echo $inputline | awk '{print $6" "$7" "$8}'
done < /tmp/junk
rm /tmp/junk
0
käµfm³d 👽Commented:
If there is an online resource or page



I am open to using PHP to do it as well.

preg_match_all('/(?<=CITY:\s)\s*.*/', $source, $results);
var_dump($results);

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
wfninpaAuthor Commented:
Thank you.  This was exactly what I needed.  The three sites were helpful for testing and modifying the code to my needs.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Regular Expressions

From novice to tech pro — start learning today.