Link to home
Start Free TrialLog in
Avatar of chrismdoyle
chrismdoyle

asked on

Dump email addresses out of flat files using Regex and sed

I've got a folder full of csv and text files that I need to grab all of the email addresses out of. These files are stored on a linux server so I'd like to be able to use the command line or a script to do the following:
For each file in a folder (whether it's a CSV file, tab delimited, or | delimited)
   Search each line for the email address, and send all of those lines to a new file

Example:
File1.csv:
chris@gmail.com, Chris, Last name, phone
1234, chris@gmail.com, Chris, Last Name, phone

ouput to File1-emails.csv:
chris@gmail.com
chris@gmail.com

File2.csv:
"chris@gmail.com", Chris, Last name, phone

ouput to File2-emails.csv:
chris@gmail.com

File3.txt:
chris@gmail.com|chris|last  name| phone

ouput to File3-emails.txt:
chris@gmail.com

The point here is, I'd like the script to be able to run in files of different formats.
Avatar of ozo
ozo
Flag of United States of America image

sed 's/\(.*[", ]\)*\([^", ]*@[-A-Za-z0-9_]*\).*/\2/p
d' file1.csv > File1-emails.csv
Avatar of chrismdoyle
chrismdoyle

ASKER

Very close ^ What about the output on a file like this:

KIFFANY|VANZANT|38122|_lil_sexy131@hotmail.com|
ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Just an update, the above solution is not complete yet. As it turns out, it does not capture the domain name following the email address so chris@gmail.com became chris@gmail

Any help would be appreciated. Could you post the Regex expression in the SED command as well as separate?
I thought I had a . inside of [-A-Za-z0-9_.] but I don't see it now
how does
/\(.*["|, ]\)*\([^"|, ]*@[-A-Za-z0-9_.]*\).*/
work for you?