Solved

SED Regular Expression Problems

Posted on 2004-07-30
8
344 Views
Last Modified: 2008-03-04
I am trying to use SED to change the headers of an e-mail.

A typical e-mail might be:

Received: subnet.server.com
Date: Tue, 22 Apr 2004 15:33:15 BST
From: "Someone" <someone@someone.com>
To: "You" <you@someone.com>

etc etc.

I am trying to change the "Received: " part of the header to "Received: from " (which is not difficult at all.  The problem comes when some of the e-mails will already containg "Received: from " and so, they should not put another "from" in there.. i.e.

Received: from from subnet.server.com

The regular expression I am using at the moment (which does not work) is :

sed 's/Received:\s+(from\s+)/Received: from/' filename

Could you possibly tell me how to do this please?
0
Comment
Question by:garry_m
  • 3
  • 3
  • 2
8 Comments
 
LVL 23

Assisted Solution

by:Mysidia
Mysidia earned 25 total points
ID: 11677216
Sed uses POSIX regular expressions, not Perl regular expressions.

Try
  sed 's/Received:[[:space:]]\+\(from[[:space:]]\+\)\?/Received: from /'

Note there is no \s.     [[:space:]]  matches a space
   see  man -S 7 regex

Some special expression characters need to be escaped in a sed script to enable special
behavior, so  \(   \) instead of ( ) is an expression group
0
 
LVL 23

Expert Comment

by:Mysidia
ID: 11677258
To manipulate mail headers like that ,you should be using anchored matches, also,
so you don't wind up rewriting the subject line if someone puts 'Received:' in it:
So rather, consider

sed 's/^Received:[[:space:]]\+\(from[[:space:]]\+\)\?/Received: from /'

The ^ matches the start of the line
0
 

Author Comment

by:garry_m
ID: 11677307
Thanks for that, testing now..

As to your second comment, that shouldn't be a problem should it? As the expression doesn't have a global operator, then it should only replace the first instance shouldn't it?
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 
LVL 20

Accepted Solution

by:
Gns earned 25 total points
ID: 11677392
This might work out OK:
sed 's/Received: \([^ .]\+\.[^ .]\+\.[^ .]\+\) /Received: from \1 /' filename

In the above we find any lines "Received: abc123.acb123.abc123 " and replace them with "Received: from abc123.acb123.abc123 " ... All domainnames don't match that, so you might need add lines for abc123.abc123, abc123.abc123.abc123.abc123 etc.
Do read the info page for sed, especially the section on regular expressions... since these differ a bit from perlre
Speaking of perl, this might be easier in perl:-).

-- Glenn
0
 
LVL 20

Expert Comment

by:Gns
ID: 11677405
Sorry for the late post Mysidia, clearly I'm a slow typer today (also:-).

-- Glenn
0
 
LVL 23

Expert Comment

by:Mysidia
ID: 11677552
re: Garry

The global operator pertains to multiple substitutions in one line.  If it's not used, then the substitution will be
applied at most once per line: with the g option sed will allow the same pattern to match many times on a single line.

However, the substitution of a s// command willl be re-attempted for every line of input, even after a successful
match.

There doesn't have to be a 'g' for sed to do that
0
 
LVL 20

Expert Comment

by:Gns
ID: 11677639
About the info page... the gnu sed manpage is (as noted on it) a joke, more or less. The true docs are readable with
info sed
If you are unfamiliar with info you can start with
info info
and follow the onscreen instructions.

-- Glenn
0
 

Author Comment

by:garry_m
ID: 11971333
Thanks guys, split points between both of you cause both solutions helped.

Thanks again
0

Featured Post

Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Daily system administration tasks often require administrators to connect remote systems. But allowing these remote systems to accept passwords makes these systems vulnerable to the risk of brute-force password guessing attacks. Furthermore there ar…
Little introduction about CP: CP is a command on linux that use to copy files and folder from one location to another location. Example usage of CP as follow: cp /myfoder /pathto/destination/folder/ cp abc.tar.gz /pathto/destination/folder/ab…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

733 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question