• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 353
  • Last Modified:

SED Regular Expression Problems

I am trying to use SED to change the headers of an e-mail.

A typical e-mail might be:

Received: subnet.server.com
Date: Tue, 22 Apr 2004 15:33:15 BST
From: "Someone" <someone@someone.com>
To: "You" <you@someone.com>

etc etc.

I am trying to change the "Received: " part of the header to "Received: from " (which is not difficult at all.  The problem comes when some of the e-mails will already containg "Received: from " and so, they should not put another "from" in there.. i.e.

Received: from from subnet.server.com

The regular expression I am using at the moment (which does not work) is :

sed 's/Received:\s+(from\s+)/Received: from/' filename

Could you possibly tell me how to do this please?
0
garry_m
Asked:
garry_m
  • 3
  • 3
  • 2
2 Solutions
 
MysidiaCommented:
Sed uses POSIX regular expressions, not Perl regular expressions.

Try
  sed 's/Received:[[:space:]]\+\(from[[:space:]]\+\)\?/Received: from /'

Note there is no \s.     [[:space:]]  matches a space
   see  man -S 7 regex

Some special expression characters need to be escaped in a sed script to enable special
behavior, so  \(   \) instead of ( ) is an expression group
0
 
MysidiaCommented:
To manipulate mail headers like that ,you should be using anchored matches, also,
so you don't wind up rewriting the subject line if someone puts 'Received:' in it:
So rather, consider

sed 's/^Received:[[:space:]]\+\(from[[:space:]]\+\)\?/Received: from /'

The ^ matches the start of the line
0
 
garry_mAuthor Commented:
Thanks for that, testing now..

As to your second comment, that shouldn't be a problem should it? As the expression doesn't have a global operator, then it should only replace the first instance shouldn't it?
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
GnsCommented:
This might work out OK:
sed 's/Received: \([^ .]\+\.[^ .]\+\.[^ .]\+\) /Received: from \1 /' filename

In the above we find any lines "Received: abc123.acb123.abc123 " and replace them with "Received: from abc123.acb123.abc123 " ... All domainnames don't match that, so you might need add lines for abc123.abc123, abc123.abc123.abc123.abc123 etc.
Do read the info page for sed, especially the section on regular expressions... since these differ a bit from perlre
Speaking of perl, this might be easier in perl:-).

-- Glenn
0
 
GnsCommented:
Sorry for the late post Mysidia, clearly I'm a slow typer today (also:-).

-- Glenn
0
 
MysidiaCommented:
re: Garry

The global operator pertains to multiple substitutions in one line.  If it's not used, then the substitution will be
applied at most once per line: with the g option sed will allow the same pattern to match many times on a single line.

However, the substitution of a s// command willl be re-attempted for every line of input, even after a successful
match.

There doesn't have to be a 'g' for sed to do that
0
 
GnsCommented:
About the info page... the gnu sed manpage is (as noted on it) a joke, more or less. The true docs are readable with
info sed
If you are unfamiliar with info you can start with
info info
and follow the onscreen instructions.

-- Glenn
0
 
garry_mAuthor Commented:
Thanks guys, split points between both of you cause both solutions helped.

Thanks again
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 3
  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now