?
Solved

SED Regular Expression Problems

Posted on 2004-07-30
8
Medium Priority
?
349 Views
Last Modified: 2008-03-04
I am trying to use SED to change the headers of an e-mail.

A typical e-mail might be:

Received: subnet.server.com
Date: Tue, 22 Apr 2004 15:33:15 BST
From: "Someone" <someone@someone.com>
To: "You" <you@someone.com>

etc etc.

I am trying to change the "Received: " part of the header to "Received: from " (which is not difficult at all.  The problem comes when some of the e-mails will already containg "Received: from " and so, they should not put another "from" in there.. i.e.

Received: from from subnet.server.com

The regular expression I am using at the moment (which does not work) is :

sed 's/Received:\s+(from\s+)/Received: from/' filename

Could you possibly tell me how to do this please?
0
Comment
Question by:garry_m
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
  • 2
8 Comments
 
LVL 23

Assisted Solution

by:Mysidia
Mysidia earned 100 total points
ID: 11677216
Sed uses POSIX regular expressions, not Perl regular expressions.

Try
  sed 's/Received:[[:space:]]\+\(from[[:space:]]\+\)\?/Received: from /'

Note there is no \s.     [[:space:]]  matches a space
   see  man -S 7 regex

Some special expression characters need to be escaped in a sed script to enable special
behavior, so  \(   \) instead of ( ) is an expression group
0
 
LVL 23

Expert Comment

by:Mysidia
ID: 11677258
To manipulate mail headers like that ,you should be using anchored matches, also,
so you don't wind up rewriting the subject line if someone puts 'Received:' in it:
So rather, consider

sed 's/^Received:[[:space:]]\+\(from[[:space:]]\+\)\?/Received: from /'

The ^ matches the start of the line
0
 

Author Comment

by:garry_m
ID: 11677307
Thanks for that, testing now..

As to your second comment, that shouldn't be a problem should it? As the expression doesn't have a global operator, then it should only replace the first instance shouldn't it?
0
U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

 
LVL 20

Accepted Solution

by:
Gns earned 100 total points
ID: 11677392
This might work out OK:
sed 's/Received: \([^ .]\+\.[^ .]\+\.[^ .]\+\) /Received: from \1 /' filename

In the above we find any lines "Received: abc123.acb123.abc123 " and replace them with "Received: from abc123.acb123.abc123 " ... All domainnames don't match that, so you might need add lines for abc123.abc123, abc123.abc123.abc123.abc123 etc.
Do read the info page for sed, especially the section on regular expressions... since these differ a bit from perlre
Speaking of perl, this might be easier in perl:-).

-- Glenn
0
 
LVL 20

Expert Comment

by:Gns
ID: 11677405
Sorry for the late post Mysidia, clearly I'm a slow typer today (also:-).

-- Glenn
0
 
LVL 23

Expert Comment

by:Mysidia
ID: 11677552
re: Garry

The global operator pertains to multiple substitutions in one line.  If it's not used, then the substitution will be
applied at most once per line: with the g option sed will allow the same pattern to match many times on a single line.

However, the substitution of a s// command willl be re-attempted for every line of input, even after a successful
match.

There doesn't have to be a 'g' for sed to do that
0
 
LVL 20

Expert Comment

by:Gns
ID: 11677639
About the info page... the gnu sed manpage is (as noted on it) a joke, more or less. The true docs are readable with
info sed
If you are unfamiliar with info you can start with
info info
and follow the onscreen instructions.

-- Glenn
0
 

Author Comment

by:garry_m
ID: 11971333
Thanks guys, split points between both of you cause both solutions helped.

Thanks again
0

Featured Post

More Than Just A Video Library

Train for your certification. Learn the latest DevOps tools. Grow your skillset to do better work.

At Linux Academy, we release new training modules every week so you'll always be up to date on the latest tech.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

rdate is a Linux command and the network time protocol for immediate date and time setup from another machine. The clocks are synchronized by entering rdate with the -s switch (command without switch just checks the time but does not set anything). …
Fine Tune your automatic Updates for Ubuntu / Debian
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.
Suggested Courses
Course of the Month9 days, 20 hours left to enroll

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question