SED Regular Expression Problems

Posted on 2004-07-30
Medium Priority
Last Modified: 2008-03-04
I am trying to use SED to change the headers of an e-mail.

A typical e-mail might be:

Received: subnet.server.com
Date: Tue, 22 Apr 2004 15:33:15 BST
From: "Someone" <someone@someone.com>
To: "You" <you@someone.com>

etc etc.

I am trying to change the "Received: " part of the header to "Received: from " (which is not difficult at all.  The problem comes when some of the e-mails will already containg "Received: from " and so, they should not put another "from" in there.. i.e.

Received: from from subnet.server.com

The regular expression I am using at the moment (which does not work) is :

sed 's/Received:\s+(from\s+)/Received: from/' filename

Could you possibly tell me how to do this please?
Question by:garry_m
  • 3
  • 3
  • 2
LVL 23

Assisted Solution

Mysidia earned 100 total points
ID: 11677216
Sed uses POSIX regular expressions, not Perl regular expressions.

  sed 's/Received:[[:space:]]\+\(from[[:space:]]\+\)\?/Received: from /'

Note there is no \s.     [[:space:]]  matches a space
   see  man -S 7 regex

Some special expression characters need to be escaped in a sed script to enable special
behavior, so  \(   \) instead of ( ) is an expression group
LVL 23

Expert Comment

ID: 11677258
To manipulate mail headers like that ,you should be using anchored matches, also,
so you don't wind up rewriting the subject line if someone puts 'Received:' in it:
So rather, consider

sed 's/^Received:[[:space:]]\+\(from[[:space:]]\+\)\?/Received: from /'

The ^ matches the start of the line

Author Comment

ID: 11677307
Thanks for that, testing now..

As to your second comment, that shouldn't be a problem should it? As the expression doesn't have a global operator, then it should only replace the first instance shouldn't it?
A proven path to a career in data science

At Springboard, we know how to get you a job in data science. With Springboard’s Data Science Career Track, you’ll master data science  with a curriculum built by industry experts. You’ll work on real projects, and get 1-on-1 mentorship from a data scientist.

LVL 20

Accepted Solution

Gns earned 100 total points
ID: 11677392
This might work out OK:
sed 's/Received: \([^ .]\+\.[^ .]\+\.[^ .]\+\) /Received: from \1 /' filename

In the above we find any lines "Received: abc123.acb123.abc123 " and replace them with "Received: from abc123.acb123.abc123 " ... All domainnames don't match that, so you might need add lines for abc123.abc123, abc123.abc123.abc123.abc123 etc.
Do read the info page for sed, especially the section on regular expressions... since these differ a bit from perlre
Speaking of perl, this might be easier in perl:-).

-- Glenn
LVL 20

Expert Comment

ID: 11677405
Sorry for the late post Mysidia, clearly I'm a slow typer today (also:-).

-- Glenn
LVL 23

Expert Comment

ID: 11677552
re: Garry

The global operator pertains to multiple substitutions in one line.  If it's not used, then the substitution will be
applied at most once per line: with the g option sed will allow the same pattern to match many times on a single line.

However, the substitution of a s// command willl be re-attempted for every line of input, even after a successful

There doesn't have to be a 'g' for sed to do that
LVL 20

Expert Comment

ID: 11677639
About the info page... the gnu sed manpage is (as noted on it) a joke, more or less. The true docs are readable with
info sed
If you are unfamiliar with info you can start with
info info
and follow the onscreen instructions.

-- Glenn

Author Comment

ID: 11971333
Thanks guys, split points between both of you cause both solutions helped.

Thanks again

Featured Post

Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
I. Introduction There's an interesting discussion going on now in an Experts Exchange Group — Attachments with no extension . This reminded me of questions that come up here at EE along the lines of, "How can I tell the type of file from its cont…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.

587 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question