Solved

SED Regular Expression Problems

Posted on 2004-07-30
8
341 Views
Last Modified: 2008-03-04
I am trying to use SED to change the headers of an e-mail.

A typical e-mail might be:

Received: subnet.server.com
Date: Tue, 22 Apr 2004 15:33:15 BST
From: "Someone" <someone@someone.com>
To: "You" <you@someone.com>

etc etc.

I am trying to change the "Received: " part of the header to "Received: from " (which is not difficult at all.  The problem comes when some of the e-mails will already containg "Received: from " and so, they should not put another "from" in there.. i.e.

Received: from from subnet.server.com

The regular expression I am using at the moment (which does not work) is :

sed 's/Received:\s+(from\s+)/Received: from/' filename

Could you possibly tell me how to do this please?
0
Comment
Question by:garry_m
  • 3
  • 3
  • 2
8 Comments
 
LVL 23

Assisted Solution

by:Mysidia
Mysidia earned 25 total points
ID: 11677216
Sed uses POSIX regular expressions, not Perl regular expressions.

Try
  sed 's/Received:[[:space:]]\+\(from[[:space:]]\+\)\?/Received: from /'

Note there is no \s.     [[:space:]]  matches a space
   see  man -S 7 regex

Some special expression characters need to be escaped in a sed script to enable special
behavior, so  \(   \) instead of ( ) is an expression group
0
 
LVL 23

Expert Comment

by:Mysidia
ID: 11677258
To manipulate mail headers like that ,you should be using anchored matches, also,
so you don't wind up rewriting the subject line if someone puts 'Received:' in it:
So rather, consider

sed 's/^Received:[[:space:]]\+\(from[[:space:]]\+\)\?/Received: from /'

The ^ matches the start of the line
0
 

Author Comment

by:garry_m
ID: 11677307
Thanks for that, testing now..

As to your second comment, that shouldn't be a problem should it? As the expression doesn't have a global operator, then it should only replace the first instance shouldn't it?
0
Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

 
LVL 20

Accepted Solution

by:
Gns earned 25 total points
ID: 11677392
This might work out OK:
sed 's/Received: \([^ .]\+\.[^ .]\+\.[^ .]\+\) /Received: from \1 /' filename

In the above we find any lines "Received: abc123.acb123.abc123 " and replace them with "Received: from abc123.acb123.abc123 " ... All domainnames don't match that, so you might need add lines for abc123.abc123, abc123.abc123.abc123.abc123 etc.
Do read the info page for sed, especially the section on regular expressions... since these differ a bit from perlre
Speaking of perl, this might be easier in perl:-).

-- Glenn
0
 
LVL 20

Expert Comment

by:Gns
ID: 11677405
Sorry for the late post Mysidia, clearly I'm a slow typer today (also:-).

-- Glenn
0
 
LVL 23

Expert Comment

by:Mysidia
ID: 11677552
re: Garry

The global operator pertains to multiple substitutions in one line.  If it's not used, then the substitution will be
applied at most once per line: with the g option sed will allow the same pattern to match many times on a single line.

However, the substitution of a s// command willl be re-attempted for every line of input, even after a successful
match.

There doesn't have to be a 'g' for sed to do that
0
 
LVL 20

Expert Comment

by:Gns
ID: 11677639
About the info page... the gnu sed manpage is (as noted on it) a joke, more or less. The true docs are readable with
info sed
If you are unfamiliar with info you can start with
info info
and follow the onscreen instructions.

-- Glenn
0
 

Author Comment

by:garry_m
ID: 11971333
Thanks guys, split points between both of you cause both solutions helped.

Thanks again
0

Featured Post

Migrating Your Company's PCs

To keep pace with competitors, businesses must keep employees productive, and that means providing them with the latest technology. This document provides the tips and tricks you need to help you migrate an outdated PC fleet to new desktops, laptops, and tablets.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Access_log 17 121
mcrypt_create_iv() is deprecated 4 160
maybe no no httpd.conf 6 47
Linux Desktop suggestion for Dell Inspiron 3043 13 40
How many times have you wanted to quickly do the same thing to a list but found yourself typing it again and again? I first figured out a small time saver with the up arrow to recall the last command but that can only get you so far if you have a bi…
Google Drive is extremely cheap offsite storage, and it's even possible to get extra storage for free for two years.  You can use the free account 15GB, and if you have an Android device..when you install Google Drive for the first time it will give…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question