Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

bash sed space replacement

Posted on 2006-06-14
19
Medium Priority
?
2,940 Views
Last Modified: 2012-05-05
I have a bunch of files in a folder
For all the files here is what I want to do:

Insert  "<br>" and the same number of spaces (in this format "&nbsp;") into all the lines that start with one or more space (these spaces will just be blank space and not in the &nbsp; format). Ideally I would like to do this only for lines that are not blank - but I don't care too much if I have the end up doing that.

any ideas - thanks.
jculkincys



0
Comment
Question by:jculkincys
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 4
  • 3
  • +2
19 Comments
 
LVL 27

Expert Comment

by:Nopius
ID: 16908632
replace.sed:
--[cut here]--
/^ *[^ ]/{
i\
<br>
s/ /\&nbsp;/g
}
--[cut here]--
sed -f replace.sed yourfile.txt
0
 
LVL 2

Author Comment

by:jculkincys
ID: 16908678
Thanks alot for the reply Nopius

I  will try it out

Would you mind explaining it a bit as I am still not expert at sed and regular expressions
0
 
LVL 27

Assisted Solution

by:Nopius
Nopius earned 200 total points
ID: 16908761
Oops, incorrect solution :-)
--[cut here]--
/^ /{
i\
<br>
: spc
/^ /i\
&nbsp;
/^ /s///
/^ /b spc
}
--[cut here]--

This script will produce correct resuld but each &nbsp; will be on separate line (I don't know is it OK for you).
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 27

Expert Comment

by:Nopius
ID: 16909366
Version 3:
Usage: sed -E -f replace.sed file.in > file.out

replace.sed
--[cut here]--
# Select only non empty lines, starting with space
/^ +[^ ]/{
# Always insert <br> for lines starting with a space
i\
<br>
# do while we have leading spaces
:loop
# Change one space to one &nbsp; simbol
/^((&nbsp;)*) /s//\2\&nbsp;/g
# If we have more leading spaces, loop
/^((&nbsp;)*) /b loop
}
--[cut here]--

Now it works as intended, it's still a challenge to do the same with basic regular expressions (this version uses extended RE).
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 16909638
> Insert  "<br>" and the same number of spaces (in this format "&nbsp;") into all the lines that start with one or more space

are you trying to add "visible" spaces to an HTML page? then this aproach is useless, somehow. You simply can omit the leading &nbsp; before a <br> 'cause they are not visible for obvious reason.

If you want to preserve the original text simply write it inside a <pre> </pre> and use a fixed width font.
0
 
LVL 16

Expert Comment

by:xDamox
ID: 16910562
Hi,

You could try this:

cat targt.txt | sed -e 's/ /\&nbsp;/g' | sed -e 's/$/<br>/g'

The above will place a &nbsp; where there is a space and at the end of a line will insert a <br>
0
 
LVL 2

Author Comment

by:jculkincys
ID: 16910970
ahoffman the original text is in pre tags and I am trying to take it out of them while retaining each line's indentation.

Nopius and xDamonX I will experiement with your solutinos and get back to you
0
 
LVL 2

Author Comment

by:jculkincys
ID: 16911158
Nopius
I tested yours first and it works pretty good
However it produces output like like this
<example>
<br>
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
</example>
Is there an easy way to convert that output to something like this (ie all on one line)
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;


xDamox
I just need to replace only the leading space with &nbsp;
0
 
LVL 16

Expert Comment

by:xDamox
ID: 16911641
Hi,

Could you give me an example of what you are working with and what you would like it to
look like after its been altered with sed?

0
 
LVL 2

Author Comment

by:jculkincys
ID: 16911981
sure xDamox - good idea

the input is formatted with <pre> but I want to remove the pre tags and still maintain the multiple leading spaces that indet the lines. Also I need to place a <br> at the beginning of each line that starts with one for more space.
<input>
Thursday, August 11
    Morning - JW - Verify the daily backups on all production related machines
            - LTM - Verify the production export  - good
            - LTM - Database check
            - LTM/SYSTEMS- Verify the production backup - good
</input>

you might want to copy this into notepad or something because the line wraps might make it confusing.
<desired output>
Thursday, August 11
<br>&nbsp;&nbsp;&nbsp;&nbsp;Morning - JW - Verify the daily backups on all production related machines
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;- LTM - Verify the production export  - good
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;- LTM - Database check
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;- LTM/SYSTEMS- Verify the production backup - good
</desired output>




0
 
LVL 16

Assisted Solution

by:xDamox
xDamox earned 600 total points
ID: 16912092
Hi,

The best I got was:

cat target.txt | sed -e 's/^ /<br>/g' | sed -e 's/ /\&nbsp;/g'

The one pitfall to this you might notice as every space it inserts a &nbsp;
0
 
LVL 8

Expert Comment

by:Autogard
ID: 16912806
I was finding it hard to do this with a simple SED command so here is a bash script -- sure, you could put this all on one line instead of running it in a script, but I'll just post the script.

-------------------------------------------
#!/bin/bash

# Replace the first space with a newline and a &nbsp;
sed -i 's/^ /<br>\&nbsp;/' $1

counter=0
while [ $counter -lt 1000 ]; do
    # Now in loop replace all &nbsp;<space> with &nbsp;&nbsp;
    sed -i 's/\&nbsp; /\&nbsp;\&nbsp;/' $1
    let counter=counter+1
done

# Print result
cat $1
----------------------------------------------

Call it using "scriptname.sh <filenametodoreplaceon>"

Only pitfalls that I see:
1. this will replace lines that contain only spaces with a <br>&nbsp;&nbsp; etc........
2. this will only handle files that have 1000 or less spaces at the front of the line (if you want more, just increase the "1000"
3. it will also replace any existing "&nbsp; "s on lines that aren't at the beginning of the line
4. also know that it will overwrite the original file (because of the "-i" option)

Kind of clunky -- I know, but it should work.  :)  Maybe someone else can use this to make an easier solution.
0
 
LVL 8

Accepted Solution

by:
Autogard earned 800 total points
ID: 16912862
To eliminate pitfall #1 add this before the "cat $1"

# Get rid of all lines that are now just a <br> followed by some &nbsp; (lines that were only spaces before this script was run)
sed -i 's/^<br>\&nbsp;\(\&nbsp;\)*$//' $1
0
 
LVL 51

Assisted Solution

by:ahoffmann
ahoffmann earned 400 total points
ID: 16914946
> input is formatted with <pre> but I want to remove the pre tags and still maintain the multiple leading spaces that indet the lines.
why would you do that? it's more or less useless depending on the font used in the browser which you cannot control. <pre> is the way to go, anything else is unreliable and not woth thinking about (except for academic philosophy ;-)

Anyway, such an academic solution with gawk:
gawk '{x=match($0,"[^ ]");if(x<2){print}else{s=substr($0,0,x-1);t=substr($0,x);gsub(" ","\\&nbsp;",s);print "<br>"s""t;}}' file

# to be improved in many ways, best with perl ...
0
 
LVL 2

Author Comment

by:jculkincys
ID: 16915077
ahoffman -
I am doing this because we are moving these documents into a wiki
The wiki does support <pre> but then we lose all the other pretty formatting of the wiki
perl will work - how would you do it with perl?

Autogard
that looks promising - let me give it a whirl

0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 16915146
did you test my gawk suggestion?
0
 
LVL 2

Author Comment

by:jculkincys
ID: 16915974
You all did great thanks alot

ahoffman I accepted his as the "answer" only because he answered first and I really don't understand gawk that much.

here is the next challenge for this project
http://www.experts-exchange.com/Operating_Systems/Linux/Q_21888235.html
0
 
LVL 8

Expert Comment

by:Autogard
ID: 16916007
Thanks jculkincys!

From what I've heard "awk" can be a powerful tool to use as well as "sed", but I really just haven't found the time (or much of a need) to learn it.

Maybe you can point us all ahoffmann to a good tutorial.  :)
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 16918077
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

How many times have you wanted to quickly do the same thing to a list but found yourself typing it again and again? I first figured out a small time saver with the up arrow to recall the last command but that can only get you so far if you have a bi…
Fine Tune your automatic Updates for Ubuntu / Debian
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Suggested Courses
Course of the Month11 days, 19 hours left to enroll

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question