Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2948
  • Last Modified:

bash sed space replacement

I have a bunch of files in a folder
For all the files here is what I want to do:

Insert  "<br>" and the same number of spaces (in this format "&nbsp;") into all the lines that start with one or more space (these spaces will just be blank space and not in the &nbsp; format). Ideally I would like to do this only for lines that are not blank - but I don't care too much if I have the end up doing that.

any ideas - thanks.
jculkincys



0
jculkincys
Asked:
jculkincys
  • 6
  • 4
  • 3
  • +2
4 Solutions
 
NopiusCommented:
replace.sed:
--[cut here]--
/^ *[^ ]/{
i\
<br>
s/ /\&nbsp;/g
}
--[cut here]--
sed -f replace.sed yourfile.txt
0
 
jculkincysAuthor Commented:
Thanks alot for the reply Nopius

I  will try it out

Would you mind explaining it a bit as I am still not expert at sed and regular expressions
0
 
NopiusCommented:
Oops, incorrect solution :-)
--[cut here]--
/^ /{
i\
<br>
: spc
/^ /i\
&nbsp;
/^ /s///
/^ /b spc
}
--[cut here]--

This script will produce correct resuld but each &nbsp; will be on separate line (I don't know is it OK for you).
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
NopiusCommented:
Version 3:
Usage: sed -E -f replace.sed file.in > file.out

replace.sed
--[cut here]--
# Select only non empty lines, starting with space
/^ +[^ ]/{
# Always insert <br> for lines starting with a space
i\
<br>
# do while we have leading spaces
:loop
# Change one space to one &nbsp; simbol
/^((&nbsp;)*) /s//\2\&nbsp;/g
# If we have more leading spaces, loop
/^((&nbsp;)*) /b loop
}
--[cut here]--

Now it works as intended, it's still a challenge to do the same with basic regular expressions (this version uses extended RE).
0
 
ahoffmannCommented:
> Insert  "<br>" and the same number of spaces (in this format "&nbsp;") into all the lines that start with one or more space

are you trying to add "visible" spaces to an HTML page? then this aproach is useless, somehow. You simply can omit the leading &nbsp; before a <br> 'cause they are not visible for obvious reason.

If you want to preserve the original text simply write it inside a <pre> </pre> and use a fixed width font.
0
 
xDamoxCommented:
Hi,

You could try this:

cat targt.txt | sed -e 's/ /\&nbsp;/g' | sed -e 's/$/<br>/g'

The above will place a &nbsp; where there is a space and at the end of a line will insert a <br>
0
 
jculkincysAuthor Commented:
ahoffman the original text is in pre tags and I am trying to take it out of them while retaining each line's indentation.

Nopius and xDamonX I will experiement with your solutinos and get back to you
0
 
jculkincysAuthor Commented:
Nopius
I tested yours first and it works pretty good
However it produces output like like this
<example>
<br>
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
</example>
Is there an easy way to convert that output to something like this (ie all on one line)
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;


xDamox
I just need to replace only the leading space with &nbsp;
0
 
xDamoxCommented:
Hi,

Could you give me an example of what you are working with and what you would like it to
look like after its been altered with sed?

0
 
jculkincysAuthor Commented:
sure xDamox - good idea

the input is formatted with <pre> but I want to remove the pre tags and still maintain the multiple leading spaces that indet the lines. Also I need to place a <br> at the beginning of each line that starts with one for more space.
<input>
Thursday, August 11
    Morning - JW - Verify the daily backups on all production related machines
            - LTM - Verify the production export  - good
            - LTM - Database check
            - LTM/SYSTEMS- Verify the production backup - good
</input>

you might want to copy this into notepad or something because the line wraps might make it confusing.
<desired output>
Thursday, August 11
<br>&nbsp;&nbsp;&nbsp;&nbsp;Morning - JW - Verify the daily backups on all production related machines
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;- LTM - Verify the production export  - good
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;- LTM - Database check
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;- LTM/SYSTEMS- Verify the production backup - good
</desired output>




0
 
xDamoxCommented:
Hi,

The best I got was:

cat target.txt | sed -e 's/^ /<br>/g' | sed -e 's/ /\&nbsp;/g'

The one pitfall to this you might notice as every space it inserts a &nbsp;
0
 
AutogardCommented:
I was finding it hard to do this with a simple SED command so here is a bash script -- sure, you could put this all on one line instead of running it in a script, but I'll just post the script.

-------------------------------------------
#!/bin/bash

# Replace the first space with a newline and a &nbsp;
sed -i 's/^ /<br>\&nbsp;/' $1

counter=0
while [ $counter -lt 1000 ]; do
    # Now in loop replace all &nbsp;<space> with &nbsp;&nbsp;
    sed -i 's/\&nbsp; /\&nbsp;\&nbsp;/' $1
    let counter=counter+1
done

# Print result
cat $1
----------------------------------------------

Call it using "scriptname.sh <filenametodoreplaceon>"

Only pitfalls that I see:
1. this will replace lines that contain only spaces with a <br>&nbsp;&nbsp; etc........
2. this will only handle files that have 1000 or less spaces at the front of the line (if you want more, just increase the "1000"
3. it will also replace any existing "&nbsp; "s on lines that aren't at the beginning of the line
4. also know that it will overwrite the original file (because of the "-i" option)

Kind of clunky -- I know, but it should work.  :)  Maybe someone else can use this to make an easier solution.
0
 
AutogardCommented:
To eliminate pitfall #1 add this before the "cat $1"

# Get rid of all lines that are now just a <br> followed by some &nbsp; (lines that were only spaces before this script was run)
sed -i 's/^<br>\&nbsp;\(\&nbsp;\)*$//' $1
0
 
ahoffmannCommented:
> input is formatted with <pre> but I want to remove the pre tags and still maintain the multiple leading spaces that indet the lines.
why would you do that? it's more or less useless depending on the font used in the browser which you cannot control. <pre> is the way to go, anything else is unreliable and not woth thinking about (except for academic philosophy ;-)

Anyway, such an academic solution with gawk:
gawk '{x=match($0,"[^ ]");if(x<2){print}else{s=substr($0,0,x-1);t=substr($0,x);gsub(" ","\\&nbsp;",s);print "<br>"s""t;}}' file

# to be improved in many ways, best with perl ...
0
 
jculkincysAuthor Commented:
ahoffman -
I am doing this because we are moving these documents into a wiki
The wiki does support <pre> but then we lose all the other pretty formatting of the wiki
perl will work - how would you do it with perl?

Autogard
that looks promising - let me give it a whirl

0
 
ahoffmannCommented:
did you test my gawk suggestion?
0
 
jculkincysAuthor Commented:
You all did great thanks alot

ahoffman I accepted his as the "answer" only because he answered first and I really don't understand gawk that much.

here is the next challenge for this project
http://www.experts-exchange.com/Operating_Systems/Linux/Q_21888235.html
0
 
AutogardCommented:
Thanks jculkincys!

From what I've heard "awk" can be a powerful tool to use as well as "sed", but I really just haven't found the time (or much of a need) to learn it.

Maybe you can point us all ahoffmann to a good tutorial.  :)
0
 
ahoffmannCommented:
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

  • 6
  • 4
  • 3
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now