Solved

bash sed space replacement

Posted on 2006-06-14
19
2,899 Views
Last Modified: 2012-05-05
I have a bunch of files in a folder
For all the files here is what I want to do:

Insert  "<br>" and the same number of spaces (in this format "&nbsp;") into all the lines that start with one or more space (these spaces will just be blank space and not in the &nbsp; format). Ideally I would like to do this only for lines that are not blank - but I don't care too much if I have the end up doing that.

any ideas - thanks.
jculkincys



0
Comment
Question by:jculkincys
  • 6
  • 4
  • 3
  • +2
19 Comments
 
LVL 27

Expert Comment

by:Nopius
ID: 16908632
replace.sed:
--[cut here]--
/^ *[^ ]/{
i\
<br>
s/ /\&nbsp;/g
}
--[cut here]--
sed -f replace.sed yourfile.txt
0
 
LVL 2

Author Comment

by:jculkincys
ID: 16908678
Thanks alot for the reply Nopius

I  will try it out

Would you mind explaining it a bit as I am still not expert at sed and regular expressions
0
 
LVL 27

Assisted Solution

by:Nopius
Nopius earned 50 total points
ID: 16908761
Oops, incorrect solution :-)
--[cut here]--
/^ /{
i\
<br>
: spc
/^ /i\
&nbsp;
/^ /s///
/^ /b spc
}
--[cut here]--

This script will produce correct resuld but each &nbsp; will be on separate line (I don't know is it OK for you).
0
 
LVL 27

Expert Comment

by:Nopius
ID: 16909366
Version 3:
Usage: sed -E -f replace.sed file.in > file.out

replace.sed
--[cut here]--
# Select only non empty lines, starting with space
/^ +[^ ]/{
# Always insert <br> for lines starting with a space
i\
<br>
# do while we have leading spaces
:loop
# Change one space to one &nbsp; simbol
/^((&nbsp;)*) /s//\2\&nbsp;/g
# If we have more leading spaces, loop
/^((&nbsp;)*) /b loop
}
--[cut here]--

Now it works as intended, it's still a challenge to do the same with basic regular expressions (this version uses extended RE).
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 16909638
> Insert  "<br>" and the same number of spaces (in this format "&nbsp;") into all the lines that start with one or more space

are you trying to add "visible" spaces to an HTML page? then this aproach is useless, somehow. You simply can omit the leading &nbsp; before a <br> 'cause they are not visible for obvious reason.

If you want to preserve the original text simply write it inside a <pre> </pre> and use a fixed width font.
0
 
LVL 16

Expert Comment

by:xDamox
ID: 16910562
Hi,

You could try this:

cat targt.txt | sed -e 's/ /\&nbsp;/g' | sed -e 's/$/<br>/g'

The above will place a &nbsp; where there is a space and at the end of a line will insert a <br>
0
 
LVL 2

Author Comment

by:jculkincys
ID: 16910970
ahoffman the original text is in pre tags and I am trying to take it out of them while retaining each line's indentation.

Nopius and xDamonX I will experiement with your solutinos and get back to you
0
 
LVL 2

Author Comment

by:jculkincys
ID: 16911158
Nopius
I tested yours first and it works pretty good
However it produces output like like this
<example>
<br>
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
</example>
Is there an easy way to convert that output to something like this (ie all on one line)
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;


xDamox
I just need to replace only the leading space with &nbsp;
0
 
LVL 16

Expert Comment

by:xDamox
ID: 16911641
Hi,

Could you give me an example of what you are working with and what you would like it to
look like after its been altered with sed?

0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 
LVL 2

Author Comment

by:jculkincys
ID: 16911981
sure xDamox - good idea

the input is formatted with <pre> but I want to remove the pre tags and still maintain the multiple leading spaces that indet the lines. Also I need to place a <br> at the beginning of each line that starts with one for more space.
<input>
Thursday, August 11
    Morning - JW - Verify the daily backups on all production related machines
            - LTM - Verify the production export  - good
            - LTM - Database check
            - LTM/SYSTEMS- Verify the production backup - good
</input>

you might want to copy this into notepad or something because the line wraps might make it confusing.
<desired output>
Thursday, August 11
<br>&nbsp;&nbsp;&nbsp;&nbsp;Morning - JW - Verify the daily backups on all production related machines
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;- LTM - Verify the production export  - good
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;- LTM - Database check
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;- LTM/SYSTEMS- Verify the production backup - good
</desired output>




0
 
LVL 16

Assisted Solution

by:xDamox
xDamox earned 150 total points
ID: 16912092
Hi,

The best I got was:

cat target.txt | sed -e 's/^ /<br>/g' | sed -e 's/ /\&nbsp;/g'

The one pitfall to this you might notice as every space it inserts a &nbsp;
0
 
LVL 8

Expert Comment

by:Autogard
ID: 16912806
I was finding it hard to do this with a simple SED command so here is a bash script -- sure, you could put this all on one line instead of running it in a script, but I'll just post the script.

-------------------------------------------
#!/bin/bash

# Replace the first space with a newline and a &nbsp;
sed -i 's/^ /<br>\&nbsp;/' $1

counter=0
while [ $counter -lt 1000 ]; do
    # Now in loop replace all &nbsp;<space> with &nbsp;&nbsp;
    sed -i 's/\&nbsp; /\&nbsp;\&nbsp;/' $1
    let counter=counter+1
done

# Print result
cat $1
----------------------------------------------

Call it using "scriptname.sh <filenametodoreplaceon>"

Only pitfalls that I see:
1. this will replace lines that contain only spaces with a <br>&nbsp;&nbsp; etc........
2. this will only handle files that have 1000 or less spaces at the front of the line (if you want more, just increase the "1000"
3. it will also replace any existing "&nbsp; "s on lines that aren't at the beginning of the line
4. also know that it will overwrite the original file (because of the "-i" option)

Kind of clunky -- I know, but it should work.  :)  Maybe someone else can use this to make an easier solution.
0
 
LVL 8

Accepted Solution

by:
Autogard earned 200 total points
ID: 16912862
To eliminate pitfall #1 add this before the "cat $1"

# Get rid of all lines that are now just a <br> followed by some &nbsp; (lines that were only spaces before this script was run)
sed -i 's/^<br>\&nbsp;\(\&nbsp;\)*$//' $1
0
 
LVL 51

Assisted Solution

by:ahoffmann
ahoffmann earned 100 total points
ID: 16914946
> input is formatted with <pre> but I want to remove the pre tags and still maintain the multiple leading spaces that indet the lines.
why would you do that? it's more or less useless depending on the font used in the browser which you cannot control. <pre> is the way to go, anything else is unreliable and not woth thinking about (except for academic philosophy ;-)

Anyway, such an academic solution with gawk:
gawk '{x=match($0,"[^ ]");if(x<2){print}else{s=substr($0,0,x-1);t=substr($0,x);gsub(" ","\\&nbsp;",s);print "<br>"s""t;}}' file

# to be improved in many ways, best with perl ...
0
 
LVL 2

Author Comment

by:jculkincys
ID: 16915077
ahoffman -
I am doing this because we are moving these documents into a wiki
The wiki does support <pre> but then we lose all the other pretty formatting of the wiki
perl will work - how would you do it with perl?

Autogard
that looks promising - let me give it a whirl

0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 16915146
did you test my gawk suggestion?
0
 
LVL 2

Author Comment

by:jculkincys
ID: 16915974
You all did great thanks alot

ahoffman I accepted his as the "answer" only because he answered first and I really don't understand gawk that much.

here is the next challenge for this project
http://www.experts-exchange.com/Operating_Systems/Linux/Q_21888235.html
0
 
LVL 8

Expert Comment

by:Autogard
ID: 16916007
Thanks jculkincys!

From what I've heard "awk" can be a powerful tool to use as well as "sed", but I really just haven't found the time (or much of a need) to learn it.

Maybe you can point us all ahoffmann to a good tutorial.  :)
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 16918077
0

Featured Post

Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

If you have a server on collocation with the super-fast CPU, that doesn't mean that you get it running at full power. Here is a preamble. When doing inventory of Linux servers, that I'm administering, I've found that some of them are running on l…
SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now