Solved

I need a shell script to find html files, identify a word in the file and replace dummy text with that word

Posted on 2007-04-04
16
796 Views
Last Modified: 2012-05-05
I have several paralell directories, each with a file named mes.html
In each of these files is one of several standard words, like pure, planet or factor.  After each standard word is a number, like 12 or 08.
Based on which word is in the mes.html file, I need to replace the word thumb.jpg with a the standard word in the file, like pure 08.jpg.  Even better would be no space in the replacement text.

0
Comment
Question by:kailee
  • 6
  • 5
  • 2
  • +3
16 Comments
 
LVL 14

Expert Comment

by:ygoutham
ID: 18855789
some more clarity is required.  

you have mes.html in too many directories.  this mes.html has words pure08, planet03, factor04 etc happening inside them.

you want pure08 to be changed to pure08.jpg and the following

pure08  ->  pure08.jpg
planet03 -> planet03.gif

and so on...

is this right??

if so, yes it is doable
0
 

Author Comment

by:kailee
ID: 18856016
not exactly.
mes.html is a standard file in each directory.
in mes.html, are words like 'pure' followed by a number, like '08'
so you might find pure 08, planet 03 or factor 04.
also, in each mes.html is the word 'thumb'.
depending on which word (not number) is found, thumb needs to be replaced by that word and its number.

So, I think case - select is needed as part of the script; one for each word.

0
 
LVL 84

Accepted Solution

by:
ozo earned 250 total points
ID: 18856097
find . -name mes.html | xargs perl -i -pe '($w,$n)=($1,$2) if /\b(pure|planet|factor)\s*(\d+); s/(thumb)/$w$n/g if $w'
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 
LVL 6

Assisted Solution

by:_iskywalker_
_iskywalker_ earned 250 total points
ID: 18856493
a less cryptografic:

list="pure planet factor"
list1=`find . -name mes.html`

for i in $list; do
for j in $list1; do
#check if the work is in this file, hope this is not by a break line separeted
k=`grep $i $j|head -n 1`

pass=0

for m in $k; do
if [ $m == $i ] ; then
pass=1;
l=$m;
fi;
if [ $pass -eq 1 ]; then
n=$m;
break;
done;
sed -e 's/thumb/$l $n/g' $j

done;
done;
0
 

Author Comment

by:kailee
ID: 18861324
can I use something other than a space in the list?  There are cases where some of the standard words are actually two words, like 'flower power'.
0
 
LVL 84

Expert Comment

by:ozo
ID: 18861338
in that case, would you want to replace 'thumb' with 'flower power'?
0
 

Author Comment

by:kailee
ID: 18861649
welll.....
I was going to do a search and replace after this process, to keep this one simple.
so, actually, each standard search word (or combo) is replaced by a pair of characters.
So, flower power 12 will become fp12.  mutual ecology 05 will beccome me05.

0
 
LVL 84

Expert Comment

by:ozo
ID: 18861682
then would pure 08 become p08?
0
 

Author Comment

by:kailee
ID: 18862153
yes.  However, I could make it easier by adding a dummy word after the single word ones.  like pure dummy 08, world dummy 13, etc.
0
 
LVL 84

Expert Comment

by:ozo
ID: 18862429
can there be more than one standard word(s) in the file?
would they always appear before the corresponding thumbs?
0
 
LVL 6

Expert Comment

by:_iskywalker_
ID: 18862984
you could reaplace power by "" and the replace flower by flower power.
0
 

Author Comment

by:kailee
ID: 18864778
Ozo.  No, there is only one entry per file.  Shal I set them to be one word or two words?
0
 
LVL 84

Expert Comment

by:ozo
ID: 18961934
perl -i -pe '($w,$x,$n)=($1,$2,$3) if /\b(?=(.).*?\b(\w))(?:flower power|word dummy)\s*(\d+)/; s/(thumb)/$w$x$n/g if $w'
0
 

Author Comment

by:kailee
ID: 19139096
Sorry, I was out of town for awhile.  I'm back at looking at this and will try these solutions
0
 
LVL 16

Expert Comment

by:Hanno Schröder
ID: 21170286
No comment has been added to this question in more than 21 days, so it is now classified as abandoned.

I will leave the following recommendation for this question in the Cleanup Zone:
SPLIT POINTS - between _iskywalker_ {http:#18856493} and ozo{http:#18856097}

Any objections should be posted here in the next 4 days. After that time, the question will be closed.
JustUNIX, Experts Exchange Cleanup Volunteer
0
 
LVL 1

Expert Comment

by:Computer101
ID: 21198277
Forced accept.

Computer101
EE Admin
0

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Convert grep lines to perl 6 35
Rate limit for DNS queries 7 72
Adding a continue option in a script 9 39
linux(debian) mouse poor performance 4 39
Introduction We as admins face situation where we need to redirect websites to another. This may be required as a part of an upgrade keeping the old URL but website should be served from new URL. This document would brief you on different ways ca…
Fine Tune your automatic Updates for Ubuntu / Debian
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

816 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now