Solved

I need a shell script to find html files, identify a word in the file and replace dummy text with that word

Posted on 2007-04-04
16
794 Views
Last Modified: 2012-05-05
I have several paralell directories, each with a file named mes.html
In each of these files is one of several standard words, like pure, planet or factor.  After each standard word is a number, like 12 or 08.
Based on which word is in the mes.html file, I need to replace the word thumb.jpg with a the standard word in the file, like pure 08.jpg.  Even better would be no space in the replacement text.

0
Comment
Question by:kailee
  • 6
  • 5
  • 2
  • +3
16 Comments
 
LVL 14

Expert Comment

by:ygoutham
ID: 18855789
some more clarity is required.  

you have mes.html in too many directories.  this mes.html has words pure08, planet03, factor04 etc happening inside them.

you want pure08 to be changed to pure08.jpg and the following

pure08  ->  pure08.jpg
planet03 -> planet03.gif

and so on...

is this right??

if so, yes it is doable
0
 

Author Comment

by:kailee
ID: 18856016
not exactly.
mes.html is a standard file in each directory.
in mes.html, are words like 'pure' followed by a number, like '08'
so you might find pure 08, planet 03 or factor 04.
also, in each mes.html is the word 'thumb'.
depending on which word (not number) is found, thumb needs to be replaced by that word and its number.

So, I think case - select is needed as part of the script; one for each word.

0
 
LVL 84

Accepted Solution

by:
ozo earned 250 total points
ID: 18856097
find . -name mes.html | xargs perl -i -pe '($w,$n)=($1,$2) if /\b(pure|planet|factor)\s*(\d+); s/(thumb)/$w$n/g if $w'
0
 
LVL 6

Assisted Solution

by:_iskywalker_
_iskywalker_ earned 250 total points
ID: 18856493
a less cryptografic:

list="pure planet factor"
list1=`find . -name mes.html`

for i in $list; do
for j in $list1; do
#check if the work is in this file, hope this is not by a break line separeted
k=`grep $i $j|head -n 1`

pass=0

for m in $k; do
if [ $m == $i ] ; then
pass=1;
l=$m;
fi;
if [ $pass -eq 1 ]; then
n=$m;
break;
done;
sed -e 's/thumb/$l $n/g' $j

done;
done;
0
 

Author Comment

by:kailee
ID: 18861324
can I use something other than a space in the list?  There are cases where some of the standard words are actually two words, like 'flower power'.
0
 
LVL 84

Expert Comment

by:ozo
ID: 18861338
in that case, would you want to replace 'thumb' with 'flower power'?
0
 

Author Comment

by:kailee
ID: 18861649
welll.....
I was going to do a search and replace after this process, to keep this one simple.
so, actually, each standard search word (or combo) is replaced by a pair of characters.
So, flower power 12 will become fp12.  mutual ecology 05 will beccome me05.

0
 
LVL 84

Expert Comment

by:ozo
ID: 18861682
then would pure 08 become p08?
0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 

Author Comment

by:kailee
ID: 18862153
yes.  However, I could make it easier by adding a dummy word after the single word ones.  like pure dummy 08, world dummy 13, etc.
0
 
LVL 84

Expert Comment

by:ozo
ID: 18862429
can there be more than one standard word(s) in the file?
would they always appear before the corresponding thumbs?
0
 
LVL 6

Expert Comment

by:_iskywalker_
ID: 18862984
you could reaplace power by "" and the replace flower by flower power.
0
 

Author Comment

by:kailee
ID: 18864778
Ozo.  No, there is only one entry per file.  Shal I set them to be one word or two words?
0
 
LVL 84

Expert Comment

by:ozo
ID: 18961934
perl -i -pe '($w,$x,$n)=($1,$2,$3) if /\b(?=(.).*?\b(\w))(?:flower power|word dummy)\s*(\d+)/; s/(thumb)/$w$x$n/g if $w'
0
 

Author Comment

by:kailee
ID: 19139096
Sorry, I was out of town for awhile.  I'm back at looking at this and will try these solutions
0
 
LVL 16

Expert Comment

by:Hanno Schröder
ID: 21170286
No comment has been added to this question in more than 21 days, so it is now classified as abandoned.

I will leave the following recommendation for this question in the Cleanup Zone:
SPLIT POINTS - between _iskywalker_ {http:#18856493} and ozo{http:#18856097}

Any objections should be posted here in the next 4 days. After that time, the question will be closed.
JustUNIX, Experts Exchange Cleanup Volunteer
0
 
LVL 1

Expert Comment

by:Computer101
ID: 21198277
Forced accept.

Computer101
EE Admin
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Over the years I've spent many an hour playing on hardened, DMZ'd servers, with only a sub-set of the usual GNU toy's to keep me company; frequently I've needed to save and send log or data extracts from these server back to my PC, or to others, and…
How to remove superseded packages in windows w60 or w61 installation media (.wim) or online system to prevent unnecessary space. w60 means Windows Vista or Windows Server 2008. w61 means Windows 7 or Windows Server 2008 R2. There are various …
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now