Solved

stoopid pattern matching...

Posted on 2000-04-26
5
227 Views
Last Modified: 2010-03-05
I've got the following text in a UNIX text file that I need to do a pattern match on:

<SCRIPT LANGUAGE="JavaScript">
    pagetitle="GLCC Home Page";
</SCRIPT>

I'm trying to do a find-and-replace through the command line using this syntax:
    perl -pi.bak -e 's#...#...#oi'  *.html
But I can only match over one line. That is, I can match 'Script">\n' and I can match '\s+pagetitle', but I can't match 'Script">\n\s+pagetitle' for some obnoxious reason. My Camel book hasn't been enlightening. Help?
0
Comment
Question by:mblase
  • 2
  • 2
5 Comments
 
LVL 16

Expert Comment

by:maneshr
ID: 2753009
try this..

perl -pi.bak -e 'BEGIN{$/="";} s#Script">\s+pagetitl#SCR#oi' *.html
0
 
LVL 2

Author Comment

by:mblase
ID: 2753127
Thanks. But since this is a 100-pointer, I won't award points until you tell me just why the heck that works. :-)
0
 
LVL 16

Accepted Solution

by:
maneshr earned 100 total points
ID: 2753163
The problem is the -p switch

this switch causes Perl to assume the following loop around your script, which makes it iterate over filename arguments rather as sed does:

     LINE:
     while (<>) {
         ...                # your script goes here
     } continue {
         print;
     }

But you wanted the entire file to be treated as one single line.
in order to do that you have to make the $/ (INPUT RECORD SEPARATOR)variable to ignore \n as the line seperator.
We do that using $/="";

Unfortunately, that cant be achieved by putting the $/=""; within the while(<>) loop but by using the BEGIN sub routine.

A BEGIN subroutine is executed as soon as possible, that is, the moment it is completely defined, even before the rest of the containing file is parsed.

This causes the while loop to read the entire input, .html, file as a single line!!

let me know if you are still unclear and i will make it more simpler.
0
 
LVL 2

Author Comment

by:mblase
ID: 2753188
ahhhhhhhh.... thanks for the explanation of the -p switch. I was wondering why all the //s and //m switches wouldn't work, and the BEGIN{} was necessary. This'll come in way handy down the road, too. Many thanks!
0
 
LVL 84

Expert Comment

by:ozo
ID: 2753407
perl -0777 -pi.bak -e 's#...#...#oi'  *.html
perldoc perlrun
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

766 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question