Regular Expression

How can I pull out lines between two patterns that are themselves on different lines?
eg I want to extract the lines of code between <form> and </form>
from a hTML file.

I am in urgent.Please do reply
padmamvsAsked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
guadalupeConnect With a Mentor Commented:
open (FILE, "$file");

@Info=<FILE>;

$file_lines = join(" ", @Info);

close(FILE);


$file_lines =~ /(<form>.*<\/form>)/gs;

$lines = $1;

Line will contain everything between the "forms" and including the tags.  If you don't want the tags just move the parens around the .*  like this  (.*).
0
 
KennyIT Application ExecutiveCommented:
In the HTML file, try to place the <form> and </form> tags in a line by themselves. Definitely do not have 2 form tags on the same line, or else it will "break" the code.

In your PERL script:

 open (HTML, "htmlfile.htm")
 $Count=0;
 while (<HTML>)
   {
   $Line=$_;
   if ($Line =~ /<form>/i)
     {
     $Count=1;
     }
   if ($Count ne 0)
     {
     print $Line;
     # you can do whatever you want here
     }

   if ($Line =~ /<\/form>/i)
     {
     $Count=0;
     }
   

Hope this helps.
0
 
KennyIT Application ExecutiveCommented:
Oh yeah...if you do not want to trap the <form> and </Form> lines, then reposition the code as follows :


while (<HTML>)
   {
   $Line=$_;

   if ($Line =~ /<\/form>/i)
     {
     $Count=0;
     }

   if ($Count ne 0)
     {
     print $Line;
     # you can do whatever you want here
     }

   if ($Line =~ /<form>/i)
     {
     $Count=1;
     }

0
 
ozoCommented:
perl -ne 'print if /<form>/../<\/form>/' file

But if you want to parse html, and not just pull out lines between two patterns that are themselves on different lines,
then you will need something a little more clever
0
All Courses

From novice to tech pro — start learning today.