Solved

Perl script to extract pattern of specific text for a given directory and all of  its sub-directories

Posted on 2014-09-23
3
92 Views
Last Modified: 2015-08-23
I need to write a Perl program to  search every file (with particular file extension, say .sql) under all sub directories and grep the patterns of

"update "
"delete "
"alter table "

output of the report should be:

directory_path:fileA:NNNN: update  ......    full statement until a ";" is located
directory_path:fileA:NNNN: alter table ....  full statement until a ";" is located
directory_path:fileB:NNNN: delete ....

where [directory_path] should be the path from partent directory , current directory
             NNNNN is the line number in that file


Sample directory path:
Current directory
db-2.0.1.5
    sir-abcd
           fileA.sql
           fileB.sql
    sir-efjk
           fileC.sql
           fileD.sql
db-2.0.1.6
     sir-xyz
           fileA.sql
           fileC.sql
db-2.0.1.6

Note:fileA.sql under db-2.0.1.5/sir-abcd  may not be the same as  fileA.sql under db-2.0.1.6/sir-xyz

Therefore output could be:

db-2.0.1.5/sir-abcd:fileA:1234: update  ......    full statement until a ";" is located
db-2.0.1.6/sir-xyz:fileA:675: alter table ....  full statement until a ";" is located
directory_path:fileB:NNNN: delete ....

Note:  ordering of  db-2.0.1.5, db-2.0.1.6 is important
0
Comment
Question by:tindavid
  • 2
3 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 40338681
If there are several lines between the update  ...... and the full statement until a ";" is located
would you want
directory_path:fileA:NNNN:
on each of those lines, or only on the line that matched "update "?
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 40338698
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
find(sub{
    /\.sql$/ || return;
    open F,$_ or warn "$_ $!";
    my $name="$File::Find::dir:$_";
    while( <F> ){
         print "$name:$.: $_" if s/.*?(?=update |delete |alter table )// .. s/;.*/;/;
    }
    close F;
},".");
0
 

Author Comment

by:tindavid
ID: 40340746
Hi Ozo,

Looks ok, couple of improvement needs to be done.

1) for a DDL statement spread into lines, please contact them into single line
2) some data issue and the raw data needs to be filtered for actual DDL statement:

Assume all DDL must start on a line without any preceeding character and ended with ";" .  However, data in below migh or might now be a vlalid DDL:

          delete from table     <--  valid DDL  even has spaces before delete
rem  delete xxx                   <-- not a valid DDL, due to 'rem" or "Rem" indicates it is a comment line
delete from                         <--  valid DDL
   alter table                         <-- valid DDL
--         delete from table   <-- not a valid DDL, due to '--" indicates it is a comment line
 /*   delete from table    <-- not a valid DDL due to /* is the beginning of comment area, and */ need to end the comment area
     where xxxxx ....;
*/


Please be note that : --,  REM indicates the text after this key word act as comment line.
/* and */   indiactes any text in between will consider as comment
0

Featured Post

3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will show, step by step, how to integrate R code into a R Sweave document
Active Directory replication delay is the cause to many problems.  Here is a super easy script to force Active Directory replication to all sites with by using an elevated PowerShell command prompt, and a tool to verify your changes.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question