Solved

AWK nesting and Regular Expression

Posted on 2009-07-06
12
451 Views
Last Modified: 2012-05-07
I want to use one file as the source variable for my regular expressions  , while I awk another file. The idea is to have a new regular expression ($i) applied to a given file.

I want to use awk to solve this problem:

See code snipet for where I am stuck.

Thank you

PA

awk '

{ for ( i = 1; i <= NF; i++ )

     # I want to do :  /enterprises.9.9.166.1.5.1.1.2./  && $0~i$  on file ./MIBDump 

}' class_index

Open in new window

0
Comment
Question by:pierre-alex
  • 6
  • 4
  • 2
12 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 24791621
what do you mean by
$0~i$  on file ./MIBDump
do you want $0 to come from  ./MIBDump?
do you want $i to come from /MIBDump?
and it looks like the expression just returns true or false.  do you actually want to do anything?
0
 

Author Comment

by:pierre-alex
ID: 24792932
Ozo,


 I want to run a reg expression which looks for the occurence of /enterprises.9.9.166.1.5.1.1.2./ in the file "MIBDump" AND the occurence of $i in the file "class_index".

so lets suppose in the file class_index I have two values: 33151 and 39891 then the script will run a search for :

/enterprises.9.9.166.1.5.1.1.2./ && /33151/
then

/enterprises.9.9.166.1.5.1.1.2./ && /39891 /


The search will be done on file "MIBDump".


( $0~$i  purpose is to  fetch an entire line of the file class_index and put it in variable $i  )  

Thanks

PA


0
 
LVL 8

Expert Comment

by:JIEXA
ID: 24810772
I'd concatenate the 2 files with separator. The code below is not tested, but you can understand what's done.
(cat expression.txt ; echo SEPARATOR ; cat data.txt ) | awk '

BEGIN{separatorSeen=0}

(separatorSeen==0){expression=$0}

/^SEPARATOR$/{separatorSeen=1}

(separatorSeen==1 && ...){  if ($0~expression) ... }'

Open in new window

0
 

Author Comment

by:pierre-alex
ID: 24841717
JIEXA

Thanks for the reply, but the files need to stay separate.

Regards

PA
0
 
LVL 8

Expert Comment

by:JIEXA
ID: 24842067


cat data.txt | awk '

BEGIN{

  getline < "expression.txt"

  expression=$0

}

{  if ($0~expression) ... }'

Open in new window

0
 
LVL 84

Expert Comment

by:ozo
ID: 24842998
If I am understanding what you are saying, it sounds like you are asking for
grep -F enterprises.9.9.166.1.5.1.1.2. MIBDump  | grep -f class_index
0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 

Author Comment

by:pierre-alex
ID: 24885421
Ozo, yes this is correct but it needs to be done in awk ...

Thank you

PA
0
 

Author Comment

by:pierre-alex
ID: 24885980
JIEXA:

Using "getline" seems a good idea, however, when used in the "BEGIN" section, only the FIRST expression in the file "expression.txt" is retrieved.

I have tried using "getline" in the main body of awk, but the behaviour is as follows:

For each line in the file "data.txt" the NEXT line in file "expression.txt" is retrieved and used as a regular expression, so what I end up having is the FIRST line of "data.txt" against the "second line of "expression.txt" then the SECOND line of "data.txt" agains the "third line of expression.txt " etc ....

This is not what I want, but its seems to be the natural behaviour of "getline" according to the documentation.

What I need is for each line of "data.txt" match the first expression in "expression.txt", then for each line of "data.txt"  match the second expression in "expression.txt",  then for each of "data.txt" match the third expression etc ...

Can this be accomplished with an awk program at the first place ???
 

thanks

PA
0
 

Author Comment

by:pierre-alex
ID: 24886473
JIEXA:

I spent some time rewriting the code to solve the problem using arrays.

(My attempt using "getline" was unfortunately fruitless, see previous post)

My new code is working  OK (see snippet), but I have to manually populate the array in the "BEGIN" section of 'AWK. That is awkward.


What I need is a way to populate populate my array arr[x] on initialization from a secondary file "class_index" ...

Working code, primary and secondary files all can be found below.

Any suggestions?

Regards

PA
 


awk '

BEGIN {

arr[1] = "102983"

arr[2] = "103019"

target_parent = "102975"

#print arr[1];print arr[2]

}
 
 

/enterprises.9.9.166.1.5.1.1.4/ {
 

 #print $0
 

 #}
 

     for ( i = 1; i<=2; i++ )
 

     { value = arr[i]
 

      # Trying to run a regular expression within a For loop _ Failed

      # /102975/ && $0~value

         # {

         #  print $0

         # }
 

      gsub(/.*\./,"",$1)

      child=$1

      #print child
 

      m=split($4,t,".")

      parent=t[m]

      #print "last value:"

      #print t[m]
 

      if ( child == arr[i] && target_parent == parent)

             {

                { print "parent:", parent }

                { print "child:", child }

              }

       }
 

                                 } ' MIBDump
 

Output:
 

parent: 102975

child: 102983

Open in new window

MIBDump
class-index
0
 
LVL 8

Accepted Solution

by:
JIEXA earned 125 total points
ID: 24886837
You can use a loop for getline!
BEGIN {

  c=1

  while((getline < "expression.txt") > 0)

  {

    arr[c]=$0

    c++

  }

}

Open in new window

0
 
LVL 8

Expert Comment

by:JIEXA
ID: 24886840
Or if there are only 3 data lines:
BEGIN {

  file = "expression.txt"

  getline < file

  arr[1] = $0

  getline < file

  arr[2] = $0

  getline < file

  target_parent = $0

}

...

Open in new window

0
 

Author Closing Comment

by:pierre-alex
ID: 31600226
JIEXA:

Brilliant! Working like a charm now.

Thanks for your help

PA
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

by Batuhan Cetin Regular expression is a language that we use to edit a string or retrieve sub-strings that meets specific rules from a text. A regular expression can be applied to a set of string variables. There are many RegEx engines for u…
Do you hate spam? I do, and I am willing to bet you do as well. I often wonder, though, "if people hate spam so much, why do they still post their email addresses on the web?" I'm not talking about a plain-text posting here. I am referring to the fa…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now