Solved

Pass a variable to awk inside a Korn shell script.

Posted on 2004-04-23
15
11,525 Views
Last Modified: 2013-12-13
I'm parsing a logfile for referring pages.
I want to structure my script such that I can pass a variable to awk to tell it which URL to use as the referring page. I can't figure out how to pass the variable.
I'm using the Korn shell.

Here's the script:

egrep ' 20[0-9] | 30[0-9] ' $1 | uniq | awk '{ print $7" "$11}' | awk ' $1 ~ /.htm/' | sed -f ~/bin/nl |  sed -f ~/bin/am | awk ' $2 ~ /\/inthenews\//'  | sort -fd | uniq -c > $2

This script produces a list of all referrals from any page in the "inthenews" directory.  I hardcoded "/inthenews/" because I can't figure out how to pass a variable to awk.

Thanks in advance for any advice.
0
Comment
Question by:jackstreet
15 Comments
 
LVL 22

Expert Comment

by:NovaDenizen
Comment Utility
I recommend that you switch over to perl.  awk and sed are good for simple tasks, but perl has the power to do this in about a ten line script, and it would be easy to parameterize that.

Sermon over.  The difficulty with parameterizing the variable in the awk statement is that the argument to awk is in single quotes, which do not permit expansion of variable names.  The question is, can we rewrite this argument without using single quotes?  The answer is yes, and we can do it by escaping individual characters.

Assuming $dir contains the directory,
awk \ \$2\ \~\ /\\/$dir\\//
should do the trick.  Note that all spaces, the '$' in $2, the ~, and the backslashes are all individually escaped because we want awk to see them, and the $ in $dir is not escaped because we want ksh to expand the $dir variable.  This way, ksh passes ' $2 ~ /\/dirname\//' as its sole argument.

It may also be possible to do it using double quotes, but I don't know the precise details off the top of my head.
0
 

Author Comment

by:jackstreet
Comment Utility
NovaDenizen,
I got this when I ran the code and passed 3 variables to the script: $1, $2, $3, which are, respectively, logfile name, directory name (inthenews), name of results file :

*****************************************
awk: syntax error at source line 1
context is
       >>> \ <<< inthenews\~/\/\//
awk: bailing out at source line 1

*****************************************

Looks like it thinks the $2 in "awk ' $2 ~..." should be the passed $2 variable and not the second field. It should be "awking" the second field ($2) for the $2 variable.
Yes? Clear as mud?
Thanks so much!
[Aside - Perl seems so dense compared to these little UNIX machines!]
0
 
LVL 22

Accepted Solution

by:
NovaDenizen earned 250 total points
Comment Utility
Ok, ksh is weirder than I thought.  I thought it would be the same as bash, but ksh has some rather obscure escaping behavior. Try this:
awk $' $2 ~ /\\/'$dirname$'\\//'
The argument is made up of three segments concatenated together.  $' $2 ~ /\\/' should not expand the $2 or ~ and should collapse the double backslash to a single, $dirname should be your directory name, and $'\\//' should finish up the argument with a single backslash.
0
 
LVL 22

Expert Comment

by:NovaDenizen
Comment Utility
Double quotes might work too.
awk " \$2 \~ /\\/$dirname\\//"
0
 

Author Comment

by:jackstreet
Comment Utility
I will check this Monday. Thanks!
0
 
LVL 45

Expert Comment

by:sunnycoder
Comment Utility
arguments to an awk script can be passed just like command line variables ...

The number of arguments is held in special variable ARGC and arguments themselves are held in an array ARGV (same names as in C/C++ ... just the case is different)

If you can provide the exact input and output (formats), then maybe we can try to provide/suggest a better solution
0
 
LVL 22

Expert Comment

by:NovaDenizen
Comment Utility
The problem is that he wants awk to see '$2', '~', and the '\'s, but wants ksh to see the '$dirname' parameter. It's just a matter of getting the escapes correct.
0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 

Author Comment

by:jackstreet
Comment Utility
NovaDenizen,
After trying the double quotes I got this:

*****************************************
awk: syntax error at source line 1
context is
      $2 >>> \ <<< ~ /\/inthenews\//
awk: bailing out at source line 1

*****************************************
But after trying the first suggestion: awk $' $2 ~ /\\/'$dirname$'\\//'
It worked! I am able to pass a directory name as a variable to awk and it's "awking" the second field.

I have to ask, how are you employing the first, third and fourth dollar signs in the above? All the documentation I've seen refers to the dollar sign as a means of variable substitution and you aren't using them for that.

It works great for directories but I may have to ask a related question if I can't figure out how to pass a partial URL as a variable.
Examples:  
WebHome/dirname/dirname/pagename
WebHome/dirname/pagename
WebHome/homepage

Is there a quick solution to that?
0
 

Author Comment

by:jackstreet
Comment Utility
And one other example:
WebHome/dirname/pagename.html
0
 

Author Comment

by:jackstreet
Comment Utility
Sorry -- in my question regarding the dollar signs I should have said the first and fourth dollar signs.
0
 
LVL 22

Expert Comment

by:NovaDenizen
Comment Utility
The ksh construct $'...' tells ksh to treat the ... the same as a C compiler would treat the double-quoted string "...".  C does not treat '$' or '~' characters in any special way, so these pass through unchanged.  Then comes $dirname, which is not quoted or escaped, so ksh sees it as a variable and substitutes the variable value for it.  Then comes another $'...' sequence for the end.

As far as I can tell, no other shell has a construct like ksh's $'...'.  The designers of ksh had some interesting ideas, but their implementation was kind of flawed.

The second question is difficult because awk is hardcoded to recognize '/' characters as boundaries for regular expressions.  If you absolutely must use awk, then you will need a routine that substitutes each "/" with "\/" so awk will know they are to be treated as regular characters.  Also, '.' is a special character in regular expressions, which normally matches up to any character.  

So, I think awk is inappropriate.  Told ya so :).  Perl is a winner here.  Here is a short script for you

#!/usr/bin/perl
while (<STDIN>) {
    @a = split(' ');
    if (index($a[1], $argv[0] != -1) { print $_ ; }
}

This looks at the second column of the input, and checks if the first script argument is a substring of the second column.  
If you want an equality check instead, substitute this line:
if ($a[1] eq $argv[0]) { print $_; }

Name the script something like 'ff2c' (find filename second column) or whatever, and use it instead of the awk command.
... | ff2c $dirname | ...


0
 
LVL 22

Expert Comment

by:NovaDenizen
Comment Utility
Oops, I forgot a parenthesis in the first if statement.
if (index($a[1], $argv[0]) != -1)
0
 

Author Comment

by:jackstreet
Comment Utility
I don't have perl on my system. I'm running a barebones Unix in Windows version.  I'll have to figrue out what to do next.
Thanks!
0
 
LVL 22

Expert Comment

by:NovaDenizen
Comment Utility
use sed to escape the filename before you run your mega-pipeline.
newname = `echo $dirname | sed -e 's#/#\/#g' -e 's/\./\\./g'  .  I might have the escapes a bit wrong there.  The intent is to replace occurances of '/' with '\/', and replace occurances of '.' with '\.'.
0
 

Expert Comment

by:rein8
Comment Utility
using the following will also work:

awk '{ print myVar }' myVar=$aKshVar
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Suggested Solutions

Whether you've completed a degree in computer sciences or you're a self-taught programmer, writing your first lines of code in the real world is always a challenge. Here are some of the most common pitfalls for new programmers.
A short article about problems I had with the new location API and permissions in Marshmallow
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now