Solved

perl and sort - unique

Posted on 2006-11-23
12
830 Views
Last Modified: 2008-02-01
I have a script I made, I'm sure it could be done in a better way but it does give me what I need, problem is to get it
to put out what I want I have to run the script and pipe it to unix sort -u to actually get what I'm after.

I run the script like this:

./script.pl logfile.txt | sort -u


I get many multiple lines of the same thing so just need to get one unique line of each, the unix sort works fine for this
but trying to figure out how to add that to this script so to do it all with perl.

Script I'm using:

#!/usr/bin/perl
use strict ;
use warnings ;

while(<>){
my $syncQ4=/user=(.*?)&password=&appid=(Sync2005Q4)/;
my $NotifyLink=/user=(.*?)&password=&fmt-out=text\/xml&refresh=0&appid=(NotifyLink)/;
my $XCAP=/appid=(XCAP)&user=(.*?)&password/;
my $WEB=/&tzid=&test/ && /user=(.*?)&password/;
print "$1\t$2\n" if $NotifyLink;
print "$1\t$2\n" if $syncQ4;
print "$2\t$1\n" if $XCAP;
print "$1\tWeb\n" if $WEB;
}


sample of the output I get when just running the script and NOT doing a pipe to sort -u

user1      Web
user2      NotifyLink
user3      Sync2005Q4
user4      XCAP
user2      NotifyLink
user2      NotifyLink
user4      XCAP
user4      XCAP
user1      Web
user1      Web


Thanks,
0
Comment
Question by:bt707
  • 6
  • 5
12 Comments
 
LVL 2

Expert Comment

by:EarleAke
ID: 18003279
OK, I am not on a unix system so this may need a little work.  The idea is to push the lines onto an array then sort and then do your own uniq by keeping track of the last line printed and don't print out the next line if it matched the last.

#!/usr/bin/perl
use strict ;
use warnings ;

while(<>){
my $syncQ4=/user=(.*?)&password=&appid=(Sync2005Q4)/;
my $NotifyLink=/user=(.*?)&password=&fmt-out=text\/xml&refresh=0&appid=(NotifyLink)/;
my $XCAP=/appid=(XCAP)&user=(.*?)&password/;
my $WEB=/&tzid=&test/ && /user=(.*?)&password/;
push(@answer, sprint "$1\t$2") if $NotifyLink;
push(@answer, sprint "$1\t$2") if $syncQ4;
push(@answer, sprint "$2\t$1") if $XCAP;
push(@answer, sprint "$1\tWeb") if $WEB;
}

$lastline="";

foreach $line (sort @answer) {
  next if ($line eq $lastline);
  printf("%s\n", $line);
  $lastline=$line;
}

0
 

Author Comment

by:bt707
ID: 18003354
EarleAke,

I am on unix and getting a lot of errors, I comments out the use strict for now which removed some of the errors but not sure what to do with the errors on the sprint.

Here is the errors I get, I see you have a good sugguestion for what I'm trying to do, I'll keep working with it.

Thanks,


errors I'm getting right now.


String found where operator expected at ./Sort-Appid.pl line 10, near "sprint "$1\t$2""
        (Do you need to predeclare sprint?)
String found where operator expected at ./Sort-Appid.pl line 11, near "sprint "$1\t$2""
        (Do you need to predeclare sprint?)
String found where operator expected at ./Sort-Appid.pl line 12, near "sprint "$2\t$1""
        (Do you need to predeclare sprint?)
String found where operator expected at ./Sort-Appid.pl line 13, near "sprint "$1\tWeb""
        (Do you need to predeclare sprint?)
Global symbol "@answer" requires explicit package name at ./Sort-Appid.pl line 10.
syntax error at ./Sort-Appid.pl line 10, near "sprint "$1\t$2""
Global symbol "@answer" requires explicit package name at ./Sort-Appid.pl line 11.
syntax error at ./Sort-Appid.pl line 11, near "sprint "$1\t$2""
Global symbol "@answer" requires explicit package name at ./Sort-Appid.pl line 12.
syntax error at ./Sort-Appid.pl line 12, near "sprint "$2\t$1""
Global symbol "@answer" requires explicit package name at ./Sort-Appid.pl line 13.
syntax error at ./Sort-Appid.pl line 13, near "sprint "$1\tWeb""
Global symbol "$lastline" requires explicit package name at ./Sort-Appid.pl line 16.
Global symbol "$line" requires explicit package name at ./Sort-Appid.pl line 18.
Global symbol "@answer" requires explicit package name at ./Sort-Appid.pl line 18.
Global symbol "$line" requires explicit package name at ./Sort-Appid.pl line 19.
Global symbol "$lastline" requires explicit package name at ./Sort-Appid.pl line 19.
Global symbol "$line" requires explicit package name at ./Sort-Appid.pl line 20.
Global symbol "$lastline" requires explicit package name at ./Sort-Appid.pl line 21.
Global symbol "$line" requires explicit package name at ./Sort-Appid.pl line 21.
Execution of ./Sort-Appid.pl aborted due to compilation errors.


0
 
LVL 84

Expert Comment

by:ozo
ID: 18003896
remove the sprint
0
 
LVL 84

Expert Comment

by:ozo
ID: 18003909
add
my(@answer,$line,$lastline);
0
 
LVL 84

Expert Comment

by:ozo
ID: 18003915
Doyou need it sorted, or do you just need it unique?
0
 

Author Comment

by:bt707
ID: 18003946
ozo,

yes to sort it would be nice, how do I add the sort?  

Thanks,
0
What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

 

Author Comment

by:bt707
ID: 18003978
ozo,

guess I added the line and remove the sprint all wrong,


here is what I put in but still getting errors.


#!/usr/bin/perl
use strict ;
use warnings ;

while(<>){
my $syncQ4=/user=(.*?)&password=&appid=(Sync2005Q4)/;
my $NotifyLink=/user=(.*?)&password=&fmt-out=text\/xml&refresh=0&appid=(NotifyLink)/;
my $XCAP=/appid=(XCAP)&user=(.*?)&password/;
my $WEB=/&tzid=&test/ && /user=(.*?)&password/;
push(@answer, "$1\t$2") if $NotifyLink;
push(@answer, "$1\t$2") if $syncQ4;
push(@answer, "$2\t$1") if $XCAP;
push(@answer, "$1\tWeb") if $WEB;
}

my(@answer,$line,$lastline);

$lastline="";

foreach $line (sort @answer) {
  next if ($line eq $lastline);
  printf("%s\n", $line);
  $lastline=$line;
}



# ./Sort-Appid.pl commandlog.txt
Global symbol "@answer" requires explicit package name at ./Sort-Appid.pl line 10.
Global symbol "@answer" requires explicit package name at ./Sort-Appid.pl line 11.
Global symbol "@answer" requires explicit package name at ./Sort-Appid.pl line 12.
Global symbol "@answer" requires explicit package name at ./Sort-Appid.pl line 13.
Execution of ./Sort-Appid.pl aborted due to compilation errors.


Thanks,




0
 

Author Comment

by:bt707
ID: 18003990
I see where I should of added the line my(@answer,$line,$lastline);   up above the push lines but still getting errors, so still working on it.
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 18004111
my @answer should come before
while(<>){
my $syncQ4=/user=(.*?)&password=&appid=(Sync2005Q4)/;
my $NotifyLink=/user=(.*?)&password=&fmt-out=text\/xml&refresh=0&appid=(NotifyLink)/;
my $XCAP=/appid=(XCAP)&user=(.*?)&password/;
my $WEB=/&tzid=&test/ && /user=(.*?)&password/;
push(@answer, "$1\t$2") if $NotifyLink;
0
 

Author Comment

by:bt707
ID: 18004321
Thanks ozo,

That fixed it, worked great.

it comes out unique now same as when I was piping it to unix sort -u,

how can sort now by the second column it puts out.


Thanks,
0
 
LVL 84

Expert Comment

by:ozo
ID: 18004361
foreach $line ( map{/\s(.*)/}sort map{(/\s(\S+)/)[0]." $_"}@answer )
0
 

Author Comment

by:bt707
ID: 18005309
perfect, Thanks!!
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now