Link to home
Start Free TrialLog in
Avatar of Europa MacDonald
Europa MacDonaldFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Counting multiple conditions

The code below counts the number of rows, the value of the third column (in my data set), and will list the value of the third column along with the number of rows that it appears in.

so for this sample data set:
101,102,143,145,146,149
101,102,143,145,147,148
101,102,143,145,247,149
102,120,143,147,248,149
102,134,144,245,346,447
102,125,144,145,446,548
102,125,144,145,446,549

when 101 is the first value it will list data for the third column as:

143 4
144 3

Could the code be adjusted to count all incidences for 101 AND 102 ?
Avatar of farzanj
farzanj
Flag of Canada image

Which code?
Avatar of Europa MacDonald

ASKER

apologies .... :)


#!/usr/bin/perl
use strict;
use warnings;
open M,"<master.vim" or die "master.vim $!";
my %c;

 /^101,[^,]*,(\d+)/ &&  $c{$1}++ while <M>;

close M;

open C,">count.txt" or die "count.txt $!";
print C "$_ $c{$_}\n" for sort{$a<=>$b}keys %c;
close C;
/^10[12],[^,]*,(\d+)/ &&  $c{$1}++ while <M>;
You mean

#!/usr/bin/perl
use strict;
use warnings;
open M,"<master.vim" or die "master.vim $!";
my %c;

 /^10[12],[^,]*,(\d+)/ &&  $c{$1}++ while <>;

close M;

open C,">count.txt" or die "count.txt $!";
print  "$_ $c{$_}\n" for sort{$a<=>$b}keys %c;
close C;

Open in new window

that alteration just hangs for a long time :(
If you altered <M> to <> it will be waiting for you to enter the data in STDIN
I corrected <M> but it still hangs on my main data and doesnt show any return on count.txt in my sample data short list
Sorry, I had made changes to see what you were getting.

This should run for you.


#!/usr/bin/perl
use strict;
use warnings;
open M,"<master.vim" or die "master.vim $!";
my %c;

 /^10[12],[^,]*,(\d+)/ &&  $c{$1}++ while <M>;

close M;

open C,">count.txt" or die "count.txt $!";
print C "$_ $c{$_}\n" for sort{$a<=>$b}keys %c;
close C;

Open in new window

its not working for me at all.

this is the sample data

101,102,103,104,105,106
101,102,103,104,105,106
101,102,103,104,105,106
101,103,104,105,106,107
101,103,104,105,106,107
101,103,104,105,106,107
102,103,106,107,108,109
102,103,106,107,108,109

the program should show

101 103 3
101 104 3
102 106 2

Is this possible ?
changing
/^10[12],[^,]*,(\d+)/ &&  $c{$1}++ while <M>;
to
/^(10[12]),[^,]*,(\d+)/ &&  $c{"$1 $2"}++ while <M>;
should change the output from
103 3
104 3
106 2
to
101 103 3
101 104 3
102 106 2

But then you may also want to change
sort{$a<=>$b}keys %c
to
sort keys %c
to get rid of the Argument  isn't numeric warning message
Or you can try this:

#!/usr/bin/perl
#
my %count;

while(<>)
{
        my ($n1, $n2) = /^(\d+),[^,]+,(\d+)/;
        $count{$n1}{$n2}++;
}

foreach my $v1 (sort keys %count)
{
        foreach my $v2 (sort keys %{$count{$v1}})
        {
                print $v1, " ", $v2, " ", $count{$v1}{$v2}, "\n";
        }
}

Open in new window

using my sample data which I just ran from 1-60, count.txt contains

12 43 600
12 29 18240
12 42 1015
12 44 310
12 17 19840
12 26 23023
12 35 8008
12 40 2268
12 33 11200
12 46 33
12 39 3120
12 20 25578
12 18 22475
12 25 24288
12 15 11968
12 24 25300
12 30 16473
12 31 14688
12 28 19950
12 38 4125
12 19 24360
12 23 26000
12 32 12920
12 27 21560
12 37 5280
12 36 6578
12 34 9555
12 22 26325
12 14 6545
12 21 26208
12 45 128
12 41 1568
12 16 16368

when I use

/^(1[2]),[^,]*,(\d+)/ &&  $c{"$1 $2"}++ while <M>;

Maybe I have asked the question wrong ?
I thought you wanted
/^(10[12]),[^,]*,(\d+)/
Try my solution above by running like

./scriptname count.txt
farzanj, thanks but Im a non-programmer working on a stats project. Dont know how to do that yet.

ozo, what is /^(10[12]),[^,]*,(\d+)/ ?

my data starts 101 mostly, then 102 etc.
What I mean was that you need to put that in a file.  You can name it anything but I named it

scriptname

And the file you want it to read it count.txt, right?
And you need to make it executable

chmod +x scriptname

Then on command line you need to say
./scriptname count.txt
/(10[12])/ is equivalent to /(101|102)/
sorry, farzanj, youve lost me, Im just hoping for a simple solution :)


thanks ozo

is it possible to change the script to analysis two sets at the same time ?
The script should already analyse two sets at the same time
unless you mean something else by at the same time than I am understanding.
I havent explained myself properly

just now it analyses the first column (for101) and the third column and it counts

Could it analyse the first column for 101, 102, 103 etc and still count the rows and values of third column ?
ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial