Sort values as text

This code

#!/usr/bin/perl
use strict;
use warnings;
@ARGV or @ARGV=('sample.txt');
open STDOUT,">output.txt" or die "output.txt $!";
print reverse sort <>;

Works great. It sorts a list of data, with two columns, into descending order based on the first column.

The problem that I have now, is that I require need further sorting.

the sorted data looks like this:

60000 84
51000 1260
50100 1260
50010 1260
50001 1260
42000 5670
41100 12600
41010 12600
41001 12600
40200 5670
40110 12600
40101 12600
.....
00600 210
00510 2520
00501 2520
00420 9450
00411 21000
00402 9450
Descending order - everything good so far.

The problem is that there are groups within this first column.

All groups containing

6
5,1
4,2
4,1,1
3,3
3,2,1
3,1,1,1
2,2,2
2,2,1,1
2,1,1,1,1

should be grouped together, then put into descending order as before.

Could someone help me by adjusting the above code to perform this ?

thankyou
Europa MacDonaldChief slayer of dragonsAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ozoCommented:
Could you clarify what the groups would be in the above example?
0
Europa MacDonaldChief slayer of dragonsAuthor Commented:
6   - 60000 06000 00600 etc
5,1 51000 50100 50010 15000 etc
4,2 42000 04020 etc
4,1,1
3,3 33000 - 00033
3,2,1
3,1,1,1
2,2,2
2,2,1,1
2,1,1,1,1  21111 12111 1121 etc

Is that clearer ?
0
ozoCommented:
What group would 123456 go into?
0
Microsoft Azure 2017

Azure has a changed a lot since it was originally introduce by adding new services and features. Do you know everything you need to about Azure? This course will teach you about the Azure App Service, monitoring and application insights, DevOps, and Team Services.

wilcoxonCommented:
When you say "groups containing" do you mean the number contains each of those digits in the order specified (so 5,1 would match 5100 and 5010 but not 1500)?

You can't do this with a simple sort.  You'll have to use a custom sort function as I did on your previous question (though it will be more complex).
0
Europa MacDonaldChief slayer of dragonsAuthor Commented:
there isnt a 123456

as per the original sample list, and the described groupings above,

only groups that contain these numbers and only these numbers plus a few zeros.

there should be a total of five digits, the numbers plus zeros.

ie

6   - 60000 06000 00600 etc      and all combinations of these
5,1 - 51000 50100 50010 15000 etc     and all combinations of these
4,2 - 42000 04020 etc and all     combinations of these
4,1,1- 41100       and all combinations of these
3,3 33000 - 00033   and all combinations of these
3,2,1-  32100
3,1,1,1 - 31110 and all combinations of these
2,2,2 - 22200
2,2,1,1 - 22110
2,1,1,1,1  21111 12111 1121 etc  

and so on

does that make it clearer ?
0
Europa MacDonaldChief slayer of dragonsAuthor Commented:
When you say "groups containing" do you mean the number contains each of those digits in the order specified (so 5,1 would match 5100 and 5010 but not 1500)?


must be 5 digits, with zeros making up the rest

so 5,1  could be

51000
50100
50010
50001
15000
05100
05010
05001
10500
   
etc

same for all combination
0
ozoCommented:
What group would 12456 go in?
0
wilcoxonCommented:
I hadn't seen your response about the rest 0s so this isn't perfect but it'll be close.
use strict;
use warnings;
@ARGV or @ARGV=('sample.txt');
open OUT, '>output.txt' or die "could not write output.txt: $!";
print OUT sort col1_rev <>;
sub col1_rev {
    # split the columns and only compare the first one
    my $cola = (split /\s+/, $a)[0];
    my $colb = (split /\s+/, $b)[0];
    my $grpa = group($cola);
    my $grpb = group($colb);
    return $grpa <=> $grpb unless ($grpa == $grpb);
    return $colb <=> $cola;
}
sub group {
    my ($val) = @_;
    return 1 if ($val =~ m{6});
    return 2 if ($val =~ m{5.*1|1.*5});
    return 3 if ($val =~ m{4.*2|2.*4});
    return 4 if ($val =~ m{1.*1} and $val =~ m{4});
    return 5 if ($val =~ m{3.*3});
    return 6 if ($val =~ m{3} and $val =~ m{2} and $val =~ m{1});
    return 7 if ($val =~ m{1.*1.*1} and $val =~ m{3});
    return 8 if ($val =~ m{2.*2.*2});
    return 9 if ($val =~ m{2.*2} and $val =~ m{1.*1});
    return 10 if ($val =~ m{1.*1.*1.*1} and $val =~ m{2});
}

Open in new window

0
Europa MacDonaldChief slayer of dragonsAuthor Commented:
There isnt a 123456

These codes (in the first column) are counting the frequency of values in my list. The second column is the count.

there are five columns in this set of data, on each row, 1 2 3 4 and 5

a 0 means there are none of the values I am looking for in that column

I can only have a total of 6 values in each row, so if I find a group of 6 in one column, then there can be no values in any other column so I would have a pattern 6 0 0 0 0, or 0 0 6 0 0 depending on which column the values are found in. If I find a 5 in one column there can only be 1 incidence of another value that meets my search criteria in another column, so 5 in one column, 1 in another and the other three columns must be empty, and so on with 4 and 2, 4 and 1 and 1 etc
0
wilcoxonCommented:
Revised for rest must be 0s...  Rerevised since each column can only contain total 6 (mostly the same as previous code).
use strict;
use warnings;
@ARGV or @ARGV=('sample.txt');
open OUT, '>output.txt' or die "could not write output.txt: $!";
print OUT sort col1_rev <>;
sub col1_rev {
    # split the columns and only compare the first one
    my $cola = (split /\s+/, $a)[0];
    my $colb = (split /\s+/, $b)[0];
    my $grpa = group($cola);
    my $grpb = group($colb);
    return $grpa <=> $grpb unless ($grpa == $grpb);
    return $colb <=> $cola;
}
sub group {
    my ($val) = @_;
    return 1 if ($val =~ m{6});
    return 2 if ($val =~ m{5.*1|1.*5});
    return 3 if ($val =~ m{4.*2|2.*4});
    return 4 if ($val =~ m{1.*1} and $val =~ m{4});
    return 5 if ($val =~ m{3.*3});
    return 6 if ($val =~ m{3} and $val =~ m{2} and $val =~ m{1});
    return 7 if ($val =~ m{1.*1.*1} and $val =~ m{3});
    return 8 if ($val =~ m{2.*2.*2});
    return 9 if ($val =~ m{2.*2} and $val =~ m{1.*1});
    return 10 if ($val =~ m{1.*1.*1.*1} and $val =~ m{2});
    die "$val does not match any established grouping";
}

Open in new window

0
Europa MacDonaldChief slayer of dragonsAuthor Commented:
wilcoxon - output file has no content at all
0
ozoCommented:
#!/usr/bin/perl
use strict;
use warnings;
@ARGV or @ARGV=('sample.txt');
my %groups;
unshift @{$groups{join'',sort split//,(split)[0]}},$_ for sort <>;
open STDOUT,">output.txt" or die "output.txt $!";
print @$_,"\n" for values %groups;
0
wilcoxonCommented:
An alternate way to do the group checking (probably more efficient than using regexes)...
use strict;
use warnings;
use List::Util qw(sum);
@ARGV or @ARGV=('sample.txt');
open OUT, '>output.txt' or die "could not write output.txt: $!";
print OUT sort col1_rev <>;
sub col1_rev {
    # split the columns and only compare the first one
    my $cola = (split /\s+/, $a)[0];
    my $colb = (split /\s+/, $b)[0];
    my $grpa = group($cola);
    my $grpb = group($colb);
    return $grpa <=> $grpb unless ($grpa == $grpb);
    return $colb <=> $cola;
}
sub group {
    my ($val) = @_;
    my @dig = sort { $b <=> $a } split //, $val;
    die "$val is invalid\n" unless (sum(@dig) == 6);
    return 1 if ($dig[0] == 6);
    return 2 if ($dig[0] == 5 and $dig[1] == 1);
    return 3 if ($dig[0] == 4 and $dig[1] == 2);
    return 4 if ($dig[0] == 4 and $dig[1] == 1 and $dig[2] == 1);
    return 5 if ($dig[0] == 3 and $dig[1] == 3);
    return 6 if ($dig[0] == 3 and $dig[1] == 2 and $dig[2] == 1);
    return 7 if ($dig[0] == 3 and $dig[1] == 1 and $dig[2] == 1 and $dig[3] == 1);
    return 8 if ($dig[0] == 2 and $dig[1] == 2 and $dig[3] == 2);
    return 9 if ($dig[0] == 2 and $dig[1] == 2 and $dig[3] == 1 and $dig[4] == 1);
    return 10 if ($dig[0] == 2 and $dig[1] == 1 and $dig[2] == 1 and $dig[3] == 1 and $dig[4] == 1);
}

Open in new window

0
wilcoxonCommented:
Tested both the regex version and the latest variant and they both produce the expected output in output.txt for me.  Is sample.txt empty or is there an error if you run it from the command-line (perl script.pl)?

I hadn't gone back and rethought the groupings (until looking at ozo's latest).  group sub could be simplified further (or removed and written differently) since all values should add up to 6 but it works quickly as-is.
0
ozoCommented:
If you want the groups ordered, in addition to sorting within the groups:
#!/usr/bin/perl
use strict;
use warnings;
@ARGV or @ARGV=('sample.txt');
my %groups;
unshift @{$groups{reverse sort split//,(split)[0]}},$_ for sort <>;
open STDOUT,">output.txt" or die "output.txt $!";
print @{$groups{$_}},"\n" for reverse sort keys %groups;

Another way to do it:
#!/usr/bin/perl
use strict;
use warnings;
@ARGV or @ARGV=('sample.txt');
my %groups;
push @{$groups{reverse sort split//,(split)[0]}},$_ while <>;
open STDOUT,">output.txt" or die "output.txt $!";
print reverse(sort @{$groups{$_}}),"\n" for reverse sort keys %groups;

If you don't need the separator between groups:
#!/usr/bin/perl
use strict;
use warnings;
@ARGV or @ARGV=('sample.txt');
open STDOUT,">output.txt" or die "output.txt $!";
print /\0(.*)/s for reverse sort map{join"",reverse("\0",sort split//,(split)[0]),$_} <>;
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.