Solved

Sorting sub routine

Posted on 2003-11-04
8
411 Views
Last Modified: 2010-03-04
I have the following sub-routine:


#########################################################################
#                                                                                                            #
# subroutine sort_data                                                                              #
# Subroutine that does the actual sort.                                                 #
# Accepts 4 params viz.                                                                         #
# 1. The column number to sort. Column no. start from 0.                         #
# 2. The type of sort numeric or alphabetic. Default is Alphabetic.       #
# 3. The order of sort. Default order is ascending                                    #
# 4. The referrence of the array that needs to be sorted.                         #
#########################################################################

sub sort_data {
                  
      my ($row,$type,$sort_order,$r_data) = @_;
      my (@array);
      
      if ($type eq "n") {
            if ($sort_order eq "d") {
                  @array = map { (split ('<->', $_))[1] }
                        reverse sort {$a <=> $b}
                        map { join ('<->', lc ((split('\|', $_))[$row]) , $_) }
                        @{$r_data};
            }
            else {
                  @array = map { (split ('<->', $_))[1] }
                        sort {$a <=> $b}
                        map { join ('<->', lc ((split('\|', $_))[$row]) , $_) }
                        @{$r_data};
            }
      }
      else {
            if ($sort_order eq "d") {
                  @array = map { (split ('<->', $_))[1] }
                        reverse sort {$a cmp $b}
                        map { join ('<->', lc ((split('\|', $_))[$row]) , $_) }
                        @{$r_data};
            }
            else {
                  @array = map { (split ('<->', $_))[1] }
                        sort {$a cmp $b}
                        map { join ('<->', lc ((split('\|', $_))[$row]) , $_) }
                        @{$r_data};
            }
      }

      return (\@array)
}

Someone rewrote it for me to make it more efficient and to make the warning that did showup in the error log file the moment I added the -W option to the perl command.

#########################################################################
#                                                                                                            #
# subroutine sort_data                                                                              #
# Subroutine that does the actual sort.                                                 #
# Accepts 4 params viz.                                                                         #
# 1. The column number to sort. Column no. start from 0.                         #
# 2. The type of sort numeric or alphabetic. Default is Alphabetic.       #
# 3. The order of sort. Default order is ascending                                    #
# 4. The referrence of the array that needs to be sorted.                         #
#########################################################################

sub sort_data {
                  
      my ($row,$type,$sort_order,$r_data) = @_;
      my (@array);
      
      if ($type eq "n") {
            if ($sort_order eq "d") {
                  @array = map { $_->[1] }
                        sort {$b->[0] <=> $a->[0]}
                        map { [ lc ((split('|', $_))[$row]) , $_] }
                        @{$r_data};
            }
            else {
                  @array = map { $_->[1] }
                        sort {$a->[0] <=> $b->[0]}
                        map { [ lc ((split('|', $_))[$row]) , $_] }
                        @{$r_data};
            }
      }
      else {
            if ($sort_order eq "d") {
                  @array = map { $_->[1] }
                        sort {$b->[0] cmp $a->[0]}
                        map { [ lc ((split('|', $_))[$row]) , $_] }
                        @{$r_data};
            }
            else {
                  @array = map { $_->[1] }
                        sort {$a->[0] cmp $b->[0]}
                        map { [ lc ((split('|', $_))[$row]) , $_] }
                        @{$r_data};
            }
      }

      return (\@array)
}

But it isn't working like the old version is! It isn't the same. What is different?

Forexample:

$row = 0;
$type = "n";
$sort_order = "a";

will work in the old version but not in the new version. Trying to learn perl by learning what is wrong and why. Can someone help me to fix the new code?
0
Comment
Question by:weversbv
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 9681758
 split('|', $_)
should be
  split('\|', $_)
0
 
LVL 18

Expert Comment

by:kandura
ID: 9681902
It doesn't work with numerical sorting, because the Schwartzian Transform uses '|' to split the rows,
but this is interpreted as an 'or' regexp. The first element in the anonymous array will always contain
only the $row-th character in the data.
I'm pretty sure your alphabetical sort would show the same problem!

You could write the whole routine a lot prettier by using the ? : operator to select two alternatives,
and by realizing that <=> and cmp return 1,0,-1 to tell how the two elements compare.

So I would write this:

sub sort_data
{
      my ($row,$type,$sort_order,$r_data) = @_;

      # $order will contain 1 for ascending and -1 for descending sort
      my $order = $sort_order eq 'd' ? -1 : 1;                    
      
      my @array = map { $_->[1] }
            sort { ( $type eq 'n' ?                              # is it a numerical sort?
                        $a->[0] <=> $b->[0] :                  # then use <=> operator
                        $a->[0] cmp $b->[0] )                  # else do lexical sort
                          * $order                        # multiply by $order to change the direction
                                                        # of the sort
                  }  map {
                        [ lc ((split('\|', $_))[$row]) , $_]      # use a Schwartzian Transform on the input.
                  } @$r_data;
      
      return \@array;
}
0
 
LVL 20

Expert Comment

by:jmcg
ID: 9684783
One good thing about the two versions of your code is that they keep the sort comparison routine short and simple. Kandura undoes much of the virtue of the Schwartzian transform by calculating too much in the compare routine.

Ozo has spotted the essential flaw in the rewriting. Although split will take strings as its first argument, the only string that is supposed to be given to it is one containing a single space. Other than that, you should be passing a regex to match. So all the places where your rewrite has

   split('|', $_)

you should have

   split( /\|/, $_)

or, just a slight performance improvement, since you only need to split enough times to get to the field (why are you calling it a "row"???) of interest.

  split( /\|/, $_, $row+2)

In a similar vein, you could forego the call to the 'lc' operator when doing numeric searches.

0
 

Author Comment

by:weversbv
ID: 9684967
jmcg:

or, just a slight performance improvement, since you only need to split enough times to get to the field (why are you calling it a "row"???) of interest.

  split( /\|/, $_, $row+2)

In a similar vein, you could forego the call to the 'lc' operator when doing numeric searches.

???

Could you help me to an example of what you mean?
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 18

Expert Comment

by:kandura
ID: 9685327
jmcg: you are right. In an attempt to write prettier code, I sacrified performance. I didn't think 1 'if' and 1 multiplication would matter much, but of course it adds up.

weversbv: what jmcg means is that:
1.  you don't have to split every row completely, because you only want the one column. So there's no need to split the remaining columns after that;
2.  you don't have to convert the column you're looking for to lowercase when you're sorting numerically.

To put it together, here's your new routine:

sub sort_data {
               
     my ($row,$type,$sort_order,$r_data) = @_;
     my (@array);
     
     if ($type eq "n") {
          if ($sort_order eq "d") {
               @array = map { $_->[1] }
                    sort {$b->[0] <=> $a->[0]}
                    map { [ (split(/\|/, $_, $row+2))[$row] , $_] }          #  <-- leave out the call to lc( )
                    @{$r_data};
          }
          else {
               @array = map { $_->[1] }
                    sort {$a->[0] <=> $b->[0]}
                    map { [ (split(/\|/, $_, $row+2))[$row] , $_] }          #  <-- same here
                    @{$r_data};
          }
     }
     else {
          if ($sort_order eq "d") {
               @array = map { $_->[1] }
                    sort {$b->[0] cmp $a->[0]}
                    map { [ lc ((split(/\|/, $_, $row+2))[$row]) , $_] }
                    @{$r_data};
          }
          else {
               @array = map { $_->[1] }
                    sort {$a->[0] cmp $b->[0]}
                    map { [ lc ((split(/\|/, $_, $row+2))[$row]) , $_] }
                    @{$r_data};
          }
     }

     return (\@array)
}
0
 

Author Comment

by:weversbv
ID: 9685606
I do know understand what you mean.

Is it possible to sort nummerical whenever it is possible.
So if the row to sort contains only nummerical values I woudl like to sort nummerical.
If not sort alphabetic. Is this possible?
0
 
LVL 20

Expert Comment

by:jmcg
ID: 9688430
If your array isn't very large (how large is that? I'm not sure: 1000? 5000?), you can combine numerical and lexical sort this way:

    sort { ($a->[0] <=> $b->[0]) || ($a->[0] cmp $b->[0]) } ...

As soon as you start to do something which requires examining each term and making decisions in the comparison routine, you slow things down. As the array being sorted gets larger, it becomes increasingly important to keep the comparison routine simple. And, eventually, you may need to use an external sort program.

For a combined sort such as you are asking for, on something sufficiently large, it may be worthwhile to convert the numbers into zero-filled fields large enough to represent the largest number present. Zero-filled fields, ie, 000123 instead of just 123, allow you to use a lexical sort for numbers.
0
 
LVL 18

Accepted Solution

by:
kandura earned 125 total points
ID: 9688498
Yes, that's possible. It means you have to investigate the data in the column before sorting.
That is, loop through the requested column. If you encounter a non-numerical value, do lexical sort, otherwise it'll be a numerical sort.

sub sort_data_check {
      my ($row,$type,$sort_order,$r_data) = @_;
      my @array = ();
      my $sort_type =  'n';      # I use this to get the sort type from the data

        # here I loop through the data, extracting the requested column
        # and investigating its type
        # note that this takes care of the first step in the Schwartzian Transform
      foreach (@$r_data)
      {
            my $k = (split(/\|/, $_, $row+2))[$row];     # this gets your column
            if($k==0 && $k ne '0') {
                  $sort_type = 'a';                                # it's not a number
            }
            push @array, [$k, $_];
      }

      if($sort_type ne 'n')                                               # convert the data to lower case
      {
            @array = map { lc($_->[0]); $_ } @array;
      }

      if ($sort_type eq "n") {
            if ($sort_order eq "d") {
                  @array = map { $_->[1] }
                        sort {$b->[0] <=> $a->[0]}
                        @array;
            } else {
                  @array = map { $_->[1] }
                        sort {$a->[0] <=> $b->[0]}
                        @array;
            }
      } else {
            if ($sort_order eq "d") {
                  @array = map { $_->[1] }
                        sort {$b->[0] cmp $a->[0]}
                                @array;
            } else {
                  @array = map { $_->[1] }
                        sort {$a->[0] cmp $b->[0]}
                        @array;
            }
      }

      return (\@array)
}



 
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

930 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now