Solved

How to read in a file in perl, put it in a hash table, and then sort it by a hash value (not the key)

Posted on 2010-11-23
8
304 Views
Last Modified: 2012-05-10
I have a file named list.txt. Inside that file are many rows that look like this.

GA_ZTC      CC_9811111
IA_ZTC      CC_9811112
IA_ZTC      CC_9711233

Each column is separated by a tab. I need to sort the list by the second column. I was trying to read it into a hash table and sort on the value, but my code is not working well. Ideally, at the end I would have a two column file sorted by the second column. I have thought about writing it out to another file, swapping columns, sorting it and then reordering it, but I think this is not efficient coding. I would appreciate it if somebody could help me figure out how to do this with a hash table. Thanks in advance.
#!/usr/bin/perl

# Read in a file and print it out.



# use strict;



open(INFILE, "Bld_Org2_S3.txt"); # open for input

open(OUTFILE,">","sortedlist.txt");



sub hashValueAscending

{

   $val{$a} cmp $val{$b};

}



my %hash;

while (<INFILE>)

{ 

   chomp; 

   my ($key, $val) = split /\t/;

   $hash{$key} .= exists $hash{$key} ? "$val" : $val;

   foreach $key (sort hashValueAscending (keys(%hash))) 

		{

			print OUTFILE $hash{$key}."\t".$key."\n";

		}

}



#flock(INFILE, LOCK_UN);

close(INFILE);

close(OUTFILE);

Open in new window

0
Comment
Question by:dlnewman70
  • 4
  • 2
  • 2
8 Comments
 
LVL 26

Assisted Solution

by:wilcoxon
wilcoxon earned 200 total points
ID: 34198856
This should work...
#!/usr/bin/perl
# Read in a file and print it out.

use strict;
use warnings;

open(INFILE, "Bld_Org2_S3.txt") or die "could not open Bld_Org2_S3.txt: $!";
open(OUTFILE,">","sortedlist.txt") or die "could not write sortedlist.txt: $!";

my %hash;
while (<INFILE>) { 
    chomp; 
    my ($key, $val) = split /\t/;
# I'm not sure what you were trying to do with this line - it effectively does nothing
#   $hash{$key} .= exists $hash{$key} ? "$val" : $val;
    # create the hash in reverse order to make it simpler
    $hash{$val} = $key;
}
close(INFILE);

foreach $val (sort keys %hash) {
    print OUTFILE "$hash{$val}\t$val\n";
}
close(OUTFILE);

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34198870
Hmm.  Two important questions I forgot to ask:
1) Are the values in column 1 unique?
2) Are the values in column 2 unique?

If the answer to #2 is no, then my above code will lose some data.
0
 
LVL 16

Accepted Solution

by:
jmatix earned 300 total points
ID: 34198996
If you don't care about using the hash this one line would do it:

perl -e '@l = <>; print sort{(split(/\t+/, $a))[1] cmp (split(/\t+/, $b))[1]} @l' data.txt >output.txt

If you are on windows:

perl -e "@l = <>; print sort{(split(/\t+/, $a))[1] cmp (split(/\t+/, $b))[1]} @l" data.txt >output.txt
0
 

Author Comment

by:dlnewman70
ID: 34199555
Values in column 1 are not unique. I ran the code above and it removed a lot of lines. I assume this is the reason. I also had to rem out the USE STRICT; line. Other than that I think we are on the right path. If I can figure out the unique issue.

It is possible that column 1 and column 2 are not unique, but together the combination of them will yield unique results. Hopefully that makes sense.

For example,

A  1
A  2
B  1

0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:dlnewman70
ID: 34199603
For further clarity in my example,

If original file looks like
A  1
A  2
B  1

I am tyring to get sort to perform the following
A  1
B  1
A  2

0
 

Author Comment

by:dlnewman70
ID: 34199693
jmatix, your one line code seems to work. Could you possibly explain the code briefly? I appreciate the fast response.
0
 
LVL 16

Expert Comment

by:jmatix
ID: 34199792
Basically it read all lines into an array @l. Then sorts the lines on the second field as key and prints the sorted lines.

{(split(/\t+/, $a))[1] cmp (split(/\t+/, $b))[1]}

The above code splits the line at tab character and compares the second fields (subscript [1]) of each line. If you want to sort descending just interchange $a and $b as:

{(split(/\t+/, $b))[1] cmp (split(/\t+/, $a))[1]}

0
 

Author Closing Comment

by:dlnewman70
ID: 34199959
I really appreciate both experts who helped me with this problem. I split the points based upon the valuable input and the fact that both experts really helped get me pointed in the right direction. I gave jmatix the greater points because his solution seemed to work the best. Part of it was associated with the uniqueness of the fields.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.

920 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now