Solved

How to read in a file in perl, put it in a hash table, and then sort it by a hash value (not the key)

Posted on 2010-11-23
8
303 Views
Last Modified: 2012-05-10
I have a file named list.txt. Inside that file are many rows that look like this.

GA_ZTC      CC_9811111
IA_ZTC      CC_9811112
IA_ZTC      CC_9711233

Each column is separated by a tab. I need to sort the list by the second column. I was trying to read it into a hash table and sort on the value, but my code is not working well. Ideally, at the end I would have a two column file sorted by the second column. I have thought about writing it out to another file, swapping columns, sorting it and then reordering it, but I think this is not efficient coding. I would appreciate it if somebody could help me figure out how to do this with a hash table. Thanks in advance.
#!/usr/bin/perl

# Read in a file and print it out.



# use strict;



open(INFILE, "Bld_Org2_S3.txt"); # open for input

open(OUTFILE,">","sortedlist.txt");



sub hashValueAscending

{

   $val{$a} cmp $val{$b};

}



my %hash;

while (<INFILE>)

{ 

   chomp; 

   my ($key, $val) = split /\t/;

   $hash{$key} .= exists $hash{$key} ? "$val" : $val;

   foreach $key (sort hashValueAscending (keys(%hash))) 

		{

			print OUTFILE $hash{$key}."\t".$key."\n";

		}

}



#flock(INFILE, LOCK_UN);

close(INFILE);

close(OUTFILE);

Open in new window

0
Comment
Question by:dlnewman70
  • 4
  • 2
  • 2
8 Comments
 
LVL 26

Assisted Solution

by:wilcoxon
wilcoxon earned 200 total points
Comment Utility
This should work...
#!/usr/bin/perl
# Read in a file and print it out.

use strict;
use warnings;

open(INFILE, "Bld_Org2_S3.txt") or die "could not open Bld_Org2_S3.txt: $!";
open(OUTFILE,">","sortedlist.txt") or die "could not write sortedlist.txt: $!";

my %hash;
while (<INFILE>) { 
    chomp; 
    my ($key, $val) = split /\t/;
# I'm not sure what you were trying to do with this line - it effectively does nothing
#   $hash{$key} .= exists $hash{$key} ? "$val" : $val;
    # create the hash in reverse order to make it simpler
    $hash{$val} = $key;
}
close(INFILE);

foreach $val (sort keys %hash) {
    print OUTFILE "$hash{$val}\t$val\n";
}
close(OUTFILE);

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
Hmm.  Two important questions I forgot to ask:
1) Are the values in column 1 unique?
2) Are the values in column 2 unique?

If the answer to #2 is no, then my above code will lose some data.
0
 
LVL 16

Accepted Solution

by:
jmatix earned 300 total points
Comment Utility
If you don't care about using the hash this one line would do it:

perl -e '@l = <>; print sort{(split(/\t+/, $a))[1] cmp (split(/\t+/, $b))[1]} @l' data.txt >output.txt

If you are on windows:

perl -e "@l = <>; print sort{(split(/\t+/, $a))[1] cmp (split(/\t+/, $b))[1]} @l" data.txt >output.txt
0
 

Author Comment

by:dlnewman70
Comment Utility
Values in column 1 are not unique. I ran the code above and it removed a lot of lines. I assume this is the reason. I also had to rem out the USE STRICT; line. Other than that I think we are on the right path. If I can figure out the unique issue.

It is possible that column 1 and column 2 are not unique, but together the combination of them will yield unique results. Hopefully that makes sense.

For example,

A  1
A  2
B  1

0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:dlnewman70
Comment Utility
For further clarity in my example,

If original file looks like
A  1
A  2
B  1

I am tyring to get sort to perform the following
A  1
B  1
A  2

0
 

Author Comment

by:dlnewman70
Comment Utility
jmatix, your one line code seems to work. Could you possibly explain the code briefly? I appreciate the fast response.
0
 
LVL 16

Expert Comment

by:jmatix
Comment Utility
Basically it read all lines into an array @l. Then sorts the lines on the second field as key and prints the sorted lines.

{(split(/\t+/, $a))[1] cmp (split(/\t+/, $b))[1]}

The above code splits the line at tab character and compares the second fields (subscript [1]) of each line. If you want to sort descending just interchange $a and $b as:

{(split(/\t+/, $b))[1] cmp (split(/\t+/, $a))[1]}

0
 

Author Closing Comment

by:dlnewman70
Comment Utility
I really appreciate both experts who helped me with this problem. I split the points based upon the valuable input and the fact that both experts really helped get me pointed in the right direction. I gave jmatix the greater points because his solution seemed to work the best. Part of it was associated with the uniqueness of the fields.
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
It is a freely distributed piece of software for such tasks as photo retouching, image composition and image authoring. It works on many operating systems, in many languages.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now