Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

Newbie --  Why won't this work (sorting contents of files)

Posted on 2003-04-01
10
Medium Priority
?
152 Views
Last Modified: 2012-05-04
I have a directory containing about 250 text files, each with a hundred or so lines containing data seperated by tabs.  The first field is a time field in epoch seconds.  I'm trying to sort the contents (lines) of all these files, one file at a time, and ouput the sorted results to another file, with the same name as the original file, but with a different extension.  I tried this:

#!/usr/local/bin/perl

my $dirname = "/var/www/html/folding/data/members";
my $new = "/var/www/html/folding/data/members/temp.tmp";

opendir(DIR, $dirname) or die "can't opendir $dirname: $!";

while (defined($teamfile = readdir(DIR))) {
     next if $teamfile =~ /^\.\.?$/;
        open(OLD, "<$dirname/$teamfile") or die "can't open OLD: $!";
        open (NEW, ">$new") or die "can't open NEW: $!";
        select (NEW);
        my(@lines) = <OLD>;
        @lines = sort(@lines);
        my($line);
        foreach $line (@lines) {

        print NEW "$line";

   }

my $newname =  "/var/www/html/folding/data/sorted/" . $teamfile . ".sort";
system("mv $new $newname");
close(NEW);
close(OLD);

}
closedir(DIR);


But it failed.  For an original file named, say orgfile.his, I would get orgfile.his.sort, orgfile.his.sort.sort, orgfile.his.sort.sort.sort, etc. (all files are full size, and contain apparently what I'm looking for).  There were hundreds of these files (.sort.sort.sort....).

I'm *very* new to Perl, and I can't spot the error.  It's probably glaring to one of you guys and gals.  Someone willing to help a blind man?

Thanks
0
Comment
Question by:mjcoyne
10 Comments
 
LVL 27

Accepted Solution

by:
wilcoxon earned 150 total points
ID: 8248303
Change this line:

next if $teamfile =~ /^\.\.?$/;

to this:

next if ($teamfile =~ /^\.\.?$/ or $teamfile =~ /\.sort$/);

The problem is that the readdir is picking up your new files as they are written and you are not skipping files that already end in .sort.
0
 
LVL 48

Expert Comment

by:Tintin
ID: 8250100
I think you'd have to agree that it is much more simple to write it as:

#!/usr/local/bin/perl
use strict;

my $dirname = "/var/www/html/folding/data/members";

foreach my $teamfile (<$dirname/*.his>) {
      open FILE, $teamfile or die "can't open $teamfile: $!\n";
      open SORTED, ">$teamfile.sort" or die "Can not open $temafile.sort $!\n";
      print SORTED sort <FILE>;
      close FILE;
      close SORTED;
}
0
 
LVL 27

Expert Comment

by:wilcoxon
ID: 8250156
Yep.  That is much simpler.  I wasn't thinking in terms of re-writing it - I was just looking at fixing his problem.  Personally, I'd re-write your code as below.  I'm a big believer in not using glob patterns (I had a script fail due to hitting the glob limit).

#!/usr/local/bin/perl

use strict;
use warnings;

my $dirname = "/var/www/html/folding/data/members";

opendir DIR, $dirname or die $!;
while (readdir DIR) {
     next unless /\.his$/; # or whatever pattern you want to use to check the file
     my $teamfile = $_;
     open FILE, $teamfile or die "can't open $teamfile: $!\n";
     open SORTED, ">$teamfile.sort" or die "Can not open $temafile.sort $!\n";
     print SORTED sort <FILE>;
     close FILE;
     close SORTED;
}
closedir DIR;
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 48

Expert Comment

by:Tintin
ID: 8250260
Perl's glob is fine from 5.6.1 onwards.  
0
 
LVL 17

Author Comment

by:mjcoyne
ID: 8250300
I wish I could split the points among you guys...  I accepted wilcoxon's first answer among three correct answers 'cause he did correctly answer the question I asked...:)

See?  I told you I was a newbie...  Both examples of reworked code are obviously much simpler and cleaner ways of accomplishing what I was trying to do...  And, of course, shows the benefit of experience over newbieness...

Thanks to both of you!
0
 
LVL 48

Expert Comment

by:Tintin
ID: 8250317
You can split points by requesting it in the Community Support section.
0
 
LVL 27

Expert Comment

by:wilcoxon
ID: 8254538
Are you sure Tintin?  I believe perl's globbing has limits even in 5.8 - admittedly the limit is in the 1000s.  On the exmh-users list, this was just brought up recently as somebody wrote some mail threading code that used globbing and it failed for several people (one I remember had 11000 files in the directory).
0
 
LVL 48

Expert Comment

by:Tintin
ID: 8257029
The big difference in globbing from 5.6.1 onwards is that it is done internally rather than being limited by the limits of the csh.

I haven't seen anything to suggest that there is a limit in 5.8.x, but I guess it is possible with huge numbers of files.
0
 
LVL 85

Expert Comment

by:ozo
ID: 8257352
I just tried it on over 32000 files with no problem
0
 
LVL 27

Expert Comment

by:wilcoxon
ID: 8257527
Odd.  I didn't check it (I don't have a directory with that many files) but I'm pretty sure the person on the exmh-users list specifically mentioned using perl 5.6.1.  Possibly working differently on different platforms?
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Suggested Courses

564 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question