It is the generated statistics file I am interested in (see attached example).
I am interested in the number of "Molecular mimicry 14-mers":
statistics file:
I want to modify the script so that none of the above files are generated.
Instead the script will create a file called "Results"
As I say, I am only interested in Molecular Mimicry 14-mers
so, instead of creating the three files, I would like a "Results" file to be created in the ~/candidates folder. And in this file would be the number 23 (or whatever the number of molecular mimicry candidates would be).
Also, If the same script is run twice, the next result should append to the previous result in the same file (i.e. 23 [next result here])
Or if the script is run 1000 times, you'd generate 1000 results.
I think that this might do it. You need the "-final" file generated as input to your counters but I've added a line to remove it when it's done being used.
#!/usr/bin/perl -w
use strict;
my $counter_in = 0;
my $counter_out = 0;
open (COUNTERIN, "<$ENV{TMP}/$ENV{PARASITE}-in");
while (my $line = <COUNTERIN>) {
if ($line =~ /^>/) {
chomp $line;
$counter_in++;
}
}
close (COUNTERIN);
open (COUNTEROUT, "<$ENV{TMP}/$ENV{PARASITE}-out");
while (my $line = <COUNTEROUT>) {
if ($line =~ /^>/) {
chomp $line;
$counter_out++;
}
}
close (COUNTEROUT);
my $total_proteins = $counter_in + $counter_out;
open (STATISTICS, ">>$ENV{CANDIDATES}/Results");
my %whole_proteins;
my $mmcandidates = 0;
open (FINAL, "<$ENV{CANDIDATES}/$ENV{PARASITE}-$ENV{HOST}-final");
while (my $line = <FINAL>) {
chomp $line;
if ($line =~ /^(.+)-AA:\d+/) {
$mmcandidates++;
$whole_proteins{$1} = 1;
}
}
close FINAL;
unlink("<$ENV{CANDIDATES}/$ENV{PARASITE}-$ENV{HOST}-final");
print STATISTICS "Molecular mimicry candidate 14-mers\t$mmcandidates\n";
close STATISTICS;