Solved

question

Posted on 2004-04-07
3
178 Views
Last Modified: 2010-03-04
Hi All,

I know I've been posting alot of questions recently but this is easy enough for ye guys and difficult for me!  I have a script below that parses that input file below and gives a list of numbers one for (A) as below and one for (B), in order and occuring once, for the input file like. for example it prints : Interface residues A: 5, 6, 7 9
                                                                                                   Interface residues B :8, 10 ,11, 12 etc in a nice format. etc for the whole file.  What I have to do is parse the first part of the file as I've done before but when the format changes like at the line "SER   5(A)(CA)   - PRO   6(A)"(CA)   :   3.018 --I need to print all these numbers, like before, but to a separate line headed as "neighbouring Residues".  So my output file looks like
Interface resides A: (numbers A first part of file)
interface residuesB: (numbers B first part of file
Neighbouring residuesA: (numbers A after format changes)
Neighbouring residuesB: (numbers B after format changes)

I know its quite simply but my efforts wouldn't work!  Thanks


Input file:
PHE 119(A)( 906)   - THR  10(B)( 996)   :   4.441
PHE 119(A)( 911)   - GLN  11(B)(1002)   :   4.486
PHE 119(A)( 911)   - PRO  12(B)(1016)   :   4.203
PHE 119(A)( 914)   - TRP  13(B)(1025)   :   4.372
PHE 119(A)( 913)   - VAL  16(B)(1056)   :   3.810
PHE 119(A)( 913)   - PHE 119(B)(1874)   :   4.362
SER   5(A)(CA)   - PRO   6(A)(CA)   :   3.018
PRO   6(A)(CA)   - SER   5(A)(CA)   :   3.018
PRO   6(A)(CA)   - SER   7(A)(CA)   :   3.831
THR  10(A)(CA)   - SER   9(A)(CA)   :   3.816
THR  10(A)(CA)   - GLN  11(A)(CA)   :   3.778
GLN  11(A)(CA)   - THR  10(A)(CA)   :   3.778
GLN  11(B)(CA)   - PRO  12(B)(CA)   :   3.830
PRO  12(B)(CA)   - PRO   8(B)(CA)   :   4.140
PRO  12(B)(CA)   - GLN  11(B)(CA)   :   3.830


Script
#!/usr/local/bin/perl

use strict;

my $datafile = '/home/paul/list/antotest';

my %chain;

open FILE, $datafile or die "Can not open $datafile $!\n";

open(OUTFILE,">antotest2")||die;

while (<FILE>) {
    if (/(\d+)\((.).*-.*\s+(\d+)\((.)/) {
     $chain{$2}->{$1} = $1;
     $chain{$4}->{$3} = $3;
    } else {
       print OUTFILE "";
    }
}
foreach my $chain (sort keys %chain) {
     print OUTFILE "Interfacing Residues Chain $chain: ";
     my $c = 0;
     foreach my $skey (sort {$a <=> $b} keys %{$chain{$chain}} ) {
          print  OUTFILE ","   if ($c > 0);
          $c++;
          printf OUTFILE "%-3d", $skey;
     }
     print OUTFILE "\n";
}
0
Comment
Question by:paulieomeara
  • 2
3 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 10780904
#!/usr/local/bin/perl

use strict;

my $datafile = '/home/paul/list/antotest';

my @chain;

open FILE, $datafile or die "Can not open $datafile $!\n";

while (<FILE>) {
    while( /(\d+)\((.)\)\(\s*(\w*)\)/g ){
        my($r,$c,$p)=($1,$2,$3);
        $chain[$p!~/\d+/]{$c}{$r}++;
    }
}
open(OUTFILE,">antotest2")||die $!;
for my $part ( 0,1 ){
  foreach my $chain ( sort keys %{$chain[$part]} ){
    print OUTFILE "${[qw(Interfacing Neighboring)]}[$part] Residues Chain $chain: ";
    my $c = 0;
    foreach my $skey (sort {$a <=> $b} keys %{$chain[$part]{$chain}} ) {
        print  OUTFILE ","   if ($c > 0);
        $c++;
        printf OUTFILE "%-3d", $skey;
    }
    print OUTFILE "\n";
  }
}
0
 

Author Comment

by:paulieomeara
ID: 10781146
Hi Oza,

Thats fantastic, thanks so much!  Just one more thing, is there any way that I can exclude the numbers that are in Interfacing Residues ChainA from Neighboring Residues Chain A and the same for Interfacing Residues Chain B?

My current output is like this:
Interfacing Residues Chain A: 5  ,6  ,10 ,11 ,12 ,13 ,16 ,17 ,19 ,20 ,23 ,49 ,11
4,115,116,117,118,119
Interfacing Residues Chain B: 5  ,6  ,10 ,11 ,12 ,13 ,16 ,19 ,20 ,23 ,49 ,114,11
5,116,117,118,119
Neighboring Residues Chain A: 5  ,6  ,7  ,8  ,9  ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17
 ,18 ,19 ,20 ,21 ,22 ,23 ,24 ,48 ,49 ,50 ,113,114,115,116,117,118,119,120
Neighboring Residues Chain B: 5  ,6  ,7  ,8  ,9  ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17
 ,18 ,19 ,20 ,21 ,22 ,23 ,24 ,48 ,49 ,50 ,113,114,115,116,117,118,119,120



0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 10781256
#assuming all the Interface Resudue Chains come before the Neighboring Residue Cheains
while (<FILE>) {
    while( /(\d+)\((.)\)\(\s*(\w*)\)/g ){
        my($r,$c,$p)=($1,$2,$3);
        next if $chain[0]{$c}{$r};
        $chain[$p!~/\d+/]{$c}{$r}++;    }
}
close FILE;
open(OUTFILE,">antotest2")||die $!;
for my $part ( 0,1 ){  foreach my $chain ( sort keys %{$chain[$part]} ){
    print OUTFILE "${[qw(Interfacing Neighboring)]}[$part] Residues Chain $chain: ";
    print OUTFILE join',',map{sprintf"%-3d",$_} sort{$a<=>$b}keys %{$chain[$part]{$chain}};
    print OUTFILE "\n";
  }
}
close OUTFILE;
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now