Solved

final output question ever!

Posted on 2004-04-12
5
189 Views
Last Modified: 2010-03-04
Hi guys,

Ok, this is the last question I shall ever ask on perl...I promise!  Ozo was helping me with this, I thought I could fix it but...due to my perl dyslexia I failed!
Basically, I have an output like this:

Output file:
Interfacing Residues Chain A:30 ,31 ,34 ,35 ,36 ,99 ,103,104,106,107,110,111,114
,117,118,119,120,122,123,126
Interface Residue matched cA:E  ,R  ,L  ,S  ,F  ,K  ,H  ,C  ,L  ,V  ,A  ,A  ,P
,F  ,T  ,P  ,A  ,H  ,A  ,D
Interfacing Residues Chain B:30 ,33 ,34 ,35 ,51 ,55 ,101,108,109,111,112,115,116
,119,122,123,124,125,127,128,131
Interface Residue matched cB:R  ,V  ,V  ,Y  ,P  ,M  ,E  ,N  ,V  ,V  ,C  ,A  ,H
,G  ,F  ,T  ,P  ,P  ,Q  ,A  ,Q
Neighboring Residues Chain A:29 ,32 ,33 ,37 ,98 ,100,102,105,108,109,112,113,115
,116,121,124,125,127
Neighboring Residue match cA:L  ,M  ,F  ,P  ,F  ,L  ,S  ,L  ,T  ,L  ,H  ,L  ,A
,E  ,V  ,S  ,L  ,K
Neighboring Residues Chain B:29 ,31 ,32 ,36 ,50 ,52 ,54 ,56 ,100,102,107,110,113
,114,117,118,120,121,126,129,130,132
Neighboring Residue match cB:G  ,L  ,L  ,P  ,T  ,D  ,V  ,G  ,P  ,N  ,G  ,L  ,V
,L  ,H  ,F  ,K  ,E  ,V  ,A  ,Y  ,K

1bbbAB
Surface Residues chainA:1  ,3  ,4  ,5  ,8  ,9  ,11 ,12 ,15 ,16 ,18 ,19 ,20 ,23 ,
37 ,38 ,40 ,41 ,42 ,44 ,45 ,47 ,48 ,49 ,50 ,51 ,53 ,54 ,56 ,57 ,60 ,61 ,64 ,67 ,
68 ,71 ,72 ,74 ,75 ,77 ,78 ,79 ,81 ,82 ,83 ,85 ,86 ,89 ,91 ,92 ,94 ,95 ,96 ,130,
134,137,138,139,140,141
Surface Resitype chainA:V  ,S  ,P  ,A  ,T  ,N  ,K  ,A  ,G  ,K  ,G  ,A  ,H  ,E  ,
P  ,T  ,K  ,T  ,Y  ,P  ,H  ,D  ,L  ,S  ,H  ,G  ,A  ,Q  ,K  ,G  ,K  ,K  ,D  ,T  ,
N  ,A  ,H  ,D  ,D  ,P  ,N  ,A  ,S  ,A  ,L  ,D  ,L  ,H  ,L  ,R  ,D  ,P  ,V  ,A  ,
T  ,T  ,S  ,K  ,Y  ,R
Surface Residues chainB:1  ,2  ,4  ,5  ,6  ,8  ,9  ,10 ,12 ,13 ,16 ,17 ,19 ,20 ,
21 ,22 ,37 ,39 ,40 ,41 ,43 ,44 ,46 ,47 ,49 ,58 ,59 ,61 ,62 ,63 ,65 ,66 ,69 ,72 ,
73 ,76 ,77 ,79 ,80 ,82 ,83 ,84 ,86 ,87 ,88 ,90 ,91 ,92 ,94 ,95 ,96 ,97 ,99 ,104,
135,139,143,144,145,146
Surface Resitype chainB:V  ,H  ,T  ,P  ,E  ,K  ,S  ,A  ,T  ,A  ,G  ,K  ,N  ,V  ,
D  ,E  ,W  ,Q  ,R  ,F  ,E  ,S  ,G  ,D  ,S  ,P  ,K  ,K  ,A  ,H  ,K  ,K  ,G  ,S  ,
D  ,A  ,H  ,D  ,N  ,K  ,G  ,T  ,A  ,T  ,L  ,E  ,L  ,H  ,D  ,K  ,L  ,H  ,D  ,R  ,
A  ,N  ,H  ,K  ,Y  ,H

I have to parse this file above and if any of the numbers from the "Interfacting Residues chainA" line or "neighboring ResidueschainA" line occur in the"Surface Residues chainA"- I have to remove the number from "Surface Residues chainA" and also its corresponding Surface Resitype chainA letter(which lies directly beneath it) without altering the format I have!  This has also got to be repeated for the numbers in "Interfacting Residues chainB" or  "neighboring ResidueschainB" again, removing any of the same numbers in "Surface Residues chainB" and its corresponding "Surface Resitype chainB".

Ozo's program looked like this:

open (IN,'1bbbAB60') || die "Unable to open the Input File";
undef($/); $_=<IN>; close IN;
@ChainA=();  @ChainB=();
if (m#Interfacing Residues Chain A: ?([\s,\d]*)#) {push(@ChainA,split(/\s*,\s*/,$1))};
if (m#Neighboring Residues Chain A: ?([\s,\d]*)#) {push(@ChainA,split(/\s*,\s*/,$1))};
if (m#Interfacing Residues Chain B: ?([\s,\d]*)#) {push(@ChainB,split(/\s*,\s*/,$1))};
if (m#Neighboring Residues Chain B: ?([\s,\d]*)#) {push(@ChainB,split(/\s*,s*/,$1))};
@SurfaceResidueA=();  @SurfaceResidueB=();
if (m#Surface Residues chainA: ?([\s\d,]*)#) {push(@SurfaceResidueA,split(/\s*,\s*/,$1)
)};
if (m#Surface Residues chainB: ?([\s,\d]*)#) {push(@SurfaceResidueB,split(/\s*,\s*/,$1)
)};

@ChainA{@ChainA}=(1)x@ChainA;
$SRA=join(",", map{sprintf("%-3s", $_)} grep {!$ChainA{$_}} @SurfaceResidueA);
s#(Surface Residues chainA:)[\d\s,]*#$1 $SRA#;
@ChainB{@ChainB}=(1)x@ChainB;
$SRB=join(",", map{sprintf("%-3s", $_)} grep {!$ChainB{$_}} @SurfaceResidueB);
s#(Surface Residues chainB:)[\d\s,]*#$1 $SRB#;

open (OUT,">outty.txt") or die $!;
print OUT;

but unfortunatley, it was still not removing the numbers from "Surface Residues chainA"(or chainB) and also its corresponding Surface Resitype chainA(or chainB) letter

Ozo's output is below if he's here and can help me cause he's a genius!
Thanks Sarah XX



1bbbAB
Interfacing Residues Chain A: 30 ,31 ,34 ,35 ,36 ,99 ,103,104,106,107,110,111,11
4,117,118,119,120,122,123,126
Interfacing Residues Chain B: 30 ,33 ,34 ,35 ,51 ,55 ,101,108,109,111,112,115,11
6,119,122,123,124,125,127,128,131
Neighbouring Residues Chain A: 29 ,32 ,33 ,37 ,98 ,100,102,105,108,109,112,113,1
15,116,121,124,125,127
Neighbouring Residues Chain B: 29 ,31 ,32 ,36 ,50 ,52 ,54 ,56 ,100,102,107,110,1
13,114,117,118,120,121,126,129,130,132

1bbbAB
Surface Residues chainA: 1  ,4  ,8  ,15 ,16 ,19 ,38 ,41 ,44 ,45 ,50 ,51 ,53 ,61
,64 ,71 ,74 ,82 ,85 ,90 ,92 ,115,139,141
Surface Resitype chainA:V  ,P  ,T  ,G  ,K  ,A  ,T  ,T  ,P  ,H  ,H  ,G  ,A  ,K  ,
D  ,A  ,D  ,A  ,D  ,K  ,R  ,A  ,K  ,R
Surface Residues chainB: 2  ,5  ,6  ,9  ,16 ,21 ,22 ,40 ,43 ,44 ,47 ,49 ,52 ,56
,76 ,79 ,87 ,95 ,97 ,99 ,120,146
Surface Resitype chainB:H  ,P  ,E  ,S  ,G  ,D  ,E  ,R  ,E  ,S  ,D  ,S  ,D  ,G  ,
A  ,D  ,T  ,K  ,H  ,D  ,K  ,H
0
Comment
Question by:sarahJo
  • 3
  • 2
5 Comments
 
LVL 4

Accepted Solution

by:
vi_srikanth earned 500 total points
Comment Utility
Could u clarify my doubt?  You have said that the input of the above program will be like this:

Interfacing Residues Chain A:30 ,31 ,34 ,35 ,36 ,99 ,103,104,106,107,110,111,114
,117,118,119,120,122,123,126
.
.
.

If u see the above input, there are linebreaks/entermarks/newline characters within each line, i.e. in the above there is an entermark after 114.  Will this be the real case? or while posting ur comment u've delibrately put these entermarks?  In other words, will the input for the above program will have linebreaks within each line or not?  If it has, then we might have to tweak the code a little.  If I'm not clear tell me.
0
 

Author Comment

by:sarahJo
Comment Utility
Hi Vi srikanth,

No...each line of input will all be in one line
so like this:
Interfacing Residues Chain A:30 ,31 ,34 ,35 ,36 ,99 ,103

Sorry, I just got corrupted when I pasted it in.Tks! Sarah
0
 
LVL 4

Expert Comment

by:vi_srikanth
Comment Utility
The above program for the above input outputs the following. U've said that "it was still not removing the numbers from ...".  Can u exactly pinpoint the number which got retained? For eg., in the input if u see there is "37" in Surface Residues chainA, which got deleted in the final output.

Input:
--------------------------------------
Interfacing Residues Chain A:30 ,31 ,34 ,35 ,36 ,99 ,103,104,106,107,110,111,114,117,118,119,120,122,123,126
Interface Residue matched cA:E  ,R  ,L  ,S  ,F  ,K  ,H  ,C  ,L  ,V  ,A  ,A  ,P,F  ,T  ,P  ,A  ,H  ,A  ,D
Interfacing Residues Chain B:30 ,33 ,34 ,35 ,51 ,55 ,101,108,109,111,112,115,116,119,122,123,124,125,127,128,131
Interface Residue matched cB:R  ,V  ,V  ,Y  ,P  ,M  ,E  ,N  ,V  ,V  ,C  ,A  ,H,G  ,F  ,T  ,P  ,P  ,Q  ,A  ,Q
Neighboring Residues Chain A:29 ,32 ,33 ,37 ,98 ,100,102,105,108,109,112,113,115,116,121,124,125,127
Neighboring Residue match cA:L  ,M  ,F  ,P  ,F  ,L  ,S  ,L  ,T  ,L  ,H  ,L  ,A,E  ,V  ,S  ,L  ,K
Neighboring Residues Chain B:29 ,31 ,32 ,36 ,50 ,52 ,54 ,56 ,100,102,107,110,113,114,117,118,120,121,126,129,130,132
Neighboring Residue match cB:G  ,L  ,L  ,P  ,T  ,D  ,V  ,G  ,P  ,N  ,G  ,L  ,V,L  ,H  ,F  ,K  ,E  ,V  ,A  ,Y  ,K

1bbbAB
Surface Residues chainA:1  ,3  ,4  ,5  ,8  ,9  ,11 ,12 ,15 ,16 ,18 ,19 ,20 ,23 ,37 ,38 ,40 ,41 ,42 ,44 ,45 ,47 ,48 ,49 ,50 ,51 ,53 ,54 ,56 ,57 ,60 ,61 ,64 ,67 ,
68 ,71 ,72 ,74 ,75 ,77 ,78 ,79 ,81 ,82 ,83 ,85 ,86 ,89 ,91 ,92 ,94 ,95 ,96 ,130,134,137,138,139,140,141
Surface Resitype chainA:V  ,S  ,P  ,A  ,T  ,N  ,K  ,A  ,G  ,K  ,G  ,A  ,H  ,E  ,P  ,T  ,K  ,T  ,Y  ,P  ,H  ,D  ,L  ,S  ,H  ,G  ,A  ,Q  ,K  ,G  ,K  ,K  ,D  ,T  ,N  ,A  ,H  ,D  ,D  ,P  ,N  ,A  ,S  ,A  ,L  ,D  ,L  ,H  ,L  ,R  ,D  ,P  ,V  ,A  ,T  ,T  ,S  ,K  ,Y  ,R
Surface Residues chainB:1  ,2  ,4  ,5  ,6  ,8  ,9  ,10 ,12 ,13 ,16 ,17 ,19 ,20 ,21 ,22 ,37 ,39 ,40 ,41 ,43 ,44 ,46 ,47 ,49 ,58 ,59 ,61 ,62 ,63 ,65 ,66 ,69 ,72 ,73 ,76 ,77 ,79 ,80 ,82 ,83 ,84 ,86 ,87 ,88 ,90 ,91 ,92 ,94 ,95 ,96 ,97 ,99 ,104,135,139,143,144,145,146
Surface Resitype chainB:V  ,H  ,T  ,P  ,E  ,K  ,S  ,A  ,T  ,A  ,G  ,K  ,N  ,V  ,D  ,E  ,W  ,Q  ,R  ,F  ,E  ,S  ,G  ,D  ,S  ,P  ,K  ,K  ,A  ,H  ,K  ,K  ,G  ,S  ,D  ,A  ,H  ,D  ,N  ,K  ,G  ,T  ,A  ,T  ,L  ,E  ,L  ,H  ,D  ,K  ,L  ,H  ,D  ,R  ,A  ,N  ,H  ,K  ,Y  ,H


Output:
----------------------------------------
Interfacing Residues Chain A:30 ,31 ,34 ,35 ,36 ,99 ,103,104,106,107,110,111,114,117,118,119,120,122,123,126
Interface Residue matched cA:E  ,R  ,L  ,S  ,F  ,K  ,H  ,C  ,L  ,V  ,A  ,A  ,P,F  ,T  ,P  ,A  ,H  ,A  ,D
Interfacing Residues Chain B:30 ,33 ,34 ,35 ,51 ,55 ,101,108,109,111,112,115,116,119,122,123,124,125,127,128,131
Interface Residue matched cB:R  ,V  ,V  ,Y  ,P  ,M  ,E  ,N  ,V  ,V  ,C  ,A  ,H,G  ,F  ,T  ,P  ,P  ,Q  ,A  ,Q
Neighboring Residues Chain A:29 ,32 ,33 ,37 ,98 ,100,102,105,108,109,112,113,115,116,121,124,125,127
Neighboring Residue match cA:L  ,M  ,F  ,P  ,F  ,L  ,S  ,L  ,T  ,L  ,H  ,L  ,A,E  ,V  ,S  ,L  ,K
Neighboring Residues Chain B:29 ,31 ,32 ,36 ,50 ,52 ,54 ,56 ,100,102,107,110,113,114,117,118,120,121,126,129,130,132
Neighboring Residue match cB:G  ,L  ,L  ,P  ,T  ,D  ,V  ,G  ,P  ,N  ,G  ,L  ,V,L  ,H  ,F  ,K  ,E  ,V  ,A  ,Y  ,K

1bbbAB
Surface Residues chainA: 1  ,3  ,4  ,5  ,8  ,9  ,11 ,12 ,15 ,16 ,18 ,19 ,20 ,23 ,38 ,40 ,41 ,42 ,44 ,45 ,47 ,48 ,49 ,50 ,51 ,53 ,54 ,56 ,57 ,60 ,61 ,64 ,67 ,68 ,71 ,72 ,74 ,75 ,77 ,78 ,79 ,81 ,82 ,83 ,85 ,86 ,89 ,91 ,92 ,94 ,95 ,96 ,130,134,137,138,139,140,141
Surface Resitype chainA:V  ,S  ,P  ,A  ,T  ,N  ,K  ,A  ,G  ,K  ,G  ,A  ,H  ,E  ,P  ,T  ,K  ,T  ,Y  ,P  ,H  ,D  ,L  ,S  ,H  ,G  ,A  ,Q  ,K  ,G  ,K  ,K  ,D  ,T  ,N  ,A  ,H  ,D  ,D  ,P  ,N  ,A  ,S  ,A  ,L  ,D  ,L  ,H  ,L  ,R  ,D  ,P  ,V  ,A  ,T  ,T  ,S  ,K  ,Y  ,R
Surface Residues chainB: 1  ,2  ,4  ,5  ,6  ,8  ,9  ,10 ,12 ,13 ,16 ,17 ,19 ,20 ,21 ,22 ,37 ,39 ,40 ,41 ,43 ,44 ,46 ,47 ,49 ,58 ,59 ,61 ,62 ,63 ,65 ,66 ,69 ,72 ,73 ,76 ,77 ,79 ,80 ,82 ,83 ,84 ,86 ,87 ,88 ,90 ,91 ,92 ,94 ,95 ,96 ,97 ,99 ,104,135,139,143,144,145,146
Surface Resitype chainB:V  ,H  ,T  ,P  ,E  ,K  ,S  ,A  ,T  ,A  ,G  ,K  ,N  ,V  ,D  ,E  ,W  ,Q  ,R  ,F  ,E  ,S  ,G  ,D  ,S  ,P  ,K  ,K  ,A  ,H  ,K  ,K  ,G  ,S  ,D  ,A  ,H  ,D  ,N  ,K  ,G  ,T  ,A  ,T  ,L  ,E  ,L  ,H  ,D  ,K  ,L  ,H  ,D  ,R  ,A  ,N  ,H  ,K  ,Y  ,H
0
 

Author Comment

by:sarahJo
Comment Utility
Hi Vi_srikanth,

My sincere apologies...its working fine.  One of my files was corrupted!  Thank you.

0
 
LVL 4

Expert Comment

by:vi_srikanth
Comment Utility
Thats gr8
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now