Solved

Protein Cleaving Enzymes and Perl

Posted on 2009-05-12
34
399 Views
Last Modified: 2012-05-06
Hi,

I have a HTML form which contains four radio buttons... each button represents an enzyme which cuts a protein sequence in a different way.. i.e.

if trypsin is selected, it'll cut a protein sequence at K or R unless the next amino acid is P.

I've been pseudo-coding this up, basically, trying to spit out what i want my code to do... so radio button selected -> depending on enzyme, take the six sequences ($orfprotein1-6) and cut them in the relevent places, then return the results on the next html webpage..

ReadingFrame1: fragment1 fragment2 fragment3 etc etc
ReadingFrame2: etc etc
3
4
5
6
 
and also on the same html output page contain a box to save this data as an output text file.

but here's the pseudo-code. It contains the info on which enzyme cuts where in a sequence:

#Cleave Reading Frames with selected enzyme from list of four radio buttons in form

# TRYPSIN - cuts at "K" or "R" unless the next amino acid is "P"
# Need to create a loop which runs along all six sequences ($orfprotein1,$orfprotein2 etc)  cutting the sequences at K and R and returning the
fragments
# i.e. GHDTFGR // GHKPRPDAEK // STGDFY (stays intact for KP and RP)

my $Trypsin = $query->param('Trypsin');
# if Trypsin selected by radio button?? then call this loop....
# cleave fragments in loop and then return the fragments created in the HTML output below ($cleavedfragments)


# ENDOPROTL - cuts at "K" unless the next amino acid is "P"
# Need to create a loop which runs along all six sequences ($orfprotein1,$orfprotein2 etc)  cutting the sequences at K unless next to P returning the
fragments

my $EndoProtL = $query->param('EndoProtL');
# if EndoProtL selected by radio button?? then call this loop....
# cleave fragments in loop and then return the fragments created in the HTML output below ($cleavedfragments)


# ENDOPROTA - cuts at R unless the next amino acid is Proline P
# Need to create a loop which runs along all six sequences ($orfprotein1,$orfprotein2 etc)  cutting the sequences at R unless next to P returning
the fragments

my $EndoProtA = $query->param('EndoProtA');


# V8PROT- cuts at E unless the next amino acid is P
# same as above

my $V8prot = $query->param('V8prot');

Thanks in advance.
0
Comment
Question by:StephenMcGowan
  • 23
  • 11
34 Comments
 

Author Comment

by:StephenMcGowan
Comment Utility
A clearer way of me explaining.. probably...

say if i have 6 sequences:

Reading Frame 1: SAEVIHQVEEALDTDEKEMLRDVAIDVVPPNVRDLALVELDILRERGKLSVGDLAELLYRVRRFDLLKRILKMDRKAVETHLLRNPHLVSDYRVLMAEIGEDLDKSDVSSLIFLMKDYMGRGKISKEKSFLDLVVELEKLNLVAPDQLDLLEKCLKNIHRIDLKTKIQKYKQSVQGAGTSYRNVLQAAIQKSLKDPSNNFRLHNGRSKEQRLKEQLGAQQEPVKKSIQESEAFLPQSIPEERYKMKSKPLGICLIIDCIGNETELLRDTFTSLGYEVQKFLHLSMHGISQILGQFACMPEHRDYDSFVCVLVSRGGSQSVYGVDQTHSGLPLHHIRRMFMGDSCPYLAGKPKMFFIQNYVVSEGQLEDSSLLEVDGPAMKNVEFKAQKRGLCTVHREADFFWSLCTADMSLLEQSHSSPSLYLQCLSQKLRQERKRPLLDLHIELNGYMYDWNSRVSAKEKYYVWLQHTLRKKLILSYT_;<br>
Reading Frame 2: RQSFFETPSLPWAMKSRNSCISVCMVYPRFLANLPVCPSTETTTALCVSW_;<br>
Reading Frame 3: PLSSREAKDVFYSELCGVRGPAGGQQPLGGGWASDEECGIQGSEARAVHSSPRS_;<br>
Reading Frame 4: HTEMQEFLDFIAQGSEGVSKKLCLIANAIDYQADS_;<br>
Reading Frame 5: QSIIRQIPRGLLFILYLSSGML_;<br>
Reading Frame 6: RTHQIYPNPHQSLPSALYSPKQGEGS_;<br>

and i choose Trypsin (cuts at K and R but not if next letter is P)

i'd expect to have the output:

SAEVIHQVEEALDTDEK
EMLR
DVAIDVVPPNVRDLALVELDILRERGK
LSVGDLAELLYR
VRRFDLLK
RILK
MDRK
AVETHLLR
NPHLVSDYRVLMAEIGEDLDK
SDVSSLIFLMKDYMGRGK
ISK
EK
SFLDLVVELEK
LNLVAPDQLDLLEK
CLK
NIHR
IDLKTK
IQK
YK
QSVQGAGTSYR
NVLQAAIQK
SLKDPSNNFR
LHNGRSK
EQRLK
EQLGAQQEPVK
K
SIQESEAFLPQSIPEER
YK
MK
SKPLGICLIIDCIGNETELLR
etc.. etc... etc

basically runs through 6 sequences ($proteinorf1-6) with the fragments going into a list
doesn't matter about distinguishing between reading frames in the output, just output as a list
this will be dependent on the enzyme opted in the radio buttons on the form.
i.e. Trypsin - K and R but not if followed by P
      EndoProtL - K but not if followed by P
      EndoProtA - R but not if followed by P
      V8Prot - E but not if followed by P

hope that makes more sense.
0
 

Author Comment

by:StephenMcGowan
Comment Utility
Thinking about it, i think the most efficient way of going about this is assigning each $orfprotein to a universal enzyme sub-routine i'll call "enzyme_digest".

so @fragments1 = enzyme_digest($orfprotein1)

what i'm hoping the sub-routine "enzyme digest" will do is the following:

---------------
receive the my $Trypsin = $query->param('Trypsin');
                   my $EndoProtL = $query->param('EndoProtL');
                   my $EndoProtA = $query->param('EndoProtA');
                   my $V8prot = $query->param('V8prot');
..CGI form radio button calls from the HTML form... and depending on which one is called, use the my $enzymename (i.e. my $Trypsin) to act as a trigger and feed the sequence into a specific loop which cleaves at Trypsins cleavage sites (K and R but not followed by P). It then keeps looping and i think the best thing to do is for every fragment cleaved, it is then put into a one column array, so a list like i manually typed up above.

So picturing the sub-routine overall, there will be four "radio button calls" which will be linked with is own specific cleavage - array loop and before returning the array as @fragments.

Is this possible?
                   
                     
0
 
LVL 39

Expert Comment

by:Adam314
Comment Utility
I'm not exactly sure what you are asking... but to split the strings in the way you described, here is how you'd do that:
##### Your reading frame 0

my $str =

   'SAEVIHQVEEALDTDEKEMLRDVAIDVVPPNVRDLALVELDILRERGKLS'

  .'VGDLAELLYRVRRFDLLKRILKMDRKAVETHLLRNPHLVSDYRVLMAEIG'

  .'EDLDKSDVSSLIFLMKDYMGRGKISKEKSFLDLVVELEKLNLVAPDQLDL'

  .'LEKCLKNIHRIDLKTKIQKYKQSVQGAGTSYRNVLQAAIQKSLKDPSNNF'

  .'RLHNGRSKEQRLKEQLGAQQEPVKKSIQESEAFLPQSIPEERYKMKSKPL'

  .'GICLIIDCIGNETELLRDTFTSLGYEVQKFLHLSMHGISQILGQFACMPE'

  .'HRDYDSFVCVLVSRGGSQSVYGVDQTHSGLPLHHIRRMFMGDSCPYLAGK'

  .'PKMFFIQNYVVSEGQLEDSSLLEVDGPAMKNVEFKAQKRGLCTVHREADF'

  .'FWSLCTADMSLLEQSHSSPSLYLQCLSQKLRQERKRPLLDLHIELNGYMY'

  .'DWNSRVSAKEKYYVWLQHTLRKKLILSYT';
 

##### split on K or R, but not if the next character is P

##### Keep the K or R as part of the returned string

my @parts = split(/(?<=[KR])(?!P)/, $str);
 

##### print results

print join("\n", @parts) . "\n";

Open in new window

0
 

Author Comment

by:StephenMcGowan
Comment Utility
Hi Adam,

Basically i have a block of 6 sequences ($orfprotein1-6).
The user can define which enzyme (choice of 4) they want to use on these 6 sequences.
Each enzyme cuts at a different place:

TRYPSIN - cuts at "K" or "R" unless the next amino acid is "P"
ENDOPROTL - cuts at "K" unless the next amino acid is "P"
ENDOPROTA - cuts at R unless the next amino acid is Proline P
V8PROT- cuts at E unless the next amino acid is P

so depending on the user choosing a certain radio button on a HTML form, a different action will be carried out on the 6 sequences and the fragments will go into a list, for example:

SAEVIHQVEEALDTDEK
EMLR
DVAIDVVPPNVRDLALVELDILRERGK
LSVGDLAELLYR
VRRFDLLK
RILK
MDRK
AVETHLLR
0
 
LVL 39

Expert Comment

by:Adam314
Comment Utility
You would use the same code, but with an if block to check what the user entered.
##### Check which radio the user selected

my $re;

if   ($cgi->param('radio') eq  'first') { $re=qr/(?<=[KR])(?!P)/; }

elsif($cgi->param('radio') eq 'second') { $re=qr/(?<=K)(?!P)/; }

elsif($cgi->param('radio') eq  'third') { $re=qr/(?<=R)(?!P)/; }

elsif($cgi->param('radio') eq 'fourth') { $re=qr/(?<=E)(?!P)/; }

else {die "Unknown radio selection\n";}
 
 

##### process all sequences

my @parts;

foreach my $seq (@orfprotein) {

    push @parts, split($re, $seq);

}
 

##### print results

print join("\n", @parts) . "\n";

Open in new window

0
 

Author Comment

by:StephenMcGowan
Comment Utility
So i need to go from

Step 1.
User choosing radio button on form
Step 2.
Some way of this choice activating the specified action through CGI, in the case if Trypsin selected:

#if Trypsin selected as enzyme, cleave sequences at K and R
if my $Trypsin = $query->param('Trypsin');

my @fragments1 = split(/(?<=[KR])(?!P)/, $proteinorf1);
my @fragments2 = split(/(?<=[KR])(?!P)/, $proteinorf2);
my @fragments3 = split(/(?<=[KR])(?!P)/, $proteinorf3);
my @fragments4 = split(/(?<=[KR])(?!P)/, $proteinorf4);
my @fragments5 = split(/(?<=[KR])(?!P)/, $proteinorf5);
my @fragments6 = split(/(?<=[KR])(?!P)/, $proteinorf6);

and then join each of the arrays together into one long list.

If this is any clearer?
0
 
LVL 39

Expert Comment

by:Adam314
Comment Utility

********** If your HTML is like this:

<input type="radio" name="enzyme" value="TRYPSIN" />TRYPSIN<br />

<input type="radio" name="enzyme" value="ENDOPROTL" />ENDOPROTL<br />

<input type="radio" name="enzyme" value="ENDOPROTA" />ENDOPROTA<br />

<input type="radio" name="enzyme" value="V8PROT" />V8PROT<br />
 
 
 

********** Then your perl needs to be like this:

my $re;

if   ($cgi->param('enzyme') eq   'TRYPSIN') { $re=qr/(?<=[KR])(?!P)/; }

elsif($cgi->param('enzyme') eq 'ENDOPROTL') { $re=qr/(?<=K)(?!P)/; }

elsif($cgi->param('enzyme') eq 'ENDOPROTA') { $re=qr/(?<=R)(?!P)/; }

elsif($cgi->param('enzyme') eq    'V8PROT') { $re=qr/(?<=E)(?!P)/; }

else {die "Unknown enzyme selection\n";}
 
 

********** To break all proteins, and put then in the same array

my @parts;

foreach my $seq ($proteinorf1,$proteinorf2,$proteinorf3,$proteinorf4,$proteinorf5,$proteinorf6) {

    push @parts, split($re, $seq);

}
 
 

#Now, @parts contains everything

Open in new window

0
 

Author Comment

by:StephenMcGowan
Comment Utility
Sorry Adam,

Just confused by this part:

##### process all sequences
my @parts;
foreach my $seq (@orfprotein) {
    push @parts, split($re, $seq);

to process all sequences, the sequences i have are $orfprotein1 through to 6. would these replace $seq, or keep "my $seq (@orfprotein) {" as you've left it?
0
 

Author Comment

by:StephenMcGowan
Comment Utility
Hmm.. doesn't seem to be working, receiving Internal Server Error when i try it out.

Here's the script i've entered:
----------------------HTML------------------------
    <label>
      <input type='radio' name='enzyme' value='TRYPSIN' id='TRYPSIN'
/>
      Trypsin</label>
    <br />
    <label>
      <input type='radio' name='enzyme' value='ENDOPROTL' id='ENDOPROTL' />
      Endoproteinase Lys-C</label>
    <br />
    <label>
      <input type='radio' name='enzyme' value='ENDOPROTA' id='ENDOPROTA' />
      Endoproteinase Arg-C</label>
    <br />
    <label>
      <input type='radio' name='enzyme' value='V8PROT' id='V8PROT' />
      V8 proteinase (Glu-C)</label><br>

---------------------------PERL-------------------------------

# Select an enzyme from the radio buttons on form
my $re;
if   ($cgi->param('enzyme') eq   'TRYPSIN') { $re=qr/(?<=[KR])(?!P)/; }
elsif($cgi->param('enzyme') eq 'ENDOPROTL') { $re=qr/(?<=K)(?!P)/; }
elsif($cgi->param('enzyme') eq 'ENDOPROTA') { $re=qr/(?<=R)(?!P)/; }
elsif($cgi->param('enzyme') eq    'V8PROT') { $re=qr/(?<=E)(?!P)/; }
else {die "Unknown enzyme selection\n";}


To break all proteins, and put then in the same array
my @parts;
foreach my $seq
($proteinorf1,$proteinorf2,$proteinorf3,$proteinorf4,$proteinorf5,$proteinorf6)
{
    push @parts, split($re, $seq);
}
# Now, @parts contains everything

---------------------------------------------------------------

#HTML OUTPUT

print "Content-type: text/html

<html>
<title>Page 2</title>
<body>
Reading Frame 1: $orfprotein1;<br>
Reading Frame 2: $orfprotein2;<br>
Reading Frame 3: $orfprotein3;<br>
Reading Frame 4: $orfprotein4;<br>
Reading Frame 5: $orfprotein5;<br>
Reading Frame 6: $orfprotein6;<br><br>
Fragments:<br>
@parts;

</body>
</html>

";
0
 

Author Comment

by:StephenMcGowan
Comment Utility
have since modified perl script,
my @parts;
foreach my $seq ($orfprotein1,$orfprotein2,$orfprotein3,$orfprotein4,$orfprotein5,$orfprotein6) {
    push @parts, split($re, $seq);
}
without any joy. :o/

0
 

Author Comment

by:StephenMcGowan
Comment Utility
and "To break all proteins, and put then in the same array" now reads:
"# To break all proteins, and put then in the same array"
0
 
LVL 39

Expert Comment

by:Adam314
Comment Utility
This code:
    foreach my $seq (@orfprotein) {
was when I assumed all of your proteins were in an array.  For processing them as individual variables, use this:
    foreach my $seq ($proteinorf1,$proteinorf2,$proteinorf3,$proteinorf4,$proteinorf5,$proteinorf6) {


In your html output, are you seeing literally @parts, instead of it's content?  If so, how are you generating the html output?  Try this:

print $cgi->header;

print "<html><body><pre>\n";

print join("\n", @parts) . "\n";

print "</pre></body></html>\n";

Open in new window

0
 

Author Comment

by:StephenMcGowan
Comment Utility
Hi Adam,

Sorry, how would:

print $cgi->header;
print "<html><body><pre>\n";
print join("\n", @parts) . "\n";
print "</pre></body></html>\n";


fit into this:


print "Content-type: text/html

<html>
<title>Page 2</title>
<body>
Reading Frame 1: $orfprotein1;<br>
Reading Frame 2: $orfprotein2;<br>
Reading Frame 3: $orfprotein3;<br>
Reading Frame 4: $orfprotein4;<br>
Reading Frame 5: $orfprotein5;<br>
Reading Frame 6: $orfprotein6;<br><br>

</body>
</html>

";
0
 

Author Comment

by:StephenMcGowan
Comment Utility
In your html output, are you seeing literally @parts, instead of it's content?  If so, how are you generating the html output?

No, i just receive an internal server error, i don't manage to see any output, just an error page.
0
 
LVL 39

Expert Comment

by:Adam314
Comment Utility
Do you have access to your error logs?  If so, check them.  They will provide you with a more detailed error message.
0
 

Author Comment

by:StephenMcGowan
Comment Utility
# Select an enzyme from the radio buttons on form
my $re;
if   ($cgi->param('enzyme') eq   'TRYPSIN') { $re=qr/(?<=[KR])(?!P)/; }
elsif($cgi->param('enzyme') eq 'ENDOPROTL') { $re=qr/(?<=K)(?!P)/; }
elsif($cgi->param('enzyme') eq 'ENDOPROTA') { $re=qr/(?<=R)(?!P)/; }
elsif($cgi->param('enzyme') eq    'V8PROT') { $re=qr/(?<=E)(?!P)/; }
else {die "Unknown enzyme selection\n";}

Global symbol "$cgi" requires explicit package name at ./proteindigest.pl line 93
0
 

Author Comment

by:StephenMcGowan
Comment Utility
use CGI; # a predefined module
my $query = new CGI;

is what i'm using at the start of the script.
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 39

Expert Comment

by:Adam314
Comment Utility
I assumed:
    use CGI; # a predefined module
    my $cgi = CGI->new;

Either change your $query to $cgi (and change any other uses of $query to $cgi), or change $cgi to $query in what I posted.
0
 

Author Comment

by:StephenMcGowan
Comment Utility
I've changed the $cgi's to $query. I've also ran my script through the perl unix command. This doesn't involve data input, as it's currently set up for textbox via html and also contains the radio button commands.

I've attached the compiling errors, you'll see they're all related to there being no input, and so mostly uninitialized value errors.

I'll jot down my updated code:

----HTML FORM-----
#!/usr/bin/perl
# assessment.pl   # the name of the file

print "Content-type: text/html


<form id='form1' name='form1' method='post' action='ORFfinder.pl'>
  <label>Enter a DNA Sequence:<br>
    <textarea name='dna-textbox' id='dna-textbox' cols='45' rows='5'></textarea>
  </label>
  <p>Please select an enzyme:</p>
  <p>
    <label>
      <input type='radio' name='enzyme' value='TRYPSIN' id='TRYPSIN'
/>
      Trypsin</label>
    <br />
    <label>
      <input type='radio' name='enzyme' value='ENDOPROTL' id='ENDOPROTL' />
      Endoproteinase Lys-C</label>
    <br />
    <label>
      <input type='radio' name='enzyme' value='ENDOPROTA' id='ENDOPROTA' />
      Endoproteinase Arg-C</label>
    <br />
    <label>
      <input type='radio' name='enzyme' value='V8PROT' id='V8PROT' />
      V8 proteinase (Glu-C)</label><br>
  </p>
  <p>
    <input type='submit' name='button' id='button' value='Submit' />
     <input type='reset' name='Reset' id='Reset' value='Reset' />
       <input type='button' name='Help' id='Help' value='Help' />
       <input type='button' name='Upload' id='Upload' value='Upload' />
    <br />
  </p>
</form>
";

----HTML------

# Select an enzyme from the radio buttons on form
my $re;
if   ($query->param('enzyme') eq   'TRYPSIN') { $re=qr/(?<=[KR])(?!P)/; }
elsif($query->param('enzyme') eq 'ENDOPROTL') { $re=qr/(?<=K)(?!P)/; }
elsif($query->param('enzyme') eq 'ENDOPROTA') { $re=qr/(?<=R)(?!P)/; }
elsif($query->param('enzyme') eq    'V8PROT') { $re=qr/(?<=E)(?!P)/; }
else {die "Unknown enzyme selection\n";}

# To cleave all proteins, and put then in the same array
my @parts;
foreach my $seq ($orfprotein1,$orfprotein2,$orfprotein3,$orfprotein4,$orfprotein5,$orfprotein6) {
    push @parts, split($re, $seq);
}

# Now, @parts contains everything

#HTML OUTPUT

<html>
<title>Page 2</title>
<body>
Reading Frame 1: $orfprotein1;<br>
Reading Frame 2: $orfprotein2;<br>
Reading Frame 3: $orfprotein3;<br>
Reading Frame 4: $orfprotein4;<br>
Reading Frame 5: $orfprotein5;<br>
Reading Frame 6: $orfprotein6;<br><br>
Fragments:<br>
</body>
</html>
";

picoreadout1.jpg
0
 
LVL 39

Expert Comment

by:Adam314
Comment Utility
So everything works through your web browser then?

If not, you can add this to the top of your script (right after the #! line) :
    use CGI::Carp 'fatalsToBrowser';
It'll cause more detailed error messages to go to your browser, if possible.  If you still get the generic "Internal Server Error" error message, and you can't look in your error log, some common causes are:
    1) Your #! line not pointing to the correct perl binary
    2) The file not having read and execute permission for the web server user (use chmod 755 filename.pl)
    3) File having incorrect line endings (windows line endings on unix) - FTP file in ASCII mode to fix, or use dos2unix program
0
 

Author Comment

by:StephenMcGowan
Comment Utility
Hey Adam,

i went for the  use CGI::Carp 'fatalsToBrowser'; approach and it's pulled up this software error:

Software error:
Can't locate object method "new" via package "CGI" at /home/march09/campus12sm/public_html/ORFfinder.pl line 14.

Is this a method in the CGI module which i don't have?

If this is the case i'm guessing the programme falls over at this line:

my $query = new CGI;

This is in my ORFfinder.pl script

In this script the query takes the dna sequence from a textbox and feeds it in as string -> $dna1

$dna1 = $query->param('dna-textbox');
0
 

Author Comment

by:StephenMcGowan
Comment Utility
I think:

"Software error:
Can't locate object method "new" via package "CGI" at /home/march09/campus12sm/public_html/ORFfinder.pl line 14."

was generated because i removed "use cgi;" and replaced it with "use CGI::Carp 'fatalsToBrowser';"

i've put "use cgi;" back in it's place again and received the error:

Software error:
Can't locate cgi.pm in @INC
(@INCcontains: /etc/perl /usr/local/lib/perl/5.8.7 /usr/local/share/perl/5.8.7 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl /usr/local/lib/perl/5.8.4 /usr/local/share/perl/5.8.4 .) at /home/march09/campus12sm/public_html/ORFfinder.pl line 11.
BEGIN failed--compilation aborted at /home/march09/campus12sm/public_html/ORFfinder.pl line 11.

0
 

Author Comment

by:StephenMcGowan
Comment Utility
I see it as strange as:

Software error:
Can't locate cgi.pm in @INC
(@INCcontains: /etc/perl /usr/local/lib/perl/5.8.7 /usr/local/share/perl/5.8.7 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl /usr/local/lib/perl/5.8.4 /usr/local/share/perl/5.8.4 .) at /home/march09/campus12sm/public_html/ORFfinder.pl line 11.
BEGIN failed--compilation aborted at /home/march09/campus12sm/public_html/ORFfinder.pl line 11.

seems to indicate CGI.pm hasn't been installed, but i have previously used CGI to take form data from text box to perl script and it's worked fine? It seems strange?

Here's the CGI i have used so far for both my scripts:

-------Script1 ORFfinder.pl--------

use CGI::Carp 'fatalsToBrowser';
use cgi;
my $query = new CGI;

#Text box info to perl
$dna1 = $query->param('dna-textbox');

-------Script1 proteindigest.pl--------

use CGI::Carp 'fatalsToBrowser';
use cgi;
my $query = new CGI;

# Select an enzyme from the radio buttons on form
my $re;
if   ($query->param('enzyme') eq   'TRYPSIN') { $re=qr/(?<=[KR])(?!P)/; }
elsif($query->param('enzyme') eq 'ENDOPROTL') { $re=qr/(?<=K)(?!P)/; }
elsif($query->param('enzyme') eq 'ENDOPROTA') { $re=qr/(?<=R)(?!P)/; }
elsif($query->param('enzyme') eq    'V8PROT') { $re=qr/(?<=E)(?!P)/; }
else {die "Unknown enzyme selection\n";}

This is all of the CGI code for my two scripts. I'm not sure if i need to call my $query = new CGI; twice,
you'll see i have called it in each script
0
 

Author Comment

by:StephenMcGowan
Comment Utility
I think i've sorted most of it now, i was calling "use cgi;" but changed this to "use CGI;"

I'm now having problems displaying @parts;

my current html output reads:

#HTML OUTPUT

<html>
<title>Page 2</title>
<body>
Reading Frame 1: $orfprotein1;<br>
Reading Frame 2: $orfprotein2;<br>
Reading Frame 3: $orfprotein3;<br>
Reading Frame 4: $orfprotein4;<br>
Reading Frame 5: $orfprotein5;<br>
Reading Frame 6: $orfprotein6;<br><br>
Fragments:<br> #<- below fragments would be a long list of the fragments (@parts)
(i.e. print join("\n", @parts) . "\n";)
</body>
</html>
";
 
i've tried:
Fragments:<br>
print join("\n", @parts) . "\n";         <- within the body

and receive the error:
syntax error at ./proteindigest.pl line 121, near "join("\"
Execution of ./proteindigest.pl aborted due to compilation errors.
0
 
LVL 39

Expert Comment

by:Adam314
Comment Utility
To generate the HTML, use this.  Remove all your existing code that is displaying html.
print $cgi->header;
 

print "<html><body>\n";

print "Reading Frame 1: $orfprotein1;<br>\n";

print "Reading Frame 2: $orfprotein2;<br>\n";

print "Reading Frame 3: $orfprotein3;<br>\n";

print "Reading Frame 4: $orfprotein4;<br>\n";

print "Reading Frame 5: $orfprotein5;<br>\n";

print "Reading Frame 6: $orfprotein6;<br><br>\n";

print "Parts:<br>\n";

print join("<br>\n", @parts) . "<br>\n";

print "</body></html>\n";

Open in new window

0
 

Author Comment

by:StephenMcGowan
Comment Utility
Hi Adam,

I have output! buuuuut.... it seems that no matter which enzyme i select, it cleaves after every letter!
giving a long list of letters!

If i select Trypsin, it should only cut after letters K and R

so:
SAEVIHQVEEALDTDEK
EKEMLR
DEK
etc etc...

here's the cutting code!

# Select an enzyme from the radio buttons on form
my $re;
if   ($query->param('enzyme') eq   'TRYPSIN') { $re=qr/(?<=[KR])(?!P)/; }
elsif($query->param('enzyme') eq 'ENDOPROTL') { $re=qr/(?<=K)(?!P)/; }
elsif($query->param('enzyme') eq 'ENDOPROTA') { $re=qr/(?<=R)(?!P)/; }
elsif($query->param('enzyme') eq    'V8PROT') { $re=qr/(?<=E)(?!P)/; }
else {die "Unknown enzyme selection\n";}


# To cleave all proteins, and put then in the same array
my @parts;
foreach my $seq ($orfprotein1,$orfprotein2,$orfprotein3,$orfprotein4,$orfprotein5,$orfprotein6) {
    push @parts, split($re, $seq);
}

# Now, @parts contains everything

Thanks for everything so far by the way Adam,

Much appreciated!

Stephen.
output1.jpg
0
 
LVL 39

Accepted Solution

by:
Adam314 earned 500 total points
Comment Utility
Try this code, run it in a script from the command line.  Try setting $enzyme to all of the possible values.  What is your output?  Is it what you wanted?

#!/usr/bin/perl

use strict;

use warnings;

use Data::Dumper;

use XML::Simple;

$Data::Dumper::Indent = 1;
 

my $orfprotein1 = 

   'SAEVIHQVEEALDTDEKEMLRDVAIDVVPPNVRDLALVELDILRERGKLS'

  .'VGDLAELLYRVRRFDLLKRILKMDRKAVETHLLRNPHLVSDYRVLMAEIG'

  .'EDLDKSDVSSLIFLMKDYMGRGKISKEKSFLDLVVELEKLNLVAPDQLDL'

  .'LEKCLKNIHRIDLKTKIQKYKQSVQGAGTSYRNVLQAAIQKSLKDPSNNF'

  .'RLHNGRSKEQRLKEQLGAQQEPVKKSIQESEAFLPQSIPEERYKMKSKPL'

  .'GICLIIDCIGNETELLRDTFTSLGYEVQKFLHLSMHGISQILGQFACMPE'

  .'HRDYDSFVCVLVSRGGSQSVYGVDQTHSGLPLHHIRRMFMGDSCPYLAGK'

  .'PKMFFIQNYVVSEGQLEDSSLLEVDGPAMKNVEFKAQKRGLCTVHREADF'

  .'FWSLCTADMSLLEQSHSSPSLYLQCLSQKLRQERKRPLLDLHIELNGYMY'

  .'DWNSRVSAKEKYYVWLQHTLRKKLILSYT';
 
 

my $enzyme = 'V8PROT';
 

my $re;

if   ($enzyme eq   'TRYPSIN') { $re=qr/(?<=[KR])(?!P)/; }

elsif($enzyme eq 'ENDOPROTL') { $re=qr/(?<=K)(?!P)/; }

elsif($enzyme eq 'ENDOPROTA') { $re=qr/(?<=R)(?!P)/; }

elsif($enzyme eq    'V8PROT') { $re=qr/(?<=E)(?!P)/; }

else {die "Unknown enzyme selection\n";}
 

my @parts;

push @parts, split($re, $orfprotein1);
 

print "Parts:\n    " . join("\n    ", @parts) . "\n";

Open in new window

0
 

Author Comment

by:StephenMcGowan
Comment Utility
I've basically copied/pasted the above script into a new file "testenzyme.pl" and tried running it with perl.

I'm not sure if i have XML installed?

errored: "Can't locate XML/Simple.pm in @INC"

:o/
0
 
LVL 39

Expert Comment

by:Adam314
Comment Utility
Remove line 5... that was left over from another question.
0
 

Author Comment

by:StephenMcGowan
Comment Utility
Yep! exactly like that! so it would do the same for all six $orfproteins and return a long list,

Just like that!

:oD
0
 

Author Comment

by:StephenMcGowan
Comment Utility
This is the call i've tried from the html-form.. determining the button chosen and it's action:

i took your script above and went for the "$my enzyme" approach.

my $enzyme = $query->param('enzyme');

I'm not sure if the param field ('enzyme') is correct.

HTML FORM BUTTON:

<input type='radio' name='enzyme' value='TRYPSIN' id='TRYPSIN'
/>
my $enzyme = $query->param('enzyme');
 

# Select an enzyme from the radio buttons on form

my $re;

if   ($enzyme eq   'TRYPSIN') { $re=qr/(?<=[KR])(?!P)/; }

elsif($enzyme eq 'ENDOPROTL') { $re=qr/(?<=K)(?!P)/; }

elsif($enzyme eq 'ENDOPROTA') { $re=qr/(?<=R)(?!P)/; }

elsif($enzyme eq    'V8PROT') { $re=qr/(?<=E)(?!P)/; }

else {die "Unknown enzyme selection\n";}
 
 

# To cleave all proteins, and put then in the same array

my @parts;

foreach my $seq ($orfprotein1,$orfprotein2,$orfprotein3,$orfprotein4,$orfprotein5,$orfprotein6) {

    push @parts, split($re, $seq);

}
 

# Now, @parts contains everything

Open in new window

0
 

Author Comment

by:StephenMcGowan
Comment Utility
Think i'm onto something here, but hit another stumbling block!

I submit the sequence in text box, hit a protein radio button, hit submit but receive the error message:

Software error:
Not a CODE reference at ./proteindigest.pl line 87.

which is the line:

my $enzyme = $query->param('enzyme');

~~~html radio button~~~

<input type='radio' name='enzyme' value='TRYPSIN' id='TRYPSIN'
/>

so trying to link the two but hitting the CODE reference error
0
 
LVL 39

Expert Comment

by:Adam314
Comment Utility
Do you have this working now?  I'm assuming so, since you've accepted a post.  If not, let me know.
0
 

Author Comment

by:StephenMcGowan
Comment Utility
Yeah, i think i got it up and running now.. only problems i'm having now are on my other posts, in the end i put the two scripts into one extended script and made it work that way, was having trouble calling different cgi elements in two separate scripts.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Alert not firing 2 32
Suggestion on WebSite Template Sites 5 49
Bootstrap 3 and Angular 2 12 20
Glyph icons in Bootstrap 3 4 9
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
This is a PowerShell web interface I use to manage some task as a network administrator. Clicking an action button on the left frame will display a form in the middle frame to input some data in textboxes, process this data in PowerShell and display…
In this Micro Tutorial viewers will learn how to create navigation buttons that change on rollover, using CSS (Continuation of the CSS Image Sprite tutorial) Create a parent ID for all the list items       - Specify position: absolute and display: block…
In this tutorial viewers will learn how to code links for mobile sites that, once clicked, send a call or text to a specified number. For a telephone link (once clicked, calls a number), begin with a normal "<a href=" link tag. For the href, specify…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now