Link to home
Start Free TrialLog in
Avatar of thestarcrossed
thestarcrossed

asked on

Perl, print staments, print to screen , print to file problems

perl newbie

I am a newbie.
I know some of the basics of Perl programming.
I now have to create a code taht will
open a file
read from file
remove the header
print the header less portion to a new file.
This is my fourthweek with perl and I can handle STDIN
My question: where in the code should I place my "print to file" statement?
I guess my understanding of the "flow" of the program is messed up.
Where should I declare the variables with "my"?
At the ver
beginning? Outside the subroutine?



Here is the code:


print "PLEASE ENTER THE FILENAME OF THE YOUR SEQUENCE:=";
chomp($seq_filename=<STDIN>);
#
open(PROTFILE,$seq_filename) or die "unable to open the file";
@seq=<PROTFILE>;
close PROTFILE;
#
#
foreach $newline (@seq) {
#
## discard blank newline
if ($newline =~ /^\s*$/) {
next;

## discard comment newline
} elsif($newline =~ /^\s*/) {
next;

# discard fasta header newline
} elsif($newline =~ /^>/) {
next;

## keep newline, add to sequence string
} else {
$sequence1 .= $newline;
}
#
}

# remove non-sequence data (in this case, whitespace) from $sequence string
 #Remove whitespace
$newline =~ s/\s//g;
@seq=split("",$newline); #splits string into an array

print " \nThe original file is:\n$sequence1 \n";


SOLUTION
Avatar of Jason Minton
Jason Minton
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of thestarcrossed
thestarcrossed

ASKER

Hello Jason,
Thank you for the code.
The trouble is that when I used it yesterday and today,
it game me this error message.

Filehandle NEWFILE opened only for input at nonseqdat.pl line 63.


Thank you for your prompt response!
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
yes, i forgot the '>'

Do what Talmash says...  Or if you want to open for append use, '>>'
I guess there is aproblem with the logic flow in my problem.

I used the ">>" and yet its giving me error messages. I give up for now.
Avatar of ozo
what error messages is it giving you?
You probably don't want to use ">>" instead use ">"  I was just letting you know that ">>" is available if you want to append to the file later.

What is the error?
} elsif($newline =~ /^\s*/) {
this will always match, since there will always be at least zero whitespace characters at the stat of the line
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I'd approach it like:

^\s*

Means match zero or more whitespace from the beginning of the line.  This will match *all* lines.

What do your comment lines begin with?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Error messages:
"my" variable $input masks earlier declaration in same scope at codon3.pl line 1
8.
Name "main::INFILE" used only once: possible typo at codon3.pl line 112.
PLEASE ENTER THE FILENAME OF THE YOUR SEQUENCE:=25na.pep
25na.pep
Use of uninitialized value in substr at codon3.pl line 33, <STDIN> line 2.
Use of uninitialized value in substr at codon3.pl line 33, <STDIN> line 2.
codon2aaTCASTCCSTCGSTCTSTTCFTTTFTTALTTGLTACYTATYTAA_TAG_TGCCTGTCTGA_TGGWCTALCTCL
CTGLCTTLCCAPCCCPCCGPCCTPCACHCATHCAAQCAGQCGARCGCRCGGRCGTRATAIATCIATTIATGMACATACCT
ACGTACTTAACNAATNAAAKAAGKAGCSAGTSAGARAGGRGTAVGTCVGTGVGTTVGCAAGCCAGCGAGCTAGACDGATD
GAAEGAGEGGAGGGCGGGGGGGTGUse of uninitialized value in concatenation (.) or strin
g at codon3.pl line 110, <STDIN> line 2.
I translated the sequence



 into the protein

128

Use of uninitialized value in print at codon3.pl line 111, <STDIN> line 2.

C:\Perl>



My code:#!/usr/bin/perl
use strict;
print "PLEASE ENTER THE FILENAME OF THE YOUR SEQUENCE:=";
chomp(my $input=<STDIN>);

open PROTFILE,$input  or die "Unable to open $input $!\n";
open NEWFILE, ">newfile.dat" or die "Can not open /path/to/newfile.dat $!\n";
#open (OUTFILE,">>$myfile");
while (<PROTFILE>) {
  next if (/^\s*$|^>/);
  next if (/^\s*/);

  s/\s+//g;
  print NEWFILE "$_\n";
}
use strict;
use warnings;
my $input;
my @newarray1;
my $newline;
my $i;
my $myfile;
my $protein;
my $codon;
my $myresults;
my %genetic_code;
my $sequence;
$myfile = 'codon.txt';
$input = <STDIN>;
open (PROTFILE, $input);
open (OUTFILE,">>$myfile");
while($input){
$codon = substr($newline,$i,3);
$protein .= codon2aa($codon);
 ##calling the sub routine
## codon2aa
# # A subroutine to translate a sequence 3-character codon to an amino acid
# Version 3, using hash lookup
print "codon2aa";
sub codon2aa {
my($codon) = @newarray1;
$codon = uc ($codon);
 %genetic_code = (
'TCA' => 'S', # Serine
'TCC' => 'S', # Serine
'TCG' => 'S', # Serine
'TCT' => 'S', # Serine
'TTC' => 'F', # Phenylalanine
'TTT' => 'F', # Phenylalanine
'TTA' => 'L', # Leucine
'TTG' => 'L', # Leucine
'TAC' => 'Y', # Tyrosine
'TAT' => 'Y', # Tyrosine
'TAA' => '_', # Stop
'TAG' => '_', # Stop
'TGC' => 'C', # Cysteine
'TGT' => 'C', # Cysteine
'TGA' => '_', # Stop
'TGG' => 'W', # Tryptophan
'CTA' => 'L', # Leucine
'CTC' => 'L', # Leucine
'CTG' => 'L', # Leucine
'CTT' => 'L', # Leucine
'CCA' => 'P', # Proline
'CCC' => 'P', # Proline
'CCG' => 'P', # Proline
'CCT' => 'P', # Proline
'CAC' => 'H', # Histidine
'CAT' => 'H', # Histidine
'CAA' => 'Q', # Glutamine
'CAG' => 'Q', # Glutamine
'CGA' => 'R', # Arginine
'CGC' => 'R', # Arginine
'CGG' => 'R', # Arginine
'CGT' => 'R', # Arginine
'ATA' => 'I', # Isoleucine
'ATC' => 'I', # Isoleucine
'ATT' => 'I', # Isoleucine
'ATG' => 'M', # Methionine
'ACA' => 'T', # Threonine
'ACC' => 'T', # Threonine
'ACG' => 'T', # Threonine
'ACT' => 'T', # Threonine
'AAC' => 'N', # Asparagine
'AAT' => 'N', # Asparagine
'AAA' => 'K', # Lysine
'AAG' => 'K', # Lysine
'AGC' => 'S', # Serine
'AGT' => 'S', # Serine
'AGA' => 'R', # Arginine
'AGG' => 'R', # Arginine
'GTA' => 'V', # Valine
'GTC' => 'V', # Valine
'GTG' => 'V', # Valine
'GTT' => 'V', # Valine
'GCA' => 'A', # Alanine
'GCC' => 'A', # Alanine
'GCG' => 'A', # Alanine
'GCT' => 'A', # Alanine
'GAC' => 'D', # Aspartic Acid
'GAT' => 'D', # Aspartic Acid
'GAA' => 'E', # Glutamic Acid
'GAG' => 'E', # Glutamic Acid
'GGA' => 'G', # Glycine
'GGC' => 'G', # Glycine
'GGG' => 'G', # Glycine
'GGT' => 'G', # Glycine
);} print OUTFILE codon2aa ;
print codon2aa;
print "I translated the sequence\n\n$sequence\n\n into the protein\n\n$protein\n\n";
 print OUTFILE $myresults;
 close INFILE;
close OUTFILE;
exit;   }

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You can't just concatenate  two scripts together and hope it will work.  My script was based on the original code you supplied, I didn't realise you had a whole lot of other stuff in the script as well.

You declare my $input twice
you never open INFILE, so what are you closing?
you never assign a value to  $myresults or to $sequence
you never assign any value to $newline or $i, so what should substr($newline,$i,3) do?
So that was it.. I will work on these, OZO, Tintin.
Thank you so much..
I tried learning  this on my own.

ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial