asked on

Perl, print staments, print to screen , print to file problems

perl newbie

I am a newbie.
I know some of the basics of Perl programming.
I now have to create a code taht will
open a file
read from file
remove the header
print the header less portion to a new file.
This is my fourthweek with perl and I can handle STDIN
My question: where in the code should I place my "print to file" statement?
I guess my understanding of the "flow" of the program is messed up.
Where should I declare the variables with "my"?
At the ver
beginning? Outside the subroutine?

Here is the code:

print "PLEASE ENTER THE FILENAME OF THE YOUR SEQUENCE:=";
chomp($seq_filename=<STDIN>);
#
open(PROTFILE,$seq_filename) or die "unable to open the file";
@seq=<PROTFILE>;
close PROTFILE;
#
#
foreach $newline (@seq) {
#
## discard blank newline
if ($newline =~ /^\s*$/) {
next;

## discard comment newline
} elsif($newline =~ /^\s*/) {
next;

# discard fasta header newline
} elsif($newline =~ /^>/) {
next;

## keep newline, add to sequence string
} else {
$sequence1 .= $newline;
}
#
}

# remove non-sequence data (in this case, whitespace) from $sequence string
#Remove whitespace
$newline =~ s/\s//g;
@seq=split("",$newline); #splits string into an array

print " \nThe original file is:\n$sequence1 \n";

SOLUTION

Jason Minton

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

thestarcrossed

ASKER

Hello Jason,
Thank you for the code.
The trouble is that when I used it yesterday and today,
it game me this error message.

Filehandle NEWFILE opened only for input at nonseqdat.pl line 63.

Thank you for your prompt response!

SOLUTION

Talmash

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Jason Minton

yes, i forgot the '>'

Do what Talmash says... Or if you want to open for append use, '>>'

thestarcrossed

ASKER

I guess there is aproblem with the logic flow in my problem.

I used the ">>" and yet its giving me error messages. I give up for now.

ozo

what error messages is it giving you?

Jason Minton

You probably don't want to use ">>" instead use ">" I was just letting you know that ">>" is available if you want to append to the file later.

What is the error?

ozo

} elsif($newline =~ /^\s*/) {
this will always match, since there will always be at least zero whitespace characters at the stat of the line

SOLUTION

ozo

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Tintin

I'd approach it like:

^\s*

Means match zero or more whitespace from the beginning of the line. This will match *all* lines.

What do your comment lines begin with?

SOLUTION

Tintin

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

thestarcrossed

ASKER

Error messages:
"my" variable $input masks earlier declaration in same scope at codon3.pl line 1
8.
Name "main::INFILE" used only once: possible typo at codon3.pl line 112.
PLEASE ENTER THE FILENAME OF THE YOUR SEQUENCE:=25na.pep
25na.pep
Use of uninitialized value in substr at codon3.pl line 33, <STDIN> line 2.
Use of uninitialized value in substr at codon3.pl line 33, <STDIN> line 2.
codon2aaTCASTCCSTCGSTCTSTTCFTTTFTTALTTGLTACYTATYTAA_TAG_TGCCTGTCTGA_TGGWCTALCTCL
CTGLCTTLCCAPCCCPCCGPCCTPCACHCATHCAAQCAGQCGARCGCRCGGRCGTRATAIATCIATTIATGMACATACCT
ACGTACTTAACNAATNAAAKAAGKAGCSAGTSAGARAGGRGTAVGTCVGTGVGTTVGCAAGCCAGCGAGCTAGACDGATD
GAAEGAGEGGAGGGCGGGGGGGTGUse of uninitialized value in concatenation (.) or strin
g at codon3.pl line 110, <STDIN> line 2.
I translated the sequence

into the protein

128

Use of uninitialized value in print at codon3.pl line 111, <STDIN> line 2.

C:\Perl>

My code:#!/usr/bin/perl
use strict;
print "PLEASE ENTER THE FILENAME OF THE YOUR SEQUENCE:=";
chomp(my $input=<STDIN>);

open PROTFILE,$input or die "Unable to open $input $!\n";
open NEWFILE, ">newfile.dat" or die "Can not open /path/to/newfile.dat $!\n";
#open (OUTFILE,">>$myfile");
while (<PROTFILE>) {
next if (/^\s*$|^>/);
next if (/^\s*/);

s/\s+//g;
print NEWFILE "$_\n";
}
use strict;
use warnings;
my $input;
my @newarray1;
my $newline;
my $i;
my $myfile;
my $protein;
my $codon;
my $myresults;
my %genetic_code;
my $sequence;
$myfile = 'codon.txt';
$input = <STDIN>;
open (PROTFILE, $input);
open (OUTFILE,">>$myfile");
while($input){
$codon = substr($newline,$i,3);
$protein .= codon2aa($codon);
##calling the sub routine
## codon2aa
# # A subroutine to translate a sequence 3-character codon to an amino acid
# Version 3, using hash lookup
print "codon2aa";
sub codon2aa {
my($codon) = @newarray1;
$codon = uc ($codon);
%genetic_code = (
'TCA' => 'S', # Serine
'TCC' => 'S', # Serine
'TCG' => 'S', # Serine
'TCT' => 'S', # Serine
'TTC' => 'F', # Phenylalanine
'TTT' => 'F', # Phenylalanine
'TTA' => 'L', # Leucine
'TTG' => 'L', # Leucine
'TAC' => 'Y', # Tyrosine
'TAT' => 'Y', # Tyrosine
'TAA' => '_', # Stop
'TAG' => '_', # Stop
'TGC' => 'C', # Cysteine
'TGT' => 'C', # Cysteine
'TGA' => '_', # Stop
'TGG' => 'W', # Tryptophan
'CTA' => 'L', # Leucine
'CTC' => 'L', # Leucine
'CTG' => 'L', # Leucine
'CTT' => 'L', # Leucine
'CCA' => 'P', # Proline
'CCC' => 'P', # Proline
'CCG' => 'P', # Proline
'CCT' => 'P', # Proline
'CAC' => 'H', # Histidine
'CAT' => 'H', # Histidine
'CAA' => 'Q', # Glutamine
'CAG' => 'Q', # Glutamine
'CGA' => 'R', # Arginine
'CGC' => 'R', # Arginine
'CGG' => 'R', # Arginine
'CGT' => 'R', # Arginine
'ATA' => 'I', # Isoleucine
'ATC' => 'I', # Isoleucine
'ATT' => 'I', # Isoleucine
'ATG' => 'M', # Methionine
'ACA' => 'T', # Threonine
'ACC' => 'T', # Threonine
'ACG' => 'T', # Threonine
'ACT' => 'T', # Threonine
'AAC' => 'N', # Asparagine
'AAT' => 'N', # Asparagine
'AAA' => 'K', # Lysine
'AAG' => 'K', # Lysine
'AGC' => 'S', # Serine
'AGT' => 'S', # Serine
'AGA' => 'R', # Arginine
'AGG' => 'R', # Arginine
'GTA' => 'V', # Valine
'GTC' => 'V', # Valine
'GTG' => 'V', # Valine
'GTT' => 'V', # Valine
'GCA' => 'A', # Alanine
'GCC' => 'A', # Alanine
'GCG' => 'A', # Alanine
'GCT' => 'A', # Alanine
'GAC' => 'D', # Aspartic Acid
'GAT' => 'D', # Aspartic Acid
'GAA' => 'E', # Glutamic Acid
'GAG' => 'E', # Glutamic Acid
'GGA' => 'G', # Glycine
'GGC' => 'G', # Glycine
'GGG' => 'G', # Glycine
'GGT' => 'G', # Glycine
);} print OUTFILE codon2aa ;
print codon2aa;
print "I translated the sequence\n\n$sequence\n\n into the protein\n\n$protein\n\n";
print OUTFILE $myresults;
close INFILE;
close OUTFILE;
exit; }

SOLUTION

ozo

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Tintin

You can't just concatenate two scripts together and hope it will work. My script was based on the original code you supplied, I didn't realise you had a whole lot of other stuff in the script as well.

ozo

You declare my $input twice
you never open INFILE, so what are you closing?
you never assign a value to $myresults or to $sequence

ozo

you never assign any value to $newline or $i, so what should substr($newline,$i,3) do?

thestarcrossed

ASKER

So that was it.. I will work on these, OZO, Tintin.
Thank you so much..
I tried learning this on my own.

ASKER CERTIFIED SOLUTION

mjcoyne

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial