Solved

how to get rid of spaces in the output from this perl script

Posted on 2006-11-05
3
202 Views
Last Modified: 2010-03-05
Folks,
I have a perl script that actually parses following sample file:
gene            1995..3119
                     /gene="dnaN"
                     /locus_tag="AAur_0002"
     CDS             1995..3119
                     /gene="dnaN"
                     /locus_tag="AAur_0002"
                     /EC_number="2.7.7.7"
                     /note="identified by match to protein family HMM PF00712;
                     match to protein family HMM PF02767; match to protein
                     family HMM PF02768; match to protein family HMM TIGR00663"
                     /codon_start=1
                     /transl_table=11
                     /product="DNA polymerase III, beta subunit"
                     /protein_id="tigr:AAur_0002"
                     /translation="MKFRVDRDVLAEAVTWTARSLSPRPPVPVLSGLLLKAEAGTVSL
                     SSFDYETSARLEIPADIAVEGTILVSGRLLADICRSLPSAPVEVETDGSKVTLTCRRS
                     SFHLATMPESEYPALPALPAISGTLPGDAFAQAVSQVIIAASKDDTLPILTGVRMEIE
                     DDLITLLATDRYRLAMREVPWKPVTPGISTSALVKSKTLNEVAKTLGGSGDINLALAD
                     DDSRLIGFESGGRTTTSLLVDGDYPKIRSLFPDSTPIHATVQTQELVEAVRRVSLVAE
                     RNTPVRLAFTQGLLNLDAGTGEDAQASEELEAQLSGEDITVAFNPHYLVEGLSVIETK
                     YVRFSFTTAPKPAMITAQAEADGEDQDDYRYLVMPVRLPN"
gene            5318..5872
                     /locus_tag="AAur_0005"
     CDS             5318..5872
                     /locus_tag="AAur_0005"
                     /note="identified by match to protein family HMM PF05258"
                     /codon_start=1
                     /transl_table=11
                     /product="putative protein of unknown function (DUF721)"
                     /protein_id="tigr:AAur_0005"
                     /translation="MAKDSRDGLQPGREPDEIDAAQAALNRMREAAAARGEVRQRAPR
                     PGSAPKRQGLRDTRGFAQFHGSGRDPLGLGKVVGRLVAERGWTSPVAVGSVMAEWETL
                     VGPDISSHCTPESFTDTTLHVRCDSTAWATQLRLLSTSLLEMFRNELGEGVVTSIHVL
                     GPSAPSWRKGGRSVNGRGPRDTYG"


Here is the script:
use strict;
use vars
qw{
$table_line
};
$table_line ='';
while(<>)
{

        if(/^\s+\/product=(.*)/)
        {
                my $product =$1;
                while (<>)
                {
                        last unless /^\s+\/product=(.*)/;
                        $product =$product.$1;

                }
                $table_line =$table_line.$product."\t";
        }
                if(/^\s+\/protein_id=(.*)/)
        {
                $table_line = $table_line.$1."\t";

        }
         if(/^\s+\/translation=(.*)/)
        {
                my $translation = $1;
                while (<>)
                {
                        last unless /^\s+\        (.*)/;
                        $translation=$translation.$1;
                }
                $table_line=$table_line.$translation."\t";

        }
                print "$table_line\n";
                $table_line ="";


}


Here is the output from the script:  This script parses the input file and puts required entries in tabbed format in a output file:












"DNA polymerase III, beta subunit"      "tigr:AAur_0002"
"MKFRVDRDVLAEAVTWTARSLSPRPPVPVLSGLLLKAEAGTVSLSSFDYETSARLEIPADIAVEGTILVSGRLLADICRSLPSAPVEVETDGSKVTLTCRRSSFHLATMPESEYPALPALPAISGTLPGDAFAQAVSQVIIAASKDDTLPILTGVRMEIEDDLITLLATDRYRLAMREVPWKPVTPGISTSALVKSKTLNEVAKTLGGSGDINLALADDDSRLIGFESGGRTTTSLLVDGDYPKIRSLFPDSTPIHATVQTQELVEAVRRVSLVAERNTPVRLAFTQGLLNLDAGTGEDAQASEELEAQLSGEDITVAFNPHYLVEGLSVIETKYVRFSFTTAPKPAMITAQAEADGEDQDDYRYLVMPVRLPN"






"putative protein of unknown function (DUF721)" "tigr:AAur_0005"
"MAKDSRDGLQPGREPDEIDAAQAALNRMREAAAARGEVRQRAPRPGSAPKRQGLRDTRGFAQFHGSGRDPLGLGKVVGRLVAERGWTSPVAVGSVMAEWETLVGPDISSHCTPESFTDTTLHVRCDSTAWATQLRLLSTSLLEMFRNELGEGVVTSIHVLGPSAPSWRKGGRSVNGRGPRDTYG"


I wanted this script to inseatd not to print the blank spaces , instead it shd just print the required output in tabbed format .. any clues how can i get rid  of these blank spaces ..
0
Comment
Question by:bjuneja_2000
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
3 Comments
 
LVL 17

Expert Comment

by:mjcoyne
ID: 17878438
Before printing, try:

$table_line =~ s/^\s*$//g;
0
 

Author Comment

by:bjuneja_2000
ID: 17878659
hmm ,
Actually I tried that before , it didn't work .., not sure why ..
Any other clue ?
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 17878692
       print "$table_line\n" if $table_line;
0

Featured Post

Enroll in May's Course of the Month

May’s Course of the Month is now available! Experts Exchange’s Premium Members and Team Accounts have access to a complimentary course each month as part of their membership—an extra way to increase training and boost professional development.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

737 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question