Solved

help for a beginner!

Posted on 1997-12-08
9
280 Views
Last Modified: 2008-03-03
I am a beginner of perl. I only know little bit about programming. I tried to write a perl program to cut up each paragraph into lines that has as many words as
possible. And the last line should be single spaced.


XXXXXXXXXXXXX.XXXXXXXXXX,XXXXXXXXXXX. XXXXXXXX.
XXXXXXXXXXXXX.XXXXXXXXXX,XXXXXXXXXXX. XXXXXXXX.


XXXXXXX. YYYYYY. YYYY,YYY. YYY,YYY. YYYY.  ZZZZ.ZZZZ.
XXXX,XXXX.
    ZZZZZZ.ZZZZZZZZZZZZZZZZZZZZZZZZZZ.ZZZZZZZZZZZZZ


And I will get the output to another file like that:
XXXXXXXXXXXXX.XXXXXXXXXX,XXXXXXXXXXX. XXXXXXXX.
XXXXXXXXXXXXX.XXXXXXXXXX,XXXXXXXXXXX.

XXXXXXX. YYYYYY. YYYY,YYY. YYY,YYY. YYYY.  ZZZZ.ZZZZ.XXXX,XXXX.

ZZZZZZ.ZZZZZZZZZZZZZZZZZZZZZZZZZZ.ZZZZZZZZZZZZZ


here is the code I tried on my program, however, it don't work at all. I tried to look for some perl books for help. However, I just know which is good for me. Also, if you know which perl book is good for beginner, just let me know.
--------------

#!/usr/bin/perl

 print "Input a file name:";
 $infilename=<STDIN>;
 chop ($infilename);
 open(IN, $infilename);

 print "Type output file name:";
 $outfilename=<STDIN>;
 chop ($outfilename);
  open(OUT, ">$outfilename");
  while(<IN>) {
 tr/\r//;  #remove all the "\r" from the input file.
 s/\n\s*\n////g; #replace empty lines with Tags
        s/\n+//g;  # remove newlines
        s/\t+////g;  # replace multiple tabs with nothing
        s/\n\n\t/\t/;  # remove new leading newline artifacts
        s/$/\n/;  # add newline at end

     print OUT $_;
  }
 close (IN);
 close (OUT);
0
Comment
Question by:waikap
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
9 Comments
 
LVL 1

Expert Comment

by:malec
ID: 1209098
I don't understand what you are trying to do. Why would you remove endline (it is always at the end of the line) and then add it to each line? Also lines s/\n\s*\n////g; and s/\t+////g; will produce syntax error in  Perl 5.
0
 
LVL 84

Expert Comment

by:ozo
ID: 1209099
use Text::Wrap;
0
 
LVL 84

Expert Comment

by:ozo
ID: 1209100
use Text::Wrap qw(wrap $columns);
$columns = 48;
$/="";
while( <> ){
 s/\s+/ /g;
 print wrap("","",$_),"\n\n";
}
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 1

Expert Comment

by:ramsay
ID: 1209101
a good book is:
"programming perl" O'Reilly & Ass, Inc

0
 
LVL 2

Accepted Solution

by:
haystor earned 200 total points
ID: 1209102
try something like this:

#!/usr/bin/perl

$COLUMNS = 80;  # number of columns to wrap at

print "input a file name: ";
$infilename = <STDIN>;
chomp $infilename;    # use chomp instead of chomp
print "Type output file name: ";
$outfilename = <STDIN>;
chomp $outfilename;

open (IN, "$infilename") or die("no such file\n");
while (<IN>) {
    chomp;
    $line = $_;
    if ($line =~ /^\n\s*\n$/) {  # this would be a blank
       push (@newfile, "\n");           # line indicating para
       next;                                       # graph boundary
     }
    @words = split(/\s/, $line);
    $newline = "";
    foreach $i (@words) {
        if (length($newline) + length($i) < $COLUMNS) {
            $newline += " $i";  # note the space
        } else {
            $newline += "\n";
             push (@newfile, $newline);
             $newline = $i;
         }
         push(@newfile, "$newline\n");
     }
}

open (OUT, ">$outfilename") or die("cannot write to $outfilename");
foreach $i (@newfile) {
    print OUT "$i";
}
close IN;
close OUT;

This should get you considerably closer.  Note how the =~ operator is used withing pattern matching.  As well as how I have used chomp, and the foreach construct.

My diagnosis of your code is that you are learning from a "21 days" book. Hey, it will get you started (that's how I started) but it won't get you all the way there.  You have enough of an idea now to move on to "Learning Perl" by Randal Swartz.  "Programming Perl" by Larry Wall should be used as a reference.  These two are the definitive books on Perl, and you should not require any others for a while.  Might I also suggest that you read the newsgroup comp.lang.perl.misc just to get an idea of the flow and style of Perl.

I have tried to answer the question for a person that is trying to learn to program.  If you look at the differences between how I have done it, and how you have tried, you should try to look up in one of the two books above as to why it is done that way.

Good Luck

0
 
LVL 84

Expert Comment

by:ozo
ID: 1209103
haystor, I think you typed += where you meant to say .=
0
 
LVL 2

Expert Comment

by:haystor
ID: 1209104
Thank you, Ozo,

I'm sitting here studying Java as I am trying to program Perl.
0
 
LVL 2

Expert Comment

by:haystor
ID: 1209105
Allow me to try again, this time cutting and pasting instead of retyping:

I have written this one assuming that each line should be turned into a paragraph.  I believe the variables can be followed easily.  Just put this subroutine into a program,  send it the columns, input file, and output file.  It should do the rest, as well as echo it back to STOUT.  You can comment that line out if needed.

If you meant that every paragraph should be a defined by a blank line, just send a comment, and I will show you where to put that.  Have fun

#!/usr/local/bin/perl


$COLUMNS = 30;
$infile = "text.txt";
$outfile = "out.txt";

WrapIt($COLUMNS, $infile, $outfile);  # once you have the above three values
                                      # just put the WrapIt line in.

sub WrapIt {
    ($COLUMNS, $infile, $outfile) = @_;
    open (IN, "$infile") or print "cannot open filename\n";
    @lines = <IN>;
    close IN;
    chomp @lines;
   
    @newlines = ();
    foreach $i (@lines) {
      @words = split(/\s+/, $i);
      if ($i eq "") {
          push(@newlines, "");
          next;
      }
      foreach $j (@words) {
          if ($COLUMNS < (length($currentline) + 1 + length($j))) {
            push (@newlines, $currentline);
            $currentline = "$j";
          } else {
            if ($currentline) {
                $currentline .= " $j";
            } else {
                $currentline = "$j";
            }
          }
      }
      push(@newlines, $currentline);
      $currentline = "";
    }
   
   
    open (OUT, ">$outfile") or print "cannot write to output file\n";
    foreach $i (@newlines) {
      print OUT "$i\n";
      print "$i\n";  # echoes to ST OUT
    }
}
0
 

Author Comment

by:waikap
ID: 1209106
Hi Laystor,

Thank you for your effort to help me out of this problem. I try both of your codes. The first one seems not working very well. It just give me all 0 in very lines. The second one is working. However, it just give somethink like the following. I don't if that is problem with the perl compiler I am using or what.(I am using Win32 perl for windows 95). I want to cut up each garagraph into lines that has as many words in each lines as possible. I use a entire blank line to determine a new paragraph. Maybe, it will be much clear, if I give an example.

Test file:
1111111111111111111111111

22222222222222222222222222222222222222
2222222222222

3333333333333333333333333
3333333333333333333333333

444444444
44

5555555555555555555555555

6666666666666666666666666

the output in your second program:
1111111111111111111111111     <-- right


22222222222222222222222222222222222222
2222222222222                 <-- don't go up to make it one                                    line.

3333333333333333333333333
3333333333333333333333333   <-- same problem

444444444
44                          <-- same

5555555555555555555555555

6666666666666666666666666
 
I want to have:


111111111111111111111111


222222222222222222222222222222222222222222222222222


33333333333333333333333333333333333333333333333333

44444444444

5555555555555555555555555

6666666666666666666666666

Thank you for your two codes. It gives me quite a lot idea of how to do it. I am looking at those codes and I am trying to understand the meaning of those weired stuff, like s/\t+////, bit by bit.  My problem is I don't I have any perl reference(I ordered my books and it is coming soon), I need to gauess those thing one by one. Anyway, I start to know much about perl and start to appreciate the beauty of perl.  Thank you for you time and effort for helping me on this problem. I want to give you more points, however, that is all I have. Your time and patient deserve much more than those points I can give you!

Joe
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

630 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question