Solved

Finding Blank Lines in a Text File

Posted on 2006-07-08
13
259 Views
Last Modified: 2008-03-10
Hello,
I 'm developing a site of poetry in which each poem is stored in it's own plain text file. I am trying to parse it. Here is an excerpt from a poem (they are all in the same "format"):

A New Dawn
September 27, 2002

Dawn slowly lend your light upon me,
Rain cleanse this mud on my soul.
Allow the river in my mind to crystallize each thought,
Love expose me to my love.

Autumn's first cool breeze sooths the summer's flame,
Rejuvenating my life and brining me back to my soul.
The autumn night freezes time for the next day,
Time, a gentle blanket you request, allow it to keep you warm,
My love.

A friendly reunion to my Angel, help me re-center,
Distant I've become, lost is where I'm found.
I surrender to the mirror so that I may find myself in this pure air,
My soul trusting spirals for they have never failed me.
Pain is a pretense, for I know the future is the light ahead,
My desire.



and I've parse out the poem's title and date, but I want to wrap each stanza (paragraph) in <p> tags. So, I really just need to insert <p></p> where each blank (or only whitespace) line is, but I end up with an HTML file looking like this:

<p>A New Dawn
<br />September 27, 2002
</p><p></p><p></p><p></p><p></p><p></p><p></p><p></p><p></p><p></p><p></p>


I've tried to regex'ing \n\n (among other things), using strcomp, for(), foreach(), etc. You can see this in my comments in the PHP script. Can someone help me here? I need to get each stanza wrapped in it's own <p> tags, thanks!

Here's the PHP script below. (I know it is messy, but I'm in the process of developing it still.) Thanks!

<?php

//Written by JKS on July 8th, 2006..
      $poem = "A_New_Dawn.txt";


//check to make sure the poem exists, if not provide an error message
      if (!file_exists($poem)) {
            print 'This poem cannot be found or no longer exists. (error: 01)';
            die;
      }


//open the poem and retrieves each line; store in the array $lines
      $opened_poem = fopen ($poem, 'r');
      while (!feof ($opened_poem)) {
            $buffer = fgets($opened_poem, 7000);
            $lines[] = $buffer;
      }
      fclose ($opened_poem); ///closes poem text file after read by system.


$total_lines = count($lines); //count the total amount of lines in the poem

//If the poem has less than or equal to 3 lines of text, return an error.
      if ($total_lines < 3) {
            print 'This poem cannot be found or no longer exists. (error: 02)';
            die;
      }


//write the head of the XHTML document
      print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
      print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
      print '<head>' . "\n";
      print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
      print '<title>' . $lines[0] . '</title>' . "\n";
      print '</head>' . "\n";
      print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
      print("<p>" . $lines[0] . "<br />" . $lines[1] . "</p>");


//count the total amount of lines in the poem without the first two rows.
      //$total_lines_without_heading = ($total_lines - 2);

//find the breaks in the paragraphs (commented out to try using foreach()
//$regs = array();
//for($position = 3; $total_lines_without_heading >= $position; $position++) {
//      print("<br />" . $lines[$position]);
//}
//print("<br />" . $lines[$position++]);


$position = 3;
foreach($lines as $poem_lines) {
      //if (strcmp($poem_lines, '') { print("<p></p>"); }
      //if (!ereg("<br[[:space:]]/>", $poem_lines)) { print("no match"); }
      
      if (!$lines == 0) { print("<p></p>"); }
      else {
            print("<br />" . $lines[$position]);
            if($total_lines == $position) { break; }
            $position++;
      }
}


//closing of the XHTML document
print '</body>' . "\n";
print '</html>';

?>
0
Comment
Question by:damijim
  • 7
  • 6
13 Comments
 
LVL 29

Accepted Solution

by:
TeRReF earned 400 total points
ID: 17067211
Try this:

$position = 3;
foreach($lines as $poem_lines) {
     if (trim($poem_lines, "\t\n\r\ \0") == '') {
        print("<p></p>");
    } else {
          print("<br />" . $poem_lines);
          if($total_lines == $position) { break; }
          $position++;
    }
}
0
 
LVL 1

Author Comment

by:damijim
ID: 17067427
awesome, thanks!
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 17067434
You're welcome :)
0
Networking for the Cloud Era

Join Microsoft and Riverbed for a discussion and demonstration of enhancements to SteelConnect:
-One-click orchestration and cloud connectivity in Azure environments
-Tight integration of SD-WAN and WAN optimization capabilities
-Scalability and resiliency equal to a data center

 
LVL 1

Author Comment

by:damijim
ID: 17067444
ack, actually, why is it still outputting a line break before the </p><p>? See below..

<br /></p><p>Dawn slowly lend your light upon me,
<br />Rain cleanse this mud on my soul.
<br />Allow the river in my mind to crystallize each thought,
<br />Love expose me to my love.
<br /></p><p>Autumn's first cool breeze sooths the summer's flame,
<br />Rejuvenating my life and brining me back to my soul.
<br />The autumn night freezes time for the next day,
<br />Time, a gentle blanket you request, allow it to keep you warm,
<br />My love.
<br /></p><p>A friendly reunion to my Angel, help me re-center,
0
 
LVL 1

Author Comment

by:damijim
ID: 17067447
okay, nevermind.. it's the linebreak from the line above it. hrm, so I'll need to always check the next line to make sure it isn't blank before writing a <br />....

I change this line-> print($poem_lines . "<br />\n");
and got this output->

The autumn night freezes time for the next day,
<br />
Time, a gentle blanket you request, allow it to keep you warm,
<br />
My love.
<br />
</p><p>A friendly reunion to my Angel, help me re-center,
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 17067456
Can you print your current code? I'll clean it up and make it work...
0
 
LVL 1

Author Comment

by:damijim
ID: 17067459
//write the head of the XHTML document
      print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
      print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
      print '<head>' . "\n";
      print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
      print '<title>' . $lines[0] . '</title>' . "\n";
      print '</head>' . "\n";
      print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
      print("<h1>" . $lines[0] . "</h1>\n<br /><span>" . $lines[1] . "</span>");

$position = 3;
foreach($lines as $poem_lines) {
     if (trim($poem_lines, "\t\n\r\ \0") == '') {
        print("</p>\n<p>");
    } else {
          print($poem_lines . "<br />\n");
          if($total_lines == $position) { break; }
          $position++;
    }
}

//closing of the XHTML document
print "\n" . "</p> \n" . '</body>' . "\n";
print '</html>';

?>
0
 
LVL 1

Author Comment

by:damijim
ID: 17067470
err... sorry,

<?php

//Written by JKS on July 8th, 2006.. all rights reserved.
      $poem = "A_New_Dawn.txt";


//check to make sure the poem exists, if not provide an error message
      if (!file_exists($poem)) {
            print 'This poem cannot be found or no longer exists. (error: 01)';
            die;
      }


//open the poem and retrieves each line (up to two megabyte max); store in the array $lines
      $opened_poem = fopen ($poem, 'r');
      while (!feof ($opened_poem)) {
            $buffer = fgets($opened_poem, 7000);
            $lines[] = $buffer;
      }
      fclose ($opened_poem); ///closes poem text file after read by system.


$total_lines = count($lines); //count the total amount of lines in the poem

//If the poem has less than or equal to 3 lines of text, return an error.
      if ($total_lines < 3) {
            print 'This poem cannot be found or no longer exists. (error: 02)';
            die;
      }


//write the head of the XHTML document
      print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
      print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
      print '<head>' . "\n";
      print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
      print '<title>' . $lines[0] . '</title>' . "\n";
      print '</head>' . "\n";
      print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
      print("<h1>" . $lines[0] . "</h1>\n<br /><span>" . $lines[1] . "</span>");

$position = 3;
foreach($lines as $poem_lines) {
     if (trim($poem_lines, "\t\n\r\ \0") == '') {
        print("</p>\n<p>");
    } else {
          print($poem_lines . "<br />\n");
          if($total_lines == $position) { break; }
          $position++;
    }
}

//closing of the XHTML document
print "\n" . "</p> \n" . '</body>' . "\n";
print '</html>';

?>
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 17067498
I think this should do it...

<?php

//Written by JKS on July 8th, 2006.. all rights reserved.
     $poem = "A_New_Dawn.txt";


//check to make sure the poem exists, if not provide an error message
     if (!file_exists($poem)) {
          print 'This poem cannot be found or no longer exists. (error: 01)';
          die;
     }


//open the poem and retrieves each line (up to two megabyte max); store in the array $lines
     $opened_poem = fopen ($poem, 'r');
     while (!feof ($opened_poem)) {
          $buffer = fgets($opened_poem, 7000);
          $lines[] = $buffer;
     }
     fclose ($opened_poem); ///closes poem text file after read by system.


     $total_lines = count($lines); //count the total amount of lines in the poem

//If the poem has less than or equal to 3 lines of text, return an error.
     if ($total_lines < 3) {
          print 'This poem cannot be found or no longer exists. (error: 02)';
          die;
     }


//write the head of the XHTML document
     print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
     print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
     print '<head>' . "\n";
     print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
     print '<title>' . $lines[0] . '</title>' . "\n";
     print '</head>' . "\n";
     print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
     print("<h1>" . array_shift($lines) . "</h1>\n<span>" . array_shift($lines) . "</span><p></p>");
     // remove empty line under date line
     array_shift($lines);

     // now we only have the lines that contain the actual poem
     $firstline = true;
     foreach($lines as $poem_lines) {
          if (trim($poem_lines, "\t\n\r\ \0") == '') {
               print("</p>\n<p>\n");
               $firstline = true;
          } else {
               $br = ($firstline) ? '' : "<br />\n";
               print($br . $poem_lines);
               $firstline = false;
          }
     }

//closing of the XHTML document
     print "\n" . "</p> \n" . '</body>' . "\n";
     print '</html>';

?>
0
 
LVL 1

Author Comment

by:damijim
ID: 17067541
Thanks! It's working! :)

If you don't mind me asking, why does PHP insist on putting some HTML tags on new lines? For example, the statement below...

print("<h1>" . array_shift($lines) . "</h1>\n<p>" . array_shift($lines) . "</p><p>");

outputs to the browser as...

<h1>A New Dawn
</h1>

But the "\n" in the PHP appears after the </h1>... there isn't any code telling it to put </h1> on a new line... it doesn't make sense to me. Is there a way to control this?

Thanks for your time!
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 17067555
It's probably the newline that's in the original text file.
If you really want to get rid of then, you might want to try:
print("<h1>" . str_replace(array("\n", "\r"), '', array_shift($lines)) . "</h1>\n<p>" . array_shift($lines) . "</p><p>");
0
 
LVL 1

Author Comment

by:damijim
ID: 17067562
okay, makes sense. thanks :)
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 17067566
You're welcome.
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Build an array called $myWeek which will hold the array elements Today, Yesterday and then builds up the rest of the week by the name of the day going back 1 week.   (CODE) (CODE) Then you just need to pass your date to the function. If i…
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

808 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question