Solved

Finding Blank Lines in a Text File

Posted on 2006-07-08
13
246 Views
Last Modified: 2008-03-10
Hello,
I 'm developing a site of poetry in which each poem is stored in it's own plain text file. I am trying to parse it. Here is an excerpt from a poem (they are all in the same "format"):

A New Dawn
September 27, 2002

Dawn slowly lend your light upon me,
Rain cleanse this mud on my soul.
Allow the river in my mind to crystallize each thought,
Love expose me to my love.

Autumn's first cool breeze sooths the summer's flame,
Rejuvenating my life and brining me back to my soul.
The autumn night freezes time for the next day,
Time, a gentle blanket you request, allow it to keep you warm,
My love.

A friendly reunion to my Angel, help me re-center,
Distant I've become, lost is where I'm found.
I surrender to the mirror so that I may find myself in this pure air,
My soul trusting spirals for they have never failed me.
Pain is a pretense, for I know the future is the light ahead,
My desire.



and I've parse out the poem's title and date, but I want to wrap each stanza (paragraph) in <p> tags. So, I really just need to insert <p></p> where each blank (or only whitespace) line is, but I end up with an HTML file looking like this:

<p>A New Dawn
<br />September 27, 2002
</p><p></p><p></p><p></p><p></p><p></p><p></p><p></p><p></p><p></p><p></p>


I've tried to regex'ing \n\n (among other things), using strcomp, for(), foreach(), etc. You can see this in my comments in the PHP script. Can someone help me here? I need to get each stanza wrapped in it's own <p> tags, thanks!

Here's the PHP script below. (I know it is messy, but I'm in the process of developing it still.) Thanks!

<?php

//Written by JKS on July 8th, 2006..
      $poem = "A_New_Dawn.txt";


//check to make sure the poem exists, if not provide an error message
      if (!file_exists($poem)) {
            print 'This poem cannot be found or no longer exists. (error: 01)';
            die;
      }


//open the poem and retrieves each line; store in the array $lines
      $opened_poem = fopen ($poem, 'r');
      while (!feof ($opened_poem)) {
            $buffer = fgets($opened_poem, 7000);
            $lines[] = $buffer;
      }
      fclose ($opened_poem); ///closes poem text file after read by system.


$total_lines = count($lines); //count the total amount of lines in the poem

//If the poem has less than or equal to 3 lines of text, return an error.
      if ($total_lines < 3) {
            print 'This poem cannot be found or no longer exists. (error: 02)';
            die;
      }


//write the head of the XHTML document
      print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
      print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
      print '<head>' . "\n";
      print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
      print '<title>' . $lines[0] . '</title>' . "\n";
      print '</head>' . "\n";
      print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
      print("<p>" . $lines[0] . "<br />" . $lines[1] . "</p>");


//count the total amount of lines in the poem without the first two rows.
      //$total_lines_without_heading = ($total_lines - 2);

//find the breaks in the paragraphs (commented out to try using foreach()
//$regs = array();
//for($position = 3; $total_lines_without_heading >= $position; $position++) {
//      print("<br />" . $lines[$position]);
//}
//print("<br />" . $lines[$position++]);


$position = 3;
foreach($lines as $poem_lines) {
      //if (strcmp($poem_lines, '') { print("<p></p>"); }
      //if (!ereg("<br[[:space:]]/>", $poem_lines)) { print("no match"); }
      
      if (!$lines == 0) { print("<p></p>"); }
      else {
            print("<br />" . $lines[$position]);
            if($total_lines == $position) { break; }
            $position++;
      }
}


//closing of the XHTML document
print '</body>' . "\n";
print '</html>';

?>
0
Comment
Question by:damijim
  • 7
  • 6
13 Comments
 
LVL 29

Accepted Solution

by:
TeRReF earned 400 total points
ID: 17067211
Try this:

$position = 3;
foreach($lines as $poem_lines) {
     if (trim($poem_lines, "\t\n\r\ \0") == '') {
        print("<p></p>");
    } else {
          print("<br />" . $poem_lines);
          if($total_lines == $position) { break; }
          $position++;
    }
}
0
 
LVL 1

Author Comment

by:damijim
ID: 17067427
awesome, thanks!
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 17067434
You're welcome :)
0
 
LVL 1

Author Comment

by:damijim
ID: 17067444
ack, actually, why is it still outputting a line break before the </p><p>? See below..

<br /></p><p>Dawn slowly lend your light upon me,
<br />Rain cleanse this mud on my soul.
<br />Allow the river in my mind to crystallize each thought,
<br />Love expose me to my love.
<br /></p><p>Autumn's first cool breeze sooths the summer's flame,
<br />Rejuvenating my life and brining me back to my soul.
<br />The autumn night freezes time for the next day,
<br />Time, a gentle blanket you request, allow it to keep you warm,
<br />My love.
<br /></p><p>A friendly reunion to my Angel, help me re-center,
0
 
LVL 1

Author Comment

by:damijim
ID: 17067447
okay, nevermind.. it's the linebreak from the line above it. hrm, so I'll need to always check the next line to make sure it isn't blank before writing a <br />....

I change this line-> print($poem_lines . "<br />\n");
and got this output->

The autumn night freezes time for the next day,
<br />
Time, a gentle blanket you request, allow it to keep you warm,
<br />
My love.
<br />
</p><p>A friendly reunion to my Angel, help me re-center,
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 17067456
Can you print your current code? I'll clean it up and make it work...
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 1

Author Comment

by:damijim
ID: 17067459
//write the head of the XHTML document
      print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
      print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
      print '<head>' . "\n";
      print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
      print '<title>' . $lines[0] . '</title>' . "\n";
      print '</head>' . "\n";
      print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
      print("<h1>" . $lines[0] . "</h1>\n<br /><span>" . $lines[1] . "</span>");

$position = 3;
foreach($lines as $poem_lines) {
     if (trim($poem_lines, "\t\n\r\ \0") == '') {
        print("</p>\n<p>");
    } else {
          print($poem_lines . "<br />\n");
          if($total_lines == $position) { break; }
          $position++;
    }
}

//closing of the XHTML document
print "\n" . "</p> \n" . '</body>' . "\n";
print '</html>';

?>
0
 
LVL 1

Author Comment

by:damijim
ID: 17067470
err... sorry,

<?php

//Written by JKS on July 8th, 2006.. all rights reserved.
      $poem = "A_New_Dawn.txt";


//check to make sure the poem exists, if not provide an error message
      if (!file_exists($poem)) {
            print 'This poem cannot be found or no longer exists. (error: 01)';
            die;
      }


//open the poem and retrieves each line (up to two megabyte max); store in the array $lines
      $opened_poem = fopen ($poem, 'r');
      while (!feof ($opened_poem)) {
            $buffer = fgets($opened_poem, 7000);
            $lines[] = $buffer;
      }
      fclose ($opened_poem); ///closes poem text file after read by system.


$total_lines = count($lines); //count the total amount of lines in the poem

//If the poem has less than or equal to 3 lines of text, return an error.
      if ($total_lines < 3) {
            print 'This poem cannot be found or no longer exists. (error: 02)';
            die;
      }


//write the head of the XHTML document
      print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
      print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
      print '<head>' . "\n";
      print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
      print '<title>' . $lines[0] . '</title>' . "\n";
      print '</head>' . "\n";
      print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
      print("<h1>" . $lines[0] . "</h1>\n<br /><span>" . $lines[1] . "</span>");

$position = 3;
foreach($lines as $poem_lines) {
     if (trim($poem_lines, "\t\n\r\ \0") == '') {
        print("</p>\n<p>");
    } else {
          print($poem_lines . "<br />\n");
          if($total_lines == $position) { break; }
          $position++;
    }
}

//closing of the XHTML document
print "\n" . "</p> \n" . '</body>' . "\n";
print '</html>';

?>
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 17067498
I think this should do it...

<?php

//Written by JKS on July 8th, 2006.. all rights reserved.
     $poem = "A_New_Dawn.txt";


//check to make sure the poem exists, if not provide an error message
     if (!file_exists($poem)) {
          print 'This poem cannot be found or no longer exists. (error: 01)';
          die;
     }


//open the poem and retrieves each line (up to two megabyte max); store in the array $lines
     $opened_poem = fopen ($poem, 'r');
     while (!feof ($opened_poem)) {
          $buffer = fgets($opened_poem, 7000);
          $lines[] = $buffer;
     }
     fclose ($opened_poem); ///closes poem text file after read by system.


     $total_lines = count($lines); //count the total amount of lines in the poem

//If the poem has less than or equal to 3 lines of text, return an error.
     if ($total_lines < 3) {
          print 'This poem cannot be found or no longer exists. (error: 02)';
          die;
     }


//write the head of the XHTML document
     print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
     print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
     print '<head>' . "\n";
     print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
     print '<title>' . $lines[0] . '</title>' . "\n";
     print '</head>' . "\n";
     print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
     print("<h1>" . array_shift($lines) . "</h1>\n<span>" . array_shift($lines) . "</span><p></p>");
     // remove empty line under date line
     array_shift($lines);

     // now we only have the lines that contain the actual poem
     $firstline = true;
     foreach($lines as $poem_lines) {
          if (trim($poem_lines, "\t\n\r\ \0") == '') {
               print("</p>\n<p>\n");
               $firstline = true;
          } else {
               $br = ($firstline) ? '' : "<br />\n";
               print($br . $poem_lines);
               $firstline = false;
          }
     }

//closing of the XHTML document
     print "\n" . "</p> \n" . '</body>' . "\n";
     print '</html>';

?>
0
 
LVL 1

Author Comment

by:damijim
ID: 17067541
Thanks! It's working! :)

If you don't mind me asking, why does PHP insist on putting some HTML tags on new lines? For example, the statement below...

print("<h1>" . array_shift($lines) . "</h1>\n<p>" . array_shift($lines) . "</p><p>");

outputs to the browser as...

<h1>A New Dawn
</h1>

But the "\n" in the PHP appears after the </h1>... there isn't any code telling it to put </h1> on a new line... it doesn't make sense to me. Is there a way to control this?

Thanks for your time!
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 17067555
It's probably the newline that's in the original text file.
If you really want to get rid of then, you might want to try:
print("<h1>" . str_replace(array("\n", "\r"), '', array_shift($lines)) . "</h1>\n<p>" . array_shift($lines) . "</p><p>");
0
 
LVL 1

Author Comment

by:damijim
ID: 17067562
okay, makes sense. thanks :)
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 17067566
You're welcome.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
what is best version of php to use 6 50
only allow numbers with preg match 4 27
updating the date data 12 25
is this a cms? 8 36
Generating table dynamically is the most common issue faced by php developers.... So it seems there is a need of an article that explains the basic concept of generating tables dynamically. It just requires a basic knowledge of html and little maths…
Author Note: Since this E-E article was originally written, years ago, formal testing has come into common use in the world of PHP.  PHPUnit (http://en.wikipedia.org/wiki/PHPUnit) and similar technologies have enjoyed wide adoption, making it possib…
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now