Solved

Finding Blank Lines in a Text File

Posted on 2006-07-08
13
234 Views
Last Modified: 2008-03-10
Hello,
I 'm developing a site of poetry in which each poem is stored in it's own plain text file. I am trying to parse it. Here is an excerpt from a poem (they are all in the same "format"):

A New Dawn
September 27, 2002

Dawn slowly lend your light upon me,
Rain cleanse this mud on my soul.
Allow the river in my mind to crystallize each thought,
Love expose me to my love.

Autumn's first cool breeze sooths the summer's flame,
Rejuvenating my life and brining me back to my soul.
The autumn night freezes time for the next day,
Time, a gentle blanket you request, allow it to keep you warm,
My love.

A friendly reunion to my Angel, help me re-center,
Distant I've become, lost is where I'm found.
I surrender to the mirror so that I may find myself in this pure air,
My soul trusting spirals for they have never failed me.
Pain is a pretense, for I know the future is the light ahead,
My desire.



and I've parse out the poem's title and date, but I want to wrap each stanza (paragraph) in <p> tags. So, I really just need to insert <p></p> where each blank (or only whitespace) line is, but I end up with an HTML file looking like this:

<p>A New Dawn
<br />September 27, 2002
</p><p></p><p></p><p></p><p></p><p></p><p></p><p></p><p></p><p></p><p></p>


I've tried to regex'ing \n\n (among other things), using strcomp, for(), foreach(), etc. You can see this in my comments in the PHP script. Can someone help me here? I need to get each stanza wrapped in it's own <p> tags, thanks!

Here's the PHP script below. (I know it is messy, but I'm in the process of developing it still.) Thanks!

<?php

//Written by JKS on July 8th, 2006..
      $poem = "A_New_Dawn.txt";


//check to make sure the poem exists, if not provide an error message
      if (!file_exists($poem)) {
            print 'This poem cannot be found or no longer exists. (error: 01)';
            die;
      }


//open the poem and retrieves each line; store in the array $lines
      $opened_poem = fopen ($poem, 'r');
      while (!feof ($opened_poem)) {
            $buffer = fgets($opened_poem, 7000);
            $lines[] = $buffer;
      }
      fclose ($opened_poem); ///closes poem text file after read by system.


$total_lines = count($lines); //count the total amount of lines in the poem

//If the poem has less than or equal to 3 lines of text, return an error.
      if ($total_lines < 3) {
            print 'This poem cannot be found or no longer exists. (error: 02)';
            die;
      }


//write the head of the XHTML document
      print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
      print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
      print '<head>' . "\n";
      print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
      print '<title>' . $lines[0] . '</title>' . "\n";
      print '</head>' . "\n";
      print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
      print("<p>" . $lines[0] . "<br />" . $lines[1] . "</p>");


//count the total amount of lines in the poem without the first two rows.
      //$total_lines_without_heading = ($total_lines - 2);

//find the breaks in the paragraphs (commented out to try using foreach()
//$regs = array();
//for($position = 3; $total_lines_without_heading >= $position; $position++) {
//      print("<br />" . $lines[$position]);
//}
//print("<br />" . $lines[$position++]);


$position = 3;
foreach($lines as $poem_lines) {
      //if (strcmp($poem_lines, '') { print("<p></p>"); }
      //if (!ereg("<br[[:space:]]/>", $poem_lines)) { print("no match"); }
      
      if (!$lines == 0) { print("<p></p>"); }
      else {
            print("<br />" . $lines[$position]);
            if($total_lines == $position) { break; }
            $position++;
      }
}


//closing of the XHTML document
print '</body>' . "\n";
print '</html>';

?>
0
Comment
Question by:damijim
  • 7
  • 6
13 Comments
 
LVL 29

Accepted Solution

by:
TeRReF earned 400 total points
Comment Utility
Try this:

$position = 3;
foreach($lines as $poem_lines) {
     if (trim($poem_lines, "\t\n\r\ \0") == '') {
        print("<p></p>");
    } else {
          print("<br />" . $poem_lines);
          if($total_lines == $position) { break; }
          $position++;
    }
}
0
 
LVL 1

Author Comment

by:damijim
Comment Utility
awesome, thanks!
0
 
LVL 29

Expert Comment

by:TeRReF
Comment Utility
You're welcome :)
0
 
LVL 1

Author Comment

by:damijim
Comment Utility
ack, actually, why is it still outputting a line break before the </p><p>? See below..

<br /></p><p>Dawn slowly lend your light upon me,
<br />Rain cleanse this mud on my soul.
<br />Allow the river in my mind to crystallize each thought,
<br />Love expose me to my love.
<br /></p><p>Autumn's first cool breeze sooths the summer's flame,
<br />Rejuvenating my life and brining me back to my soul.
<br />The autumn night freezes time for the next day,
<br />Time, a gentle blanket you request, allow it to keep you warm,
<br />My love.
<br /></p><p>A friendly reunion to my Angel, help me re-center,
0
 
LVL 1

Author Comment

by:damijim
Comment Utility
okay, nevermind.. it's the linebreak from the line above it. hrm, so I'll need to always check the next line to make sure it isn't blank before writing a <br />....

I change this line-> print($poem_lines . "<br />\n");
and got this output->

The autumn night freezes time for the next day,
<br />
Time, a gentle blanket you request, allow it to keep you warm,
<br />
My love.
<br />
</p><p>A friendly reunion to my Angel, help me re-center,
0
 
LVL 29

Expert Comment

by:TeRReF
Comment Utility
Can you print your current code? I'll clean it up and make it work...
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 1

Author Comment

by:damijim
Comment Utility
//write the head of the XHTML document
      print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
      print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
      print '<head>' . "\n";
      print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
      print '<title>' . $lines[0] . '</title>' . "\n";
      print '</head>' . "\n";
      print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
      print("<h1>" . $lines[0] . "</h1>\n<br /><span>" . $lines[1] . "</span>");

$position = 3;
foreach($lines as $poem_lines) {
     if (trim($poem_lines, "\t\n\r\ \0") == '') {
        print("</p>\n<p>");
    } else {
          print($poem_lines . "<br />\n");
          if($total_lines == $position) { break; }
          $position++;
    }
}

//closing of the XHTML document
print "\n" . "</p> \n" . '</body>' . "\n";
print '</html>';

?>
0
 
LVL 1

Author Comment

by:damijim
Comment Utility
err... sorry,

<?php

//Written by JKS on July 8th, 2006.. all rights reserved.
      $poem = "A_New_Dawn.txt";


//check to make sure the poem exists, if not provide an error message
      if (!file_exists($poem)) {
            print 'This poem cannot be found or no longer exists. (error: 01)';
            die;
      }


//open the poem and retrieves each line (up to two megabyte max); store in the array $lines
      $opened_poem = fopen ($poem, 'r');
      while (!feof ($opened_poem)) {
            $buffer = fgets($opened_poem, 7000);
            $lines[] = $buffer;
      }
      fclose ($opened_poem); ///closes poem text file after read by system.


$total_lines = count($lines); //count the total amount of lines in the poem

//If the poem has less than or equal to 3 lines of text, return an error.
      if ($total_lines < 3) {
            print 'This poem cannot be found or no longer exists. (error: 02)';
            die;
      }


//write the head of the XHTML document
      print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
      print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
      print '<head>' . "\n";
      print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
      print '<title>' . $lines[0] . '</title>' . "\n";
      print '</head>' . "\n";
      print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
      print("<h1>" . $lines[0] . "</h1>\n<br /><span>" . $lines[1] . "</span>");

$position = 3;
foreach($lines as $poem_lines) {
     if (trim($poem_lines, "\t\n\r\ \0") == '') {
        print("</p>\n<p>");
    } else {
          print($poem_lines . "<br />\n");
          if($total_lines == $position) { break; }
          $position++;
    }
}

//closing of the XHTML document
print "\n" . "</p> \n" . '</body>' . "\n";
print '</html>';

?>
0
 
LVL 29

Expert Comment

by:TeRReF
Comment Utility
I think this should do it...

<?php

//Written by JKS on July 8th, 2006.. all rights reserved.
     $poem = "A_New_Dawn.txt";


//check to make sure the poem exists, if not provide an error message
     if (!file_exists($poem)) {
          print 'This poem cannot be found or no longer exists. (error: 01)';
          die;
     }


//open the poem and retrieves each line (up to two megabyte max); store in the array $lines
     $opened_poem = fopen ($poem, 'r');
     while (!feof ($opened_poem)) {
          $buffer = fgets($opened_poem, 7000);
          $lines[] = $buffer;
     }
     fclose ($opened_poem); ///closes poem text file after read by system.


     $total_lines = count($lines); //count the total amount of lines in the poem

//If the poem has less than or equal to 3 lines of text, return an error.
     if ($total_lines < 3) {
          print 'This poem cannot be found or no longer exists. (error: 02)';
          die;
     }


//write the head of the XHTML document
     print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . "\n";
     print '<html xmlns="http://www.w3.org/1999/xhtml">' . "\n";
     print '<head>' . "\n";
     print '<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />' . "\n";
     print '<title>' . $lines[0] . '</title>' . "\n";
     print '</head>' . "\n";
     print '<body>' . "\n";
//end of XHTML header


//output the title & date of the poem to the browser with formatting
     print("<h1>" . array_shift($lines) . "</h1>\n<span>" . array_shift($lines) . "</span><p></p>");
     // remove empty line under date line
     array_shift($lines);

     // now we only have the lines that contain the actual poem
     $firstline = true;
     foreach($lines as $poem_lines) {
          if (trim($poem_lines, "\t\n\r\ \0") == '') {
               print("</p>\n<p>\n");
               $firstline = true;
          } else {
               $br = ($firstline) ? '' : "<br />\n";
               print($br . $poem_lines);
               $firstline = false;
          }
     }

//closing of the XHTML document
     print "\n" . "</p> \n" . '</body>' . "\n";
     print '</html>';

?>
0
 
LVL 1

Author Comment

by:damijim
Comment Utility
Thanks! It's working! :)

If you don't mind me asking, why does PHP insist on putting some HTML tags on new lines? For example, the statement below...

print("<h1>" . array_shift($lines) . "</h1>\n<p>" . array_shift($lines) . "</p><p>");

outputs to the browser as...

<h1>A New Dawn
</h1>

But the "\n" in the PHP appears after the </h1>... there isn't any code telling it to put </h1> on a new line... it doesn't make sense to me. Is there a way to control this?

Thanks for your time!
0
 
LVL 29

Expert Comment

by:TeRReF
Comment Utility
It's probably the newline that's in the original text file.
If you really want to get rid of then, you might want to try:
print("<h1>" . str_replace(array("\n", "\r"), '', array_shift($lines)) . "</h1>\n<p>" . array_shift($lines) . "</p><p>");
0
 
LVL 1

Author Comment

by:damijim
Comment Utility
okay, makes sense. thanks :)
0
 
LVL 29

Expert Comment

by:TeRReF
Comment Utility
You're welcome.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Consider the following scenario: You are working on a website and make something great - something that lets the server work with information submitted by your users. This could be anything, from a simple guestbook to a e-Money solution. But what…
Generating table dynamically is the most common issue faced by php developers.... So it seems there is a need of an article that explains the basic concept of generating tables dynamically. It just requires a basic knowledge of html and little maths…
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now