Avatar of evco
evcoFlag for United States of America asked on

Replace newlines with HTML paragraph tags

The following replaces 2 or more newlines with paragraph tags however if a paragraph (2 or more newlines) contains a single newline it is not wrapped in paragraph tags.

I have a basic understanding of regular expressions but I'm at a loss on how to fix the regex below to correct problem.

I've figured out that it DOES work when newlines are \r but not when they are \r\n or \n.

The example below will output:
<p>This is the first paragraph.</p>
This is the second
paragraph.

But I need it to output::
<p>This is the first paragraph.</p>
<p>This is the second
paragraph.</p>
$string = "This is the first paragraph.\n\nThis is the second\nparagraph.";
echo preg_replace("{(?:^|(?:\x0d\x0a){2,}|\x0a{2,}¦\x0d{2,})(.+?)(?=(?:(\x0d\x0a){2,}|\x0d{2,}|\x0a{2,}|$))}", "</p>$1<p>", $string)

Open in new window

Regular ExpressionsPHP

Avatar of undefined
Last Comment
evco

8/22/2022 - Mon
hielo

How about this:
$string = "This is the first paragraph.\n\nThis is the second\nparagraph.";
echo "<p>" . preg_replace("/\n\n+/","</p><p>",$string) . "</p>";

OR if you want newline between paragraphs:
$string = "This is the first paragraph.\n\nThis is the second\nparagraph.";
echo "<p>" . preg_replace("/\n\n+/","</p>\n<p>",$string) . "</p>";
ASKER
evco

I need to accommodate for all operating systems though.

I tried to build upon your example but can't quite get it there...
echo "<p>" . preg_replace("/^[\r|\n|\r\n]{2,}$/", "</p><p>", $string) . "</p>";

Open in new window

hielo

Try:
echo "<p>" . preg_replace("/(\r|\n|\r\n){2,}/", "</p><p>", $string) . "</p>";
 
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
ASKER
evco

It works...sort of. It wraps <p></p> around ANY amount of newlines. So strings before a single newline get wrapped in paragraph tags.
hielo

Try:
echo "<p>" . preg_replace("/(\r|\n|\r\n)(\r|\n|\r\n)+/", "</p><p>", $string) . "</p>";
ASKER
evco

It still does the same thing. I've spent so much time on this darn thing today that I think I'm starting to repeat some of my solutions out of desparation!

I'm wondering if it may be easier to tweak the original regex since it got the job done most of the way. It's just ignoring chunks that contain a single newline. For example, it will skip wrapping the first paragraph in <p></p> because of the single newline. However this problem only exists for newlines on Win (\r\n) and Unix/Linux (\n) but works for Mac (\r):

$string = "This is a \n paragraph \n\n. And this is another paragraph \n\n"
preg_replace("{(?:^|(?:\x0d\x0a){2,}|\x0a{2,}|\x0d{2,})(.+?)(?=(?:(\x0d\x0a){2,}|\x0d{2,}|\x0a{2,}|$))}", "<p>$1</p>", $string);
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
hielo

Try:

echo "<p>" . preg_replace("/(\n)(\n)+/", "</p><p>", ( preg_replace("/\r/","\n",$string)) ) . "</p>";
OR:
echo "<p>" . preg_replace("/(\n)(\n)+/U", "</p><p>", ( preg_replace("/\r/U","\n",$string)) ) . "</p>";

Open in new window

ASKER
evco

It works for \n and \r but not \r\n and if you throw in some whitespace before or after the newlines it throws everything off. I've tried a couple variation on what you gave me but didn't get anywhere.
hielo

>>but not \r\n ...
That is odd because my last suggestion first changes \r to \n. So if you had \r\n it gets converted to \n\n. Then the regexp to the left takes that result and sees \n\n and makes the substitution. I wonder if your input actually has \f (linefeed):

echo "<p>" . preg_replace("/(\n)(\n)+/", "</p><p>", ( preg_replace("/[\r\f]/","\n",$string))
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
ASKER
evco

Actually I'm setting the input manually so I can control it. The only code being executed in my test script right now has been the following---I change up the newline format each time to test  all three: \n, \r and \r\n

$string = "This is a\r\nparagraph\r\n\r\nThis is another paragraph\rn\r\n";
echo "<p>" . preg_replace("/(\n)(\n)+/", "</p><p>", preg_replace("/[\r\f]/", "\n", $string));

I tried your solution but it wraps <p></p> around the content preceding the single newline.
ASKER CERTIFIED SOLUTION
hielo

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
ASKER
evco

Sweet! That seems to work. Now I just need to try it in the production code---I'm not at my office computer at the moment so I will verify later this evening.
ASKER
evco

Thank you for your help hielo. It works beautifully.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.