Link to home
Start Free TrialLog in
Avatar of simdex
simdexFlag for United States of America

asked on

How to Stop WordPress from Removing Paragraph <p> Tags From Editor?

How do I stop WordPress from removing paragraph tags (<p> and </p>) from the post/page editor? I often copy and paste HTML code from a post or page, edit it in another text or HTML editor, then use the resulting HTML code elsewhere in other applications that do require <p>...</p> tags around paragraphs, not just line breaks.

Please let me know your thoughts. Any and all ideas and solutions are appreciated. Thanks in advance for your help!
Avatar of Dan Craciun
Dan Craciun
Flag of Romania image

Paste HTML code in the "Text" tab of the editor.
You'll see there the HTML structure of your page/post.

HTH,
Dan
ASKER CERTIFIED SOLUTION
Avatar of Jason C. Levine
Jason C. Levine
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of simdex

ASKER

That doesn't address my question. I don't want WordPress to remove the <p> tags from the text/code editor ever.  WordPress by default treats regular line breaks as cue to start a new paragraph (I'm not talking about <br> tags). I don't want this.

Please let me know your thoughts at your convenience.
You'll see there the HTML structure of your page/post.

Dan, FYI:

Unless you disable wpautop() your solution won't work. Pasting the text, then switching to visual mode and/or publishing will strip the p tags.
WordPress has an aggressively opinionated stance toward content filtering, which usually mangles your content horribly badly. Furthermore, it passes your content down a chain of plugins who can each in turn take a whack at further mangling it. Well written software uses validation to insure that dangerous content is not submitted, and bounces it back to you with an error if it is not well formed for you to correct. WordPress on the other hand seems to think it can guess what you meant and will take just about anything, and then consequently destroy it trying to format it to their own ridiculous nonsensical standards. This is generally regarded as an amateurish and lazy approach to content safety. WordPress runs about 22% of all active websites, and is the chief violator of this extremely lazy and illogical approach to content safety. The end result is that you have to constantly fight with the platform just to get it to save what you gave it without destroying it, and in a lot of cases, you really can't do much about it except pick a different platform or install fifty plugins to try and make it play nice in the sandbox.

Plugins themselves often complicate things further, because many of them are not particularly written very well either.

Most of them use regex to parse html, which is pretty much a cardinal sin in programming (use an xml parser, not regex), and WordPress itself also does this, which constantly results in stray p tags and broken markup that makes it not validate. Anyhow, if you want to make WordPress stop messing up your content, I can explain the process for you, but it is not straightforward or easy to implement.

WordPress, without any plugins helping, mangles your content in about five locations, which collectively represent one round trip from the editor to the database, and then out of the database and back to your page:

- The wpautop filter
- The wpremovep filter
- The sanitizers that run prior to inserting it into the database, which include stripslashes and stripslashes_deep
- The javascript equivalent of wpautop that is loaded with the tinyMCE editor (yes they do this in two different places, written in two different languages, and they do not both work the same way),
- The javascript equivalent of wpremovep that is also loaded with the tinyMCE editor (they also do this in two different places in two different languages).

You need to disable all of these if you want WordPress not to mangle your content, and that is also with the assumption that you do not also have plugins or your theme further applying bad filtering to it.

wpautop (the backend version) can be disabled with

remove_filter( 'the_content', 'wpautop' );
remove_filter( 'the_excerpt', 'wpautop' );

Open in new window


The above must be registered with a priority of 10 (the default), or it does not work, and WordPress will happily ignore your request, not tell you, and apply it anyways.


There is not a filter to remove wpremovep, stripslashes, or stripslashes_deep. Instead, you want to filter the result prior to them running, replace the content that gets stripped with a placeholder that does not, and then swap it back for the regular markup when retrieving the post/page from the database to display. You would do this by registering your substitution filter extremely early, and registering your reverse substitution filter extremely late. This is a constant problem if you have content that contains backslashes on your site, such as code samples, although it also applies this breakage to various other nitpicky characters here and there that it deems unsafe, and usually does not provide any capacity to whitelist them whatsoever, leaving your only remaining option to be a double replacement approach.

The javascript portion is problematic. There is no api provided to disable this functionality. However, if you want a quick answer, you can register a script into the admin footer on editor pages that replaces
wp.editor.wpautop

Open in new window

and
wp.editor.wpremovep

Open in new window

with functions that just return the same thing they get.

This works by using one of WordPress's chief weaknesses against it, which is the fact that they make everything globally accessible everywhere, allowing you to edit just about anything, including internals that you should not be messing with. Javascript itself also is inherently global, and anything registered into the window element can be replaced at any time. Most javascript libraries wrap their code in a self executing function to prevent anything from messing with them, but wordpress just dumps it into the window for anything to do anything with, which is why you can do this, even though it is not something you are officially supposed to be able to edit.

This will however make your editor display everything on one line, because the WordPress editor functionality is built around the opinion that you are going to do it their way whether you like it or not. You can alternately install a plugin that swaps the editor for a better one, but you will also need to address the filter portion as described above in order to fully retain your content integrity.

It should also be noted that WordPress seems utterly committed to not giving you tools to keep things strictly how you put it in there, and frequently changes up their api to make this process further difficult to accomplish. As of the current version (4.9), this approach works, but they will inevitably break it again further.