php - get number of characters outside of <>

I've got a code that gets the number of characters in a post...

$num_chars = strlen(utf8_decode($content));

However, I'm having some trouble in that this count includes characters in html tags <img src="...">, <a href="...">, etc.

I use the character count to display a particular layout based on how long the post is, so I need the character count to be the number of characters displayed to a reader, and to not include the html code.

Any suggestions on the most efficient way to do that?

Thanks,

Chris
St_Aug_Beach_BumAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

GaryCommented:
strip_tags()

$num_chars = strlen(strip_tags(utf8_decode($content))); 

Open in new window

0
Ray PaseurCommented:
Is the data UTF-8?  If so, PHP has mb_strlen() that might give a more accurate count.  Strlen() assumes that a byte == a character, and that isn't true with UTF-8.  Strip_tags() is probably OK, but it has its quirks and is notoriously unreliable with malformed (or even some well-formed) tags.  Your exact results may be PHP-release dependent.  You might want to check the notes on the online man page.  If you set up a test case with some representative data I can show you how to test it.
0
GaryCommented:
It's not in UTF-8
utf8_decode
0
Cloud Class® Course: SQL Server Core 2016

This course will introduce you to SQL Server Core 2016, as well as teach you about SSMS, data tools, installation, server configuration, using Management Studio, and writing and executing queries.

Ray PaseurCommented:
Gary: I didn't overlook that.  PHP is just getting into the 21st century with respect to multi-byte character sets, and UTF8_Decode() does not always work the way we wish it would.  Please see the note here:
http://php.net/manual/en/function.utf8-decode.php#104907

Some suggest using Iconv.  I don't have much experience with it.
0
GaryCommented:
Even if the string is not decoded properly it would (should) still be the same length.
(though I may need to double check that)

edit
But granted for use elsewhere it may not be the best method.
0
St_Aug_Beach_BumAuthor Commented:
Is it utf-8.... is the content from a post using wordpress 3.9.1 utf-8?  I'm not sure.

I have PHP 5.3.3 on the server,

Testing...

Chris
0
Ray PaseurCommented:
is the content from a post using wordpress 3.9.1 utf-8?
I'm not sure either.  Some part of the answer may lie in what/whether the client copied and pasted from Word for Windows (thanks again, Obama).
0
GaryCommented:
Yes (or should be)
http://codex.wordpress.org/Converting_Database_Character_Sets

Why are you decoding the string to start with?
0
St_Aug_Beach_BumAuthor Commented:
Ok, well I feel stupid, but I found I am asking the wrong question.

It's actually this function that is adjusting the layout, based on word count, not character count:

//for getting word count in single.php
function wcount(){
 ob_start();
 the_content();
 $content = ob_get_clean();
 return sizeof(explode(" ", $content));
}
 
Now... I tried changing the $content to this:

 $content = strip_tags(ob_get_clean());

and that caused a slight drop in the word count, but is still counting a lot in the html tags as words.
0
GaryCommented:
Post an example of the string.
0
GaryCommented:
str_word_count()

Splitting a string by a space isn't likely to give you a real word count
0
GaryCommented:
This seems to work fine, not using strip_tags as that may end up with two words together.

<?php
$string = "<a>some text</a>some other text <p>some text</p>some other text";

$string = preg_replace('/<[^>]*>/', ' ', $string);

echo str_word_count($string);

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
St_Aug_Beach_BumAuthor Commented:
Thank you both very much for your help on this!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PHP

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.