Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 226
  • Last Modified:

php - get number of characters outside of <>

I've got a code that gets the number of characters in a post...

$num_chars = strlen(utf8_decode($content));

However, I'm having some trouble in that this count includes characters in html tags <img src="...">, <a href="...">, etc.

I use the character count to display a particular layout based on how long the post is, so I need the character count to be the number of characters displayed to a reader, and to not include the html code.

Any suggestions on the most efficient way to do that?

Thanks,

Chris
0
St_Aug_Beach_Bum
Asked:
St_Aug_Beach_Bum
  • 7
  • 3
  • 3
1 Solution
 
GaryCommented:
strip_tags()

$num_chars = strlen(strip_tags(utf8_decode($content))); 

Open in new window

0
 
Ray PaseurCommented:
Is the data UTF-8?  If so, PHP has mb_strlen() that might give a more accurate count.  Strlen() assumes that a byte == a character, and that isn't true with UTF-8.  Strip_tags() is probably OK, but it has its quirks and is notoriously unreliable with malformed (or even some well-formed) tags.  Your exact results may be PHP-release dependent.  You might want to check the notes on the online man page.  If you set up a test case with some representative data I can show you how to test it.
0
 
GaryCommented:
It's not in UTF-8
utf8_decode
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
Ray PaseurCommented:
Gary: I didn't overlook that.  PHP is just getting into the 21st century with respect to multi-byte character sets, and UTF8_Decode() does not always work the way we wish it would.  Please see the note here:
http://php.net/manual/en/function.utf8-decode.php#104907

Some suggest using Iconv.  I don't have much experience with it.
0
 
GaryCommented:
Even if the string is not decoded properly it would (should) still be the same length.
(though I may need to double check that)

edit
But granted for use elsewhere it may not be the best method.
0
 
St_Aug_Beach_BumAuthor Commented:
Is it utf-8.... is the content from a post using wordpress 3.9.1 utf-8?  I'm not sure.

I have PHP 5.3.3 on the server,

Testing...

Chris
0
 
Ray PaseurCommented:
is the content from a post using wordpress 3.9.1 utf-8?
I'm not sure either.  Some part of the answer may lie in what/whether the client copied and pasted from Word for Windows (thanks again, Obama).
0
 
GaryCommented:
Yes (or should be)
http://codex.wordpress.org/Converting_Database_Character_Sets

Why are you decoding the string to start with?
0
 
St_Aug_Beach_BumAuthor Commented:
Ok, well I feel stupid, but I found I am asking the wrong question.

It's actually this function that is adjusting the layout, based on word count, not character count:

//for getting word count in single.php
function wcount(){
 ob_start();
 the_content();
 $content = ob_get_clean();
 return sizeof(explode(" ", $content));
}
 
Now... I tried changing the $content to this:

 $content = strip_tags(ob_get_clean());

and that caused a slight drop in the word count, but is still counting a lot in the html tags as words.
0
 
GaryCommented:
Post an example of the string.
0
 
GaryCommented:
str_word_count()

Splitting a string by a space isn't likely to give you a real word count
0
 
GaryCommented:
This seems to work fine, not using strip_tags as that may end up with two words together.

<?php
$string = "<a>some text</a>some other text <p>some text</p>some other text";

$string = preg_replace('/<[^>]*>/', ' ', $string);

echo str_word_count($string);

Open in new window

0
 
St_Aug_Beach_BumAuthor Commented:
Thank you both very much for your help on this!
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 7
  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now