x
• Status: Solved
• Priority: Medium
• Security: Public
• Views: 274

# Truncate text by words

I posted this question and got an acceptable answer. I then noticed that the solution produced no results when the sample contained few words than the amount sought in the truncation.

I am trying to truncate a text field by a certain number of words. The truncation needs to be word boundries. Here is the solution I got:

\$pat = '~^(\S+\s+(?=\S)){4}~' ;
\$sub = "hello, foo-bar!\nbaz quux whatever" ;
preg_match(\$pat, \$sub, \$match = array());
echo "'", trim(array_shift(\$match)), "'\n";

If the sample has more words than the number in the pattern (4 in this case) then this works beautifully. If the sample is shorter, it return nothing and I would like for it to return the original sample. I hope this makes sense.

0
td234
• 4
• 2
• 2
1 Solution

Commented:
try this

\$pat = '~^(\S+\s+(?=\S)){4}~' ;
\$sub = "hello, foo-bar!\nbaz quux whatever" ;
if(strlen(\$sub) > 4)
{
preg_match(\$pat, \$sub, \$match = array());
\$result =  "'", trim(array_shift(\$match)), "'\n";
}
else
{
\$result = \$sub;
}

echo \$result;
0

Author Commented:
Thanks ykf2000, but your solutions counts the characters, not the words. Your statement would return true because of the first 4 characters in "hello" and I need to know if it has 4 words or less.
0

Commented:

\$pat = '~^(\S+\s+(?=\S)){4}~' ;
\$sub = "hello, foo-bar!\nbaz quux whatever" ;
preg_match(\$pat, \$sub, \$match = array());
if(strlen(trim(array_shift(\$match))) > 0)
{
\$result =  "'", trim(array_shift(\$match)), "'\n";
}
else
{
\$result = \$sub;
}

echo \$result;
0

Author Commented:
This did not work as is, but was close. AS written, this returned the X (4th) word when the IF statement was true. I had to repeat the preg_match before the first result.
0

Commented:
0

Author Commented:
I am sure there is a cleaner way to do this than my modification to the above recomendation which has two regex's. Are there any regex experts out there with a solution?
0

Commented:
Hmmm, I tried to post something before but it just says 'no text'... strange...

Anyway, I figured you could use {0,4} instead of {4} in your regular expression... that way you match at least 0 times, and at most 4 times, and that's what you want, right?

Maybe there's an easier solution using the 'split'-function, and a for-loop to show the first 4 elements of the resulting array, anybody care to comment on that?

Ciao,

netapi
0

Author Commented:
YES! That is the answer I was looking for. Thank you very much.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.