Solved

PHP Function to break sentence after X characters and add dots...

Posted on 2013-02-05
22
750 Views
Last Modified: 2013-02-23
Hi,

I'm struggling with a function to break a sentence after X characters. What i had worked oke, until the broken word contains special characters like:
ö ë & "

If the X character was one of the UTF-8 chars of the special char, strange chars would be shown.

How to build a function that also prevents breaking the output of special chars.

Thanks!!
0
Comment
Question by:peps03
  • 8
  • 7
  • 3
  • +2
22 Comments
 
LVL 12

Expert Comment

by:sivagnanam chandrakanth
ID: 38854417
Try this

<?php
$string= "Remove a counter from the row ö ë & specified by key at the granularity specified by column_path. Note that all the values in column_path besides column_path.column_family are truly optional: you can remove the entire row by just specifying the ColumnFamily, or you can remove a SuperColumn or a single Column by specifying those levels too. Note that counters have limited support for deletes: if you remove a counter, you must wait to issue any following update until the delete has reached all the nodes and all of them have been fully compacted.";
$parts = str_split($string, $split_length = 10);
echo "<pre>";
print_r($parts);

?>

Open in new window

0
 

Author Comment

by:peps03
ID: 38854469
Thx. But what i actually meant is a function in which i can enter a string, and an amount of characters, say 20, like:

$input = 'hello there, i am trying to get this working';
echo shorten($input, 20);
Output:
Hello there, i am tr..

But it should also work if the 20th character is a special char like: ë or &amp;
0
 
LVL 12

Expert Comment

by:sivagnanam chandrakanth
ID: 38854486
ok..Try this

<?php
$string= "Removö ë & counter from the row ö ë & specified by key at the granularity specified by column_path. Note that all the values in column_path besides column_path.column_family are truly optional: you can remove the entire row by just specifying the ColumnFamily, or you can remove a SuperColumn or a single Column by specifying those levels too. Note that counters have limited support for deletes: if you remove a counter, you must wait to issue any following update until the delete has reached all the nodes and all of them have been fully compacted.";

echo shorten($string,0,10);

function shorten($str,$start,$len){

return substr($str,$start,$len);

}
?>

Open in new window

0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 

Author Comment

by:peps03
ID: 38854524
Thanks.

I mean when this happens:

$string= "Frlkrrtï"; // = ï

echo shorten(utf8_decode($string),0,8);

function shorten($str,$start,$len){

return substr($str,$start,$len);

}

Open in new window


change 8 to 9 and it works.
but when it is 8, it doesn't.

and ï is saved in the db like ï
0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
ID: 38854555
0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 38854757
What you're referring to is called a "teaser fragment" in publishing.  You may need to make multi-byte function changes here, but this works for me most of the time.
<?php // RAY_teaser_fragment.php
error_reporting(E_ALL);


// CREATE A TEASER FRAGMENT HEADLINE
// RETURN FIRST FEW WHOLE WORDS FOLLOWED BY ELLIPSES
// WITH A LINK TO THE FULL ARTICLE
// $length IS MINIMUM TRUNCATION CHARACTER COUNT


function teaser_fragment($text, $length=32, $url='#', $delim='|||')
{
    // IF TRUNCATION IS NEEDED
    if (strlen($text) > $length)
    {
        // IF TRUNCATION IS NEEDED, BREAK STRING APART
        $t = wordwrap($text, $length, $delim);
        $a = explode($delim, $t);
        $z = '...';
    }
    // IF TRUNCATION IS NOT NEEDED
    else
    {
        $a[0] = $text;
        $z = NULL;
    }

    // CONSTRUCT THE FRAGMENT WITH THE LINK AND ADD ELLIPSIS (LINK) TO THE END
    $teaser
    = '<a target="_blank" href="'
    . $url
    . '">'
    . $a[0]
    . $z
    . '</a>'
    ;
    return $teaser;
}



// USE CASES
echo "<pre>";
echo PHP_EOL;
echo "1...5...10...15...20...25...30...35...40...45..." . PHP_EOL;
echo teaser_fragment('Now is the time for all good men to come to the aid of their party');

echo PHP_EOL;
echo teaser_fragment('Now is the time for all good men to come to the aid of their party', 300);

echo PHP_EOL;
echo teaser_fragment('Now is the time for all good men to come to the aid of their party', 15, 'http://en.wikipedia.org/wiki/Filler_text');

Open in new window

HTH, ~Ray
0
 

Author Comment

by:peps03
ID: 38855071
is there no function to count how many bytes a certain special character consists of?
so you can add this to the set length of the of the requested output.
(without generating longer output, but just to prevent special characters that consist of multiple bytes from getting broken)
0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 38856181
Have you tried strlen()?  Also, if you can post a link (please do not post the text) to a sample document we might be able to give you an example or two.
0
 
LVL 9

Expert Comment

by:rinfo
ID: 38858849
maybe you need to use multibyte string function to include unicode char in your routines.
refer to this
http://php.net/manual/en/ref.mbstring.php
0
 

Author Comment

by:peps03
ID: 38859570
Thanks.

Can this function spot if a character consists of multiple bytes? And maybe count all the bytes of the special characters? That would actually be all i need.

Could you give an example of that?

Thanks!
0
 
LVL 9

Expert Comment

by:rinfo
ID: 38860162
Well first you have to enable php_mbstring.dll in php.ini
After that you may try this code .
Important here is to mention encoding you are using,
mb_internal_encoding("UTF-8");  //set encoding here
  $string=  "Remove a counter from the row ö ë & specified by key at";
  $stringLen = mb_strlen($string) ; //total length of the string
  
  $stringPart1 = mb_substr($string,0,10) ; //get part of the string to retain = retain first 10 chars
 
  $stringPart2 = mb_substr($string,10,$stringLen); //part of the string to replace with '.'
  
  $stringPart2 = mb_ereg_replace( "/(^\s+)|(\s+$)/us", ".", $stringPart2);
  $string = $stringPart1.$stringPart2;

Open in new window

0
 
LVL 9

Expert Comment

by:rinfo
ID: 38860201
I have tested codes .
sorry but its not working.
Result are same as you have mentioned in you post;
0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 38866048
Please post a link to a sample document.  Thanks, ~Ray
0
 

Author Comment

by:peps03
ID: 38867837
I don't have a link / page, its just a simple function.

This is what i have:

<?
	$string0 = 'Föööd music &amp; DJ&#039;s';
	$string1 = 'Fööd music &amp; DJ&#039;s';


function limit_letters($string, $letter_limit){
	
	
	$string2 = preg_replace('/&(.*?);/si', '-', $string);
	$countstring = mb_strlen(utf8_decode($string2), 'UTF-8');
		
	//if(strlen($countstring) > $letter_limit){
	if($countstring > $letter_limit){
		$dots1 = '..';
		$stringfixed = substr(htmlspecialchars_decode($string),0,$letter_limit);
	}else{
		$dots1 = '';
		$stringfixed = $string;
	};
	
	$stringfixed = utf8_decode($stringfixed);
	return rtrim($stringfixed).$dots1;
}

echo '<br><br>'.limit_letters($string0, 16).'<br><br>';
echo '<br><br>'.limit_letters($string1, 16).'<br><br>';
?>

Open in new window


As you see, the difference in string0 and string1 is 1 character (ö).
But the output is a difference of 4 letters. i need this to be a difference of 1 character, as ö is only 1 character (but more bytes...)
0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 38868116
The reason I am asking for a link to an external file containing the sample document is that the simple act of copying and posting the data may mung the multi-byte characters.  Rather than try to run a test on munged data, I would like to be able to use PHP to open the file that contains the original test data.  So please put the sample data online somewhere and post the URL of the online file here.  Thanks, ~Ray
0
 

Author Comment

by:peps03
ID: 38868170
http://pjpn.eu/shorten-chars/

this is the same code, but online.
0
 
LVL 109

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 38870936
This seems to test out OK.  The function is at line 15; the test cases are at the end.

<?php // RAY_temp_peps03.php
error_reporting(E_ALL);
echo '<pre>';

// MAN PAGE: http://www.joelonsoftware.com/articles/Unicode.html
// MAN PAGE: http://www.columbia.edu/kermit/utf8-t1.html
// MAN PAGE: http://www.utf-8.com/
// MAN PAGE: http://www.unicode.org/ucd/
// MAN PAGE: http://www.unicode.org/faq/line_breaking.html
// MAN PAGE: http://www.unicode.org/reports/tr14/
// MAN PAGE: http://www.php.net/manual/en/mbstring.supported-encodings.php
// MAN PAGE: http://www.php.net/manual/en/function.mb-split.php#99851

// MAKE A STRING FRAGMENT OF THE CORRECT LENGTH
function mb_teaser_fragment($str, $len, $tail=NULL)
{
    $arr = preg_split('/(?<!^)(?!$)/u', $str);
    $arr = array_slice($arr,0,$len);
    return implode(NULL, $arr) . $tail;
}

// GET THE TEST DATA SET
$url = 'http://pjpn.eu/shorten-chars/';
$raw = file_get_contents($url);

// REMOVE THE EXTRANEOUS STUFF
$raw = str_replace('<br><br><br><br>', '|', $raw);
$raw = strip_tags($raw);
$raw = str_replace('Untitled Document', NULL, $raw);
$raw = trim($raw);

// ENABLE BROWSER DISPLAY
echo '<meta charset="utf-8" />' . PHP_EOL;

// EXPLODE THE MULTI-BYTE STRING INTO AN ARRAY OF TEST STRINGS
$arr = explode('|', $raw);

// ADD A NON-MULTI-BYTE STRING
$arr[] = 'Food is delicious';

// SHOW THE COMPARISONS OF THE LENGTH WITH DIFFERENT CHARACTER SETS
mb_internal_encoding("UTF-8");
foreach ($arr as $utf)
{
    $cnt =    strlen($utf);
    $mbt = mb_strlen($utf);
    echo PHP_EOL . $utf . " StrLen=$cnt AND Mb_Strlen=$mbt";
}

// MAKE SOME TESTS
$utf = $arr[0];
echo PHP_EOL . $utf;
echo PHP_EOL . mb_teaser_fragment($utf,  1);
echo PHP_EOL . mb_teaser_fragment($utf,  2);
echo PHP_EOL . mb_teaser_fragment($utf,  3);
echo PHP_EOL . mb_teaser_fragment($utf,  4);
echo PHP_EOL . mb_teaser_fragment($utf,  5);
echo PHP_EOL . mb_teaser_fragment($utf,  6);
echo PHP_EOL . mb_teaser_fragment($utf,  7);
echo PHP_EOL . mb_teaser_fragment($utf,  8);
echo PHP_EOL . mb_teaser_fragment($utf,  9);
echo PHP_EOL . mb_teaser_fragment($utf, 10);
echo PHP_EOL . mb_teaser_fragment($utf, 11);
echo PHP_EOL . mb_teaser_fragment($utf, 12);
echo PHP_EOL . mb_teaser_fragment($utf, 13);
echo PHP_EOL;

$utf = $arr[2];
echo PHP_EOL . $utf;
echo PHP_EOL . mb_teaser_fragment($utf,  1);
echo PHP_EOL . mb_teaser_fragment($utf,  2);
echo PHP_EOL . mb_teaser_fragment($utf,  3);
echo PHP_EOL . mb_teaser_fragment($utf,  4);
echo PHP_EOL . mb_teaser_fragment($utf,  5);
echo PHP_EOL . mb_teaser_fragment($utf,  6);
echo PHP_EOL . mb_teaser_fragment($utf,  7);
echo PHP_EOL . mb_teaser_fragment($utf,  8);
echo PHP_EOL . mb_teaser_fragment($utf,  9);
echo PHP_EOL . mb_teaser_fragment($utf, 10);
echo PHP_EOL . mb_teaser_fragment($utf, 11);
echo PHP_EOL . mb_teaser_fragment($utf, 12);
echo PHP_EOL . mb_teaser_fragment($utf, 13);
echo PHP_EOL . mb_teaser_fragment($utf, 14);
echo PHP_EOL . mb_teaser_fragment($utf, 15);
echo PHP_EOL . mb_teaser_fragment($utf, 16);
echo PHP_EOL . mb_teaser_fragment($utf, 17);
echo PHP_EOL . mb_teaser_fragment($utf, 18);
echo PHP_EOL . mb_teaser_fragment($utf, 19);
echo PHP_EOL . mb_teaser_fragment($utf, 20);

Open in new window

Best regards, ~Ray
0
 

Author Comment

by:peps03
ID: 38904904
Is there a method to count specified characters in a string?

Say i can count all the spaces, small and capital letters and numbers in a string, later i can subtract this amount from the total length of characters in the string. Now i will know the amount of special characters in the string.

This is what is what i was thinking:

Say you want to only echo the first 10 chars of a string. The string = 'Fööd Jazzz and drinks' (= Fööd Jazzz and drinks)

The first ten chars would echo: 'Fööd Jaz' instead of 'Fööd Jazzz' because of the special chars count for 2.
These would be 10 chars counted: 'Fööd Jaz'

So only 6 letters, and spaces are counted. This subsequently leaves 4 special chars. Knowing each 2 special chars is 1 'normal' char, the output limit of 10 should be increased to 12 to output the desired text.

So is it possible to count pre-specified characters in a given string somehow?
0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 38904946
I think you should post a new question for this.  It's been two weeks since the original question, which was answered with a tested-and-working code example.

Best regards, ~Ray
0
 

Author Comment

by:peps03
ID: 38905685
I've requested that this question be closed as follows:

Accepted answer: 0 points for peps03's comment #a38904904

for the following reason:

Thanks Ray!
0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 38905686
I believe the author accidentally posted a close request instead of accepting the answer, which is accompanied with a tested and working code example at this URL.
http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28020210.html#a38870936

If that's wrong and I misunderstood the question or the close request, I'd like a chance to find out what the issues were.  Thanks, ~Ray
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
mysql query for sum() 3 28
Link failure 16 36
Reference key in foreach loop 4 22
How to get this library to work load? 8 25
Popularity Can Be Measured Sometimes we deal with questions of popularity, and we need a way to collect opinions from our clients.  This article shows a simple teaching example of how we might elect a favorite color by letting our clients vote for …
Introduction HTML checkboxes provide the perfect way for a web developer to receive client input when the client's options might be none, one or many.  But the PHP code for processing the checkboxes can be confusing at first.  What if a checkbox is…
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question