views:

313

answers:

3

Lets say I have the following sentence:

A quick brown fox jumped over a lazy dog.

However I have a limit, that only 25 characters can be allowed in that sentence. This might leave me with something like:

A quick brown fox jum

However, that sentence doesn't make any grammatical sense, so I would prefer to find the last word which we can allow while staying in the 25 char limit. This will give us something like:

A quick brown fox

Which will be less than the 25 char limit, however it makes more grammatical sense. I.e the word isn't broken up, we have the maximum number of comprehensible words while staying in the limit.

How can I code a function which will take a string, and a char limit such as 25, and if the string exceeds the limit, returns the string with the max number of words possible?

+12  A: 

It's easy enough using regex:

function first_few_words($text, $limit) {
    // grab one extra letter - it might be a space
    $text = substr($text, 0, $limit + 1);
    // take off non-word characters + part of word at end
    $text = preg_replace('/[^a-z0-9_\-]+[a-z0-9_\-]*\z/i', '', $text);
    return $text;
}

echo first_few_words("The quick brown fox jumps over the lazy dog", 25);

Some extra features of this implementation:

  • Splits words at linebreaks and tabs also.
  • Saves an extra word which ends at character 25.

Edit: changed regex so that only letters, digits, '_' and '-' are considered word characters.

too much php
+1 That is far more practical
karim79
Better to use `mb_substr` for multi byte safety
James Wheare
Can you change this so punctuation items like `?,.-` are also considered as word delimiters?
Click Upvote
@jwheare: Unfortunately I'm not confident to write the regex so it handles multi-byte ... :-S
too much php
A: 

You cant try adapting this function. I took The idea from the php site and adapted it to my needs. It takes the "head" and "tail" of a string and reduces the string (considering words) to the given length. For your needs yo may be ok striping all the "tail" part of the function.

function strMiddleReduceWordSensitive ($string, $max = 50, $rep = ' [...] ') {
$string=nl2space(utf8decode($string));
$strlen = mb_strlen ($string);

if ($strlen <= $max)
   return $string;

$lengthtokeep = $max - mb_strlen($rep);
$start = 0;
$end = 0;

if (($lengthtokeep % 2) == 0) {
   $length = $lengthtokeep / 2;
   $end = $start;
} else {
   $length = intval($lengthtokeep / 2);
   $end = $start + 1;
}
$tempHead = mb_strcut($string, 0, $length);
$headEnd = strrpos($tempHead, ' ')+1;
$head = trim(mb_strcut($tempHead, 0, $headEnd));

$tempTail = mb_strcut($string, -$length);
$tailStart = strpos($tempTail, ' ')+1;
$tail = trim(mb_strcut($tempTail, $tailStart));
//p($head);
//p($tail);
return $head . $rep . $tail;

}

The Disintegrator
what is this nonsense?
orlandu63
+2  A: 
<?php
    function wordwrap_explode($str, $chars)
    {
        $code = '@@@';
        return array_shift(explode($code, wordwrap($str, $chars, $code)));
    }
    echo wordwrap_explode('A quick brown fox jumped over a lazy dog.', 25);
?>

Output:

A quick brown fox jumped
inakiabt
I think you meant to use `$code` as the first argument to `explode(...)`?
too much php
yap, thanks....
inakiabt