tags:

views:

98

answers:

2

I'm trying to cut a chunk of text down to around 30 characters. If it's shorter, I want the previous string. On top of that, it has forum-style code in. I want to strip out everything between square-brackets ([])

I'm using a pair of functions to do that. forum_extract is what I call.

function forum_extract($text) {
        return str_replace("\n",'<br />', limit_text(preg_replace('/\[[^\]]+\]/', '', $text), 30));
}

function limit_text($text, $limit) {
        if (strlen($text) <= $limit)
                return $text;

        $words = str_word_count($text, 2);
        $pos = array_keys($words);
        return substr($text, 0, $pos[$limit]) . '...';
}

The problem comes in limit_text when the provided $text is shorter than the limit. All I get back is a "...".

For that to happen, it must have passed the guard-clause in limit_text. But how?

Here is a literal that gets passed into limit_text but comes out as "...":

Friend of ours paid 150€ the other day from Malaga. Spread across 4 people it didn't seem to bad, given it was a 8+ hour day for the driver, petrol etc.
+2  A: 

I think the problem is related to your $pos[$limit] statement as this will only work if $limit is one of the keys contained in $pos and $pos is actually an 0-based array of the numeric positions of the respective words in your string.

Let's take a look at the example from the PHP manual:

$str = "Hello fri3nd, you're
    looking          good today!";
$words = str_word_count($str, 2);
/*
 * Array
 * (
 *     [0] => Hello
 *     [6] => fri
 *     [10] => nd
 *     [14] => you're
 *     [29] => looking
 *     [46] => good
 *     [51] => today
 * )
 */

$pos = array_keys($words);
/*
 * Array
 * (
 *     [0] => 0
 *     [1] => 6
 *     [2] => 10
 *     [3] => 14
 *     [4] => 29
 *     [5] => 46
 *     [6] => 51
 * )
 */

Without having tested the following code, I'd try:

function limit_text($text, $limit) {
    if (strlen($text) <= $limit) {
            return $text;
    }

    $words = str_word_count($text, 2);
    $cut_pos = strlen($text);
    foreach ($words as $pos => $word) {
        $end = $pos + strlen($word);
        if ($end > $limit) {
            $cut_pos = $end;
            break;
        }
    }
    return substr($text, 0, $cut_pos) . '...';
}
Stefan Gehrig
+1  A: 

The last three lines of your limit_text function are incorrect. You probably want to cut off at a word boundary. I would let php do that using wordwrap():

$text = wordwrap($text, $limit - 3);
# Or a bit faster (depending on your average string size,
# wrapping 20kb of text could take some time):
# $text = wordwrap(substr($text, 0, $limit), $limit - 3);
if ($pos = (strpos($text, "\n"))) {
    $text = substr($text, 0, $pos);
}
return $text . '...';
soulmerge