views:

314

answers:

3

Here is what I am trying to do. I have a block of text and I would like to extract the first 50 words from the string without cutting off the words in the middle. That is why I would prefer words opposed to characters, then I could just use a left() function.

I know the str_word_count($var) function will return the number of words in a string, but how would I return only the first 50 words?

I'm going full immersion on PHP and I'm not familiar with many of the string functions, yet.

Thanks in advance, Jason

+4  A: 

str_word_count takes an optional parameter that tells it what to return.

Returns an array of strings that are the words:

$words = str_word_count($var, 1);

Then you can slice things up with something like:

$len = min(50, count($words));
$first_fifty = array_slice($words, 0, $len);
geofflane
+1  A: 

Are you sure you want a certain number of words? If you're doing something like a "preview", it's generally better to do something like "Up to 300 characters, cut off at a word boundary", in which case you can use something like:

if (strlen($str)>300)
{
  $str = substr($str,0,300);
  $pos = strrpos($str, ' ');
  if ($pos !== false && $pos > 200) // If there is no space in the last 100 chars, just truncate
    $str = substr($str,0,$pos);
  // You may also want to add ellipses:
  // $str .= '...';
}
Zarel
Hi, whoever modded down my answer, could you explain what's wrong with it? I concede that it's not nearly as detailed as jason's answer, but I don't see anything wrong with it...
Zarel
You got my vote up, but you forget what to do, if there is no space?
Thinker
Actually, I addressed that situation. I've edited it to comment the specific line in which I do.
Zarel
+5  A: 

I would recommend against using the number of words as a baseline. You could easily end up with much less or much more data than you intended to display.

One approach I've used in the past is to ask for a desired length, but make sure it does not truncate a word. Here's something that may work for you:

function function_that_shortens_text_but_doesnt_cutoff_words($text, $length)
{
    if(strlen($text) > $length) {
        $text = substr($text, 0, strpos($text, ' ', $length));
    }

    return $text;
}

That said, if you pass 1 as the second parameter to str_word_count, it will return an array containing all the words, and you can use array manipulation on that. Also, you could although, it's somewhat hackey, explode the string on spaces, etc... But that introduces lots of room for error, such as things that are not words getting counted as words.

PS. If you need a Unicode safe version of the above function, and have either the mbstring or iconv extensions installed, simply replace all the string functions with their mb_ or iconv_ prefixed equivalents.

jason
All the responses are quality, but you explained why I should go by characters instead of words. Thanks!
JasonBartholme
By the way, you don't address the situation where there is no space in the first `$length` characters, and you can still end up with *much* less data than you intended to display if the only space is in the fourth character or so (see my answer for the solution).
Zarel
Honestly, for what I've used this method for, that would never ever ever be an issue. Good point, though.
jason