views:

231

answers:

4

I found this code which will match at most 300 chars, then break at the next nearest word-break:

 $var = 'This is a test text 1234567890 test check12.' # 44 chars
 preg_match('/^.{0,300}(?:.*?)\b/iu', $var, $matches);
 echo $matches[0];

44 is lower than 300, so I expect the output to be the same like $var.

But the output is:

 This is a test text 1234567890 test check12   # 43 chars

$matches[0] is not giving me the dot at the end, however $var does. Anyone can tell me how to get the full string (with the dot)?

A: 

In your

(?:.*?)

You should get rid of the * I think. This means that it must match at least once, but up to infinite times. So you wil find that your period is in the second match.

TO be honest, I would just use the pattern

 preg_match('/^(.){0,300}\b/iu', $var, $matches);
Laykes
'*' means 0 or more. '+' means 1 or more.
thetaiko
+1  A: 

I could get the expected result by:

  • Removing the \b
  • Replacing \b with $

EDIT:

In your pattern the dot at the end of the string is acting as a word boundary, so you are able to match everything before the dot. If you put a .* after the \b , you'll see that it will match the dot.

See this for more info on how word boundaries in regex work.

codaddict
+1  A: 
'/^.{300}(?:.*?)\b|^.*{0,300}/u'

I'm not sure why you want this though. Here is my answer to a similar question, but cutting at the previous nearest space.

Mark Byers
+2  A: 

Using preg_match to break at 300 chars seems like a bad idea. Why don't you just use:

substr($var, 0, strpos($var, ' ', 300));

That will give you the first 300 chars broken at the next whitespace without using regular expressions.

thetaiko