views:

47

answers:

3

Edit: OK, I can't read, thanks to Col. Shrapnel for the help. If anyone comes here looking for the same thing to be answered... print_r(preg_split('/([\!|\?|\.|\!\?])/', $string, null, PREG_SPLIT_DELIM_CAPTURE));

Is there any way to split a string on a set of delimiters, and retain the position and character(s) of the delimiter after the split?

For example, using delimiters of ! ? . !? turning this:

$string = 'Hello. A question? How strange! Maybe even surreal!? Who knows.';

into this

array('Hello', '.', 'A question', '?', 'How strange', '!', 'Maybe even surreal', '!?', 'Who knows', '.');

Currently I'm trying to use print_r(preg_split('/([\!|\?|\.|\!\?])/', $string)); to capture the delimiters as a subpattern, but I'm not having much luck.

A: 

Simply add the PREG_SPLIT_DELIM_CAPTURE to the preg_split function:

$str = 'Hello. A question? How strange!';
$var = preg_split('/([!?.])/', $str, 0, PREG_SPLIT_DELIM_CAPTURE);
$var = array(
    0 => "Hello",
    1 => ".",
    2 => " A question",
    3 => "?",
    4 => " How strange",
    5 => "!",
    6 => "",
);
ircmaxell
A: 

Your comment sounds like you've found the relevant flag, but your regex was a little off, so I'm going to add this anyway:

preg_split('/(!\?|[!?.])/', $string, null, PREG_SPLIT_DELIM_CAPTURE);

Note that this will leave spaces at the beginning of every string after the first, so you'll probably want to run them all through trim() as well.

Results:

$string = 'Hello. A question? How strange! Maybe even surreal!? Who knows.';
print_r(preg_split('/(!\?|[!?.])/', $string, null, PREG_SPLIT_DELIM_CAPTURE));

Array
(
    [0] => Hello
    [1] => .
    [2] =>  A question
    [3] => ?
    [4] =>  How strange
    [5] => !
    [6] =>  Maybe even surreal
    [7] => !?
    [8] =>  Who knows
    [9] => .
    [10] => 
)
Chad Birch
A: 

You can also split on the space after a ., !, ? or !?. But this can only be used if you can guarantee that there is a space after such a character.

You can do this, by matching a but with a positive look-back: (<=\.|!?|?|!): this makes the regex

'/(?<=\.|\?|!) /'

And then, you'll have to check if the strings matched ends with !?: if so, substring the last two. If not, you'll have to substring the last character.

Pindatjuh