views:

333

answers:

3

Hello, hopefully this should be a quick and simple one, using PHP I'm trying to split a string into an array but by only the last instance of whitespace. So far I have...

$str="hello this is     a    space";
$arr=preg_split("/\s+/",$str);
print_r($arr);

Array ( [0] => hello [1] => this [2] => is [3] => a [4] => space )

...which splits by all instances of whitespace.

How can I expand this regular expression to split by only the last instance of whitespace? To become...

Array ( [0] => hello this is     a [1] => space )

Thank you in advance of your help!

+4  A: 

Try:

$arr=preg_split("/\s+(?=\S*+$)/",$str);


Edit

A short explanation:

The '(?= ... )' is called a positive look ahead. For example, 'a(?=b)' will only match a single 'a' if the next character (the one to the right of it) is a 'b'. Note that the 'b' is NOT a part of the match!

The '\S' is just a short-hand for the character class '[^\s]'. In other words: it matches a single character other than a white space character. The '+' after the '*' makes the character class '\S' possessive.

Finally, the '$' denotes the end of the string.

To recap: the complete regex '\s+(?=\S*+$)' would read in plain English as follows:

"match one or more white space characters only when looking ahead of those white space characters zero or more characters other than white space characters, followed by the end of the string, can be seen".

Bart Kiers
Excellent, thankyou, that works perfectly.Just for understanding, could you explain how the '(?=\S*+$)' part of that expression work please?
Martin Chatterton
You're welcome Martin. See the 'edit' for an explanation.
Bart Kiers
That's great Bart, thankyou. I'd give you more points if I could!
Martin Chatterton
+2  A: 

This should work:

$str="hello this is a  space";

preg_match('~^(.*)\s+([^\s]+)$~', $str, $matches);
$result = array($matches[1], $matches[2]);

You could do it without a regex:

$parts = array_map('trim', explode(' ', $str));
$result = array(
    implode(' ', array_slice($parts, 0, -1)),
    end($parts)
);

or

$lastSpace = strrpos($str, ' ');
$str1 = trim(substr($str, 0, $lastSpace));
$str2 = trim(substr($str, $lastSpace));
$result = array( $str1, $str2 );
Tom Haigh
Thanks Tom, two clever alternative solutions.
Martin Chatterton
A: 

If the * and + after \S dupicated? Only /\s+(?=\S+$)/ or /\s+(?=\S*$)/ is enough depends on the need.

unigg
Not duplicated - the `*+` is a single command, and `\S*+` can often be more efficient than doing `\S*`. Read up on "possessive quantifiers" for more details.
Peter Boughton