tags:

views:

86

answers:

3

I am trying to parse a list of operating system instances with their unique identifiers. I am looking for a solution to parse a text string, and pass the values into two variables. The string to be parsed is as followed:

"Ubuntu 9.10" {40f2324d-a6b2-44e4-90c3-0c5fa82c987d}
A: 

This gets the two strings as first and second elements of a string array using str_replace and preg_split:

$s = "\"Ubuntu 9.10\" {40f2324d-a6b2-44e4-90c3-0c5fa82c987d}";
$s = str_replace("\" ", "|", $s); // substitute middle quotation mark and space for a delimiter
$s = str_replace("\"", "", $s); // remove quotation marks
$vars = preg_split('/\|/', $s); // split by delimiter
print_r($vars);
JYelton
`explode()` is faster than `preg_split()` given a single-character delimiter.
Adam Backstrom
+1  A: 

You can use regular expressions to match the groups you need:

$str = '"Ubuntu 9.10" {40f2324d-a6b2-44e4-90c3-0c5fa82c987d}';
preg_match('/^"(.*)" {(.*)}$/', $str, $matches);

You can make the regular expression narrower based on the values (e.g. the second .* could be [0-9a-f-]+), but that's sufficient. $matches[1] will be "Ubuntu 9.10", and $matches[2] will be "40f2324d-a6b2-44e4-90c3-0c5fa82c987d"

Michael Mrozek
+1 for use of regex capture groups; though the second `.*` would need to be `[0-9a-f\-]`, no? :)
JYelton
@JYelton Ah, yes, although the escape isn't technically necessary. Thanks, fixed
Michael Mrozek
+2  A: 

I've been looking for an excuse to read the docs for sscanf():

sscanf($s, '"%[^"]" {%[^}]}', $os, $ident);
echo $os, "<br>", $ident;

Followup: For interest's sake, out of the three answers currently on this question:

sscanf: 0.92999792098999 seconds
preg_match: 4.73761510849 seconds
str_replace x2 + preg_split: 3.7644839286804 seconds

Benchmark here. Funny that two str_replace() and a preg_split() are faster than the preg_match().

Adam Backstrom
Nice; I didn't even know PHP had `sscanf()`
Michael Mrozek
I tend to use printf() and sprintf() a lot, but I had yet to use sscanf(). It's a bit clunky, working around the default behavior of breaking at whitespace.
Adam Backstrom