views:

548

answers:

8

I have a function which accepts a string parameter such as: "var1=val1 var2=val2 var3='a list of vals'";

I need to parse this string and pick out the var/val combination's. That is easy enough until introducing something like var3='a list of vals'. Obviously I can't explode the string into an array using a white space delimiter which has me kind of stuck. I want to create an array from this string with the var/val pairs properly assigned, how can I do this in a case where I have something like var3?

A: 

Okay, you can't change it. I would use an algorithm like this:

1) Replace all strings contained inside quotes with a unique Id, and store the ID in an array.

So

var1=val1 var2=val2 var3='a list of vals'

becomes

var1=val1 var2=val2 var3=asifab

array("asifab" => 'a list of vals')

2) Split by spaces

array("var1=val1", "var2=val2", "var3=asifab")

array("asifab" => 'a list of vals')

3) split by equal signs

array("var1"=>"val1", "var2"=>"val2", "var3"=>"asifab")

array("asifab" => 'a list of vals')

4) For each value, see if it's in your array, and if it is, split the array value by spaces and use that as the value

array("var1"=>"val1", "var2"=>"val2", "var3"=>array("a", "list", "of", "values"))

Tom Ritter
Unfortunately yes I am stick working within this structure. The function prototype requires that the list of var/vals be passed as a single string.
Nicholas Kreidberg
$argument in this case is a string though so I can't traverse it with foreach().
Nicholas Kreidberg
A: 

Use RegEx with preg_split()?

I'm not great with RE, but I'm sure you can use this to prevent splitting the string inside the single quotes.

Austin Hyde
+1  A: 

if the format of the string is set in stone, you could do something like:

$string = "var1=val1 var2=val2 var3='this is a test'";

$vars = array();
$i = 0;
while ($i < strlen($string)) {

    $eqIndex = strpos($string, "=", $i);
    $varName = substr($string, $i, $eqIndex - $i);

    $i = $eqIndex + 1;

    if ($string[$i] == "'") 
    {
        $varEndIndex = strpos($string, "'", ++$i);
    }
    else
    {
        $varEndIndex = strpos($string, " ", $i);
        if ($varEndIndex === FALSE) $varEndIndex = strlen($string);
    }

    $varValue = substr($string, $i, $varEndIndex - $i);

    $vars[$varName] = $varValue;

    $i = $varEndIndex + 1;
}

print_r($vars);

EDIT:

More robust function that handles escaped chars in the quoted values:

function getVarNameEnd($string, $offset) {

    $len = strlen($string);
    $i = $offset;
    while ($i < $len) {

        if ($string[$i] == "=")
            return $i;
        $i++;
    }

    return $len;
}

function getValueEnd($string, $offset) {

    $len = strlen($string);
    $i = $offset;
    if ($string[$i] == "'") {
        $quotedValue = true;
        $i++;
    }
    while ($i < $len) {

        if ($string[$i] == "\\" && $quotedValue)
            $i++;
        else if ($string[$i] == "'" && $quotedValue)
            return $i + 1;
        else if ($string[$i] == " " && !$quotedValue)
            return $i;
        $i++;
    }

    return $len;
}

function getVars($string) {

    $i = 0;
    $len = strlen($string);
    $vars = array();
    while ($i < $len) {

        $varEndIndex = getVarNameEnd($string, $i);
        $name = substr($string, $i, $varEndIndex - $i);
        $i = $varEndIndex + 1;

        $valEndIndex = getValueEnd($string, $i);
        $value = substr($string, $i, $valEndIndex - $i);
        $i = $valEndIndex + 1;

        $vars[$name] = $value;
    }

    return $vars;
}

$v = getVars("var1=var1 var2='this is a test' var3='this has an escaped \' in it' var4=lastval");
print_r($v);
Matt Bridges
What if '=' is part of the value in single quotes?
Artem Russakovskii
If you use query string syntax, single quotes and equal signs are escaped using the % character.
Matt Bridges
I added a solution for the type of string you provided.
Matt Bridges
This is working great for the example I gave, doing what I can now to try and "break" it :P -- THANKS.
Nicholas Kreidberg
This solution won't work when there are escaped single-quotes in the values. Working on a better function for you
Matt Bridges
Excellent! How can I check up front (before parsing) to ensure that if one single quote exists that it has another? Obviously they should always be in pairs.
Nicholas Kreidberg
Only having one problem at this juncture -- I can't find a test to catch the condition where:"var1=val1 var2=this is a test" <-- notice the complete omission of single quotes.
Nicholas Kreidberg
+1  A: 

This is traditionally why query strings use & as the delimiter and not spaces.

If you can do that, then just use parse_str to get the data out.

If not, you'll need to do regex:

preg_match_all('/(\S*)=('.*?'|\S*)/g', $your_string, $matches);
print_r($matches);
UltimateBrent
The immediate problem with your current code is .* is greedy - it will match to the last ', which is too much.
Artem Russakovskii
Yes make it something like (\S*)=('.*?'|\S*)..
merkuro
Yeah, just did the edit when i saw your comment merk. Not sure why I'm rated down though now...
UltimateBrent
A: 

You can use a regular expression to find all matching var=val pairs, such as

(\w[0-9A-Za-z]+)=(\'?\w([0-9A-Za-z ]|\\\'|\\=)+\'?)

then you can use preg_match_all to parse them from there, if the string of the second group starts with a ' character you can parse the list.

Mathew Hall
Your current version won't parse values with spaces (\w doesn't include spaces). Also, what happens with escaped single quotes which are part of the value?
Artem Russakovskii
Ah yes good points, fixed as needed.
Mathew Hall
A: 

I'm afraid this problem cannot be solved with simple regex or by simple splitting. Have a look at the str_getcsv() function in PHP 5.3. I think you can make it do exactly what you want.

array str_getcsv  ( string $input  [, string $delimiter  [, string $enclosure  [, string $escape  ]]] )

You can specify delimiter as space instead of comma and enclosure as single quote instead of double quote. If you can, dig up the implementation of this function, understand it, and learn from it. Otherwise get PHP 5.3 to use it.

Edit: There, if you don't have PHP 5.3:

if(!function_exists('str_getcsv')) {
    function str_getcsv($input, $delimiter = ",", $enclosure = '"', $escape = "\\") {
        $fp = fopen("php://memory", 'r+');
        fputs($fp, $input);
        rewind($fp);
        $data = fgetcsv($fp, null, $delimiter, $enclosure); // $escape only got added in 5.3.0
        fclose($fp);
        return $data;
    }
}

Credit: http://www.electrictoolbox.com/php-str-getcsv-function/

Edit: Here's the implementation in Perl: http://search.cpan.org/~makamaka/Text-CSV-1.12/lib/Text/CSV.pm. You can download the source and see the algorithms. If you are up for it :)

Artem Russakovskii
That function does appear to do what I need to do however we haven't upgraded to 5.3 yet... *Sigh*
Nicholas Kreidberg
Then find its definition and copy it :)
Artem Russakovskii
This is a LOT of work for something you can easily do with regex. You also might not have the permissions necessary to run that function depending on the environment.
UltimateBrent
Unfortunately this function doesn't parse out the single quoted strings properly. str_getcsv("var1=test1 var2='testing 1 2 3...'", " ", "'");results in an array with 5 members.
Nicholas Kreidberg
A: 

Haven't given the whole thing much thought, but what about this? Maybe A LITTLE too much code for such a small task :)

<?php
  function parse_vars($string)
  {
    $exploded = explode(" ", $string);
    $return = array();
    foreach($exploded AS $entry){
      if(strpos($entry, "=") === false){      
        $return[$current] .= " ".$entry;
      }else{
        list($key, $value) = explode("=", $entry);
        $return[$key] = $value;
        $current = $key;
      }
    }   
    return $return;
  }

  $string = "var1=val1 var2=val2 var3='a list of vals'";
  print_r(parse_vars($string));
  die();
?>

By the way I still prefer the regex solution with "(\S*)=('.*?'|\S*)" ...

merkuro
A: 

Perhaps you want the parse_str() function?

Here's the example from PHP.net:

<?php
$str = "first=value&arr[]=foo+bar&arr[]=baz";
parse_str($str);
echo $first;  // value
echo $arr[0]; // foo bar
echo $arr[1]; // baz

parse_str($str, $output);
echo $output['first'];  // value
echo $output['arr'][0]; // foo bar
echo $output['arr'][1]; // baz

?>

It seems to do exactly what you're looking for.

redwall_hp