views:

248

answers:

7

I have a really long string in a certain pattern such as userAccountName: abc userCompany: xyz userEmail: [email protected] userAddress1: userAddress2: userAddress3: userTown: .....and so on. This pattern repeats.

I need to find a way to process this string so that I have the values of userAccountName:, userCompany: etc. (i.e. preferably in an associative array or some such convenient format).

Is there an easy way to do this or will I have to write my own logic to split this string up into different parts?

+2  A: 

Simple regular expressions like this userAccountName:\s*(\w+)\s+ can be used to capture matches and then use the captured matches to create a data structure.

Alan Haggai Alavi
+2  A: 

If you can arrange for the data to be formatted as it is in a URL (ie, var=data&var2=data2) then you could use parse_str, which does almost exactly what you want, I think. Some mangling of your input data would do this in a straightforward manner.

McPherrinM
+1  A: 

You might have to use regex or your own logic.

Are you guaranteed that the string ": " does not appear anywhere within the values themselves? If so, you possibly could use implode to split the string into an array of alternating keys and values. You'd then have to walk through this array and format it the way you want. Here's a rough (probably inefficient) example I threw together quickly:

<?php
$keysAndValuesArray = implode(': ', $dataString);
$firstKeyName = 'userAccountName';
$associativeDataArray = array();
$currentIndex = -1;
$numItems = count($keysAndValuesArray);
for($i=0;$i<$numItems;i+=2) {
    if($keysAndValuesArray[$i] == $firstKeyName) {
         $associativeDataArray[] = array();
         ++$currentIndex;
    }
    $associativeDataArray[$currentIndex][$keysAndValuesArray[$i]] = $keysAndValuesArray[$i+1];
}

var_dump($associativeDataArray);
PCheese
+1  A: 

If you can write a regexp (for my example I'm considering there're no semicolons in values), you can parse it with preg_split or preg_match_all like this:

<?php

  $raw_data = "userAccountName: abc userCompany: xyz";
  $raw_data .= " userEmail: [email protected] userAddress1: userAddress2: ";

  $data = array();
  // /([^:]*\s+)?/ part works because the regexp is "greedy"
  if (preg_match_all('/([a-z0-9_]+):\s+([^:]*\s+)?/i', $raw_data,
                     $items, PREG_SET_ORDER)) {
    foreach ($items as $item) {
      $data[$item[1]] = $item[2];
    }
    print_r($data);
  }

?>

If that's not the case, please describe the grammar of your string in a bit more detail.

drdaeman
Won't [^:]+ grab too much of any following "label"?
PatrikAkerstrand
Yes, sorry, didn't notice it. I'll edit the post with more appropriate solution (using preg_match_all, not preg_split). Have to think more and recheck everything before posting...
drdaeman
A: 

PCRE is included in PHP and can respond to your needs using regexp like:

if ($c=preg_match_all ("/userAccountName: (<userAccountName>\w+) userCompany: (<userCompany>\w+) userEmail: /", $txt, $matches))
{
  $userAccountName = $matches['userAccountName'];
  $userCompany = $matches['userCompany'];
  // and so on...
}

the most difficult is to get the good regexp for your needs. you can have a look at http://txt2re.com for some help

eric espie
A: 

If i were you, i'll try to convert the strings in a json format with some regexp.

Then, simply use Json.

DaNieL
A: 

I think the solution closest to what I was looking for, I found at http://www.justin-cook.com/wp/2006/03/31/php-parse-a-string-between-two-strings/. I hope this proves useful to someone else. Thanks everyone for all the suggested solutions.

Gaurav