views:

343

answers:

3

I want to make a class for parsing flat-file database information into one large analogous multidimensional array. I had the idea of formatting the database in a sort of python-esque format as follows:

"tree #1":
    "key" "value"
    "sub-tree #1":
     "key" "value"
     "key #2" "value"
     "key #3" "value"

I am trying to make it parse this and build and array while parsing it to throw the keys/values into, and I want it to be very dynamic and expandable. I've tried many different techniques and I've been stumped in each of these attempts. This is my most recent:

function parse($file=null) {
 $file = $file ? $file : $this->dbfile;

 ### character variables

 # get values of 
 $src = file_get_contents($file);
 # current character number
 $p = 0;

 ### array variables

 # temp shit
 $a = array();
 # set $ln keys
 $ln = array("q"=>0,"k"=>null,"v"=>null,"s"=>null,"p"=>null);
 # indent level
 $ilvl = 0;

 ### go time

 while (strlen($src) > $p) {
  $chr = $src[$p];
  # quote
  if ($chr == "\"") {
   if ($ln["q"] == 1) { // quote open?
    $ln["q"] = 0; // close it
    if (!$ln["k"]) { // key yet?
     $ln["k"] = $ln["s"]; // set key
     $ln["s"] = null;
     $a[$ln["k"]] = $ln["v"]; // write to current array
    } else { // value time
     $ln["v"] = $ln["s"]; // set value
     $ln["s"] = null;
    }
   } else {
    $ln["q"] = 1; // open quote
   }
  }

  elseif ($chr == "\n" && $ln["q"] == 0) {
   $ln = array("q"=>0,"k"=>null,"v"=>null,"s"=>null,"p"=>null);
   $llvl = $ilvl;

  }
  # beginning of subset
  elseif ($chr == ":" && $ln["q"] == 0) {
   $ilvl++;
   if (!array_key_exists($ilvl,$a)) { $a[$ilvl] = array(); }
   $a[$ilvl][$ln["k"]] = array("@mbdb-parent"=> $ilvl-1 .":".$ln["k"]);
   $ln = array("q"=>0,"k"=>null,"v"=>null,"s"=>null,"p"=>null);
   $this->debug("INDENT++",$ilvl);
  }
  # end of subset
  elseif ($chr == "}") {
   $ilvl--;
   $this->debug("INDENT--",$ilvl);
  }
  # other characters
  else {
   if ($ln["q"] == 1) {
    $ln["s"] .= $chr;
   } else {
    # error
   }
  }
  $p++;
 }
 var_dump($a);
}

I honestly have no idea where to go from here. The thing troubling me most is setting the multidimensional values like $this->c["main"]["sub"]["etc"] the way I have it here. Can it even be done? How can I actually nest the arrays as the data is nested in the db file?

A: 

If you just want to load and save an array to a file ...

$content = "<?php\nreturn " . var_export($array, true) . ';';
/*save to configfile.php */
$array = include('configfile.php');
OIS
A: 

Well, you could use serialize and unserialize but that would be no fun, right? You should be using formats specifically designed for this purpose, but for sake of exercise, I'll try and see what I can come up with.

There seems to be two kinds of datatypes in your flatfile, key-value pairs and arrays. key-value pairs are denoted with two sets of quotes and arrays with one pair of quotes and a following colon. As you go through the file, you must parse each row and determine what it represents. That's easy with regular expressions. The hard part is to keep track of the level we're going at and act accordingly. Here's a function that parses the tree you provided:

function parse_flatfile($filename) {
    $file = file($filename);

    $result = array();
    $open = false;
    foreach($file as $row) {
        $level = strlen($row) - strlen(ltrim($row));
        $row = rtrim($row);
        // Regular expression to catch key-value pairs
        $isKeyValue = preg_match('/"(.*?)" "(.*?)"$/', $row, $match);        
        if($isKeyValue == 1) {
            if($open && $open['level'] < $level) {
                $open['item'][$match[1]] = $match[2];
            } else {
                $open = array('level' => $level - 1, 'item' => &$open['parent']);                
                if($open) {
                    $open['item'][$match[1]] = $match[2];
                } else {
                    $result[$match[1]] = $match[2];
                }
            }
        // Regular expression to catch arrays
        } elseif(($isArray = preg_match('/"(.*?)":$/', $row, $match)) > 0) {
            if($open && $open['level'] < $level) {
                $open['item'][$match[1]] = array();
                $open = array('level' => $level, 'item' => &$open['item'][$match[1]], 'parent' => &$open['item']);
            } else {
                $result[$match[1]] = array();
                $open = array('level' => $level, 'item' => &$result[$match[1]], 'parent' => false);
            }
        }    
    }    
    return $result;
}

I won't go into greater detail on how that works, but it short, as we progress deeper into the array, the previous level is stored in a reference $open and so on. Here's a more complex tree using your notation:

"tree_1":
    "key" "value"
    "sub_tree_1":
        "key" "value"
        "key_2" "value"
        "key_3" "value"
    "key_4" "value"
    "key_5" "value"
"tree_2":
   "key_6" "value"
    "sub_tree_2":
        "sub_tree_3":
            "sub_tree_4":
                "key_6" "value"
                "key_7" "value"
                "key_8" "value"
                "key_9" "value"
                "key_10" "value"

And to parse that file you could use:

$result = parse_flatfile('flat.txt');
print_r($result);

And that would output:

Array
(
[tree_1] => Array
 (
 [key] => value
 [sub_tree_1] => Array
  (
  [key] => value
  [key_2] => value
  [key_3] => value
  )    
 [key_4] => value
 [key_5] => value
 )    
[tree_2] => Array
 (
 [key_6] => value
 [sub_tree_2] => Array
  (
  [sub_tree_3] => Array
   (
   [sub_tree_4] => Array
    (
    [key_6] => value
    [key_7] => value
    [key_8] => value
    [key_9] => value
    [key_10] => value
    )    
   )    
  )    
 )    
)

I guess my test file covers all the bases, and it should work without breaking. But I won't give any guarantees.

Transforming a multidimensional array to flatfile using this notation will be left as an exercise to the reader :)

Tatu Ulmanen
Thanks! I actually had no idea what ampersands did before variables and that's a huge help for what I am trying to accomplish. And yes, I know what I am trying to accomplish is unpractical. Transforming that array into a flat file should be cake! :)
ADFDSADSA
+1  A: 

This is all going to depend on how human-readable you want your "flat file" to be.

Want human-readable?

  • XML
  • Yaml

Semi-human-readable?

  • JSON

Not really human-readable?

  • Serialized PHP (also PHP-only)
  • Mysql Dump

Writing your own format is going to be painful. Unless you want to do this purely for the academic experience, then I say don't bother.

Looks like JSON might be a happy medium for you.

$configData = array(
    'tree #1' => array(
        'key'         => 'value'
      , 'sub-tree #1' => array(
          'key'    => 'value'
        , 'key #2' => 'value'
        , 'key #3' => 'value'
      )
  )
);

//  Save config data
file_put_contents( 'path/to/config.json', json_format( json_encode( $configData ) ) );

//  Load it back out
$configData = json_decode( file_get_contents( 'path/to/config.json' ), true );

//  Change something
$configData['tree #1']['sub-tree #1']['key #2'] = 'foo';

//  Re-Save (same as above)
file_put_contents( 'path/to/config.json', json_format( json_encode( $configData ) ) );

You can get the json_format() function here, which just pretty-formats for easier human-reading. If you don't care about human-readability, you can skip it.

Peter Bailey