tags:

views:

1435

answers:

4

Hey folks! I need some help with regular expressions/etc. I need to extract virtual keys from url, some sort of router in application. Here are the params:

Rule: /books/:category/:id/:keyname
Data: /books/php/12345/this-is-a-test-keyname

Output should be something like that:

array(
  'category' => 'php',
  'id' => '12345',
  'keyname' => 'this-is-a-test-keyname'
);

So, the question is: how can i do this in php?

P.S Combinations of rules can vary. So, the main keys are the words with ':' symbol. for example like this:

/book-:id/:category/:keyname
/book/:id_:category~:keyname

P.S. 2: This is a piece of code i had before. It is working, but not flexible.

function rule_process($rule, $data) {
     // extract chunks  
     $ruleItems = explode('/',$rule);
     $dataItems = explode('/',$data);

     // remove empty items
     array_clean(&$ruleItems);
     array_clean(&$dataItems);

     // rule and data supposed to have the same structure
     if (count($ruleItems) == count($dataItems)) {
      $result = array();

      foreach($ruleItems as $ruleKey => $ruleValue) {
       // check if the chunk is a key
       if (preg_match('/^:[\w]{1,}$/',$ruleValue)) {
        // ok, found key, adding data to result
        $ruleValue = substr($ruleValue,1);
        $result[$ruleValue] = $dataItems[$ruleKey];
       }
      }

      if (count($result) > 0) return $result;
      unset($result);
     }

     return false;
    }

    function array_clean($array) {
     foreach($array as $key => $value) {
      if (strlen($value) == 0) unset($array[$key]);
     } 
    }

In fact this version of router can be enough for me, but im just interested how to make the flexible solution. By the way, some tests: (30 times of 10000 operations):

TEST #0 => Time:0.689285993576, Failures: 0
TEST #1 => Time:0.684408903122, Failures: 0
TEST #2 => Time:0.683394908905, Failures: 0
TEST #3 => Time:0.68522810936, Failures: 0
TEST #4 => Time:0.681587934494, Failures: 0
TEST #5 => Time:0.681943893433, Failures: 0
TEST #6 => Time:0.683794975281, Failures: 0
TEST #7 => Time:0.683885097504, Failures: 0
TEST #8 => Time:0.684013843536, Failures: 0
TEST #9 => Time:0.684071063995, Failures: 0
TEST #10 => Time:0.685361146927, Failures: 0
TEST #11 => Time:0.68728518486, Failures: 0
TEST #12 => Time:0.688632011414, Failures: 0
TEST #13 => Time:0.688556909561, Failures: 0
TEST #14 => Time:0.688539981842, Failures: 0
TEST #15 => Time:0.689876079559, Failures: 0
TEST #16 => Time:0.689854860306, Failures: 0
TEST #17 => Time:0.68727684021, Failures: 0
TEST #18 => Time:0.686210155487, Failures: 0
TEST #19 => Time:0.687953948975, Failures: 0
TEST #20 => Time:0.687957048416, Failures: 0
TEST #21 => Time:0.686664819717, Failures: 0
TEST #22 => Time:0.686244010925, Failures: 0
TEST #23 => Time:0.686643123627, Failures: 0
TEST #24 => Time:0.685017108917, Failures: 0
TEST #25 => Time:0.686363935471, Failures: 0
TEST #26 => Time:0.687278985977, Failures: 0
TEST #27 => Time:0.688650846481, Failures: 0
TEST #28 => Time:0.688835144043, Failures: 0
TEST #29 => Time:0.68886089325, Failures: 0

So, its fast enough. Im testing on regular laptop. So, for sure - this one can be used in real website.

Any other solutions?

+1  A: 

I don't think this is possible with just one regular expression. Zend Framework works just like your example. Have a look at their source code.

Philippe Gerber
yeah, i saw such routes in rails
A: 

I would start by defining some patterns for each element

$element=array(
    'id'=>'(\d+)',
    'category'=>'([^/]+)'
);

Then build up a regex

$rule="/book-:id/:category/:keyname";

$pattern=preg_quote($rule);
$map=array();
$map[]=null;

function initrule($matches) 
{
    //forgive the globals - quickest way to demonstrate this, in
    //production code I'd wrap this into a class...
    global $element;
    global $map;

    //remember the order we did these replacements
    $map[]=$matches[1];

    //return the desired pattern
    return $element[$matches[1]];
}

$pattern=preg_replace_callback('/:(\w+)/', "initrule", $pattern);

Note you can use that pattern on your target data, and the array of matches you get back should correspond with the element names in $map array - e.g. name $match[1] is in $map[1] etc.

Paul Dixon
but what if i have 20-30 different rules? the problem is - i cannot make the universal solution.
you would have to build a pattern for each rule and test them in sequence. Otherwise, I don't think you're going to have an easy way to figure out what each of the captured matches corresponds to.
Paul Dixon
+1  A: 

Try this simple solution:

    $data = Array (
            "/book/:id/:category/:keyname" => "/book/12345/php/this-is-a-test-keyname",
            "/book-:id/:category/:keyname" => "/book-12345/php/this-is-a-test-keyname",
            "/book/:id_:category~:keyname" => "/book/12345_php~this-is-a-test-keyname",
    );


    foreach ($data as $rule => $uri) {
            $reRule = preg_replace('/:([a-z]+)/', '(?P<\1>[^/]+)', $rule);
            $reRule = str_replace('/', '\/', $reRule);

            preg_match('/' . $reRule .'/', $uri, $matches);
            print_r($matches);
    }

The only downside is, you cannot have fancy data validation at this point, so you have to do it elsewhere. Also it could get messy if the rules conflict with the regex syntax (you’d have to do some heavy escaping job here).

Maciej Łebkowski
at least something to play with, thanks. btw, i posted my current code, its not flexible (only 1 type of delimiters).
A: 

I`ve got another version script: http://blog.sosedoff.com/2009/09/20/rails-like-php-url-router/

Dan Sosedoff