views:

93

answers:

2

After thinking about This Question and giving an answer to it I wanted to do more about that to train myself.

So I wrote a function which will calc the length of an given function. Th given php-file has to start at the beginning of the needed function. Example: If the function is in a big phpfile with lots of functions, like

/* lots of functions */
function f_interesting($arg) {
    /* function */
}
/* lots of other functions */

then $part3 of my function will require to begin like that (after the starting-{ of the interesting function):

    /* function */
}
/* lots of other functions */

Now that's not the problem, but I would like to know if there are an cleaner or simplier ways to do this. Here's my function: (I already cleaned a lot of testing-echo-commands) (The idea behind it is explained here)

function f_analysis ($part3) {
    if(isset($part3)) {
        $char_array = str_split($part3); //get array of chars
        $end_key = false; //length of function
        $depth = 0; //How much of unclosed '{'
        $in_sstr = false; //is next char inside in ''-String?
        $in_dstr = false; //is nect char inside an ""-String?
        $in_sl_comment = false; //inside an //-comment?
        $in_ml_comment = false; //inside an /* */-comment?
        $may_comment = false; //was the last char an '/' which can start a comment?
        $may_ml_comment_end = false; //was the last char an '*' which may end a /**/-comment?
        foreach($char_array as $key=>$char) {
            if($in_sstr) {
                if ($char == "'") {
                    $in_sstr = false;
                }
            }
            else if($in_dstr) {
                if($char == '"') {
                    $in_dstr = false;
                }
            }
            else if($in_sl_comment) {
                if($char == "\n") {
                    $in_sl_comment = false;
                }
            }
            else if($in_ml_comment) {
                if($may_ml_comment_end) {
                    $may_ml_comment_end = false;
                    if($char == '/') {
                        $in_ml_comment = false;
                    }
                }
                if($char == '*') {
                    $may_ml_comment_end = true;
                }
            }
            else if ($may_comment) {
                if($char == '/') {
                    $in_sl_comment = true;
                }
                else if($char == '*') {
                    $in_ml_comment = true;
                }
                $may_comment = false;
            }
            else {
                switch ($char) {
                    case '{':
                        $depth++;
                        break;
                    case '}':
                        $depth--;
                        break;
                    case '/':
                        $may_comment = true;
                        break;
                    case '"':
                        $in_dstr = true;
                        break;
                    case "'":
                        $in_sstr = true;
                        break;
                }
            }

            if($depth < 0) {
                $last_key = $key;
                break;
            }
        }
    } else echo '<br>$part3 of f_analysis not set!';
    return ($last_key===false) ? false : $last_key+1; //will be false or the length of the function
}
A: 

You could probably reduce the number of state variables a little, but truthfully... yes, it will be messy code. I would probably get rid of $may_ml_comment_end and peek ahead for the next character when I encounter an asterisk, for example. You will need to rewrite your foreach loop to a regular for loop be able to do that without creating a bigger mess though.

PS: I don't see you handling the escape character yet. Without the above approach, that would introduce another boolean variable.

Another problem with your current code is that characters immediately following a / don't get interpreted as they should. However unlikely

echo 5/'2';  // NB: no space in between

is valid in PHP and would break your parser.

Thorarin
+1  A: 

Tokenizer (Example) - Learn it, love it.

Alix Axel