Is there any way to match a function block in javascript source code using regular expressions?
(Really I'm trying to find the opposite of that, but I figured this would be a good place to start.)
Is there any way to match a function block in javascript source code using regular expressions?
(Really I'm trying to find the opposite of that, but I figured this would be a good place to start.)
No, it is not possible. Regexes can't match nested pairs of characters. So something like this would fool it:
function foo() {
if(bar) {
baz();
} // oops, regex would think this was end of function
}
However, you could create a fairly simple grammar to do it (in EBNF-ish form):
javascript_func : "function" ID "(" ")" "{" body* "}" | "function" ID "(" params ")" "{" body* "}" ; params : ID | params "," ID body : [^{}]* // assume this is like a regex | "{" body* "}" ;
Oh, this is also assuming you have some kind of lexer to strip out whitespace and comments.
There are a certain things that regular expressions just aren't very good at. That doesn't mean it's impossible to build an expression that will work, just that it's probably not a good fit. Among those things:
Javascript function blocks tend to cover multiple lines, and you are going to want to find the matching "{" and "}" braces that signify the start and end of the block, which could be nested to an unknown depth. You also need to account for potential braces used inside comments. RegEx will be painful for this.
That doesn't mean it's impossible, though. You might have additional information about the nature of the functions you're looking for. If you can do things like guarantee no braces in comments and limit nesting to a specific depth, you could still build an expression to do it. It'll be somewhat messy and hard to maintain, but at least within the realm of the possible.
Not really, no.
Function blocks aren't regular and so regular expressions aren't the right tool for the job. See, in order to capture a function block in JS, you need to count instances of {
and balance them against instances of }
, otherwise you're going to match too much or too little. Regular expressions can't do this kind of counting.
Just read in the file you're trying to look at and manage the nesting recursively. It's conceptually very easy to manage this way.