views:

84

answers:

2

I'm using the Jison parser generator for Javascript and am having problems with my language specification.

The program I'm writing will be a calculator that can handle feet, inches and sixteenths. In order to do this, I have the following specification:

%%
([0-9]+\s*"'")?\s*([0-9]+\s*"\"")?\s*([0-9]+\s*"s")? {return 'FIS';}
[0-9]+("."[0-9]+)?\b  {return 'NUMBER';}
\s+                   {/* skip whitespace */}
"*"                   {return '*';}
"/"                   {return '/';}
"-"                   {return '-';}
"+"                   {return '+';}
"("                   {return '(';}
")"                   {return ')';}
<<EOF>>               {return 'EOF';}

Most of these lines come from a basic calculator specification. I simply added the first line.

The regex correctly matches feet, inch, sixteenths, such as 6'4" (six feet, 4 inches) or 4"5s (4 inches, 5 sixteenths) with any kind of whitespace between the numbers and indicators.

The problem is that the regex also matches a null string. As a result, the lexical analysis always records a FIS at the start of the line and then the parsing fails.

Here is my question: is there a way to modify this regex to guarantee that it will only match a non-zero length string?

EDIT Although the regex has capturing groups in it, I do not need to capture those groups. I know I could use non-capturing groups, but it's a little clearer without the (?:...).

A: 

The problem is that everything in your first line is optional - either ? (0 or 1) or * (0 or more).

I'm not too familiar with the imperial system (I've never seen sixteenths before...), but perhaps something like

([0-9]+\s*["'s])+    (with whatever escaping is necessary for the " and ' - I'm not a javascript guy)

This definitely ensures that it doesn't match an empty string, the problem with this is it would allow something like 5s 4" 6', which is probably not quite what you want...

Jon
yeah... That's the solution I'm currently working with. You're absolutley right about the ordering problem. Right now, that's handled by the `FIS.fromString` method (defined elsewhere)
Dancrumb
A: 

You can add (?=.) at the beginning of your regex.

tiftik
Thanks! Unfortunately, that regex is not supported by Jison, but kudos for solving the *actual* question I asked... adding this *does* match the string correctly
Dancrumb
Correction... this regex **is** support by Jison.For my specific need, I had to use: (?=[^0-9*/\-+()]), so that it wouldn't match the other tokens
Dancrumb