tags:

views:

115

answers:

2

Given an alternation like /(foo|foobar|foobaz)/ does Perl 6 make any promises about which of the three will be used first, and if it does where in the documentation does it make that promise?

See the related question Does Perl currently (5.8 and 5.10) make any promises about the order alternations will be used?.

+6  A: 

S05 says

To that end, every regex in Perl 6 is required to be able to distinguish its "pure" patterns from its actions, and return its list of initial token patterns (transitively including the token patterns of any subrule called by the "pure" part of that regex, but not including any subrule more than once, since that would involve self reference, which is not allowed in traditional regular expressions). A logical alternation using | then takes two or more of these lists and dispatches to the alternative that matches the longest token prefix. This may or may not be the alternative that comes first lexically.

However, if two alternatives match at the same length, the tie is broken first by specificity. The alternative that starts with the longest fixed string wins; that is, an exact match counts as closer than a match made using character classes. If that doesn't work, the tie broken by one of two methods. If the alternatives are in different grammars, standard MRO (method resolution order) determines which one to try first. If the alternatives are in the same grammar file, the textually earlier alternative takes precedence. (If a grammar's rules are defined in more than one file, the order is undefined, and an explicit assertion must be used to force failure if the wrong one is tried first.)

This seems to be a very different promise from the one made in Perl 5.

Chas. Owens
+8  A: 

To put it only a few words: the alternatives should be matched (at least notionally) in parallel, and the longest match wins. If you want sequential alternations, you can use the double bar ||, which promises a left-to-right order just like | does in Perl 5 regexes.

moritz