views:

933

answers:

2

Hello,

I'm currently learning lua. regarding pattern-matching in lua I found the following sentence in the lua documentation on lua.org:

Nevertheless, pattern matching in Lua is a powerful tool and includes some features that are difficult to match with standard POSIX implementations.

As I'm familiar with posix regular expressions I would like to know if there are any common samples where lua pattern matching is "better" compared to regular expression -- or did I misinterpret the sentence? and if there are any common examples: why is any of pattern-matching vs. regular expressions better suited?

Thanks very much,

harald

+1  A: 

http://lua-users.org/wiki/LibrariesAndBindings contains a listing of functionality including regex libraries if you wish to continue using them.

To answer the question (and note that I'm by no means a Lua guru), the language has a strong tradition of being used in embedded applications, where a full regex engine would unduly increase the size of the code being used on the platform, sometimes much larger than just all of the Lua library itself.

[Edit] I just found in the online version of Programming in Lua (an excellent resource for learning the language) where this is described by one of the principles of the language: see the comments below [/Edit]

I find personally that the default pattern matching Lua provides satisfies most of my regex-y needs. Your mileage may vary.

Keith Pimmel
http://www.lua.org/pil/20.1.html
Keith Pimmel
ok -- i thought it wasn't just about the size. i read, that lua's pattern matching library is about 500 loc compared to regexp libs with ~4000 loc -- that's cool, but i thought it was also about convenience: i'm doing a lot with regexp and i know, that this stuff can get very complex and complicated -- so: are there any other features which makes lua's pattern matching more convenient or easier to use or ... than posix regexp -- besides the loc? please keep in mind: it's about learning not flaming.
harald
I'd agree with what Norman posted (which is why he would get my upvote if I had the reputation!). I can't add much more than that other than the personal aesthetic of using it - it just feels better to me. Again, YMMV :) FWIW, when I bounce between differing regex/pattern-matching styles (sed vs. Lua, for instance), it does cause me a headache and often running to documentation. I tend to stay in the tool that I use the most often for this, which happens to be Lua.
Keith Pimmel
+13  A: 

Are any common samples where lua pattern matching is "better" compared to regular expression?

It is not so much particular examples as that Lua patterns have a higher signal-to-noise ratio than POSIX regular expressions. It is the overall design that is often preferable, not particular examples.

Here are some factors that contribute to the good design:

  • Very lightweight syntax for matching common character types including uppercase letters (%u), decimal digits (%x), space characters (%s) and so on. Any character type can be complemented by using the corresponding capital letter, so pattern %S matches any nonspace character.

  • Quoting is extremely simple and regular. The quoting character is %, so it is always distinct from the strong-quoting character \, which makes Lua patterns much easier to read than POSIX regular expressions (when quoting is necessary). It is always safe to quote symbols, and it is never necessary to quote letters, so you can just go by that rule of thumb instead of memorizing what symbols are special metacharacters.

  • Lua offers "captures" and can return multiple captures as the result of a match call. This interface is much, much better than capturing substrings through side effects or having some hidden state that has to be interrogated to find captures. Capture syntax is simple: just use parentheses.

  • Lua has a "shortest match" - modifier to go along with the "longest match" * operator. So for example s:find '%s(%S-)%.' finds the shortest sequence of nonspace characters that is preceded by space and followed by a dot.

  • The expressive power of Lua patterns is comparable to POSIX "basic" regular expressions, without the alternation operator |. What you are giving up is "extended" regular expressions with |. If you need that much expressive power I recommend going all the way to LPEG which gives you essentially the power of context-free grammars at quite reasonable cost.

Norman Ramsey
thanks -- a lot of information. i think i've to delve deeper into lua pattern matching, before i fully understand, what was ment with the quoted sentence ...
harald