How are regular expressions processed?

+4 A:

Regular expressions can be modeled as a Deterministic Finite State Machine. That would probably be a good place to start if you wanted to "process" one.

scompt.com 2009-07-31 20:48:55

James Conigliaro 2009-07-31 21:04:24

@James. More commonly, yes they are. The often imitated Perl implementation is non-deterministic. There are some engines that are deterministic though. See my answer below.

Steve Wortham 2009-07-31 21:07:05

+3 A:

A regular expression describes a ruleset for a state machine. It generally moves over a string one character at a time, making decisions based on what happened on previous characters and what is described in the regex.

Any regex can also be written as a loop over a string one character at a time. Some of these could be fairly simple, but the power of regex is found when what appears to be a simple regex, with a few lookbehinds and subgroups would take a thousand lines of code to reproduce in your own state machine.

Rex M 2009-07-31 20:50:00

+2 A:

This question is very broad. This is not a complete answer but Jeff Moser has an excellent write-up on his blog that walks through .NET's regex process: How .NET Regular Expressions Really Work

I suspect other answers will shed light on other areas of regular expressions unless your question is updated to be more specific.

Ahmad Mageed 2009-07-31 20:50:33

+2 A:

This will depend on which regex implementation you're referring to.

There are 2 common but different techniques used in regex engines:

This MSDN article explains several techniques implemented in various engines and then goes onto explain .NET's implementation and why Microsoft chose what they chose for .NET.

They go even more in-depth in the various articles you see listed here.

Steve Wortham 2009-07-31 20:52:04

+1 A:

http://www.moserware.com/2009/03/how-net-regular-expressions-really-work.html

huh 2009-07-31 20:53:05

A:

Despite what everyone here says about state machines, you can write a remarkably simple regex recogniser using recursive techiques with very little state. There are examples of these in two of Brian Kernighan's books Software Tools In Pascal and The Practice Of Programming.

anon 2009-07-31 21:01:19

ansaurus

tags:

views:

answers:

How are regular expressions processed?

related questions