Your question doesn’t seem to pertain to regular expressions themselves, but only the syntax generally used to express regular expressions. Among many hardcore coders, this syntax has come to be accepted as pretty succinct and powerful, but for longer regular expressions it is actually really unreadable and unmaintainable.
Some people have already mentioned the “x” flag in Perl, which helps a bit, but not much.
I like regular expressions a lot, but not the syntax. It would be nice to be able to construct a regular expression from readable, meaningful method names. For example, instead of this C# code:
foreach (var match in Regex.Matches(input, @"-?(?<number>\d+)"))
{
Console.WriteLine(match.Groups["number"].Value);
}
you could have something much more verbose but much more readable and maintainable:
int number = 0;
Regex r = Regex.Char('-').Optional().Then(
Regex.Digit().OneOrMore().Capture(c => number = int.Parse(c))
);
foreach (var match in r.Matches(input))
{
Console.WriteLine(number);
}
This is just a quick idea; I know there are other, unrelated maintainability issues with this (although I would argue they are fewer and more minor). An extra benefit of this is compile-time verification.
Of course, if you think this is over the top and too verbose, you can still have a regular expression syntax that is somewhere in between, perhaps...
instead of: -?(?<number>\d+)
could have: ("-" or "") + (number = digit * [1..])
This is still a million times more readable and only twice as long. Such a syntax can easily be made to have the same expressive power as normal regular expressions, and it can certainly be integrated into a programming language’s compiler for static analysis.
I don’t really know why there is so much opposition to rethinking the syntax for regular expressions even when entire programming languages are rethought (e.g. Perl 6, or when C# was new). Furthermore, the above very-verbose idea is not even incompatible with “old” regular expressions; the API could easily be implemented as one that constructs an old-style regular expression under the hood.