tags:

views:

40

answers:

2

Is \d++ a valid regular expression in programming languages that don't support possessive quantifier?Is it equivalent to (\d+)+?

When testing it in Python,an error sre_constants.error: multiple repeat will be raised.In C#,it will throw a runtime exception:System.ArgumentException: parsing "\d++" - Nested quantifier +.As well as boost::xpressive.
But \d++...+ is considered valid in boost::regex.

wchar_t* s = L"abc123" ;
wregex e(L"\\d+++", boost::regex::normal) ;
wcmatch m ;

if(regex_search(s, m, e)){
    wcout <<  m[0] << endl ;
}

The output is 123.

+1  A: 

The above code throws an instance of boost::bad_expression with "Invalid preceding regular expression" for me. Its a redhat linux system compiled with gcc 3.4.6 and boost 1_32.

aeh
I compiled it with Visual C++ 2008 and boost 1.36|1.44.
Explogit
+1  A: 

Without possessive quantifiers, what would \d++ (or (\d+)+) mean?

Let's assume it was a valid syntax, and we could read it as "one or more (one or more digit)". In that case, we'd still be able to reduce the expression to \d+ (\d+ matches a single digit, so (\d+)+ could be simplified to (\d)+, which still matches one or more digits) . Therefore, \d++ would be redundant.

I am not aware of any regular expression engine wither \d++ is valid syntax, aside from engines that support possessive quantifiers.

Daniel Vandersluis