I'm searching the pattern "(.*)\1" on the text "blabl" with regexec() and get successful match but empty matches in regmatch_t structures. What exactly has been matched?
The regex .* can match successfully a string of zero characters, or the nothing that occurs between adjacent characters.
So your pattern is matching zero characters in the parens, and then matching zero characters immediately following that.
So if your regex was /f(.*)\1/
it would match the string "foo" between the 'f' and the first 'o'.
You might try using .+
instead of .*
, as that matches one or more instead of zero or more. (Using .+ you should match the 'oo' in 'foo')
\1 is the backreference typically used for replacement later or when trying to further refine your regex by getting a match within a match. You should just use (.*), this will give you the results you want and will automatically be given the backreference number 1. I'm no regex expert but these are my thoughts based on my limited knowledge.
As an aside, I always revert back to RegexBuddy when trying to see what's really happening.
\1 is the "re-match" instruction. The question is, do you want to re-match immediately (e.g., BLABLA)
/(.+)\1/
or later (e.g., BLAahem**BLA**)
/(.+).*\1/