This is more out of curiosity than anything else, as I'm failing to find any useful info on Google about this function (CORE::substcont)
In profiling and optimising some old, slow, XML parsing code I've found that the following regex is calling substcont 31 times for each time the line is executed, and taking a huge amount of time:
Calls: 10000 Time: 2.65s Sub calls: 320000 Time in subs: 1.15s`
$handle =~s/(>)\s*(<)/$1\n$2/g;
# spent 1.09s making 310000 calls to main::CORE:substcont, avg 4µs/call
# spent 58.8ms making 10000 calls to main::CORE:subst, avg 6µs/call
Compared to the immediately preceding line:
Calls: 10000 Time: 371ms Sub calls: 30000 Time in subs: 221ms
$handle =~s/(.*)\s*(<\?)/$1\n$2/g;
# spent 136ms making 10000 calls to main::CORE:subst, avg 14µs/call
# spent 84.6ms making 20000 calls to main::CORE:substcont, avg 4µs/call
The number of substcont calls is quite surprising, especially seeing as I would've thought that the second regex would be more expensive. This is, obviously, why profiling is a Good Thing ;-)
I've subsequently changed both these line to remove the unneccessary backrefs, with dramatic results for the badly-behaving line:
Calls:10000 Time: 393ms Sub calls: 10000 Time in subs: 341ms
$handle =~s/>\s*</>\n</g;
# spent 341ms making 10000 calls to main::CORE:subst, avg 34µs/call
- So, my question is - why should the original have been making SO many calls to substcont, and what does substcont even do in the regex engine that takes so long?