I have worked on a program that allows users to enter their own regex and you are right - they can (and do) enter regex that can take a long time to finish - sometimes longer than than the lifetime of the universe. What is worse, while processing a regex Python holds the GIL, so it will not only hang the thread that is running the regex, but the entire program.
Limiting the length of the regex will not work, since the problem is backtracking. For example, matching the regex r"(\S+)+x"
on a string of length N that does not contain an "x" will backtrack 2**N times. On my system this takes about a second to match against "a"*21
and the time doubles for each additional character, so a string of 100 characters would take approximately 19167393131891000 years to complete (this is an estimate, I have not timed it).
For more information read the O'Reilly book "Mastering Regular Expressions" - this has a couple of chapters on performance.
edit
To get round this we wrote a regex analysing function that tried to catch and reject some of the more obvious degenerate cases, but it is impossible to get all of them.
Another thing we looked at was patching the re module to raise an exception if it backtracks too many times. This is possible, but requires changing the Python C source and recompiling, so is not portable. We also submitted a patch to release the GIL when matching against python strings, but I don't think it was accepted into the core (python only holds the GIL because regex can be run against mutable buffers).