I believe that during flash programming, any attempted access to flash will stall the CPU.
So what you want to do is ensure that the critical code (maybe interrupt handlers, watchdog kicker, etc) can be run out of RAM during a program operation. The last time I used the STM32 (probably ~2 years ago) that's exactly what I did.
So just to be clear, to answer the question at the end of your post:
Another way to ask this question is
"would running my flash programming
code from RAM avoid the flash page
erasing stall?".
I believe the answer is "no". It doesn't matter so much where the flash programming driver is located, what matters is what your code does while the erase / program operation is in progress. If the CPU tries to access flash during an operation, even to read instructions for your program or read a table of constants, I believe it will stall.
I know for a fact that this is how the NXP flash works on their ARM uCs, but I wanted to cite chapter & verse for the STM32 as well. For some reason, the flash programming manual seems to be unavailable right now, but I found the following language in a similar document (PM0068, I believe):
An ongoing Flash memory operation will not block the CPU as long as the CPU does not access the Flash memory.
and
If a read/write operation [to flash] is initiated during programming, (BSY bit set), the CPU stalls until the ongoing main Flash memory programming is over.