I think you're operating under a misunderstanding: These executables aren't huge because they just lump the interpreter in there, they're huge because the whole runtime is in there.
On Windows, most of your runtime is already installed, so you don't have to ship it. You think your program is small, but a quick look at the virtual memory mappings will tell you that even a small "hello-world" type program written in C is a couple megabytes big.
That's just how big useful runtimes are.
If you really want to keep your ship-size small, your only choice is to use the runtimes that are already there, and that means C/C++ and (recently) dot-net.
If you really can't swallow the runtime, Forth is as small as it gets.
The best, most aggressive dynamic languages with the best compilers for Windows are the commercial Lisps. They do a lot of inlining and pruning when producing executables, so you end up shipping only what you use. They are still 1.5x to 5x larger than C/C++ programs.
As far as languages that you know: Perl is as fat as they get. ActiveState has perlapp which I'm sure you're already aware of, but you dismissed because of it's size. Revisit it if you can.
Now, to answer your question (is) there is some technical reason inhibiting the availability of such a best-of-both-world feature?: Yes.
Perl cannot be statically analyzed (proof), which means there's no way for a perl compiler to tell what can be discarded. That means every part of Perl's runtime needs to be available to your program becuase there's no way for your program to indicate what parts can be discarded.
That means that getting a smaller executable is equivalent to getting a smaller runtime, and you should be comfortable accepting that if the perl developers knew how to make the perl runtime smaller without discarding any features, they'd probably do it.
If you are willing to write in a strict subset of Python or PHP, these languages can be analyzed. Shed Skin and HipHop-php are pretty good, but they're still quite large, and they don't support all of Pythons and PHP's features which means that some modules will simply not work. To my knowledge, nobody has implemented pruning for either of these languages (most of the focus in these compilers is in improving their lackluster performance) and it may be another decade or more before anyone bothers, however these still will be the restrictions you have to accept when doing this sort of thing.