I'd say you're about half right. If you graphed speed of interpretation, with language level on the X axis and speed of execution on the Y axis, you'd get a "bathtub" curve -- interpreting an extremely low level language can be pretty fast, and interpreting an extremely high level language can be pretty fast. In between the two, interpretation is substantially slower.
Given an extremely high-level input language (e.g., APL) you get minimal interpretation overhead because you can do a great deal of work based on parsing relatively little input code.
At the opposite extreme, you can get pretty decent speed simply because with a sufficiently low-level language, the interpretation becomes almost trivial. The inner interpreter of a Forth implementation is a prime example of this. These often represent a given program quite compactly, which tends to be quite cache friendly, so at least in theory you could get execution as fast as (or even somewhat faster than) pure machine code.
The original intent was that most JVMs, the .NET "managed" environment, Smalltalk byte-code interpreters, etc., would fit the latter case. As most who've tried to develop these have quickly found, it's hard to keep the interpretation overhead low enough to accomplish this. Every one I know of that fits into this class has the inner loop written in assembly language by hand and usually by a good assembly language programmer. I'd almost go so far as to say that if you try to use a higher-level language (even C) it's going to be slower, and probably significantly so (when you're adding overhead for every input instruction, even one extra instruction in the inner loop will almost certainly lead to a measurable speed penalty).