COM interfaces are rather like JAVA interfaces in a way - they don't have data members. This means that interface inheritance is different to class inheritance when multiple inheritance is used.
To start with, consider non-virtual inheritance with diamond-shaped inheritance patterns...
- B inherits A
- C inherits A
- D inherits B and C
An instance of D contains two separate instances of the data members of A. That means that when a pointer-to-A points into an instance of D, it needs to identify which instance of A within D it means - the pointer is different in each case, and pointer casts are not simple relabellings of the type - the address changes too.
Now consider the same diamond with virtual inheritance. Instances of B, C and D all contain a single instance of A. If you think of B and C having a fixed layout (including the A instance) this is a problem. If Bs layout is [A, x] and Cs layout is [A, y], then [B, C, z] is not valid for D - it would contain two instances of A. What you have to use is something like [A, B', C', z] where B' is everything from B except the inherited A etc.
This means that if you have a pointer-to-B, you don't have a single scheme for dereferencing the members inherited from A. Finding those members is different depending on whether the pointer points to a pure-B or a B-within-D or a B-within-something-else. The compiler needs some run-time clue (virtual tables) to find the inherited-from-A members. You end up needing several pointers to several virtual tables in the D instance, as theres a vtable for the inherited B and for the inherited C etc, implying some memory overhead.
Single inheritance doesn't have these issues. Memory layout of instances is kept simple, and virtual tables are simpler too. That's why Java disallows multiple inheritance for classes. In interface inheritance there are no data members, so again these problems simply don't arise - there's no issue of which-inherited-A-with-D, nor of different ways to find A-within-B depending on what that particular B happens to be within. Both COM and Java can allow multiple inheritance of interfaces without having to handle these complications.
EDIT
I forgot to say - without data members, there is no real difference between virtual and non-virtual inheritance. However, with Visual C++, the layout is probably different even if there are no data members - using the same rules for each inheritance style consistently irrespective of whether any data members are present or not.
Also, the COM memory-layout matches the Visual-C++ layout (for supported inheritance types) because it was designed to do that. There's no reason why COM couldn't have been designed to support multiple and virtual inheritance of "interfaces" with data members. Microsoft could have designed COM to support the same inheritance model as C++, but chose not to - and there's no reason why they should have done otherwise.
Early COM code was often written in C, meaning hand-written struct layouts that had to precisely match the Visual-C++ layout to work. Layouts for multiple and virtual inheritance - well, I wouldn't volunteer to do it manually. Besides, COM was always its own thing, meant to link code written in many different languages. It was never intended to be tied to C++.
YET MORE EDITING
I realised I missed a key point.
In COM, the only layout issue that matters is the virtual table, which only has to handle method dispatch. There are significant differences in layout depending on whether you take the virtual or non-virtual approach, similar to the layout of on object with data members...
- For non-virtual, the D vtab contains an A-within-B vtab and an A-within-C vtab.
- For virtual, the A only occurs once within Ds vtable, but the object contains multiple vtables and pointer casts need address changes.
With interface-inheritance, this is basically implementation detail - there's only one set of method implementations for A.
In the non-virtual case, the two copies of the A virtual table would be identical (leading to the same method implementations). Its a slightly larger virtual table, but the per-object overhead is less and the pointer casts are just type-relabelling (no address change). It's simpler and more efficient implementation.
COM can't detect the virtual case because there's no indicator in the object or vtable. Also, there's no point supporting both conventions when there's no data members. It just supports the one simple convention.