imul
is more powerful because it accepts using somewhat arbitrary operand registers, whereas mul
necessarily uses eax
as one of the inputs, and writes out the result into edx:eax
. imul
makes it easier for the compiler.
imul
is nominally for signed integer types, but when multiplying two 32-bit values, the least significant 32 bits of the result are the same, whether you consider the values to be signed or unsigned. In other words, the difference between a signed and an unsigned multiply becomes apparent only if you look at the "upper" half of the result, which mul
puts in edx
and imul
puts nowhere. In C, results of arithmetic operations have the same type than the operands (if you multiply two int
together, you get an int
, not a long long
): the "upper half" is not retained. Hence, the C compiler only needs what imul
provides, and since imul
is easier to use than mul
, the C compiler uses imul
.
As a second step, since C compilers use imul
and not mul
, Intel and AMD invest more efforts into optimizing imul
than mul
, making the former faster in recent processors. This makes imul
even more attractive.
mul
is useful when implementing big number arithmetics. In C, in 32-bit mode, you should get some mul
invocations by multiplying long long
values together. But, depending on the compiler and OS, those mul
opcodes may be hidden in some dedicated function, so you will not necessarily see them. In 64-bit mode, long long
has only 64 bits, not 128, and the compiler will simply use imul
.