If you think "PC == Windows", then adding assembler to a C program doesn't hurt much. If you step into the Unix world, you'll have lots of different CPUs: PPC in the PS3 or XBox, old Macs and many powerful servers. For many small devices, you'll have ARM. Embedded devices (which account for the vast majority of installed CPUs today) usually use their own custom CPU with a special instruction set.
So while many PCs today will be able to run Intel code, that accounts only for a small fraction of all CPUs out there.
That said, x86 code is not always the same, either. There are two main reasons for assembly code: You need to access special features (like interrupt registers) or you want to optimize the code. In the first case, the code is pretty portable. In the latter case, each CPU is a little bit different. Some of them have SSE. But SSE was soon replaced with SSE2 which was replaced with SSE3 and SSE4. AMD has their own brand. Soon, there will be AVX. On the opcode level, each of them has slightly different timing on the various versions of CPUs.
To make things worse, some opcodes have bugs that are fixed in specific steppings of a CPU. On top of that, some opcode is much faster on certain versions of CPUs than on others.
Next, you'll need to interface this assembly code with the C part. That usually means you either need to deal with ABI issues.
So you can see that this can become arbitrarily complex.