Assume we have a memory mapped device taking a certain address space. The CPU tries to read something from the device, so it tries to read a certain word in that address space. What really happens? When the memory controller responds, it will cause contention on the bus since both RAM and the device are trying to respond to the same request.
It's pretty architecture/system dependent I guess. On ARM, for example, at the bottom level, you can't have one address pointing to two memory locations at the same time. Your MMU has to manage that. Your kernel (or other low-level code) manages the MMU settings for different situations so that your devices aren't mapped to the same spot as the RAM you want to use.
It can't happen, because there are more physical addresses than ram addresses
At the physical level the bus has more address lines than are needed for just the RAM. The processor either decodes the addresses and selects RAM with some and I/O with others, or sometimes just uses an entire address line to mean RAM if it's in one state or I/O if it's in another.
In an embedded system, it might be implemented exactly as I described, but mostly there is another layer involving the MMU and multiple I/O busses. Rarely, an always-on MMU of some type will specify the I/O bus or memory-vs-I/O in a page table entry, but then that PTE is selected by address, so it all comes down to addressing in the end.
Ultimately, then, some type of addressing mechanism causes a path to be taken that activates a physical signal somewhere. The HW just decodes the addresses so that RAM and I/O never get selected at the same time.
The thing that's confusing, possibly, is with an MMU you can put an I/O page right next to a RAM page, in the virtual space, but the physical addresses those are mapped to are likely to be way different.
There are various methods to implement that:
support in the memory controller - e.g. allowing to redirect certain memory ranges to another controller. This is implicit in NUMA architectures - the "closest" memory controller will handle a certain address range, and hand through all other requests. Somewhere in - or at the ond of - the chain, you can put circuitry to handle memory mapped devices. This is common in microcontrollers, that usually handle on-chip RAM, PROM and/or flash as well as externally connected memory.
Hardware with direct memory access (DMA) - can be provided by any controller that allows multiple devices to access the same memory. The external device simply writes into the RAM that you are going to read from. You need an addiotnal protocol for synchronization, but that can be supported or provided by the memory controller.
Soft faulting - in virtual memory systems, accessing an invalid address will cause a soft fault, the fault handler can then provide the actual value ro be read from a port. This comes with an obvious performance penalty, but for small data sets, this may be negligible compared to the actual hardware access.
Disclaimer: these are educated guesses with some information picked up by looking over other peoples shoulders. But I prefer thinking about it before aksing wikipedia :)
The chipset and BIOS of a PC work together to put a hole in the RAM. If you try to address the space reserved for I/O, the memory controller isn't allowed to respond.