Shared memory can give about the highest bandwidth of any form of IPC available, but it's also kind of a pain to manage -- you need to synchronize access to the shared memory, just like you would with threads. If you really need that raw bandwidth, it's about the best there is -- but a design that needs that kind of bandwidth is often one with a poorly chosen dividing line between the processes, in which case it may be unnecessarily difficult to get it to work well.
Also note that pipes (for one example) are a lot easier to use, and still have pretty serious bandwidth -- they still (normally) use a kernel-allocated buffer in memory, but they automate synchronizing access to it. The loss of bandwidth is because automating synchronization requires very pessimistic locking algorithm. That still doesn't impose a huge amount of overhead though...