Why do programmers tend to seek to assembly instead of GPGPU when running into serious performance problems? I know that there are different architectures, but implementing an algorithm in two different architectures seems like a reasonable cost when considering the performance benefit (at least is some cases).
Currently programmers more tend to seek for simple ways of writing multithreaded programs in higher-level languages than writing something in ASM. It's much cheaper and faster to buy 8-core machine and use something like PLINQ than optimize processor instructions.
Only in the last 6-12 months have graphics accelerator manufacturers begun to actively support and promote this technology, and standards like OpenCL are only beginning to emerge. I suppose you are asking the question because of a personal experience. You suggested GPGPU as a solution for a performance problem, and a different route was taken? Because the technology is so new, few people actually know what this, fewer will feel adventurous enough to try it, and even fewer will find the circumstances in an actual real-life project where the risk, platform implications, etc will justify such a decision.