I have a C# prototype that is heavily data parallel, and I've had extremely successful order of magnitude speed ups using the Parallel.For construct in .NETv4. Now I need to write the program in native code, and I'm wondering what my options are. I would prefer something somewhat portable across various operating systems and compilers, but if push comes to shove, I can go with something Windows/VC++.
I've looked at OpenMP, which looks interesting, but I have no idea whether this is looked upon as a good solution by other programmers. I'd also love suggestions for other portable libraries I can look at.