With Windows 7 probably going to RTM next October (and DirectX 11 with it), would it be worth waiting for DirectX 11's explicit GPGPU features, meaning it will be cross-platform (ATI/Nvidia, not Windows/Linux/Mac/Whatever); or should I create a CUDA application now?
From a learning point-of-view I think you would benefit from starting with CUDA now, since it will help you a lot with thinking in data-parallelism which is what the GPUs are good at. Then when/if you turn to DirectX 11, you have a good foundation for working with it, but it depends on the kind of time you have available (i.e. if you have time to experiment with stuff just for the learning experience).
Alternatively, the mac people are pushing for OpenCL (Open Compute Language) to be the general solution, though not much is known at this point. This is another technology you can wait for and check out.
The Microsoft PDC conference is held later this month, maybe they will announce some useful info on DX11 to help you make up your mind.
My general advice would be that I think there is a lot to learn now which you will be able to use later (with DX11 or OpenCL) but that you have to ask yourself if you are willing to learn some technology which might not make it in the long run. Anyways, these are just my thoughts, I don't have a huge amount of experience with CUDA yet.
On a highly speculative note, my gut feeling is that APIs such as CUDA won't survive for long and that DirectX and/or OpenCL are the only solutions which have a future (Unless they really botch their implementations, which I doubt).
From my experience, the major jump from general purpose processor programming to GPGPU programming are the conceptual leaps. The key here is data parallel code.
Even in a multi-threaded environment on a CPU, each thread is doing its own thing on a low level, and synchronization between threads is a relatively rare occurrence. To use the power of a GPGPU, you need to be running thousands of threads which are logically running the same instructions, on different data, almost completely in sync.
Learning the CUDA syntax is relatively quick compared to getting one's head around the data parallel paradigm, so if you intend on tooling yourself up for GPGPU programming, starting with CUDA now would be a very worthwhile move.
If you want the learning experience, go for it!
Another alternative is AMD/ATI's stream SDK which you can download here: http://ati.amd.com/technology/streamcomputing/sdkdwnld.html
nVidia's Cuda and ATI's CAL are roughly equivalent in features. Cuda only works on nVidia gpus and CAL only works on ATI gpus.
Eventually, there will be good cross-platform development tools, but that's a huge void right now. DirectX 11 compute shaders and OpenCL will be fighting it out to be the tool of choice, but neither one is available yet.
If you want build some "real" app, and not just a throw-away learning experience, and you want it to work cross-platform, there are some alternatives: Brook, for example. Also, people have been doing gpgpu work with both DirectX and OpenGL (not OpenCL) for several years, without waiting for explicit GPGPU features. Go to gpgpu.org for pointers
Both DirectX 11 Compute Shaders and OpenCL are mainly based on CUDA, so it is definitely worth to start working with CUDA now. Basically, they all use the same memory model, and have a similar syntax, which is closer to CUDA than to Brook+ (which you would use with the Stream SDK).
However, if you want DX11, there is no need to wait, just grab the November 2008 SDK from Microsoft which comes with a DX11 preview, which you can already use to write (at least) simple compute shader applications.