I want to start learning how to program in CUDA, not just the language, but program-design -- things like -- from what I've heard -- writing kernels without conditionals so that all the threads run the same instructions and there's minimal synchronization overhead.
And from what I've heard, the python wrapper is a lot more intuitive to use and code with than the C library.
So assuming that the languages I already know/don't know aren't a barrier, which language is it best to start learning CUDA in?
Which one gives you the best idea of the DO's and DONTs in CUDA and the easiest learning curve?