ansaurus

Question

Answer 1

+2 A:

for every reiteration of i, size, index, code, etc. have to be copied from host to device.. the way you have your program, there is not much you can do. For best results, consider moving entire i loop on the device, this way you will not have host to device copies.

Trust is great for some things, however where performance is concerned and algorithm does not quite fit available functions, you may have to rewrite for best performance without using thrust algorithms explicitly.

aaa 2010-03-09 03:36:50

Solved it with complete new device code, using thrust to copy everything i need.

macs 2010-03-09 19:46:49

ansaurus

tags:

views:

answers:

Optimize CUDA with Thrust in a loop

related questions