Hi. I'm trying to add some CUDA functionality into my existing game engine. What exactly I want to do is to render to a texture, postprocess it with CUDA and then use processed texture in shaders when rendering. Even more exactly: I want to calculate summed area table for a texture, because doing it with multiple passes/screen aligned quads (as usually advised) is terribly slow (talking about a 2048x2048 texture).
I'm not really sure that CUDA will make it much faster, but at least I want to give it a try.
Got last CUDA (5.0). All examples compile and work fine.
But I'm facing some rather silly error and I can't find what causes it.
I did everything exactly as in their d3d9 examples:
cudaGetDeviceCount - returns 1 device (my card), good
cudaSetDevice - set it to my device 0 (though it seems to be optional). no error returned
cudaD3D9SetDirect3DDevice - give it my d3ddevice which is not null - no error returned
cudaGraphicsD3D9RegisterResource - give it a pointer to an uninitialized cudaGraphicsResource*, a 2D texture and cudaGraphicsRegisterFlagsNone. And after that I get:
All CUDA-capable devices are busy or unavailable
Why? Can't see any difference between API calls in my app and their samples.
The only cause of such error I googled so far is that cudaDeviceProp::computeMode may be set to a weird value, but it's not my case (checked with cudaGetDeviceProperties).
Thanks in advance.
Things are getting even weirder: if I create a fresh project, I can type and run any cuda code there without a problem. For example, cudaMallocPitch runs fine just after main. BUT, pasting the same cudaMallocPitch after main in my old engine project makes it show that error!
Ah, stupid me. It turns out that a long time ago I messed with the default heap/stack allocation sizes in linker settings. Setting them back to default values fixed the error
I should probably close this topic, but I don't see how.