Yes! we've made great progress on my project's conversion to CUDA! Without any optimizations, it's already 6 times as fast as the CPU version. It's not as much as I'd hoped, though. But, hell, it's real-time(-ish) now!
No pretty pictures this time... Instead, YouTube video!
Getting the initial idea of keeping host and device well separated isn't all that hard. However, actually putting it into practice turned out to cause quite some headaches. Debugging a program with hundreds of threads running in parallel is hard, I tell ya. But now, all of the basic functionality seems to be in place.
- Octree - way more objects in the scene!
- Optimization - 6 times as fast as a single CPU core should be far from the limit!
- Block editor - we really need to be able to place/remove objects for our assignment...