Posted 27 October 2012 - 03:54 PM
The conclusion I came to is that really general GPU computing of a class that you can just arbitrarily switch back and forth to and from as often as you like just isn't here yet. On current hardware it's best to think of CS as a separate "mode" that your GPU can run in - and it can only run in one "mode" at a time, with overhead from switching "mode"s.
It's also worth bearing in mind that loading everything on the GPU is probably not a wise idea. The CPU is still a very capable processor, and is better than the GPU at the kind of branchy recursive code you seem to have in mind. The GPU also risks ending up having fewer resources available for it's real job of transforming vertexes and shading pixels. You need to choose and distribute your workloads wisely, as the key to performance is balancing load across both processors.
I did some experiments in a similar direction earlier this year for some dynamic texture updates I was doing. The usage pattern was to dispatch a bunch of CS invocations, switch back to normal, draw, more CS, and so on a few times per frame. The performance wasn't horrible but it was certainly a good deal slower than just doing the calculations on the CPU and incurring the overhead of updating a texture that may be currently in use for drawing.
It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.