Though, there is some amount of research that has been done to implement rendering using GPGPU methods. For a very recent one, see a paper by Laine and Karras: High-Performance Software Rasterization on GPUs (in High-Performance Graphics 2011). Their measure is that the traditional rendering APIs beat a CUDA-implemented renderer by a factor of 2-8x.
Also, other sources exist. Perhaps a more interesting one is this: Alternative Rendering Pipelines in CUDA. The prospects for implementing a renderer in CUDA might be to solve the order-independent-transparency problem (see e.g. this), or to enable real-time raytracing.
I have no experience with GPGPU but the argument put forward was, that GPGPU will always be faster because the API layer is thinner while accessing the exact same hardware and because of shared memory also.
In the light of what I linked to above, I don't think this statement is true at all.
If someone knows about published papers similar to Laine and Karras, please share a link. This is an interesting topic to me as well.