[quote name='kauna' timestamp='1327045376' post='4904502']
Doesn't this mean that there is a dependency between the GPUs and they need to synchronize the frame buffers which of course has a negative impact on the performance.
As far as I know, with the SLI/CrossFire, there shouldn't be any dependency between the current and the previous frame, otherwise the GPUs need to synchronize.
Cheers!
Yes, thats true. Though theres always a small dependancy between the two, and heavy restrictions on what can be transferred from one to the other without killing performance. Don't forget that SLI/Crossfire cards have a bridge connecting them - to my knowledge this is for direct memory transfer between the cards without passing through the motherboard. Theres also the ability to account for some latency between the two.
Its likely that cross card performance will get faster over the next few generations of cards due to their more-frequent use as general purpose units.
[/quote]
Remember that your monitor is only connected one of your cards, so in standard AFR rendering on both Radeon and Geforce GPUs every second frame is sent over the SLI/Crossfire bridge, so having synchronization between the GPUs isn't always a problem since the SLI/Crossfire bridge has an insane amount of bandwidth. The idea of doing post-processing on the second GPU will not have any overhead if you ask me, since the first GPU will not be touching that data again, so it does not have to wait for the second GPU in any way and can immediately begin rendering the next frame. I do however fail to see how this is going to give any benefit over standard AFR. With only one GPU it has about 16.666ms to render a complete frame for a smooth 60 FPS frame rate. With 2 GPUs they only have to spew out 30 frames per second each, meaning they have around 33.333ms each. In an optimal case for doing post-processing on another GPU both rendering and post-processing would take the exact same amount of time, meaning 16.666ms each. The actual data transfer between the CPUs would not take any effective time of rendering away from neither GPU since it can be done in parallel to the first GPU rendering the next frame. The data transfer would however introduce a small overhead to the total time it takes from OpenGL/DirectX commands to a complete rendered frame.
Consider this extremely artistic little chart:
AFR:
GPU 1 <---Frame0---> <---Frame2---> <---Frame4---> etc
GPU 2 <---Frame1---> <---Frame3---> etc
Postprocessing on a different GPU
Fr? = Frame no ?
R = rendering
D = data transfer
P = postprocessing
GPU 1 <---Fr0 R---> <---Fr1 R---> <---Fr2 R---> <---Fr3 R---> etc
Data <-Fr0 D-> <-Fr1 D-> <-Fr2 D-> <-Fr3 D->
GPU 2 <---Fr0 P---> <---Fr1 P---> <---Fr2 P---> etc
After the first frame has been rendered and transmitted by the first GPU, both GPUs would be working 100% without any stalling. The data transfer overhead would however make it worse than basic AFR because of delay which would be perceived as input lag.