branching in SM3.0 ??

Started by
10 comments, last by rpsathe 17 years, 9 months ago
Hi,

> Do you mind if I ask you how are the performances? I recently did something > similar on 6600GT and the performance were roughtly 1/4 of the expected.

No I don' have any performance numbers. For me, it was not an option because I was running into shader constant limits.

I don't think I can compare the performances, as it's not apples to apples. On CPU it's just one memory access array[index], and this addressing mode is supported by h/w. On GPU, you're using a sampler, that does some hardware scaling of the texture coorindate and then brings the data from video memory (which runs at different speed than system memory). If you're using anything other than nearest point filter, it'd be bringing in multiple values from video memory. So it WILL be slower.


Hope this theory makes sense.
-Rahul
Advertisement
Folks,
Thanks everyone for answering my questions. The problem was somewhere else (incorrect sampling) and passing the data to textures.

It was appearing as if it was a SM3.0 problem. I am still getting a little different results on REF and real hardware, but that is just due to FP precision I suppose.

Thanks
-Rahul

This topic is closed to new replies.

Advertisement