i''m interested how often they would get used at all.. well i wouldn''t use them. what i would need definitely is a very fast branching/looping unit, wich does not excecute different pathways, but only the required ones.
..
i have no real solution for this eighter.. but yeah, for non-branching pixelshaders, you could perform 4 pixels in parallel and gain much speed. tested myself, got a 2x increase from normal sse code in my raytracer..
definitely a good thing for amd64 wich has 2x the amount of sse registers:D can''t wait for my one.. and i love that it isn''t that deepely pipelined => branches don''t hurt that much. and its memory performance is amazing. i can''t wait:D
there''s one other way (possibly) wich you "could" do..
a scanline-derivative buffer..
means you allocate a buffer, width x 2 x dd-instructions in size.
and then you excecute your code, and as you excecute each pixel after the other, you could just store its values at the dd-instructions in the buffer. why width x 2 x dd? because for ddy you need the info of the previous scanline above you.. (or so.. could be done more optimal, but i''m tierd..)
//then, calculating the ddx would meanddx(float4& dst,float4 cur) {dst = cur - dd_cur_buffer[ddinstr + ddinstrcount*(pixelxpos - 1)];dd_cur_buffer[ddinstr + ddinstrcount*pixelxpos] = cur;++ddinstr;}//andddy(float4& dst,float4 cur) {dst = cur - dd_old_buffer[ddinstr + ddinstrcount*pixelxpos];dd_cur_buffer[ddinstr + ddinstrcount*pixelxpos] = cur;++ddinstr;}
and for each scanline you would
std::swap(dd_cur_buffer,dd_old_buffer); // just pointers:D
ddinstrcount == amount of dd-instructions in the whole program, pixelpos == current xpos from the left of the screen..
ddinstr, initialized to zero for each incomming pixel, gets incremented per dd-instruction..
something like this.. i hope you get the idea..
is that a good idea? or not? dunno.. as i said yet, i''m tierd as hell..
If that''s not the help you''re after then you''re going to have to explain the problem better than what you have. - joanusdmentia
davepermen.net