Vertex shader slower than fixed function pipeline?
On a GeForce 6800/7800, your vertex shader generates the following *native* code for the GPU:
(The hexcodes are the 128-bit instruction words.) The "fixed-function" path produces exactly the same code, but with different constant register indexes.
If you want to do very accurate timing on the GPU, you can use the EXT_timer_query extension. If defines the following enums and functions.
Use the glBeginQuery/glEndQuery mechanism with the GL_TIME_ELAPSED_EXT target to specify a timing interval. A call to glGetQueryObjectui64vEXT with <pname> GL_QUERY_RESULT returns the elapsed time in nanoseconds.
401F9C6C 01CD400D 8106C0C3 60411F80 DP4 o[HPOS].x, v[OPOS], c[212];401F9C6C 01CD500D 8106C0C3 60409F80 DP4 o[HPOS].y, v[OPOS], c[213];401F9C6C 01CD600D 8106C0C3 60405F80 DP4 o[HPOS].z, v[OPOS], c[214];401F9C6C 01CD700D 8106C0C3 60403F80 DP4 o[HPOS].w, v[OPOS], c[215];401F9C6C 00400808 0106C083 60419F9D MOV o[TEX0].xy, v[TEX0].xyxx;
(The hexcodes are the 128-bit instruction words.) The "fixed-function" path produces exactly the same code, but with different constant register indexes.
If you want to do very accurate timing on the GPU, you can use the EXT_timer_query extension. If defines the following enums and functions.
#define GL_TIME_ELAPSED_EXT 0x88BFtypedef __int64 GLint64EXT;typedef unsigned __int64 GLuint64EXT;void glGetQueryObjecti64vEXT(GLuint id, GLenum pname, GLint64EXT *params);void glGetQueryObjectui64vEXT(GLuint id, GLenum pname, GLuint64EXT *params);
Use the glBeginQuery/glEndQuery mechanism with the GL_TIME_ELAPSED_EXT target to specify a timing interval. A call to glGetQueryObjectui64vEXT with <pname> GL_QUERY_RESULT returns the elapsed time in nanoseconds.
BTW, in case you're curious, your vertex shader produces the following native code on Radeon X800/X1800.
The ATI driver inserts an extra instruction to move (0,0,0,1) into the primary color interpolant, but otherwise, it's the same native instruction sequence that Nvidia hardware uses.
00100201 00D10002 00D10001 00D10005 DP4 o[0].x, c[0], v[0];00200201 00D10022 00D10001 00D10005 DP4 o[0].y, c[1], v[0];00400201 00D10042 00D10001 00D10005 DP4 o[0].z, c[2], v[0];00800201 00D10062 00D10001 00D10005 DP4 o[0].w, c[3], v[0];00F02203 01648000 01248000 01248005 MOV o[1], R0.0001;00304203 00D10041 01248041 01248045 MOV o[2].xy, v[2];
The ATI driver inserts an extra instruction to move (0,0,0,1) into the primary color interpolant, but otherwise, it's the same native instruction sequence that Nvidia hardware uses.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement