Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

Viper173

Another Perfomance mystery

This topic is 5937 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

okkkkkk.... I thought using VAR would imazingly boost my performance, but instead I''m having 600.000 Tri/s with light, fog, culling, textures, blending disabled. (on a 1333 / Geforce2). doesn''t look too great, especially when comparing it to the usual 7 Mi Tris / s Demos... alright here''s my code: (using Delphi) /////// first initializing VAR and feeding in the array //// exts := String(PChar(glGetString(GL_EXTENSIONS))); if Pos(''GL_NV_vertex_array_range'', exts) > 0 then begin wglAllocateMemoryNV := wglGetProcAddres(''wglAllocateMemoryNV''); wglFreeMemoryNV := wglGetProcaddress(''wglFreeMemoryNV''); glVertexArrayRangeNV := wglGetProcaddres (''glVertexArrayRangeNV''); glFlushVertexArrayRangeNV := wglGetProcaddress (''glFlushVertexArrayRangeNV''); if (not Assigned(wglAllocateMemoryNV)) or (not Assigned(wglFreeMemoryNV)) or (not Assigned(glVertexArrayRangeNV)) or (not Assigned(glFlushVertexArrayRangeNV)) then raise Exception.Create(''Error loading GL_NV_vertex_array_range.'' + #10 + ''Please check if your 3D card''''s drivers are installed properly.''); End; glEnable(GL_VERTEX_ARRAY); Vert_count := 100000; randomize; // Allocate video memory for the vertex array. va := wglAllocateMemoryNV(Vert_Count * SizeOf(TVARVertex), 0, 0, 0.5); if va = nil then raise Exception.Create(''Couldn''''t allocate video memory!''); for v := 0 to Vert_Count - 1 do begin PVARVertex(Integer(va) + v*SizeOf(TVARVertex))^.VX := (Random (10) - 5)/10; PVARVertex(Integer(va) + v*SizeOf(TVARVertex))^.VY := (Random (10) - 5)/10; PVARVertex(Integer(va) + v*SizeOf(TVARVertex))^.VZ := 0; end; glInterleavedArrays(GL_V3F, SizeOf(TVARVertex), va); SetLength(ia, Vert_Count); for f := 0 to Vert_Count - 1 do begin ia[f] := f; end; glVertexArrayRangeNV(VERT_COUNT * SizeOf(TVARVertex), va); glEnableClientState(GL_VERTEX_ARRAY_RANGE_NV); ////////// and finally the rendering code : glActiveTextureARB(gl_texture0_arb); glDisable(gl_texture_2d); glActiveTextureARB(gl_texture1_arb); glDisable(gl_texture_2d); glDisable(GL_FOG); glDisable(GL_LIGHTING); glDisable(GL_BLEND); glBlendFunc(GL_SRC_ALPHA,GL_ONE_MINUS_SRC_ALPHA); glEnable(GL_DEPTH_TEST); gldisable(GL_CULL_FACE); glDisable(GL_ALPHA_TEST); glDepthMask(GL_TRUE); glColor4f(1,1,0,1); glLoadIdentity; glTranslatef(-5,-5,-10); glDrawElements(GL_TRIANGLE_Strip, Vert_Count, GL_UNSIGNED_INT, @ia[0]); ///////////// well that''s it. and here''s the TVARVertex TYPE : ////////////// TVARVertex = packed record VX, VY, VZ: Single; end; PVARVertex = ^TVARVertex; va : PVARVertex; where did my gpu/cpu /whatever) power go??????

Share this post


Link to post
Share on other sites
Advertisement
it could be 100''s of things eg fillrate etc (though 100000 verts in a call is a tad to large) the easiest way to find out what went wrong is to download someones working code
eg www.delphi3d.net or nvidia should have something

http://uk.geocities.com/sloppyturds/gotterdammerung.html

Share this post


Link to post
Share on other sites
hmmm..., normal records aren''t helping either.

the funny thing is that I tried to copy from delphi3d demo as far as possible :-)
still it''s 10 times slower. almost like I''m rendering the random triangles with GLBEGIN

Share this post


Link to post
Share on other sites
Make sure your vertex array range is valid, just before calling glDrawElements().

glGetIntegerv(GL_VERTEX_ARRAY_RANGE_VALID_NV, &u);

if u is GL_FALSE, then you did something that invalidated the range, and VAR will be disabled.

Share this post


Link to post
Share on other sites
glDrawElements(GL_TRIANGLE_Strip, Vert_Count, GL_UNSIGNED_INT,
@ia[0]);

Use GL_UNSIGNED_SHORT indices. I have found this to be a big boost in my applications. GF2 cannot handle larger ones, and the driver must break them down. GF3 can handle bigger indices, but I don't think even it can handle 32-bit ones (check the specs).

edit: Looking at your code, I also realized that you use 3 floats for each vert. This is bad, as the data is only aligned on 4-byte boundries. VAR requires alignment on 8-byte boundries. You need an extra 4 bytes of padding in your struct, and update the stride accordingly.

[edited by - MisterAnderson42 on September 13, 2002 3:06:02 AM]

Share this post


Link to post
Share on other sites
quote:

edit: Looking at your code, I also realized that you use 3 floats for each vert. This is bad, as the data is only aligned on 4-byte boundries. VAR requires alignment on 8-byte boundries. You need an extra 4 bytes of padding in your struct, and update the stride accordingly.


That''s incorrect. GF2 VAR only requires 8 byte alignment, if you use short types (2 bytes). On floats, 4 byte alignment is fine. GF3+ does not require 8 byte alignment at all.

Share this post


Link to post
Share on other sites
quote:
That''s incorrect. GF2 VAR only requires 8 byte alignment, if you use short types (2 bytes). On floats, 4 byte alignment is fine. GF3+ does not require 8 byte alignment at all.

I stand corrected. Should double check myself before posting nonsense.

Share this post


Link to post
Share on other sites
alright, I''m using GL_UNSIGNED_Short instead now, and I''m getting as mush as 1.5 Mi tris / s.

But that pointer alignment thing is giving me a headache.

I found this at nvidia:

Furthermore, each defined array must fall into one of the following formats:

Array Size Type Stride Pointer Alignment
Color 3 GL_FLOAT
Color 4 GL_FLOAT
Color 3 GL_UNSIGNED_BYTE ¹ 0
Color 4 GL_UNSIGNED_BYTE
Normal - GL_FLOAT
Normal - GL_SHORT Multiple of 8, ¹ 0 8-byte
TexCoord 1 GL_SHORT ¹ 0
TexCoord 2 GL_SHORT
TexCoord 3 GL_SHORT Multiple of 8, ¹ 0 8-byte
TexCoord 4 GL_SHORT 8-byte
TexCoord 1,2,3,4GL_FLOAT
Vertex 2 GL_SHORT
Vertex 3 GL_SHORT Multiple of 8, ¹ 0 8-byte
Vertex 4 GL_SHORT 8-byte
Vertex 2,3,4 GL_FLOAT
VertexWeight 1 GL_FLOAT


ok,ok the table is screwed....here''s the VERTEX line :

array size type stride Vertex 3 GL_SHORT Multiple of 8, not equal 0

pointer alignment
8-byte

so first of all I changed from
glInterleavedArrays(GL_V3F, SizeOf(TVARVertex),va);
to
glVertexPointer(3, GL_FLOAT, 12, va);

so how do I have to interpret the table
what''s with gl_Short?
and multiple of 8 seems to be incorrect too...


Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!