Archived

This topic is now archived and is closed to further replies.

Nooooo!!!

This topic is 5655 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I am getting really shit results with direct3D and I can see no reason for it, I have an 8MB nVidia Vanta, im trying to make a 2D game, I have tryed everything, in the caps it says my card can only support 2 simultainous textures, so I made just one large (1024x1024) texture, but that just gives me the same results as using 12 256x256 textures, I am only getting around 38 FPS, can anyone help?? I really dont want to go back to DirectDraw, but I may have too... CEO Plunder Studios

Share this post


Link to post
Share on other sites
The maximum simultaneous textures cap is purely for the multitexture cascade - it refers to the maximum number of textures which can be applied to a single polygon in a single Draw*Primitive* call. It''s nothing to do with the maximum number of textures you can have per frame (i.e. you can have 200 CreateTexture calls and 500 SetTexture calls on a chip which only supports 2 *simultaneous* textures.)


A much more likely source of bad performance will be the way you create and access vertex buffers, and how you use Draw*Primitive*

A few general hints:

- BATCH stuff up - if you''re making a call to Draw*Primitive* with less than about 100 vertices, then your code is inefficient. For a 2D engine using D3D, you should be able to get down to around 1 Draw*Primitive* per texture. Indexed primitives help a great deal here. If you''re making one Draw*Primitive* call per sprite, then performance will be hideous.

- Depending on your engine, it''s probably a good idea to leave the WORLD transformation matrix set to identity and use DYNAMIC vertex buffers with the correct locking flags (see the archives, DirectX FAQ, DirectX SDK documentation, nVidia site etc for details of good dynamic VB use).



--
Simon O''Connor
Creative Asylum Ltd
www.creative-asylum.com

Share this post


Link to post
Share on other sites
Well the 1024x1024 was the only texture...
well anyway, I tryed taking away just one of the 256x256 textures from drawing the background and and the frame rate shot up to the monitors refresh rate... very weird...
Is there a way to check how much total VRAM there is and how much left and stuff? maybe that would help...

CEO Plunder Studios

Share this post


Link to post
Share on other sites
read S1CA''s post. What he''s saying (or what I THINK he''s saying) is create a dynamic vertex buffer. Don''t use the world matrix to move your sprites around. Instead keep a copy of each sprites vertices as plain arrays or something like that. Each frame, go through these arrays and transform the vertices so that the sprite is where it should be on screen.
Then lock the dynamic vertex buffer, copy the entire array of vertices over and then unlock.
Then draw the entire vertex buffer with one SetTexture() and DrawPrimtive() call for each texture.
This of course means you need the data ordered by texture.

If i''ve misinterpreted Simon''s response then someone please correct me.

Good luck,
Toby

Gobsmacked - by Toby Murray

Share this post


Link to post
Share on other sites
Funkymunky:
Its a shame the compiler doesn''t understand wildcards either


elis-cool:
Are you doing lots of texture "minification" where the size of the texture used on screen is smaller than the size of the texture in memory ? - if so, use mipmapping. The effect you describe sounds like a combination of "fill rate" and "texel cache" issues.


tobymurray:
Yep, that''s what I meant. The reason is that one of the biggest killers of performance for 3D hardware is sending small batches of data to any Draw* calls. If you need a SetTransform per sprite, that implies that each Draw* call is only ever getting 2 polygons and 4ish vertices to play with - to D3D 3D hardware thats BAAAAD ...


elis-cool:
...which is much worse than the cost of locking a buffer properly. The dynamic locks shouldn''t cost you very much anyway if you''re doing them properly because:

a) modern drivers use "buffer renaming" to decrease the risk of serialisation (the situation where either the CPU or GPU is waiting [stall] for the other to finish something before it can proceed).

b) to a T&L chip, a matrix change is a state change, which implies a potential flush (something which increases serialisation risk).

Its a common misconception that locks themselves are expensive - they''re not - what''s expensive is incorrect or excessive use of locks which cause serialisation and stalling.

Bad use of lock flags can make performance even worse than no locks since you end up reversing the effectiveness of buffer renaming. [I think that''s where a lot of the bad reputation for dynamic VB locks comes from - people try it, get it wrong, see their framerate drop and then slag off locks from then on.]

(check the references I mentioned, they all say the same thing)

--
Simon O''Connor
Creative Asylum Ltd
www.creative-asylum.com

Share this post


Link to post
Share on other sites