Jump to content
  • Advertisement
Sign in to follow this  
Norman Barrows

index buffers, D3DPOOL_DEFAULT, and lost device

This topic is 1922 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

i was just reading up on index buffers and came across this in the directx docs under the topic "Index Buffers (Direct3D 9)":

 

 

"Note Always use D3DPOOL_DEFAULT, except when you don't want to use video memory or use large amounts of page-locked RAM when the driver is putting vertex or index buffers into AGP memory." 

 

 

I recently implemented handling lost devices, and switched everything possible from default to managed memory, with the understanding that there was no real cost penalty.

 

but now i find this, with, of course, no explanation why you should always use default.

 

they do mention hardware index caching as giving performance boosts to indexed drawing.

 

anybody have any idea whats going on here?

 

 

 

Share this post


Link to post
Share on other sites
Advertisement

From the corresponding article on vertex buffers we get the following:

 

It is possible to force vertex and index buffers into system memory by specifying D3DPOOL_SYSTEMMEM, even when the vertex processing is done in hardware. This is a way to avoid overly large amounts of page-locked memory when a driver is putting these buffers into AGP memory.

 

This indicates to me that the distinction is only between buffers created in D3DPOOL_DEFAULT versus D3DPOOL_SYSTEMMEM.

 

Using D3DPOOL_MANAGED should be seen as the equivalent of actually creating two copies of the resource - one in the default pool, the other in the system memory pool.  The D3D runtime then looks after everything else for you.  This is discussed some here: http://legalizeadulthood.wordpress.com/2009/10/12/direct3d-programming-tip-9-use-the-managed-resource-pool/

Share this post


Link to post
Share on other sites

Using D3DPOOL_MANAGED should be seen as the equivalent of actually creating two copies of the resource - one in the default pool, the other in the system memory pool.

 

yes, that was my understanding.

 

perhaps they recommend default as it eliminates the overhead of d3dpool_managed ?

 

i wish they weren't so cryptic about everything.

 

sometimes i think they don't really know themselves. 

 

they bought the code from Brender. and a million fingers must have touched it by now.

 

there may not be anyone at MS who knows everything about the system and the best way to use it.

Share this post


Link to post
Share on other sites

The original code that Microsoft bought bears no relationship to current versions.  D3D3 didn't have vertex or index buffers (it didn't even have what we'd recognise as a "Draw" call) so you can't really make comparisons.

 

The problem with D3D9 was that it straddled multiple hardware generations.  At the time it was originally released we were completing the transition from software T&L to hardware T&L and beginning the transition from the fixed pipeline to shaders, and D3D9 had to support them all (it also supports 2 major shader model generations).  So it's inevitable that it suffers from compromises and complexities that a clean API fully targetted at a specific hardware generation wouldn't, and a lot of the programming advice for it must be read in the context of 2002/2003 hardware (you see similar with OpenGL, where much advice you can find is quite firmly rooted in 1997/1998).  Fast-forward a few years and much of it is no longer relevant; hardware doesn't really work the way D3D9 was designed any more, so referring to older documentation just serves to confuse.

 

In this case the best thing is to set up your index buffer in D3DPOOL_DEFAULT, give it a good intensive benchmarking, set it up in D3DPOOL_MANAGED, run the same benchmark, then make a decision.  If the facts you discover through this process contradict advice in the documentation, then remember that the documentation is ancient and was written around a completely different generattion of hardware.

Share this post


Link to post
Share on other sites

Using D3DPOOL_MANAGED should be seen as the equivalent of actually creating two copies of the resource - one in the default pool, the other in the system memory pool.

 

yes, that was my understanding.

 

perhaps they recommend default as it eliminates the overhead of d3dpool_managed ?

Of course they recommend D3DPOOL_DEFAULT. It enables driver engineers to implement few important optimizations.

 

 

Using D3DPOOL_MANAGED should be seen as the equivalent of actually creating two copies of the resource - one in the default pool, the other in the system memory pool.

 

i wish they weren't so cryptic about everything.

 

sometimes i think they don't really know themselves. 

Wrong. They know it very well, trust me on this smile.png

You need to read between the lines. Get all performance-related papers/presentations from GDC and other events, mainly from nVidia's developer portal. There is a lot of info that you will not find anywhere else. Open 5-10 of them and enjoy smile.png  

Share this post


Link to post
Share on other sites

In this case the best thing is to set up your index buffer in D3DPOOL_DEFAULT, give it a good intensive benchmarking, set it up in D3DPOOL_MANAGED, run the same benchmark

 

indeed.      science doesn't lie.      and neither do timers.

Share this post


Link to post
Share on other sites
For real science though, you'd need all the GPUs from the past 15 years that you want to support, as they're what's going to determine a lot of the behavior...

e.g. Some GPUs might have 2 memory controllers, one for reading system RAM over tha AGP bus, and one for reading local VRAM. Counter-intuitively, on such a GPU it can be faster to keep a small amount of data in system RAM in order to utilize both controllers for parallel fetching. This could mean keeping vertices in VRAM and indices in system RAM.
Of course I wouldn't recommend this any more - just an example of how much of the APIs performance characteristics actually depend on the GPU/driver...

Share this post


Link to post
Share on other sites

For real science though, you'd need all the GPUs from the past 15 years that you want to support, as they're what's going to determine a lot of the behavior...

e.g. Some GPUs might have 2 memory controllers, one for reading system RAM over tha AGP bus, and one for reading local VRAM. Counter-intuitively, on such a GPU it can be faster to keep a small amount of data in system RAM in order to utilize both controllers for parallel fetching. This could mean keeping vertices in VRAM and indices in system RAM.
Of course I wouldn't recommend this any more - just an example of how much of the APIs performance characteristics actually depend on the GPU/driver...

 

 

yes, see this is the thing. its such a shifting target (the users graphics capabilities). and god only knows what kind of pc they might want to run it on.  in order to avoid the whole protracted mess, i've been trying to develop to the lowest common denominator, which seems to be directx 9 fixed function. 

 

i remember what it was like to be an impoverished college student with a hand-me-down pc with the previous generation of graphics on it. 

 

there's no reason to leave those dollars on the table (make a game they won't buy because it won't run well/at all on their PC), as long as i can get the desired results (from dx9 fixed function).

 

perhaps the more fundamental question to ask is how far back in terms of windows versions and directx versions should i attempt to support? IE what's so old that i shouldn't be worrying about it?

 

i take it that supporting dx9 only PCs is something i should not be worrying about?

 

right now the system requirements for the directx and windows stuff i'm using is windows 2000 and directx 9.

 

how difficult would it be to convert a fixed function dx9 app to shaders and dx10 or dx11 ?

 

FVF goes away but i'm just calling mesh:getfvf and device:setfvf in one place (maybe two).

 

is there shader code available that replicates basic fixed function capabilities? all i'm doing is aniso mip-mapping, some alphatest, and a little alpha blending. no real messing with texture stages or anything like that. when i was working on the never released caveman v2.0 back in 2008-2010, i was unimpressed with the blend ops available and wrote my own 10 channel real time weighted texture blender. if i had to do blend ops again, i'd probably do it myself and then send the results to directx. i almost prefer doing things that way. its more like good old fashioned "gimme a pointer and lemme party on the bitmap!" programming. nowadays, you throw your own private parties on your own private bitmaps, then send them off to directx for display. or use shaders and party on directx's bitmap (so to speak).

Share this post


Link to post
Share on other sites

For a reasonable baseline, consider the following.

 

The ATI R300 was introduced in 2002.

The GeForce FX was introduced in 2003.

The Intel GMA 900 was introduced in 2004.

 

All of these are D3D9 parts and all support shader model 2; therefore in order to find any common consumer hardware that doesn't support shaders at this level you need to go back to almost over a decade ago.

 

Now look at the latest Steam Hardware Survey: http://store.steampowered.com/hwsurvey - in this case, a total of 99.53% of the machines surveyed have SM2 or higher capable GPUs, and almost 98% have SM3 or better.

 

Concerning the "roll your own texture blending on the CPU" approach, this is a BAD idea for a number of reasons.  The major reason is that it's more-or-less guaranteed to introduce many CPU/GPU synchronization points per frame, which is the number 1 cause of performance loss.  A secondary reason is that the CPU will never be as fast as the GPU for this kind of operation.  A third reason is that you're introducing all manner of latency and bandwidth considerations to your program.

 

The thing is - these are exactly the kind of problems that shaders solve for you.  You get to have any kind of arbitrary blend you wish without having to deal with synchronization or latency problems, and you get massive parallelism completely for free.  The end result is simpler, faster and more robust code that runs well on the stupefyingly overwhelming majority of hardware.

 

The very fact that you've introduced this, and coupled with some of your previous posts, leads me to suspect that you're the type of person who doesn't trust the hardware, that you have a preference for doing things yourself in software even if it comes at the expense of your own code quality or performance.  That's an attitude you need to lose, to be honest.

Share this post


Link to post
Share on other sites

The very fact that you've introduced this, and coupled with some of your previous posts, leads me to suspect that you're the type of person who doesn't trust the hardware, that you have a preference for doing things yourself in software even if it comes at the expense of your own code quality or performance.  That's an attitude you need to lose, to be honest.

 

no, i'm  just lazy.  <g>.

 

a lazy perfectionist, thats me.

 

so i want results which will probably require shaders, but i don't want to write shaders unless necessary. thus my question about the availability of "plug and play" boilerplate shader code.

 

based on the performance i'm getting now, and what i want to get, its starting to look like shaders will be in my near future.

 

based on that, the system requirements should change from DX9 fixed function to whatever the typical requirements are for a title coming out in the near future.

 

this means i'll be able to take advantage of newer hardware.

 

the real time texture blender i wrote was an act of desperation. when i simply can not get the required effect from the existing libraries, i'm forced to write my own low level stuff.

 

i've only had to do it 4 times in 32 years of writing PC games:

1. realtime zoom, scale, mirror, & rotate blitter in assembly for a blitter engine about the time of Wing Commander II.

2. p-correct t-mapped poly engine when MS bought rend386 and before they re-released it as directx 1.0.

3. 50 channel real time wave mixer, around the time when S.0.S., MILES Audio,  and Diamondware SDK were the preferred audio solutions and typically only supported 8 channels with no "stepping".

4. and the texture blender (directx 8 era).

 

while i can do low level stuff, i prefer building games.  to me, low level stuff is a necessary evil.

 

[famous gamedev quote of mine coming up...]

 

but "Sometimes you gotta break a few eggs to make a REAL mayonnaise".

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!