Jump to content
  • Advertisement
Sign in to follow this  
jmoyers

DX9: Alpha Transparency Blending, Particle System

This topic is 4076 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Howdy, I am coding up a relatively simple engine for fun. Haven't coded C++ in years and thought I'd jump in pretty heavy. I've been working on it for about a week and... My question is two fold: 1. Alpha transparency using built in alpha channels in textures. I have it implemented and such, and it seemed to work just dandy. However, then comes the fact that I had not implemented back to front drawing. As far as I can tell (and from a god damn lot of searching), my problem is because if I get a front to back draw cycle from time to time.. especially in the semi random particle generator system I just coded in, I am getting a depth test fail when I should not (due to the transparency). You can tell because a black square occasionally outlines the texture. The behavior can be seen here: (excuse the weeeee pyramid, that was from a previous test) Could somebody confirm my understanding of this issue? I'd really like to... not have to sort the objects via the z coord. So, if somebody knows some work around, I'd love to hear it. Also, sorting back to front seems contrary performance-wise, as you are doing a crud-load of overdrawing. 2. This question involves particle system design. I started working on it today and ran into some issues. I generally have not read any articles on this and just went on instinct as I could not find too many resources that seemed up-to-date. What I generally heard was: point sprites suck ass. So I made it based on geometry. The default being a quad since... hey that was easy enough to do the texture mapping... I heard point sprites are shoddily supported across different hardware. Still true by todays standards? I'm using an STL List contained within a ParticleGenerator class to store particles that are from a specific source (ie, an instantiation of the ParticleGenerator class). I then merge all the particle generators into another STL list, containing pointers. From that list I create and recreate a vertex buffer specifically for the particles every render cycle (since some die and more are generated per a set frequency and life cycles and such). Is there a better way to do this? Is recreating a vertex buffer per cycle a bad practice? Now that I think about it, since I'm basically using the same geometry for every particle for a specific generator in current implementation, I could do away with a lot of that... but the question still stands, say I wanted to have a dynamic number of models to load into a single ParticleGenerator, and to spit them out at random. Or say I wanted to morph geometry in general. I've based all translation calculations on GetTickCount(), because I found time() to be uh... lacking, lol. Is this the most appropriate and accurate function I could be using? Any comments on any of this, I would be very glad to hear. Basically just going on discussion between a friend and I. Both of us are not graphics gurus - though he's written a ray tracer. [Edited by - jmoyers on April 26, 2007 12:54:35 PM]

Share this post


Link to post
Share on other sites
Advertisement
Answering your two main questions:

1. You'd get something that looked like that if:

a) D3DRS_ZWRITEENABLE was TRUE. You need to disable Z writes for things with transparency.

and/or

b) Alpha blending isn't enabled or the blend mode you have set up is incorrect. D3DRS_ALPHABLENDENABLE=TRUE, D3DRS_SRCBLEND=D3DBLEND_SRCALPHA, D3DRS_DESTBLEND=D3DBLEND_INVSRCALPHA is the most common set up for straight transparency when using sprites which have an alpha channel. D3DRS_ALPHABLENDENABLE=TRUE, D3DRS_SRCBLEND=D3DBLEND_ONE, D3DRS_DESTBLEND=D3DBLEND_ONE is a common set up for 'glowing' sprites that saturate to white when rendered on top of each other; ONE:ONE blends don't need alpha in the sprite (if that's the effect you're after).

and/or

c) You are using plain old SRCALPHA:INVSRCALPHA transparency and your texture doesn't have any alpha information (either in a D3DFORMAT that doesn't have an alpha channel or the whole of that alpha channel is set to be opaque).

and/or

d) Your texture blending setup (legacy/fixed function) or pixel shader (programmable) isn't set up to write the alpha from the texture (or other source) properly.


2. Point sprites are indeed still poorly supported. The problem is not many commercial games use them so they are always a very low priority for the device driver writers at the graphics card companies so they remain buggy or unsupported. Because they can be buggy and not well supported (and other reasons like cross platform compatibility), commercial games don't use them. Catch-22.


Now for your other questions:

3. Sorting:

a) when you render semi-transparent things with SRCALPHA:INVSRCALPHA (linear interpolate) style blend modes, some form of Z sorting is necessary to make them look right. (The linear interpolate operation isn't commutative)

b) with ONE:ONE (additive) style blends, you don't need to sort. (Addition is commutative).

c) For things like particles, you don't often notice when the sort order is wrong (the black outline is a bug elsewhere in your code) so it's quite common to not bother sorting them at all, particularly if the particles or camera are moving quickly.

d) Even if you do sort, it doesn't usually need to be absolutely perfect so you can rely on fast techniques like coarse bucket sorting (partition Z into say 16 lists [buckets], when you're processing the particles, decide which of the 16 buckets the particle goes into and add it to the appropriate list, when rendering, parse each list in turn).

e) a std::sort really isn't so bad performance wise on a modern machine; you'll have much bigger performance drains than that unless you have a crazy amount of particles per system.


4. Overdraw: erm, but that's the nature of alpha blending, you want to see some of what's behind a particle so you MUST overdraw. Avoiding un-necessary overdraw is only an optimisation for opaque things where it can be avoided. Tip: get it working, profile to find what's slow, then worry about optimisation!


5. Destroying an old and creating a new vertex buffer each time you render: NOOOOO! BAD BAD BAD BAD BAD! From the description of your system, this is likely to be the #1 source of bad performance in your whole application!

So what's a more optimal way to do it?

a) when your application first starts, create a DYNAMIC vertex buffer for each particle vertex format you have (probably only one or two). Make it big enough for say 2000-4000 vertices. Make it WRITEONLY too.

b) also at app startup, create and fill an index buffer for your vertex buffer. Since particles are usually always quads, every entry does the same thing so it can be pre-generated. Index buffers reduce the amount of vertices that need to be shifted to the card and are required to get any vertex caching.

c) every frame, each time you come to draw a particle system, lock the vertex buffer with the correct flags. See the "What's a good usage pattern for vertex buffers if I'm generating dynamic data?" topic in the DirectX FAQ which you'll find a copy of in the SDK help file. It doesn't matter whether you leave some of the buffer unfilled or if you fill the buffer a few times.

d) generate your particle 'quad' vertices directly into the locked buffer sequentially and avoiding reads (beware of things like "buffer[n].position+=w"). You shouldn't need to store the generated quad vertices to anywhere except directly into the locked buffer (so saving a pointless memory copy).

There are other more advanced variations such as using hardware instancing and handling the simulation on the GPU, but the above is enough to get you pretty optimal results.

The above dynamic VB scheme is also still appropriate if you wanted to use geometry instead of quads, though you might need to also have a dynamic index buffer if the geometry differed from particle to particle.


6. GetTickCount() is ok. QueryPerformanceCounter() or timeGetTime() are ok too.

Share this post


Link to post
Share on other sites
Best reply ever. Seriously, lots of good information.

ZWRITEENABLE set to false before particle loops fixed the problem of having to sort the particles and sped up my render a lot. Using std::sort was a severe limitation and I would have implemented my own (perhaps a bucket sort like you were saying), I think, if I had not read this and realized you could disable Z-write.

Additive blending actually sounds like a cool effect and I may switch to that for my current purposes. Right now, I'm using an emissive material just so I didn't have to screw around with lighting.

I realized my error in creating the vertex buffer over and over again when I had SUPER bad performance, and so I set it to load one set of vertices for each particle generator, and then load each particle with the same memory offset to access that specific object in the vertex buffer. Worked wonders.

That information you gave on dynamic vertex buffer is great information for the water deformation routines my friend is writing, though.

I've never wrote anything that utilizes tick counts to try and stay realtime with calculations. That being said, I have a problem that perhaps is a topic for a different thread in a different section of the forum, but I will put it here since this thread is already going (feel free to move it where you please, mods).

When the CPU bogs down for whatever reason, you have less renders per second. Since I implement my generateParticle routine based on a call to ParticleGenerator.update() (ie, dependent on framerate essentially), I have done something like the following:


int currTime = GetTickCount();
int numParticlesToGenerate = (currTime - lastTick)/frequency;
while( numParticlesToGenerate > 0 ){
generateParticle( 0 );
numParticlesToGenerate--;
lastTick = GetTickCount();
}



Where frequency is a variable you set which equals the interval in milliseconds between each particle generation. The only problem with this particle setup is that, if the CPU is really bogged down, you get "clumps" of particles, all generated at the same GetTickCount(). Any good way to solve this, or any good articles I might read on the subject? I've tried setting a negative life offset which would effect all the calculations that determine a particles current world vector, but without much luck.

Anyway, here's some screens of the particle generator working properly. Using the std::sort algorithm, it starts to bog down around 1000 particles. Without it, using a zwriteenable false technique, it can go up to 10000 particles.

This one is using a pyramid as the model instead of a quad.


Shows a negative acceleration in the Y.


Kind of neat transforming the particle generator...



More suggestions or reading would be rad.

Best regards,
Josh

[Edited by - jmoyers on April 26, 2007 12:14:29 PM]

Share this post


Link to post
Share on other sites
Solved my problem of both the "clumps" and an error in my velocity and distance calculations for each particle. Thought I'd share them with anyone who was interested.


int currTime = GetTickCount();
float numParticlesToGenerate = (float)(currTime - lastTick)/frequency;
if( (int)numParticlesToGenerate > 0 )
lastTick = GetTickCount();
while( (int)numParticlesToGenerate > 0 ){
numParticlesToGenerate--;
generateParticle( -numParticlesToGenerate*frequency );
}




Where generateParticle accepts an offset for the startingLife of a particle to generate. This allows for the following:

Say 2 particles are being generated in one render cycle, and the period of generation is 10 ms each.

24ms has passed since the last render cycle.
The offset on the life of the first generated will be 12ms. The second will be 0. This rounds a bit of time off per render cycle, but its not noticeable, and keeps a steady flow. There are ways to get it more accurate (leaving a remainder, etc).

Secondly, something I didn't think about when I was calculating distance traveled. If there is acceleration of any type, you are actually looking at integrating under a curve (or a line with a slope if the accel is constant). I was doing the calculations discretely, which assumed the velocity for the last however many ms as a constant at its highest point. This made the flow less like it should have been as well. The revised formulas look like the following:


while( current != particleList.end() ){
while( (*current)->isDead() ){
delete( (*current) );
current = particleList.erase( current );
if( current == particleList.end() )
break;
}
if( current != particleList.end() ){
D3DVECTOR currVelocity = (*current)->getLastVelocity();
D3DVECTOR acc = (*current)->getAcceleration();
int deltaT = (*current)->getDeltaT();

D3DXMATRIX translation2;
D3DXMatrixTranslation(
&translation2,
deltaT * ( currVelocity.x + 0.5f * acc.x * deltaT),
deltaT * ( currVelocity.y + 0.5f * acc.y * deltaT),
deltaT * ( currVelocity.z + 0.5f * acc.z * deltaT));
(*current)->getVelocity();
(*current)->translate( translation2 );
current++;
}
}




Results look like the following:


Now to start on some cool engine stuff.. like I don't know. Shaders or something.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!