Optimizing entity rendering?

Started by
11 comments, last by larspensjo 11 years ago

Hello,

I've already implemented a basic rendering system for my entities. Its working, but its not very effective. Take a look:


class DeferredRenderSystem : public EntitySystem<DeferredRenderSystem>
{
public:
    DeferredRenderSystem(Gfx3D& gfx3D): m_pGfx3D(&gfx3D) {};
    void Update(EntityManager& entityManager)
    {
        //todo: HUGE cleanup
        EntityManager::entityVector* vEntities = entityManager.EntitiesWithComponents<ModelComponent, Position>();
        if(vEntities)
        {
            const Camera* pCamera = m_pGfx3D->GetCamera();
            D3DXMATRIX mWorld, mTrans, mRotX, mRotY, mRotZ, mScale;
            Effect* effect;
            Model* model;
            Material* material;
            // set g-buffer
            m_pGfx3D->Set3DRenderTarget(0, RenderTargets::DEFERRED_MATERIAL);
            m_pGfx3D->Set3DRenderTarget(1, RenderTargets::DEFERRED_POSITION);
            m_pGfx3D->Set3DRenderTarget(2, RenderTargets::DEFERRED_NORMAL);
            m_pGfx3D->ClearAllBound();
            for(auto entity : *vEntities)
            {
                D3DXMatrixIdentity(&mWorld);
                Position* pos = entity->GetComponent<Position>();
                ModelComponent* mod = entity->GetComponent<ModelComponent>();
                //render components
                effect = m_pGfx3D->GetEffect(mod->m_effect);
                model = m_pGfx3D->GetModel(mod->m_model);
                material = m_pGfx3D->GetMaterial(mod->m_material);
                //set effect constants
                effect->SetTexture("Material", m_pGfx3D->GetTexture(material->GetTexture()));
                effect->SetTexture("Bump",  m_pGfx3D->GetTexture(material->GetBumpMap()));
                effect->SetMatrix("ViewProj", pCamera->GetViewProjectionMatrix());

                //rotation
                if(Rotation* pRotation = entity->GetComponent<Rotation>())
                {
                    D3DXMatrixRotationX(&mRotX, DEGTORAD(pRotation->m_x));
                    D3DXMatrixRotationY(&mRotY, DEGTORAD(pRotation->m_y));
                    D3DXMatrixRotationZ(&mRotZ, DEGTORAD(pRotation->m_z));
                    mWorld *= mRotX;
                    mWorld *= mRotY;
                    mWorld *= mRotZ;
                }
                //scaling
                if(Scalation* pScalation = entity->GetComponent<Scalation>())
                {
                    D3DXMatrixScaling(&mScale, pScalation->m_x, pScalation->m_y, pScalation->m_z);
                    mWorld *= mScale;
                }
                //translation
                D3DXMatrixTranslation(&mTrans, pos->m_x, pos->m_y, pos->m_z);
                mWorld *= mTrans;
                effect->SetMatrix("World", mWorld);

                effect->Begin();
                model->Render();
                effect->End();
            }
            delete vEntities;
            //unbind g-buffer
            m_pGfx3D->Set3DRenderTarget(0, RenderTargets::BACKBUFFER);
            m_pGfx3D->Set3DRenderTarget(1, RenderTargets::BACKBUFFER);
            m_pGfx3D->Set3DRenderTarget(2, RenderTargets::BACKBUFFER);
        }
    }

private:
    Gfx3D* m_pGfx3D;
};

So way aside from me not using a renderqueue (going to be resolved at another time), there are two main Problems:

- I have to reconstruct and multiply all matrices each frame, even if the object did not change at all.

- I have to access the model, effect, and material from my graphics module every frame, even if they did not change.

This is due to my attempt on following the approach that each entities component shall only have data, and even as little data as possible, so the involved components look like this:


struct Position : Component<Position>
{
	Position(float x, float y, float z): m_x(x), m_y(y), m_z(z) {};

	float m_x, m_y, m_z;
};

struct ModelComponent : Component<ModelComponent>
{
	ModelComponent(LPCWSTR model, unsigned int material, LPCWSTR effect): m_model(model), m_material(material), m_effect(effect) {};

	std::wstring m_model, m_effect;
	unsigned int m_material;
};

The obvious advantage is my components being completely independent of the actul render implementation, and being easily editable e.g. in my editor. So I don't think I should change this entirely (unless someone can give me a good reason to do so), but to introduce some sort of caching. My question is now: Which one of those ways would you consider the best (most flexible, ...) and if so, why?

- Give each component additional members like a translation-matrix for my position component and a dirty flag. The rendering system would then use this components and, if the dirty flag is set, update it based on the float position members of the component (same applies to the model component etc...).

- Same as above, but instead of having a dirty flag and the render system being responsible for updating the matrix, accessing the position through getters and setters thus always updating the matrix when changing the position.

- Having a CacheComponent on each entity that is responsibly for storing things like the actual translation matrix for the position component etc... . Like above, I would have a dirty flag and have the render system update the certain cached matrices etc.. if necessary.

- Same as above but instead of a CacheComponent have each system implement a caching functionality themselfs, working the way as above.

- Something else?

What do you think would be the "best" approach here?

Advertisement

A dirty flag seems entirely reasonable to store on your position component and set whenever the position changes. You can add setters to the component for changing the position that will automatically set the flag, and the render system can check that flag before it updates cached transformation matrices in your model component. Usually I'd be all for creating an entirely new RenderComponent just for this purpose, but I see the "position dirty" flag being useful in other gameplay scenarios and the model component is already handling a lot of render-specific stuff.

Another optimization you'll want to perform is splitting your rendering into three phases; an initial phase where you make a list of everything that needs to be rendered, a second phase where you sort all those objects by effect and material to reduce renderstate changes, and a third phase where you actually perform the render. You mentioned wanting to add a "renderqueue", but I wasn't sure if this is what you meant by that.

1. Do you ever need scaling?

2. Why do you build rotation matrices for each axis individually?

3. After seeing this:

delete vEntities;

i assume that then this creates it on the heap every frame?

EntityManager::entityVector* vEntities = entityManager.EntitiesWithComponents<ModelComponent, Position>();

This is expensive.

4. This functions perform some search algos?

effect = m_pGfx3D->GetEffect(mod->m_effect);
model = m_pGfx3D->GetModel(mod->m_model);
material = m_pGfx3D->GetMaterial(mod->m_material);

5. Whats going on behind the scene for these functions?

effect->Begin();
effect->End();

@belfegor:

1. Do you ever need scaling?

I assume so, yes. I'm planing on have different effects that increase or decrease the size of certain units and/or buildings for one of my study projects for example.

2. Why do you build rotation matrices for each axis individually?

Because... you've never seen me do this :O ... heck, I really didn't find the D3DXMatrixRotationYawPitchRoll() function :/

3. After seeing this:

delete vEntities;



i assume that then this creates it on the heap every frame?

EntityManager::entityVector* vEntities = entityManager.EntitiesWithComponents();

This is expensive.

,>

Yes, you are correct, I'm well aware that this part will need some major optimization. Any advice on this? I assume that the best thing would be to write a custom iterator that simply ignores all entites without the given components (bitset is used for determination), but I could need some help on this one, maybe a tutorial on writing custom vector iterators (if this is possible in that form)?

4. This functions perform some search algos?

effect = m_pGfx3D->GetEffect(mod->m_effect);
model = m_pGfx3D->GetModel(mod->m_model);
material = m_pGfx3D->GetMaterial(mod->m_material);

Something among those lines:


Model* Gfx3D::GetModel(const std::wstring& sFileName)
{
	std::wstring sDir = L"../../Repo/Game/Meshes/";
	sDir += sFileName;
	if(!HasModel(sDir))
	{
		Model* pModel = new Model(*m_pDevice, sDir);
		m_mModels[sDir] = pModel;
		return pModel;
	}
	else
	{
		return m_mModels[sDir];
	}
}

bool Gfx3D::HasModel(const std::wstring& sFileName) const
{
	return m_mModels.find(sFileName) != m_mModels.end();
}

5. Whats going on behind the scene for these functions?

effect->Begin();
effect->End();

Since those is the high-level wrapper for whatever effect framework might be used; and since I'm currently using the D3DXEffect framework only it calls begin() and beginpass(0) as well as endpass() and end(). This might of course get optimized when I start implementing my render queue...

@Zipster:

A dirty flag seems entirely reasonable to store on your position component and set whenever the position changes. You can add setters to the component for changing the position that will automatically set the flag, and the render system can check that flag before it updates cached transformation matrices in your model component. Usually I'd be all for creating an entirely new RenderComponent just for this purpose, but I see the "position dirty" flag being useful in other gameplay scenarios and the model component is already handling a lot of render-specific stuff.

Thanks, this comes close to my thoughts, so I think I'm going to stick with this.

Another optimization you'll want to perform is splitting your rendering into three phases; an initial phase where you make a list of everything that needs to be rendered, a second phase where you sort all those objects by effect and material to reduce renderstate changes, and a third phase where you actually perform the render. You mentioned wanting to add a "renderqueue", but I wasn't sure if this is what you meant by that.

Yes, this is exactly what I was implying to. Nevertheless, even with the best render queue as you described it, I would still need a way to remove those unnecessary matrix creations etc.., thus my question.

Yes, you are correct, I'm well aware that this part will need some major optimization. Any advice on this?

I really have no idea, but as you i would like to here some opinions on this matter as well.

For now i just have bunch of different vectors for each type of "entity".


effect->Begin();

If i think correctly this will call SetVertex/Pixel shader for every entity? This might have big impact on performance.

I really have no idea, but as you i would like to here some opinions on this matter as well.

For now i just have bunch of different vectors for each individual type of "entity".

This was one of my first thoughts too. But since my entities are composed of a whole lot of components (right now I have 12 implemented), those can theoretically be mixed up entirely, which means that, if I had different vectors for each "type" of entity, I'd have 2^12 of them right now, so this is out of the question. Unless I'd make some presumtions about what components can actually be put together, but this would kill the purpose and make it less modular, right?

If i think correctly this will call SetVertex/Pixel shader for every entity? This might have big impact on performance.

Well, it really depends on how the D3DXEffect framework handle things internally, but based on what I know, this might be the case. I noticed a significant rise in FPS (bad method to measure, I know) when moving the begin and end-calls out of the loop and having an "ApplyChanges" call before every model render call. This is out of the question based on the fact that every model can have a different effect, but that will be solved once I implement my render queue. This will also solve unnecessary vertex and index buffer changes as well as shader constant changes etc... . But I'll tackle that one another time.

Also, I'm surprised this hasn't been mentioned:

What does your profiler tell you?

Rather than simply guessing at what is slow, use a profiler will take measurements and pinpoint exactly what is slow.

Then you can look at improving those specific items.

It looks like you are using lazy loading in your Search/Get functions, but for your entities, try to use a quicker algorithm for find() if you can, it's usually linear time in complexity. I used a binary search algorithm for named entities, at the expense of taking a slightly longer insertion time when ordering them alphabetically. Also, make sure to handle name clashes elegantly if you happen to try to create a new entity with the name of an existing one.

I would store one world transformation as a single component instead of having one each for Position, Rotation and Scaling. If you need them, you can extract position and scale very easily from a 4x4 matrix (rotation takes a bit more work though). Store rotations as quaternions, and you can still allow the option to take any sort of input like Euler and Axis-Angle, and convert those internally.

Also, are you reading the debug runtime messages and warnings for redundant state changes?

New game in progress: Project SeedWorld

My development blog: Electronic Meteor

What does your profiler tell you?

Rather than simply guessing at what is slow, use a profiler will take measurements and pinpoint exactly what is slow.

Well my intention is not necessarly based on that I've run into some sort of performance issue, while I did profile (it showed me that GetModel(), GetEffect() and GetTexture() are by far the methods that take up the most time) its more like this: I've written this render system as quick and dirty as possible to get things to showing up, well aware that the code is anything then optimal. Based on the asumption that e.g. creating those matrices every frame is slower than storing it (while this is an assumption, isn't it a pretty accurate one? Since evading 5 possible matrix multiplications per frame per object should do at least something - it might not be the performance killer #1 in the app, but it's still going to be faster.. right?), I've asked this question. One might call me one premature optimisation, but isn't it pretty obvious in this case?

It looks like you are using lazy loading in your Search/Get functions, but for your entities, try to use a quicker algorithm for find() if you can, it's usually linear time in complexity. I used a binary search algorithm for named entities, at the expense of taking a slightly longer insertion time when ordering them alphabetically. Also, make sure to handle name clashes elegantly if you happen to try to create a new entity with the name of an existing one.

Hm, interesting thought, but are you sure we are talking about the same topic here? I wouldn't want to search my entities based on name. In this case e.g. I want every entity that has a position and a model component - no matter what the entity itself is. Wouldn't it be contraproductive if I had to search for entities based on names here? All my render system cares about is that it actually can render an entity (thus it has at least a position and a renderable)- I might add some more factors too once I get to implement more complex things... The actual implementation of "EntitiesWithComponents" doesn't even use find at all, it uses this:


template<typename C, typename... Components>
EntityManager::entityVector* EntityManager::EntitiesWithComponents(void)
{

    ComponentMask mask = componentMask<C, Components...>();
    entityVector* vMatchingEntities = new entityVector;
    for(entityVector::const_iterator ii = m_vEntities.cbegin(); ii != m_vEntities.cend(); ++ii)
    {
        if( (mask & (*ii)->GetMask()) == mask)
            vMatchingEntities->push_back(*ii);
    }
    return vMatchingEntities;
}

I store all my entities in a vector<>. Now I know this implementation is lacking, but because I might need to access an entity by any combination of components at any time - there is not much of a choice for optimisation, at least on that level. Like I suggested I might use a custom iterator to simply skip over all entities that doesn't match the wanted components, but other than that - I don't really know what else to optimize.

Also, are you reading the debug runtime messages and warnings for redundant state changes?

Yes, I do. Or say I tried, there is way to many of them. But again, those might actually reduce to a minimum once I actually implement a renderqueue, right now I don't see much point in reducing those. I quess my next goal should really be that render queue, but I somehow can't motivate myself to doing it :/

Hi,

Little bit off-topic, sorry about that, but where did you get your "Component" system ideas from? Is that from a specific framework?

best regards

This topic is closed to new replies.

Advertisement