Indoor rendering

Started by
44 comments, last by bwhiting 12 years, 5 months ago
All clear,

thanks for your replies. Will come back with the results..

perfection.is.the.key
Advertisement
As promissed i'm back with the results.

We went the Portal way, and thus a level designer places portals manually as faces, describing each portals visibility in terms of a sector it sees.

Each room is made of separate object, so when portal is placed we define which object it sees.




indoor6.jpg







The engine itself, checks for portal that is seen in the frustum (by testing portals AABB and the frustum) and generates a list of sectors visible for the camera.

indoor7.jpg




I think this would satisfy the requirements for this project.

Now i need to 'cut' the frustum to the portal size when iterating through.

Will of course come back after this gets implemented.


perfection.is.the.key
The images aren't loading for me. Can you check the urls?
was working earlier today, I guess the server is just down.

@xynapse

there are two other alternatives to cutting, one simple is to just adjust the 4 frustum planes to 'touch' the new portal, this keeps the checking simple and the results are as good as with cutting.

and alternative is to create a portal 'stack', where you just keep adding frustum planes based on the new portals, the nice part about that is that you can nicely vectorize it (e.g. SSE) and you can add 'anti portals' by just inverting the frustum planes, e.g. when there is a door etc.

Sorry the server was in maintenance yesterday.

Now, as far as i can tell, the portal approach is a very straight forward and easy going technique.




Here is the portal class definition, i know some people might find it useful when looking on the forum so it might help them a bit






class C3DPortal
{

public:
C3DPortal();
~C3DPortal();


bool Parse(const std::string& strPVSDefinition);

void SetAABB(const CBoundingBox& rAABB);
CBoundingBox * GetAABB();

void RenderAABB();

private:
CBoundingBox m_AABB;
bool m_bActive;
std::string m_strPVSDefinition;
std::vector<int> m_iPVS;
};


The Parse method is used to setup the portal, the string that it takes comes directly from 3ds max object properties window as per below:




indoor8.jpg

So when i import the mesh data i read in the portal definition as a string and pass it to new C3DPortal instance for it to perform internal setup.

This approach allows me to do future Portal expansions/customizations for level designers based on the parsing facility.




A vector of int's holds the PVS ( a list of id's of rooms the portal "sees" ).



This is kind of a base Portal - nothing else is now needed for it to work.

As i am now approaching to do the vis tests based on portals i came up to an idea that it would be good to calculate/store the 'portal frustum' when the Portal is being created.


We can do that as we have it's AABB.


So later on when checking for Portal visibility instead of cutting frustum per frame i get that data precalculated at portal creation phase. What do you think guys?




@ Krypt0n - i liked the second idea, can i ask you to help out with the details of this approach?




Thanks guys for keeping this post going.




Be back with an update later this day.
perfection.is.the.key
Ok got it working as expected,

[media]
[/media]




There is a small thing not finished yet, as you might notice at the end of VID where i am entering a room and 'next room' is not rendered - this is because i am now limiting my renderer

to render only those sectors, which portals are within actual sector the camera is in - you can see it exactly happening when i cross the sectors AABB's.




This will get implemented as next stuff.




Right, time for a coffee.










perfection.is.the.key
This thread is very interesting but there's something I don't get about acceleration structures in AD 2011.

From my (little) experience and (maybe poor) understanding there are two huge limiting factors when it comes down to rendering speed.

1- batch as much as you can in order to limit the amount of draw calls
2- limit GPU cycles spent on a per pixel basis

The theory is pretty obvious: by using an acceleration structure you limit your draw calls by rendering only the potentially visible (polygon) set. At the same time, when using occlusion culling, you inderectly limit the GPU cycles by reducing overdraw.

My problem being: there are different ways to solve the same problem without side effects.

Problem: you have 100 objects to render, not cloned/instanced, each made up by 2 materials, one shared and one being choosen from a set of 8 different materials.

Worst case scenario: selecting material 1, render, select material 2, render, switch to next object.
200 draw calls and 400 material (texture/shader) switches.

Second solution: select material 1, render each object, select material 2, render each object using that material, then switch to next material
200 draw calls and 9 material (texture/shader) switches.

Now, if we use an acceleration structure like a BSP or an Octree, we are actually splitting objects, introducing more polygons. So if an object gets splitted that implies we have two different objects, thus increasing the total object count (and draw calls). On the other hand some acceleration structures can reduce overdrawing so this might still be a winner.

What I ask is if I merged all the polygons sharing the same material and I took advantage of a z-pass (ore use a deferred renderer) what kind of performance would I get?

Even if I didn't create a supermerged object for the z-pass I would be able to issue:
9 draw calls to get the z-pass.
9 draw calls for the actual rendering, with 0 overdrawing (guaranteed by the z-pass).

I can submit 18 draw calls compared to 200+ and I'm 100% sure there's no overdraw at all... and as for rendering more polygons, polygon count usually isn't a big problem in 2011... or at least it's not so limiting like shading.

In which way any acceleration structure can render something faster than that?
I was just working thru my own setup, and hit the place where you are at. i went thru an setup a "sortkey" based on this http://realtimecollisiondetection.net/blog/?p=86 so now i have all my objects grouped by material as the most significant, and then the texture is the least important. so i had a setup of over 400 objects, but i was really only using 10 or so textures. so in my update phase, i sort the objects, and then i do a post sort phase, where i check to see if sort keys are matching. example, an object moved into the scene that wasn't in the previous, and it uses the same texture. now these two objects can be batch-merged, combining two draw calls into one.

for static objects, i combined them on load, and dynamic objects they are merged on a per frame basis.

i also use instancing for the case where the sort keys are exact ( same material, same texture ).

it's not that complicated if you draw out your steps in advance.

i also don't know that mine is the appropriate path to take, but it cut me from 400 draw calls down to 15 or so and my polys per batch/draw went up a large amount as well.

just my 0.02
Code makes the man

In which way any acceleration structure can render something faster than that?


You'll have to profile if the benefit of the reduced draw call count outweighs the penalty from the vertex buffer updates needed by dynamically merging your visible set of objects. For minimum overdraw you'd also have to change the order of the objects in your merged vertex buffer according to their distance.

My gut feeling is that draw calls are becoming less expensive as CPU speeds go up, while updating large vertex buffers can be costly, so there might be "hiccups" as you for example rotate your camera, and the visible set (or the distance ordering) changes.

For reducing overdraw, you can also do something like setting a threshold distance where you render the closest objects front-to-back without state-sorting, then switch to state-sorting for the rest.

Octrees can also be used without involving any splitting of the objects, this is commonly accomplished by so-called "loose octrees" where the objects can come out halfway from the octree cell they're based in.

[quote name='undead' timestamp='1320236467' post='4879639']
In which way any acceleration structure can render something faster than that?


You'll have to profile if the benefit of the reduced draw call count outweighs the penalty from the vertex buffer updates needed by dynamically merging your visible set of objects. My gut feeling is that draw calls are becoming less expensive as CPU speeds go up, while updating large vertex buffers can be costly, so there might be "hiccups" as you for example rotate your camera, and the visible set changes.

For reducing overdraw, you can also do something like setting a threshold distance where you render the closest objects front-to-back without state-sorting, then switch to state-sorting for the rest.

Octrees can also be used without involving any splitting of the objects, this is commonly accomplished by so-called "loose octrees" where the objects can come out halfway from the octree cell they're based in.
[/quote]
Well my idea is to logically divide your level in object types.

Let's consider 4 different object types:

- static, unsorted (a static indoor level)
- static, sorted (static translucent objects)
- not static, unsorted (a character)
- not static, sorted (a moveable glass)

Of course my approach is intended only for static objects not needing sort. In that case there's no need to update the vertex buffer, when camera moves there's no special work to be performed, etc. Yes it's brute force and inelegant but I don't see how an octree (loose or standard) can be faster than just merging static geometry.

And even if your geometry won't fit into a single vertex buffer, you can still use multiple VBs containing geometry grouped by the same criteria (example: materials 1 to 50 goes into the first vertex buffer, materials 51 to 70 goes into the second, etc.).

As for transparency unless you use order independent transparency a portal/octree isn't enough to accurately resolve sorting, which in theory should be performed per polygon.
BSPs can be useful in this case.

My point being as far as I can see OP has a good portal prototype working on static geometry which might be more useful for objects than for the level itself.

Maybe I am missing something... :unsure:

This topic is closed to new replies.

Advertisement