Archived

This topic is now archived and is closed to further replies.

Rendering indoor level (using textures, BSP, PVS, lightmaps, ...)

This topic is 5567 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi guys & coders! My 1st post here. Sorry, I suppose this has been answered many times before but it''s urgent for me and the "Search Forums" feature (which rules the world by the way) is disabled... I got a serious one (problem i mean).. Working on a mostly indoor/city 3d engine - it''s my first 3d work (but i got 8 years experience programming games). I''ve so far implemented Unreal .T3D importing, textures, lightmaps, solid leafy bsp''s with collision detection and it looks quite cool. I''m going to use the BSP tree to calculate the PVS information. But what''s my problem. I''ve got no idea how to render the scene considering all Leafs, Textures, Lightmaps and PVS. 1 static vertex + 1 index buffer - possible but bloody slow i suppose, because it would mean to render not more than say 1-5 polys per DrawIP call... but afterall seems the best for me... 1 static vertex + 1 index buffer per leaf - doesn''t solve problem with textures dynamic index buffer - possible but should i send all the 40000 vertices to the card every frame when only a few hundred are rendered? I have to change texture states sometimes so send 40000 vertices hundred times? Not good i guess. And the vertices cannot be sorted smartly because i have to consider both leaves and textures... locking vertex buffer - i''d rather avoid this, nvidia doesn''t recommend it... plus this solves the prob only with both dynamic index and vertex buffers. vertex buffer for every texture? index buffer for every texture? Vertex and index buffers for every leaf? Nope I cannot find any smart way. Pleassse! I''m sure many of you have been working on a 3d engine similar to mine. I appreciate any advice! Thank you very very so much. BTW: Does anybody of you know how Carmack renders his q3 scene? well my engine, although it loads ut files, is more similar to q3''s one (well that doesn''t mean i wanna compare them )

Share this post


Link to post
Share on other sites
Ah, yes... I remember being in this position when I was coding my first serious 3D engine....

I generated PVS from the BSP tree data and, after checking out some debug stats, noticed that I wasn''t actually rendering that many polys per frame. For example, a typical level contained approximately 25-30 thousand polys - not an absolutely massive number. But, when using the PVS at any given time I was only required to render between, say, 100-5000 polys per frame. This is after trivial rejection, culling, etc. Of course the amount of polys visible from a given camera position is also dependant upon the physical structure of the level...

So to get up and running quickly I created a dynamic vertex buffer capable of holding a couple of thousand polys. I simply then filled it each frame with the visible geometry (via the PVS) and flushed it as and when was necessary. I also threw in some simple optimisations such as sorting by material/texture and renderstates, etc.

This may or may not be of much use but this provided a good enough ''engine'' whilst I was learning, experimenting and prototyping various 3D related stuff. Obviously it might not be the best or fastest way to go but it''s a good start

Good luck!
Regards,
Sharky

Share this post


Link to post
Share on other sites
I agree with Sharky. Do not be affraid to use dynamic Vertex Buffers (ie locking it every frame, even several times). It can be very very fast provided you use the correct flags to create and lock it.

As for how I set up my rendering, up until now I have had a wrapper for D3D and OpenGL. Each entity has access to this wrapper and calls its DrawPrimitive() method as it requires. So, say, each model is rendered seperately and in turn.

However I have recently been working with pixel/vertex shaders and per-pixel lighting/bump mapping and stencil shadows. For correct and robust scene rendering in this case, I really need to draw the entire scene in distinctive passes. eg I first need to render the entire scene with ambient light only, then shadows and diffuse lighting, then spectular... This requires me to have access to all visible primitives in one location in my code, ie I can''t just render each model, mesh or BSP leaf seperately.

What I''m thinking of doing is this: Have a ''Collector'' class. Pass this class through the heirachy of entities in the game world. Each entity adds whatever it wants to this Collector (if visible) which can be Meshes, Triangle Strips or Triangle Lists. The collector could then possibly re-order these primitives by texture and rendering technique. Finally the core rendering method will get this filled Collector (which now should contain everything visible in this frame), fill a dynamic Vertex Buffer with the triangle information, and do the required rendering passes as neccessary

I still have to experiment with this. If optimised, I think it would work well.

Sorry if that doesn''t answer your question at al

Share this post


Link to post
Share on other sites
quote:
Original post by ALBI
I got a serious one (problem i mean).. Working on a mostly indoor/city 3d engine - it''s my first 3d work (but i got 8 years experience programming games). I''ve so far implemented Unreal .T3D importing, textures, lightmaps, solid leafy bsp''s with collision detection and it looks quite cool. I''m going to use the BSP tree to calculate the PVS information. But what''s my problem. I''ve got no idea how to render the scene considering all Leafs, Textures, Lightmaps and PVS.

1 static vertex + 1 index buffer - possible but bloody slow i suppose, because it would mean to render not more than say 1-5 polys per DrawIP call... but afterall seems the best for me...

1 static vertex + 1 index buffer per leaf - doesn''t solve problem with textures



What you should do is not draw right away, but send all polygons to your own draw routine, which builds a list in memory, You can sort by texture, and then when parsing the BSP is finished, pump out the triangles into vertex buffers, all triangles with the same texture in one go.

quote:

dynamic index buffer - possible but should i send all the 40000 vertices to the card every frame when only a few hundred are rendered? I have to change texture states sometimes so send 40000 vertices hundred times? Not good i guess. And the vertices cannot be sorted smartly because i have to consider both leaves and textures...



That''s probably not the way, no. You can build vertex buffers, one per texture. The good thing is that if the world is divided into cells (area''s, a convex polygon in BSP terms). You should then only rebuild the buffers when the camera goes into another cell.

quote:

locking vertex buffer - i''d rather avoid this, nvidia doesn''t recommend it... plus this solves the prob only with both dynamic index and vertex buffers.

vertex buffer for every texture? index buffer for every texture? Vertex and index buffers for every leaf? Nope I cannot find any smart way. Pleassse! I''m sure many of you have been working on a 3d engine similar to mine. I appreciate any advice! Thank you very very so much.


BTW: Does anybody of you know how Carmack renders his q3 scene? well my engine, although it loads ut files, is more similar to q3''s one (well that doesn''t mean i wanna compare them )


When studying how Quake/q2/q3 works, you must remember that quake 1 used the BSP tree to get all polygons in a back-to-front sorting order. Quake 2 had a software renderer as well, next to OPenGL so it used this way too. Many techniques in Q1 are now obsolete, especially the clipping against screen space.

With modern hardware the zbuffer takes care of the order, and you should not use it this way. The traditional BSP divides a scene up to the polygon, Q3 only divides in into area''s, a convex space with polygons as walls. Q3 doesn''t bother with Z ordering, it just sends all polygons down to an edge sorter.

Q3 is more portal based, as every BSP node defines a bounding box and every leaf defines a bounding box and area. It uses the BB to cull large parts when they fall outside of the frustrum, after the PVS processing. This is a bit list as big as the total number of area''s. It doesn''t really use the BSP anymore to draw the scene, it''s used for collision detection, and to determine which area you are in, then it just draws the visible area''s.

Share this post


Link to post
Share on other sites
Sharky, Poia & Fidelio_: Once again, thanks a lot for your answers. it's much clearer for me now and thanks to your answers I finally managed to render my level the way I wanted (that means render it like before when I'd got no BSP tree, but now be able to render only the BSP leaves I want).

It's quite slow but I suppose it's because of not using index buffer and because of drastic increment of polygons (more than 2 times) due to BSP polygon splitting.. On my GeForce4 MX-440 and 800x600x32 I had got 150 fps rendering the level brute-force (with static indices and vertices and without BSP splits) and now I have 20 fps using dynamic vertex buffer When I not fill my dynamic vertex buffer every frame (and do it only on startup), the fps is still about 40 fps. But this is full scene, no optimisations. After I implement PVS and Frustrum Culling I'm hoping it will be about 200fps again.

If you're interested, here are few screenshots of imported Tempest map from Unreal Tournament (they're 3 weeks old, now it looks the same but as i said is much slower). They have no extreme effects yet but i like 'em

Shot 1
Shot 2
Shot 3

ALBI

[edited by - ALBI on September 14, 2002 7:22:30 AM]

Share this post


Link to post
Share on other sites
quote:
Original post by ALBI
Sharky, Poia & Fidelio_: Once again, thanks a lot for your answers. it''s much clearer for me now and thanks to your answers I finally managed to render my level the way I wanted (that means render it like before when I''d got no BSP tree, but now be able to render only the BSP leaves I want).

It''s quite slow but I suppose it''s because of not using index buffer and because of drastic increment of polygons (more than 2 times) due to BSP polygon splitting.. On my GeForce4 MX-440 and 800x600x32 I had got 150 fps rendering the level brute-force (with static indices and vertices and without BSP splits) and now I have 20 fps using dynamic vertex buffer When I not fill my dynamic vertex buffer every frame (and do it only on startup), the fps is still about 40 fps. But this is full scene, no optimisations. After I implement PVS and Frustrum Culling I''m hoping it will be about 200fps again.

If you''re interested, here are few screenshots of imported Tempest map from Unreal Tournament (they''re 3 weeks old, now it looks the same but as i said is much slower). They have no extreme effects yet but i like ''em



Check out the following URL:
http://www.gamedesign.net/archive/old.gamedesign.net/quake2/tutorials/dewan_hint/

It says:
Unlike qbsp3 which breaks up polygons to fit into polygon nodes, then tries to merge them again later, q3map for Quake 3 Arena leaves polygons unbroken, and makes no attempt to merge later. The ultimate effect is roughly the same. The bsp tree itself is still built as normal.

The BSP process also gives axial planes a high priority, and determines which candidates would split the least number of polygons. Now that for Q3 polygon order isn''t important anymore (the whole node gets sent to the drawing batcher unsorted) it''s also not important anymore to split polygons. I believe that if you don''t split, the results will visually be the same, but you will be drawing a lot less.

Share this post


Link to post
Share on other sites
I experimented with this a few months ago and found the following:

on s/w t+l cards, use a dynamic VB and IB, both sorted on texture/whatever else.

on a h/w t+l cards, use a static VB and a dynamic IB, which is sorted on texture. The VB doesn''t need to be sorted.

I found these to be the best setup for me, personally but of course it may vary in different situations.

Using dynamic IB and a static VB in s/w t+l means that the entire VB basically gets processed in order to only render a few polys. (everything between MinIndex and Maxindex or whatever they''re called in the drawIP call).
When using both dynamic, you can sort both the VB and IB on texture and use a DrawIP call for each texture "block" which works out to be one of the fastest ways to do this sort of thing.

If you come up with a better soloution I''d be interested to hear it.
Good luck,
Toby

Gobsmacked - by Toby Murray

Share this post


Link to post
Share on other sites
quote:
Original post by Fidelio_
Check out the following URL:
http://www.gamedesign.net/archive/old.gamedesign.net/quake2/tutorials/dewan_hint/

I believe that if you don''t split, the results will visually be the same, but you will be drawing a lot less.



The info on this page is really interesting, thanks for link. As for not splitting polys... This is how I did it in my old (solid node based) BSP tree, but in leafy BSPs you at least have to split them during the building process to be able to compute leaves'' bounding boxes for cheap frustrum clipping. But it is true that you don''t have to necessarilly use splitted polys for rendering and thus you can save a plenty of polys. Good tip, I''ll try to add this feature into my smart-cheap-tiny-and-sexy "fillbuffers" routine which is forming in my head (well, i got one in my code too but it is far far away from sexy ).


quote:
Original post by tobymurray
I experimented with this a few months ago and found the following:

on s/w t+l cards, use a dynamic VB and IB, both sorted on texture/whatever else.

on a h/w t+l cards, use a static VB and a dynamic IB, which is sorted on texture. The VB doesn''t need to be sorted.

I found these to be the best setup for me, personally but of course it may vary in different situations.

Using dynamic IB and a static VB in s/w t+l means that the entire VB basically gets processed in order to only render a few polys. (everything between MinIndex and Maxindex or whatever they''re called in the drawIP call).
When using both dynamic, you can sort both the VB and IB on texture and use a DrawIP call for each texture "block" which works out to be one of the fastest ways to do this sort of thing.

If you come up with a better soloution I''d be interested to hear it.



Thanks a lot for interesting info. Currently I''m using dynamic vertex buffer ONLY, don''t suppose it''s a good way especially for older cards but in my opinion it might be better than static VB and dynamic IB (on any card). I guess the slowest thing on hardware t&l cards is locking buffers. But of course I got no proof for that..

Share this post


Link to post
Share on other sites