3D Engine Acceleration Techniques

Started by
18 comments, last by Prozak 21 years, 1 month ago
It is not the engine design, because the limitation is placed on the GPU, not on the CPU (that is, there is no difference between 750 MHZ and 2 GHZ computers, if they both have the same video board (well, in fact, there is a difference, but VERY small)).
And i also don''t have too much overdraw either (except in some cases, when the player is near water), but, in general, the overdraw is around 20% or so...

Height Map Editor | Eternal Lands | Fast User Directory
Advertisement
quote:
2. Since display lists tend to require quite a bit of fillrate (or smt like that), they''re likely to decrease performance (depends on the actual situation, e.g. it may depend on the amount of memory needed to store the display list).


*Cry* Please explain how display lists could have any impact on the fill-rate, since the rendered geometry is exactly the same than if you just used IM or standard VAs.

DLs are not likely to decrease your performance; they''re likely to increase it. It can happen that they slow down things a bit, but i''d blame it on the drivers, or a misuse of them (like trying to put 200 Mb of data in display lists, or rebuilding them every frame).

Y.
Occulsion culling or inverse portels work quite well for large open areats or outdoors...
quote:Original post by Ysaneya
*Cry* Please explain how display lists could have any impact on the fill-rate, since the rendered geometry is exactly the same than if you just used IM or standard VAs.

DLs are not likely to decrease your performance; they''re likely to increase it


Yes, they will increase performance, if they are used correctly. For example, if you''re using vertex arrays, encapsulating the glDrawElements calls in display lists makes little or no sence. I don''t remember all the thoery exactly, since I haven''t used display lists in quite a while...

You are correct that the geometry is exactly the same in display lists as in IM or VA (assumming you use your DLs for geometry - you might just as well use them to encapsulate state changes and such), but that geometry is stored in a different way, which may affect performance in varying ways, depending on the implementation of the DLs, the information (and the amount of it) that the DL contains, and your bottlenecks. I''m not sure it is fillrate that is the problem here, perhaps its memory bandwith or some other related thing... I don''t remember.

Btw, the OpenGL FAQ has some good pointers regarding this issue under "Display Lists and Vertex Arrays".


Michael K.,
Designer and Graphics Programmer of "The Keepers"


We come in peace... surrender or die!
Michael K.
So you really have no idea, right? Well I have an idea. There is no way in hell a dl could decrease fill performance. The only (far fetched) scenario that could possibly impact framerate in the way you describe is if the DL eats up that final piece of video memory, forcing you to swap textures in and out every frame. Putting drawElements in a display list will probably help, at least with good drivers, since it will transfer the vertices to more optimal memory (like AGP or video mem). If there is a difference between immediate mode and vertex arrays inside a dl is not as clear, but it's usually simpler to have a single way to render everything, i.e vertex arrays. DLs are a simple way to dramatically increase geometry performance on most implementations, but might not be feasible since they're immutable. Of course, this only matters *if you're geomtery limited*, which 99% of the time you aren't. If you're interested in geometry performance I suggest you check out Ysaneya's excellent OpenGL geometry benchmark here. Here are a few practical acecleration tips for the original poster:

* Use vertex arrays, specifically glDrawRangeElements
* Use display lists or the extensions NV_vertex_array_range or ATI_vertex_array_object.
* Make sure your triangles are roughly in triangle strip order, i.e tris that are close in the index stream are close in the mesh. NB: You don't have to use actual strips, just send regular tris in rough strip order.
* Decrease resolution, you're probably fill limited.
* Send large batches of stuff to the graphics card. Don't tinker around with individual triangles, just render them.
* Avoid setting redundant state and avoid changing state if you can. Dont go overboard though, like sorting *triangles* by texture (see previous point).
* Don't render what you can't see. Google worlds: spatial partitioning, oct-tree, quad tree, BSP, occlusion culling, dPVS manual, frustum culling.
* Avoid blending, it eats memory bandwidth. If you have to use it, enable alpha test as well and use it to discard fragments that have little impact on the framebuffer.
* Avoid using esoteric OpenGL functions that are often slow like e.g. two sided lighting, edge flags, selection, feedback and polygon antialiasing.
* Use mip mapping.
*Finally, make it work first, then make it fast

[edited by - GameCat on March 7, 2003 9:33:34 AM]
I''m writing a writing a 3d terrain engine and was getting 570 fps on a geforce4 ti4600 with 32768 textured polys (tri-stripped, in locked vertex arrays). I then just decided to restructure my engine, just modify it a little so it was just a tad bit more speedy in loading the terrain. Lo and behold once I was done, I got 620 fps. All I did was move a few index arrays to a static class, make sure they were only set up once, and stuff like that that happens at loading that I would have never thought would help my fps. Sometimes, a simple restructuring, just very simple code changes can speed it up.

-------------------
Realm Games Company
-------------------Realm Games Company
There''s a load of dirty detail like the angle of the texture to the camera can affect the speed of texel retrieval, the overhead of re-transforming shared vertices etc etc.

Visibility tests for objects can be sped up several times by subdividing space. Once the world is in areas you can do a dead fast "if object x is not where it was last time, search" and then for each visible area, consider only it''s own objects.

A* pathfinding can be sped up a lot with pooled memory, partial paths etc. There''s a great article by Dan Higgins.

If you can afford the memory, have plenty of variable levels of detail for models.

Tell Windows to knock your threads'' priorities up a notch, you can get a big speedup if other applications are running in the background.

One of the biggest bottlenecks is memory. Someone''s written an AMD MMX memcpy() which is about 3x faster than movsd. It accomodates arbitary alignment and sizes, so you could write an even faster blit version which expects a cache-aligned start address and a length which is an integer multiple of 4 or 32 or whatever.

If you use a lot of inefficient standard library functions (especially math and string manipulation) it may be worth writing your own.

A little software rendering eg for the HUD can reduce eg Direct calls. Your own blit is likely to be faster when used on many small areas.

Sometimes loops and recursive functions can be eliminated eg sum of a series.

Write performance-monitoring code. I have a function definition macro that counts the number of times a function is called in the debug build.

Critically examine your data structures for cache-thrashing.

my brain hurts...

********


A Problem Worthy of Attack
Proves It''s Worth by Fighting Back
spraff.net: don't laugh, I'm still just starting...
quote:Original post by Raduprv
From what I noticed, the slower things, in OpenGL are alpha testing and [alpha] blending. So, try to avoid them, as much as possible
I implemented once frustrum culling, in my engine, and got a drastical 0% increase in the frame rate.. (even tho the polygon count decreasted by 60%) :D


Raduprv, you can''t change the refresh rate of your monitor (actually you can adjust it in BIOS menu, it''s always going to be 60 FPS, 75, or maybe 80 FPS....Sound familar? If you notice your frame rate in limbo at 60 FPS no matter what, then you have "VSYNC enabled." This has been talked about a lot in the forums, especially when public testers report "60 FPS." That''s a false statement, but isn''t made with knowing any better. No, I don''t know how to disable it, so that I could see the actual engine performance, but many people here do. More than likely, your frustrum culling abated the strain on your CPU and GPU, but you just couldn''t notice because of the mechanics of VSYNC, which I can''t explain to you at the GPU pipeline level(or maybe driver?), because I don''t know anything about it. But, yep, that''s probably your problem.
Keep coming back, because it's worth it, if you work it, so work it, you're worth it!
63616C68h, do you really think I am so stupid to let the Vsync on, while testing the performances of my engine? (or any other engine, in the first place)

Height Map Editor | Eternal Lands | Fast User Directory
quote:you can''t change the refresh rate of your monitor


Whats the monitor refresh rate in windows do then?

This topic is closed to new replies.

Advertisement