Performance problems at ~1,000,000 triangles. Am I doing something wrong?

Started by
28 comments, last by SimonForsman 10 years, 3 months ago

I saw the original few posts but nothing further. Try batching them if those cubes never change. You shouldn't need to set the dirt alpha values etc. You should put those all in one giant 3d model and render it. You can also divide it up into say a 4x4 grid, where you have 16 smaller parts of a bigger model and only draw the ones that you see.

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

Advertisement

For the OP:

I assume you are attempting something like shown in this tutorial: http://en.wikibooks.org/wiki/OpenGL_Programming/Glescraft_1

If so, go through that and the following pages and see how they handle rendering. Notice that they build an entire chunk (16x16x16 or whatever) of cubes all at once an put them in a VBO, and draw each chunk with one draw call. You can optimize even more (and reduce z-fighting and artifacts when you get to doing lighting) by not building faces that are adjacent to opaque blocks. Only rebuild chunks when something has changed.

Doing what is shown in that tutorial, you can render a very large amount of cubes without any problem. Unless I'm misunderstanding something, I don't think instancing is going to help at this point.

@Laztrezort: Thanks for the page, that's an interesting find. I hadn't considered giving each chunk its own vbo. I'll go through the tutorials to see how it works. I believe I can only set one texture per vbo. That shouldn't be a problem, since I can merge the possible textures into one png. However, a single vbo means each cube in a chunk would share its own material. I'll have to think of the ramifications of that.

I planned to not render cubes that cannot be seen, but I thought I should ask since a couple million triangles shouldn't be a problem for today's graphics cards.

@dpadam450: is there a particular batching technique you recommend I look into?

Merging textures into one (or as few as possible) larger textures (usually called a "texture atlas") is a standard technique, precisely so the number of draw calls can be reduced. You just need to give each vertex an appropriate texture coordinate. Same thing can be done with other material data - the more you can reduce the number of calls, the better performance you will see. This is the most important optimization you can probably make at this point.

I believe dpadam450 was referring to the same thing with "batching." This is basically just putting as many vertices in a VBO as you can, to reduce the amount of state changes and draw calls. In fact, a good amount of rendering optimization is based around sorting data intelligently into batches.

Million triangles is nothing for any modern hardware - as long as you are not making thousands of separate draw calls. You will find that the difference in rendering speed between batching vs. not-batching is huge.

I just moved each chunk into it's own vertex buffer. It's running at 60fps in debug mode. I haven't even begun to remove blocks that cannot be seen. This means I'll have to add textures earlier than I thought.

Thank you for your help.

In regards to textures, the simplest thing you can do (sounds like you are doing cubed terrain like minecraft): is to take a square texture and divide it into 2x2,4x4 etc even sections. If one cube maps to dirt, put that in the top left divided section. When it comes time to texture coordinates, they will all be the same size as your division in the texutre and you can simply offset them by some # on x and some # on y to get them to map to the proper square in your texture grid.

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

Also note that 1 milion of primitives is a big amount.. Some days ago i was asking the question how many triangles per second can process today gpus - noone want gave me the answer there, now i can treat your results 40 M triangles as a possible answer to that maybe (?)

Actually in modern high-end GPU's you can render billions of triangles.

“There are thousands and thousands of people out there leading lives of quiet, screaming desperation, where they work long, hard hours at jobs they hate to enable them to buy things they don't need to impress people they don't like.”? Nigel Marsh

Also note that 1 milion of primitives is a big amount.. Some days ago i was asking the question how many triangles per second can process today gpus - noone want gave me the answer there, now i can treat your results 40 M triangles as a possible answer to that maybe (?)

Actually in modern high-end GPU's you can render billions of triangles.

could you show (or did you seen) some proof for that it is a screen rendering that goes at this speed? bilions of trangles is X*billions of bytes per second (where is i do not know how exacty is but up top about 100*billions of bytes i think )

I was not testing this yet on my gpu next month or somethin like that

i will try to measure it.

I haven't done much graphics programming myself, nor do I know if this is worth me pointing out but a section of your code is O(n^3) compexity which is not very effiecient. Phong shading is also expensive to compute as opposed to methods like Gouraud or the Blinn-Phong Model (unless you alter light source direction).

As I said, I have no idea if this is actually relevent to how you would implement what you want, but felt you may be interested if you didn't already know.

Also note that 1 milion of primitives is a big amount.. Some days ago i was asking the question how many triangles per second can process today gpus - noone want gave me the answer there, now i can treat your results 40 M triangles as a possible answer to that maybe (?)

Actually in modern high-end GPU's you can render billions of triangles.

could you show (or did you seen) some proof for that it is a screen rendering that goes at this speed? bilions of trangles is X*billions of bytes per second (where is i do not know how exacty is but up top about 100*billions of bytes i think )

I was not testing this yet on my gpu next month or somethin like that

i will try to measure it.

the geforce 680GTX has a memory bandwidth of 192GB/s (or 192 billion bytes per second). if you use geometry shaders you can represent a fixed size cube (12 triangles) using a single point (12 bytes, so 1 byte per triangle), you're likely to become limited by the shader units or the fillrate(around 30 billion pixels/second) before you hit the memory bandwidth limits unless you use very high resolution textures.

[size="1"]I don't suffer from insanity, I'm enjoying every minute of it.
The voices in my head may not be real, but they have some good ideas!

This topic is closed to new replies.

Advertisement