Jump to content

  • Log In with Google      Sign In   
  • Create Account

Voxel Rendering (OpenGL) +1 for help


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
15 replies to this topic

#1 CryoGenesis   Members   -  Reputation: 495

Like
0Likes
Like

Posted 19 August 2012 - 11:33 AM

Hello thar. I have written a voxel engine in java (using openGL) which is pretty good. There is one downside. The FPS. The engine only checks to to render blocks in your view distance and only renders blocks that you see. It doesn't even render the faces you don't see. The way the blocks are render is just by using active rendering. This keeps a steady 30FPS with a view distance of a mere 30 blocks infront of the player. I tried to speed the rendering up with Vertex buffer objects but the game uses up WAAAAAYY too much RAM (2.9G to be precise). I then made only 1 Vertex buffer object and just translated it for each cube. This put the RAM to normal levels but the FPS dropped to around 4 - 9 FPS... I then switched to display lists and made just one display list that would be translated for each cube to keep ram to a minimum. Again, 6 - 11 FPS.
Do you know of any ways to render lots of cubes at minimum ram and high FPS? I know minecraft does it relatively well and that uses OpenGl?

Sponsor:

#2 larspensjo   Members   -  Reputation: 1529

Like
5Likes
Like

Posted 19 August 2012 - 11:57 AM

How many faces (triangles) do you show?

The following is a world at 160 blocks viewing distance. With a low budget graphics card, it renders at about 30 FPS with 250000 triangles and 500 draw calls in full widescreen mode. If is based on VBOs.

You say that you use far too much memory. How much data do you have per vertex? I use the following per vertex:
[source lang="cpp"]struct VertexDataf { glm::vec3 fNormal; glm::vec2 fTexture; glm::vec3 fVertex; float fIntensity; // Pre computed The light intensity of this vertex float fAmbient;};[/source]
I use back face culling to skip faces I can see.

Using only one vertex buffer object for one block means you need to change the transformation matrix for every block. This is probably very expensive, with one draw call for every block. Please explain about the data structures you used for the VBOs, and maybe we can identify why it is too much RAM.

blocks.jpeg
Current project: Ephenation.
Sharing OpenGL experiences: http://ephenationopengl.blogspot.com/

#3 CryoGenesis   Members   -  Reputation: 495

Like
0Likes
Like

Posted 19 August 2012 - 12:24 PM

How many faces (triangles) do you show?

The following is a world at 160 blocks viewing distance. With a low budget graphics card, it renders at about 30 FPS with 250000 triangles and 500 draw calls in full widescreen mode. If is based on VBOs.

You say that you use far too much memory. How much data do you have per vertex? I use the following per vertex:
[source lang="cpp"]struct VertexDataf { glm::vec3 fNormal; glm::vec2 fTexture; glm::vec3 fVertex; float fIntensity; // Pre computed The light intensity of this vertex float fAmbient;};[/source]
I use back face culling to skip faces I can see.

Using only one vertex buffer object for one block means you need to change the transformation matrix for every block. This is probably very expensive, with one draw call for every block. Please explain about the data structures you used for the VBOs, and maybe we can identify why it is too much RAM.

blocks.jpeg

Well I changed it so only the voxels that are active have their VBOs initialized and that brought the Ram usage down to a gig or so.
The voxels are drawn using quads and the VBO objects have 72 floats representing each vertex for each face so even though there are only 8 vertexes for a cube I have to do it for each face so it's basically 3(spacial coordiantes)*4(vertexes per face)*6(faces) = 72. Although, I'm not sure if there is a way round that or not. I'm not sure why the static VBO was taking up so much FPS.

At the moment I'm only using 1 chunk. The chunk is 128*128*128 but only a 10th of that is active due to the terrain generation being at a low level.
Is there a way to only have 8 vertexes in a VBO and draw cubes from that?

Edited by CryoGenesis, 19 August 2012 - 12:25 PM.


#4 nox_pp   Members   -  Reputation: 490

Like
2Likes
Like

Posted 19 August 2012 - 02:18 PM

I then made only 1 Vertex buffer object and just translated it for each cube.


Are you saying that you have a draw call for each cube? If possible, batch your cubes into as few draw calls as possible. If they all use the same shader, then you would only need to batch on differing textures...but you could batch even harder than that by using a 3D texture and selecting an appropriate texture from within it based upon some parameter. You lose the ability to specify things like your model matrix in a standard uniform, but you could potentially just index in to a uniform buffer object...otherwise just repeat that data on each vertex.

At that point, you're basically replicating instanced rendering...which may be the natural next step.

Between Scylla and Charybdis: First Look <-- The game I'm working on

 

Object-Oriented Programming Sucks <-- The kind of thing I say


#5 larspensjo   Members   -  Reputation: 1529

Like
1Likes
Like

Posted 19 August 2012 - 02:23 PM

At the moment I'm only using 1 chunk. The chunk is 128*128*128 but only a 10th of that is active due to the terrain generation being at a low level.
Is there a way to only have 8 vertexes in a VBO and draw cubes from that?

That would sure save a lot of memory, but I don't know how to make it efficient. Maybe using a geometry shader to generate vertices. Having one draw call for every block, with an updated model matrix every time, is not feasible.
I am using chunks of 32x32x32 blocks, but I can't say that 128 is wrong. In my case, network transfer speed is important, and too big chunks would be inefficient. When drawing at 160 blocks viewing distance, I need at most 130 chunks, at about. That makes 32x32x32x130=4 million possible cubes.
A disadvantage with too big chunks is that the chance increase that some part of it is visible, forcing you to draw all of it. The disadvantage of too small chunks is that there will be very many draw operations, which can become inefficient. Maybe 128 is a little too big? I think 32 is a good compromise, still leading to long enough lists of triangles to be efficient.
There is one way to save vertex data space. If your chunks are 128 in width, you can represent cube addresses in a byte (instead of a 32-bit float). That would reduce the memory with a factor of 4. You then need to have the shader translate the byte number to a world position coordinate (which is just what it does anyway).
Current project: Ephenation.
Sharing OpenGL experiences: http://ephenationopengl.blogspot.com/

#6 CryoGenesis   Members   -  Reputation: 495

Like
0Likes
Like

Posted 19 August 2012 - 02:38 PM


At the moment I'm only using 1 chunk. The chunk is 128*128*128 but only a 10th of that is active due to the terrain generation being at a low level.
Is there a way to only have 8 vertexes in a VBO and draw cubes from that?

That would sure save a lot of memory, but I don't know how to make it efficient. Maybe using a geometry shader to generate vertices. Having one draw call for every block, with an updated model matrix every time, is not feasible.
I am using chunks of 32x32x32 blocks, but I can't say that 128 is wrong. In my case, network transfer speed is important, and too big chunks would be inefficient. When drawing at 160 blocks viewing distance, I need at most 130 chunks, at about. That makes 32x32x32x130=4 million possible cubes.
A disadvantage with too big chunks is that the chance increase that some part of it is visible, forcing you to draw all of it. The disadvantage of too small chunks is that there will be very many draw operations, which can become inefficient. Maybe 128 is a little too big? I think 32 is a good compromise, still leading to long enough lists of triangles to be efficient.
There is one way to save vertex data space. If your chunks are 128 in width, you can represent cube addresses in a byte (instead of a 32-bit float). That would reduce the memory with a factor of 4. You then need to have the shader translate the byte number to a world position coordinate (which is just what it does anyway).

Hey thanks, the reason the world size is 128 is just because of engine testing. I'm not looking to make the game multiplayer either.
I like the idea of having the blocks only hold a byte.
Each block holds 96 bits for its position so only having 24 bits would be a hell of a good Idea.
The blocks also hold 3 integers with their voxel positions in. I could just scrap those and use the byte position instead.
Although this brings down the memory usage I expect that the VBOs would bring it straight back up again. Each block holds a VBO with a ridiculous amount of floats. I tried converting it to display lists but it didn't give me any increase in fps. It decreased the fps instead.
Might have a look at the minecraft source code and see how that voxel engine draws the cubes.
Still, thanks for the idea!

#7 larspensjo   Members   -  Reputation: 1529

Like
0Likes
Like

Posted 19 August 2012 - 03:31 PM

Each block holds a VBO with a ridiculous amount of floats.

Actually, the VBOs can be encoded as byte integers also. I know it works, because that is the way I did it originally.

I had to go for floats eventually, because my terrain is sent through a smoothing filter (see same scene with smoothing below).
blocks1.jpeg
Current project: Ephenation.
Sharing OpenGL experiences: http://ephenationopengl.blogspot.com/

#8 CryoGenesis   Members   -  Reputation: 495

Like
0Likes
Like

Posted 19 August 2012 - 03:52 PM


Each block holds a VBO with a ridiculous amount of floats.

Actually, the VBOs can be encoded as byte integers also. I know it works, because that is the way I did it originally.

I had to go for floats eventually, because my terrain is sent through a smoothing filter (see same scene with smoothing below).
blocks1.jpeg


I turned the positions etc to bytes. It only saved around 50k of memory in exchange for some bugs. I changed it back to floats just so the code is clean.
How did you manage to render that much with VBOs?

#9 CryoGenesis   Members   -  Reputation: 495

Like
0Likes
Like

Posted 19 August 2012 - 04:36 PM



Each block holds a VBO with a ridiculous amount of floats.

Actually, the VBOs can be encoded as byte integers also. I know it works, because that is the way I did it originally.

I had to go for floats eventually, because my terrain is sent through a smoothing filter (see same scene with smoothing below).
blocks1.jpeg


I turned the positions etc to bytes. It only saved around 50k of memory in exchange for some bugs. I changed it back to floats just so the code is clean.
How did you manage to render that much with VBOs?


Oh and also good job on that. It looks really good.

#10 powly k   Members   -  Reputation: 653

Like
0Likes
Like

Posted 19 August 2012 - 04:46 PM

Wait, each block holds a vbo, as in you have a vbo for each cube? You should pretty much have as few vbos as you can. In your case, that could pretty much be one. As in, you input your data once - or when it changes - and only render using a single glDrawArrays (or similar) call per frame. Or divide into slightly smaller blocks of multiple cubes (which you might be doing, your description wasn't too clear about that) and call it for every N³ block of cubes, still not for every cube. You might also want to draw with index lists, though that might not be a good idea if you want to have per-face normal information. Which is not absolutely necessary, but makes lighting a bit easier.

Also, if you only have a few vbos, they don't take RAM pretty much at all, only VRAM. And assuming you use vec3 for every position, normal and texture coordinate and also use a baked float for ambient occlusion or whatever, you'll be doing 40 bytes per vertex and 960 bytes per cube. With 200MB of VRAM, that gives you over 200k cubes (a lot more in usual situations if you cull the underground faces) to play with, which ought to be more than enough.

#11 CryoGenesis   Members   -  Reputation: 495

Like
0Likes
Like

Posted 19 August 2012 - 05:21 PM

Wait, each block holds a vbo, as in you have a vbo for each cube? You should pretty much have as few vbos as you can. In your case, that could pretty much be one. As in, you input your data once - or when it changes - and only render using a single glDrawArrays (or similar) call per frame. Or divide into slightly smaller blocks of multiple cubes (which you might be doing, your description wasn't too clear about that) and call it for every N³ block of cubes, still not for every cube. You might also want to draw with index lists, though that might not be a good idea if you want to have per-face normal information. Which is not absolutely necessary, but makes lighting a bit easier.

Also, if you only have a few vbos, they don't take RAM pretty much at all, only VRAM. And assuming you use vec3 for every position, normal and texture coordinate and also use a baked float for ambient occlusion or whatever, you'll be doing 40 bytes per vertex and 960 bytes per cube. With 200MB of VRAM, that gives you over 200k cubes (a lot more in usual situations if you cull the underground faces) to play with, which ought to be more than enough.

I tried to use one static vbo for the cubes and just translate the cube for each draw. It lagged the game like hell. I'm looking to get around (32*32*32)*5 cubes in the Ram. This is all the cubes in the current chunk and all the cubes in the adjacent chunks. When you move to the next chunk all the data is saved and whatever chunks havent been loaded or generated are generated.
The problem is the VBO takes in vertexes for each face. Thats 72 float variables stored for EACH cube. Not to mention Texture coordinates. When I ran this last time on a 128*128*128 chunk it managed to run fast but it took up around 2Gigs of Ram.

#12 powly k   Members   -  Reputation: 653

Like
0Likes
Like

Posted 19 August 2012 - 05:44 PM

You're doing Vertex Arrays (VA) instead of Vertex Buffer Objects (VBO) if your index count makes your RAM usage go up. (Or you're developing on a laptop with shared memory, which would be very, very sad.)

I'm also interested in how you got it to take over 2 GBs of whatever memory it takes. If you go the very unoptimized route and store everything (underground, air, all the cubes there are) in your VBO, you'd still have only 32x32x32x5x72x4 bytes = 47 185 920 bytes - under 50 MBs. If you optimize it a bit and only store and show the ground layer, you'll probably be looking at a few megabytes of data.

EDIT: Ah, you got the >2GB with 128³, which would still result in around 600MB by my calculations. I wonder what's going wrong here..

Edited by powly k, 19 August 2012 - 05:47 PM.


#13 CryoGenesis   Members   -  Reputation: 495

Like
0Likes
Like

Posted 19 August 2012 - 06:00 PM

Ah yes I'm using a laptop with 256mb of dedicated VRAM. The laptop itself is pretty good. it runs on a 2.4Ghz dual core AMD with a Radeon HD graphics card. My laptop isn't a High end gaming laptop but it should be able to run anything that I make by myself.
On startup only the active blocks have VBOs when you add a block to the world it initializes its VBO.
I think one of the main reasons of the 2g memory usage was that I was doing 128*128*(some random number between 5 and 20) blocks that all had a VBO that held 72 float variables.
It's very in-efficient and I'm trying to find out how to have VBOs with only 8 floats for the vertexes.

#14 nox_pp   Members   -  Reputation: 490

Like
0Likes
Like

Posted 20 August 2012 - 10:02 AM

It's very in-efficient and I'm trying to find out how to have VBOs with only 8 floats for the vertexes.


Well, if you want to take that idea to the extreme...you can get away with 1 float per voxel if you use a geometry shader, but I can't speak to the speed or plausibility of that. You could also apply the same idea by sending your 8 floats through, and then using the geometry shader to generate whatever else you need. And finally, you can try sending 8 floats plus an integer index buffer.

Edited by nox_pp, 20 August 2012 - 10:06 AM.

Between Scylla and Charybdis: First Look <-- The game I'm working on

 

Object-Oriented Programming Sucks <-- The kind of thing I say


#15 powly k   Members   -  Reputation: 653

Like
0Likes
Like

Posted 20 August 2012 - 11:20 AM

I'm still concerned about the amount of your VBOs, so when you say you have 128x128x5~20 blocks, they're all in the same VBO, right?

#16 CryoGenesis   Members   -  Reputation: 495

Like
0Likes
Like

Posted 20 August 2012 - 02:35 PM

The chunk size is 128*128*128 and each one has its own VBO. Only the blocks that are generated at startup get a vbo and the rest aren't rendered until they are added.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS