Sign in to follow this  
sprite_hound

vbo update from heightmap

Recommended Posts

sprite_hound    461
I've been messing around with an fft water implementation, and on profiling the code, I find that updating my vbo is currently pretty slow. This isn't surprising since my vbo setup is really inefficient. I've been looking around, and noone seems to recommend any particular way of doing things. Quad strips? Triangle strips? Indexed triangle strips? Interleaved vbo or not? Use a vertex attribute for the heights in a seperate buffer (might be better for updating every frame)? Current update and rendering code is here:

// update:
void FFTWater::updateVBO(bool init) {
	std::vector<cml::vector3f> vboVertices;
	std::vector<cml::vector3f> vboNormals;
	
	numQuads = 0;
	for (unsigned int x = 0; x != xSize-1; ++x) {
		for (unsigned int z = 0; z != zSize-1; ++z) {
			vboVertices.push_back( cml::vector3f(getWorldX(x), getWorldHeight(x,z+1), getWorldZ(z+1)) );
			vboVertices.push_back( cml::vector3f(getWorldX(x+1), getWorldHeight(x+1,z+1), getWorldZ(z+1)) );
			vboVertices.push_back( cml::vector3f(getWorldX(x+1), getWorldHeight(x+1,z), getWorldZ(z)) );
			vboVertices.push_back( cml::vector3f(getWorldX(x), getWorldHeight(x,z), getWorldZ(z)) );
			
			vboNormals.push_back( normal[index(x,z+1)] );
			vboNormals.push_back( normal[index(x+1,z+1)] );
			vboNormals.push_back( normal[index(x+1,z)] );
			vboNormals.push_back( normal[index(x,z)] );
			
			++numQuads;
		}
	}
	
	vboVertexSize = vboVertices.size() * sizeof(cml::vector3f);
	vboNormalSize = vboNormals.size() * sizeof(cml::vector3f);
	
	if (init) {
		vbo.setData(vboNormalSize + vboVertexSize, 0, GL_DYNAMIC_DRAW);
	}
	vbo.replaceData(vboNormalSize, &vboNormals[0], 0);
	vbo.replaceData(vboVertexSize, &vboVertices[0], vboNormalSize);
}

// rendering:

	VertexBuffer::enableState(GL_NORMAL_ARRAY);
	VertexBuffer::enableState(GL_VERTEX_ARRAY);
	vbo.bind();
	
#define BUFFER_OFFSET(i) ((char*)NULL + (i))
	glNormalPointer(GL_FLOAT, 0, 0);
	glVertexPointer(3, GL_FLOAT, 0, BUFFER_OFFSET(vboNormalSize));
#undef BUFFER_OFFSET
	
	glDrawArrays(GL_QUADS, 0, numQuads * 4);
	
	VertexBuffer::unbind();
	VertexBuffer::disableState(GL_VERTEX_ARRAY);
	VertexBuffer::disableState(GL_NORMAL_ARRAY);


Share this post


Link to post
Share on other sites
Nairb    766
Since you know how big your buffer is going to be, why not just create a fixed-sized buffer to start with instead of a dynamic vector?

I've read an interleaved VBO can give better cache performance, but I really haven't noticed much difference in practice.

If only your heights are changing (not your x & z values, I assume?), yea, it might be beneficial to send those in as an attribute. That way you only have to update & send the heights and normals on the CPU-end.

Indexed triangle strips might help out. You shouldn't ever have to update the index buffer, which is good. But I can't say exactly how much it will help.

How are you sending your data across? I've noticed a significant difference between glBufferSubData and glMap/UnmapBuffer (where glBufferSubData performed much better).

You'll really have to test various things to find out what works best.

Share this post


Link to post
Share on other sites
_the_phantom_    11250
For simplicity of data Indexed triangle lists in strip order are the way to go; all the benifits of an index'd triangle list with only index data over head.

Update wise, if you are replacing the whole buffer the trick is to do a discard and update.

This is done by binding the VBO, called glBufferData() with a NULL data pointer and then calling again with the data. This effectively says to the driver 'I'm done with the content of the VBO, please give me a new buffer to write to if the old one isn't free' and can improve your speeds as you won't be syncing on buffer read/writes.

Other than that Nairb had some good tips about cutting down the amount of data you need to upload. A static VBO with X,Z values in and a dynamic one with Y and normal data would certainly help as you've just cut out 1/3 of the data to upload.

Share this post


Link to post
Share on other sites
sprite_hound    461
@Nairb: Thanks for the tips :)

Quote:
Original post by phantom
For simplicity of data Indexed triangle lists in strip order are the way to go; all the benifits of an index'd triangle list with only index data over head.


@Phantom: Please could you clarify what you mean by an indexed triangle list in strip order? Use GL_TRIANGLES for the primitive type and send in strip style indices?

Share this post


Link to post
Share on other sites
MARS_999    1627
Quote:
Original post by sprite_hound
@Nairb: Thanks for the tips :)

Quote:
Original post by phantom
For simplicity of data Indexed triangle lists in strip order are the way to go; all the benifits of an index'd triangle list with only index data over head.


@Phantom: Please could you clarify what you mean by an indexed triangle list in strip order? Use GL_TRIANGLES for the primitive type and send in strip style indices?


Yes, use an IBO to order the triangles in strip pattern.

Share this post


Link to post
Share on other sites
sprite_hound    461
Quote:
Original post by phantom
For simplicity of data Indexed triangle lists in strip order are the way to go; all the benifits of an index'd triangle list with only index data over head.


Well, I've fiddled some, and I'm still not entirely sure what this means.

Index generation is like so:


std::vector<unsigned int> indices;
numTriangles = xSize*2 * (zSize-1) + zSize-2;
bool even = true;
for (unsigned int z = 0; z != zSize-1; ++z) {
if (even) {
int x;
for (x = 0; x != static_cast<int>(xSize); ++x) {
indices.push_back(x + z*xSize);
indices.push_back(x + z*xSize + xSize);
}
if (z != zSize-2) {
indices.push_back(--x + z*xSize);
}
}
else {
int x;
for (x = xSize-1; x >= 0; --x) {
indices.push_back(x + z*xSize);
indices.push_back(x + z*xSize + xSize);
}
if (z != zSize-2) {
indices.push_back(++x + z*xSize);
}
}
even = !even;
}
vboIndices.setData(indices.size()*sizeof(unsigned int), &indices[0]);








And the primitives are drawn thusly:


VertexBuffer::enableState(GL_VERTEX_ARRAY);
vboVertices.bind();
glVertexPointer(3, GL_FLOAT, 0, NULL);

VertexBuffer::enableState(GL_INDEX_ARRAY);
vboIndices.bind();
glDrawElements(GL_TRIANGLE_STRIP, numTriangles, GL_UNSIGNED_INT, NULL);
vboIndices.unbind();
VertexBuffer::disableState(GL_INDEX_ARRAY);

VertexBuffer::unbind();
VertexBuffer::disableState(GL_VERTEX_ARRAY);








It works fine, but is this what you meant? (i.e. should I be using GL_TRIANGLE_STRIP?).

[Edited by - sprite_hound on April 24, 2008 6:25:26 PM]

Share this post


Link to post
Share on other sites
MARS_999    1627
No you use GL_TRIANGLES but you order the IBO to draw them like strips. You may have to use degenerate triangles depending how you go about this.

Share this post


Link to post
Share on other sites
zedz    291
r u sure its the updating of the VBOs that is causing the slowdown?
cause the cpu code u have there is hugely inefficent

defining a vector array + then using many push_back() each frame is not the way to go, also the construction of the cml::vector3f() as well as getWorldX() etc aint the fastest methods

i wouldnt be surprised if u could speed this up by 10x at least

Share this post


Link to post
Share on other sites
NineYearCycle    1538
Just one possible idea but how about just defining the heightmap using the fft and then modifying the vertex position within a vertex shader? Then you can just use a flat plane defined as a vbo and it never needs to be updated.

The heightmap and normals can be passed in as textures samplers. I've done it to displace the surface of a sphere and was quite impressed with the results for a relatively simple shader.

I'm at work now but PM me and I'll post the code I wrote as I'm sure you could adapt it for your needs. Even if you don't it gives you one more option ;)

Another thing is that, as someone has pointed out, you're doing a push_back on your vectors to build all of the data each frame. You can speed things up quite simply here by simply calling "reserve" on the vector within the number of elements that you'll be populating it with and/or by storing the vector itself to avoid rebuilding it every frame. I bet that's not really adversely affecting your performace at this point anyway but it makes no sense to waste those cycles when you don't have too.

Andy

Share this post


Link to post
Share on other sites
sprite_hound    461
@MARS_999: Enh... I suspect I'm just being dense. If I use GL_TRIANGLES with the indices set up like a strip then I get a nice pattern of disconnected triangles facing forwards and backwards.

I'm now just setting up the vbo once at the start, and updating the heights and attributes as normals every frame (setdata uses glBufferData, and replaceData uses glBufferSubData):


// Init code, called once... adding 1 to the sizes is to handle tiling.
void FFTWater::initVBO() {
bool even = true;
int xSize = this->xSize+1;
int zSize = this->zSize+1;
for (int z = 0; z != zSize-1; ++z) {
if (even) {
int x;
for (x = 0; x != xSize; ++x) {
indices.push_back(x + z*xSize);
indices.push_back(x + z*xSize + xSize);
}
if (z != zSize-2) {
indices.push_back(--x + z*xSize);
}
}
else {
int x;
for (x = xSize-1; x >= 0; --x) {
indices.push_back(x + z*xSize);
indices.push_back(x + z*xSize + xSize);
}
if (z != zSize-2) {
indices.push_back(++x + z*xSize);
}
}
even = !even;
}
vboIndices.setData(indices.size()*sizeof(unsigned int), &indices[0]);

std::vector<cml::vector3f> vertices;
for (int z = 0; z != zSize; ++z) {
for (int x = 0; x != xSize; ++x) {
vertices.push_back( cml::vector3f(getWorldX(x), 0.f, getWorldZ(z)) );
}
}
vboVertices.setData(vertices.size()*sizeof(cml::vector3f), &vertices[0]);
vboNormalSize = vertices.size()*sizeof(cml::vector3f);
vboHeightSize = vertices.size()*sizeof(float);
vboAttributes.setData(vboNormalSize + vboHeightSize, 0, GL_STREAM_DRAW);
vboAttributes.replaceData(vboNormalSize, &normal[0], 0);
vboAttributes.replaceData(vboHeightSize, &floatHeight[0], vboNormalSize);
}

// update called every frame...
void FFTWater::update(float dTime) {
// ... (update heights, normals)
vboAttributes.setData(vboNormalSize + vboHeightSize, NULL, GL_STREAM_DRAW);
vboAttributes.replaceData(vboNormalSize, &normal[0], 0);
vboAttributes.replaceData(vboHeightSize, &floatHeight[0], vboNormalSize);
}

// rendering...
void FFTWater::render(Renderer* rd) {
// ...

VertexBuffer::enableState(GL_VERTEX_ARRAY);
vboVertices.bind();
glVertexPointer(3, GL_FLOAT, 0, NULL);

VertexBuffer::enableState(GL_NORMAL_ARRAY);
vboAttributes.bind();
glNormalPointer(GL_FLOAT, 0, NULL);

// Todo: add this to shaderProg class stuff.
glEnableVertexAttribArray(1);
glBindAttribLocation(shaderProgUnlit->getID(), 1, "Height");
vboAttributes.bind();
#define BUFFER_OFFSET(i) ((char*)NULL + (i))
glVertexAttribPointer(1, 1, GL_FLOAT, 0, 0, BUFFER_OFFSET(vboNormalSize));
#undef BUFFER_OFFSET

VertexBuffer::enableState(GL_INDEX_ARRAY);
vboIndices.bind();
glDrawElements(GL_TRIANGLES, indices.size(), GL_UNSIGNED_INT, NULL);
vboIndices.unbind();
VertexBuffer::disableState(GL_INDEX_ARRAY);

glDisableVertexAttribArray(1);

VertexBuffer::unbind();
VertexBuffer::disableState(GL_NORMAL_ARRAY);
VertexBuffer::disableState(GL_VERTEX_ARRAY);

// ...
}





@zedz: Yep, sorry, should have posted my altered code sooner.

@NineYearCycle: That sounds an interesting idea. PM'ing you. :)

Share this post


Link to post
Share on other sites
NineYearCycle    1538
Ok well this is the 'raw' glsl vertex shader that I used for transforming the vertex of a sphere so it won't immediately fit your needs:

varying vec3 lightDir,normal;
uniform float time;
uniform sampler2D myTexture;

void main()
{
gl_TexCoord[0] = gl_MultiTexCoord0;

lightDir = normalize(vec3(gl_LightSource[0].position));
normal = gl_NormalMatrix * gl_Normal;

vec4 color = texture2D(myTexture, vec2(gl_TexCoord[0]));
vec3 pos = gl_Vertex + (gl_Normal * color.z);
vec4 pos4 = vec4(pos.x, pos.y, pos.z, 1.0);
gl_Position = vec4(gl_ModelViewProjectionMatrix * pos4);
}



In this version I've cut down a bit and removed everything except the vertex modification.

uniform sampler2D myTexture;

void main()
{
// getting the texture coords for the sampler
gl_TexCoord[0] = gl_MultiTexCoord0;

// In this case my source image "heightmap" is RGBA -
// - and height is only 8-bits in the alpha channel.
// This is a bit rubbish but yours could be a full float format.
vec4 heightColour = texture2D(myTexture, gl_TexCoord[0].st);

// up is the direction we want to move the vertex, this was called
// "normal" in the version above.
const vec3 up = vec3(0.0, 1.0, 0.0);

// now we modify the position of the gl_Vertex
// this last section is a bit messy and could be improved, a lot!!!
vec3 pos = gl_Vertex + (up * heightColour.a);
vec4 pos4 = vec4(pos.x, pos.y, pos.z, 1.0);
gl_Position = vec4(gl_ModelViewProjectionMatrix * pos4);
}



There nothing too it.

All I'm really doing is writing the height value into a texture, and then sampling it in the vertex shader. I use it to scale a vector called "normal" in the original shader, or "up" in the second listing. This gives me a scaled vector that I can add to the gl_Vertex that we're currently dealing with and set the gl_Position using it.

As you can see from the comments I've added the version I've given you is not an ideal situation. For a start I'm only using 8-bits of the 32-bit format from my texture. I don't need the extra precision so I save one texture channel but you will probably want to use a floating point texture format, or encode your height into the full texture.

You'll want to add a pre-generated normal map to the above and write your own fragment shader but this should get you started. It only took me a few of hours to learn how to do this the first time and if you can write a fft water simulation I'm sure you're up to it!

Andy

Share this post


Link to post
Share on other sites
_the_phantom_    11250
Quote:
Original post by sprite_hound
@MARS_999: Enh... I suspect I'm just being dense. If I use GL_TRIANGLES with the indices set up like a strip then I get a nice pattern of disconnected triangles facing forwards and backwards.


Chances are you setup your index data wrong then.

Given a set of vertices in triangle strip order {(0,0), (0, 1), (1,0), (1,1)} two triangles can be draw using the following index setup { 0,1,2,2,1,3 }.

This is a triangle list in strip order with indices.

The advntage of this is there is no need for connecting strips of triangles between sections, you can simply start a new triangle wherever you need to. The strip access pattern allows you to make use of the post-T&L cache and all it cost are indices, which are often a small over head (16bit per index).

Share this post


Link to post
Share on other sites
sprite_hound    461
@NineYearCycle: Ah. Cool. Yeah, that's far simpler than I was imagining it to be.

I wonder how that compares speed-wise. I'll stick something together to test it.

@Phantom: Thanks. I'll fiddle with my indices.

Share this post


Link to post
Share on other sites
sprite_hound    461
Well, I changed the indices, and tried out using textures for the height and normal data.

No noticeable change in performance (though this is fine, 'cos it's fast enough for now).

I think I'm gonna stick with the vertex attributes for no particularly good reason.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this