Is Vertex Buffer extension "much more" faster than Display Lists?

Started by
28 comments, last by owl 20 years, 5 months ago
I had this problem, I tried the second way and I guess gl matrix calls are illegal between begin and ends. So I am no pro, but I''m pritty sure (and what im doing) what you you do is compile a vertex array manualy

vertexout = (vertexin * bonematrix1 * boneweight1) + (vertexin * bonematrix2 * boneweight2) .... if you have more affecting bones.

and then render that array. It seems so slow to have to process every vertex individualy, but I guess its the way to do it.
Advertisement
you might want to look into the GL_ARB_Vertex_blend extension (ati have a demo of its useage) as this does what you want but basicaly in hardware
Having had a quick shifty at the code all the software work you have to do is lerp the bones, ones thats done you use it to setup mulitple modelview matries for each bone as required.

http://www.ati.com/developer/vertexblend.html
this thread finally made me play around with it. matrix palette, weights, blending.. one would think everything you need is there but of course for some reason nvidia decided that their driver will not support vertex blending and matrix palette anymore, so im kind of clueless how to dynamically select the right program matrix from within the shader. maybe its time for a new card.
f@dzhttp://festini.device-zero.de
quote:
1)Do you calculate the new position of the vertices and then assemble them all into 1 array and pass it to the card? In this case how do you calculate the new position with just martrices? Do you extract the rotation/translation values from the matrice? If you do, how do you extract it?(Sorry im not good with Matrice math)

Each bone contains a 4x4 local matrix which describes its position and rotation relative to its parent (well actually it uses a quaternion and position vector, but that''s not relevant), and an absolute matrix which describes its global position. The bones (and other rigid meshes) are animated by interpolating between positions and rotations which are sampled at about 30fps.

Bones are attached to a skin, a skin being a single mesh, with each vertex affected by one or more bones'' absolute position/rotation matrices. When the bones are updated, the vertices affected by that bone are updated too.

If you want to know specifics about matrices and how they''re used to transform shapes, there''s plenty of material on the web and in maths text books. The Matrix and Quaternion FAQ is a good place to start.

quote:
vertexout = (vertexin * bonematrix1 * boneweight1) + (vertexin * bonematrix2 * boneweight2) .... if you have more affecting bones.


Almost. Each vertex has an offset vector for each bone affecting it, which describes the local position of the vertex in relation to the bone. You''d transform that offset vector by the absolute bone matrix.

____________________________________________________________www.elf-stone.com | Automated GL Extension Loading: GLee 5.00 for Win32 and Linux

quote:Original post by Trienco
first: it seems some drivers have weird vbo issues resulting in bad performance. if you either allocate too much or too little memory (depending on card and driver) you might just get system memory. you could for example try to allocate less than 6mb of video memory or more than 12 (even if you dont need it, just for testing).

about skinning. it required vertex programs to let me see an efficient solution to the problem i had with it (meaning: i do absolutely NOT want to touch the vertices of the skin myself). the ability of storing 12 matrices/bones should easily let you do the whole skinning in the vertex shader, though having a lot of work to do for each vertex i would definitely make sure to use the cache as good as possible. more complex models would require multiple calls and changing the matrices.

btw. if you use lock/unlock and do that maybe only once it might very well be that the driver is copying it to video memory which might be more or less as fast as vbo, depending on how they implemented it. anyways, you are not using "just vertex arrays".




Yes, but lock/unlock only gave me about a 10fps gain in speed, which still puts my app well over 19m/s with just normal old vertex arrays (triangle strips of course).
quote:Original post by benjamin bunny
quote:
1)Do you calculate the new position of the vertices and then assemble them all into 1 array and pass it to the card? In this case how do you calculate the new position with just martrices? Do you extract the rotation/translation values from the matrice? If you do, how do you extract it?(Sorry im not good with Matrice math)

Each bone contains a 4x4 local matrix which describes its position and rotation relative to its parent (well actually it uses a quaternion and position vector, but that''s not relevant), and an absolute matrix which describes its global position. The bones (and other rigid meshes) are animated by interpolating between positions and rotations which are sampled at about 30fps.

Bones are attached to a skin, a skin being a single mesh, with each vertex affected by one or more bones'' absolute position/rotation matrices. When the bones are updated, the vertices affected by that bone are updated too.

If you want to know specifics about matrices and how they''re used to transform shapes, there''s plenty of material on the web and in maths text books. The Matrix and Quaternion FAQ is a good place to start.

quote:
vertexout = (vertexin * bonematrix1 * boneweight1) + (vertexin * bonematrix2 * boneweight2) .... if you have more affecting bones.


Almost. Each vertex has an offset vector for each bone affecting it, which describes the local position of the vertex in relation to the bone. You''d transform that offset vector by the absolute bone matrix.


Thx for the info, but i think my question has not been answered.Ill try to rephrase it. I already understand how bones and skin work. What i am trying to understand is how you get your animated mesh result given only the matrice of the bone.What i also want to know is how do you pass your data.

In 1 array which was created by calculating each and every vertex of that mesh? Similar to honayboyz''s method. Won''t this be slow? Also how do i interpret this:

Frame 1
vertexin * bonematrix1 * boneweight1

Vertexin is a position describing XYZ of a vertex.
BoneMatrix is a 4X4 matrix.
BoneWeight is just a float.

So how do i multiply these? As for interpolation i guess it would be like this?
Assuming frame 1 and frame 3 are keyframes.

Frame 1
vertexin * bonematrix1 * boneweight1

Frame 3
vertexin * bonematrix1 * boneweight1

Assuming Frame1 and Frame3 are the positions of the vertex at their respective frames:

Frame 2 = (Frame 3 - Frame 1)/2

Am i right? Or is there somthing wrong with this method?



Or do you render by applying the bone''s matrix and then render the vertices of each bone. Making it look like this:

apply(Bone1.matrice);
render(Bone1.vertices);
apply(Bone2.matrice);
render(Bone2.vertices);
etc...
I believe this method will be alot faster since you dont have to recalculate each and every vertex but am stumbled as to how non rigid meshes will work.

Also i only have a GF2MX so using Vertex Programs is out. I want to know how it is done in software first.



quote:Original post by GamerSg

Thx for the info, but i think my question has not been answered.Ill try to rephrase it. I already understand how bones and skin work. What i am trying to understand is how you get your animated mesh result given only the matrice of the bone.What i also want to know is how do you pass your data.

In 1 array which was created by calculating each and every vertex of that mesh? Similar to honayboyz's method. Won't this be slow? Also how do i interpret this:



it depends on what you can use and how many bones affect one vertex. if you do it in software you can do everything you like, but have to update the array before drawing (i think i wouldnt do it when updating its bone because that might happen more than once every frame). if you have support for the right extensions you can have more info per vertex, ie: indices of the matrices/bones influencing it and a weight for each of them. just load the right matrices to the palette, pass that info and let the hardware do the work.

quote:
Frame 1
vertexin * bonematrix1 * boneweight1

Vertexin is a position describing XYZ of a vertex.
BoneMatrix is a 4X4 matrix.
BoneWeight is just a float.


so what? a vector * matrix will result in a vector and a vector * float will result in a scaled vector.

quote:
So how do i multiply these? As for interpolation i guess it would be like this?
Assuming frame 1 and frame 3 are keyframes.


interpolation is one of the few things, where quaternions really shine, because i think you cant just interpolate two matrices without the result being useless.

quote:
Frame 1
vertexin * bonematrix1 * boneweight1

Frame 3
vertexin * bonematrix1 * boneweight1

Assuming Frame1 and Frame3 are the positions of the vertex at their respective frames:

Frame 2 = (Frame 3 - Frame 1)/2

Am i right? Or is there somthing wrong with this method?


that will work for translations, but imagine a hand being turned from tumbs up to thumbs down. if you interpolate like this your thumb wont rotate but just move straight down. as i said, thats one of the few things that would make me look into quaternions, even i usually think they are hyped too much.

what might work in most cases and is just an idea that might fail horribly: interpolate the 4 vectors of the matrix independently and renormalize them (except position). so if this is the (say) x axis for matrix 1 and 2:

|
|
|

------

just interpolating would end in the middle of the two endpoints, which would not be unit length and screw the matrix. if you normalize it, it should lie on the arc between them and be fine. but from the top of my mind i have no clue if the matrix would always stay orthogonal with this method. also it would require a lot of squareroots. and it will definitely blow up if you interpolate between say x and -x because 0 is hard to normalize ,-)

quote:
Or do you render by applying the bone's matrix and then render the vertices of each bone. Making it look like this:

apply(Bone1.matrice);
render(Bone1.vertices);
apply(Bone2.matrice);
render(Bone2.vertices);


that would work, but a) multiple draw calls for just a small of data is slow and b) you would have cracks at the joints.

quote:
I believe this method will be alot faster since you dont have to recalculate each and every vertex but am stumbled as to how non rigid meshes will work.


well, if your driver/hardware allows you would basically do the same, but instead of changing the matrix all the time you tell the card in advance which matrix to use for each vertex. that why your model will still be close.

problem is: if you bend too much it will still look weird, thats why you usually want each vertex to be influenced by more than one bone, introducing the weights.

whats troubling me here is only thing: if you have bone per vertex you can store the vertex relative to the bone in its original position and get the final position just by multiplying with the bone. if you have more than one bone i could either think of first multipying with the inverse of the bones original matrix and then with the bones current matrix (doubling the work in the shader) or storing copies of the vertex for each bone which could already be relative to this bone. the first approach should work in hardware (but wastes precious space in the matrix palette) the second would only work in software and require more memory.

but for the first version it should be possible to just concatenate OrgBone^-1 and Bone, shouldnt it? though that way each bone would store its inverse original world matrix, its current local matrix (relative to parent) and current world matrix. then you shouldnt have to store each vertex relative to anything but just leave it as is.

quote:
Also i only have a GF2MX so using Vertex Programs is out. I want to know how it is done in software first.


without support for matrix palette and vertex blending? as he describe above, by updating your vertices yourself.

[edited by - Trienco on November 16, 2003 4:06:06 AM]
f@dzhttp://festini.device-zero.de
quote:Original post by Trienco
but for the first version it should be possible to just concatenate OrgBone^-1 and Bone, shouldnt it? though that way each bone would store its inverse original world matrix, its current local matrix (relative to parent) and current world matrix. then you shouldnt have to store each vertex relative to anything but just leave it as is.


That would work. It''s an extra matrix multiply per bone per vertex though.

____________________________________________________________www.elf-stone.com | Automated GL Extension Loading: GLee 5.00 for Win32 and Linux

i tried it (in software, as im still figuring out how to select the right program matrix in my vrt program.. *sigh* relative addressing looked so nice)...

anyway, i ended up storing 4 matrices per bone. so for every bone i calculate its inverse matrix for the original position (OrgInv)

when one bone is changed this change is applied to its local matrix (LocMatrix), then this and all following bones are recomputed. (for the sake of being lazy i abuse opengl for the matrix math). first the parents global matrix is multiplied with the current local matrix to get the current global matrix (that dirty position part should sooner or later vanish)

void update() {  glPushMatrix();  if (parent) {    glLoadMatrixf(parent->GlobMatrix);    glMultMatrixf(LocMatrix);    glGetFloatv(GL_MODELVIEW_MATRIX, GlobMatrix);			    GlobMatrix[12]=parent->GlobMatrix[12]+parent->length*parent->GlobMatrix[4];    GlobMatrix[13]=parent->GlobMatrix[13]+parent->length*parent->GlobMatrix[5];    GlobMatrix[14]=parent->GlobMatrix[14]+parent->length*parent->GlobMatrix[6];  }


for the root bone local and global are the same
else memcpy(GlobMatrix, LocMatrix, sizeof(GlobMatrix));

then the global matrix is multiplied by the original inverse

glLoadMatrixf(GlobMatrix);
glMultMatrixf(OrgInv);
glGetFloatv(GL_MODELVIEW_MATRIX, EffMatrix);
glPopMatrix();

until i figure out an elegant way for it i use EffMatrix as the effective matrix used for skinning.

this continues for all childs from here, in a loop if its more than one:

if (child) child->update();
}

depending on how many bones are changed it might either be cheaper to only do a big update in the end or updates starting a the changed bone. so as a maximum it should be one additional matrix mult per bone but not per vertex (except you multiply first with global then inverse in the vertex program, but i think the above will be more efficient).

its of course pointless if all vertices are only affected by a single bone because its no problem to store the vertex relative to the bone in the first place. but for multiple bones per vertex it should work fine, plus you dont need to touch the original mesh (except of course adding indices and weights for each vertex).

whats causing a constant stream of curses at this time is this:

PARAM BoneMat[4] = { state.matrix.program[0] };

program[0-9] contain the matrices for each bone, at this time the w component of the vertex is abused as single index. what i need is (doesnt matter how indirect)

PARAM BoneMat[4] = { state.matrix.program[vertex.position.w] };

just in a form that is valid for vertex programs.

ADDRESS A;
ARL A.x, vertex.position.w;
PARAM BoneMat[4] = { state.matrix.program[A.x] };

doesnt work, because bonemat is initialized before anything is done and most likely relative addressing wouldnt be allowed here anyway. all thats keeping me from moving to multiple weighed bones is getting that thing to also work in hardware.

[edited by - Trienco on November 16, 2003 9:29:26 AM]
f@dzhttp://festini.device-zero.de
quote:Original post by benjamin bunny
quote:Original post by Ready4Dis
Something isn't right about that.. I'm running an athlon xp 1600, radeon 9200, and drawing my terrain + models I am hitting over 20M triangles per second... using just vertex arrays, so no clue how you're getting such low numbers with normal VA's and VBO's. Actually, I am using lock/unlock with the vertex arrays. It falls back to not using lock/unlock if not available, I haven't had a chance to implement VBO's yet (I think it's finally implemented for my card in the latest drivers, i have to check though, last time I looked it wasn't there).


I suspect it's because I'm only sending 300 vertices per batch, which isn't particularly efficient with vertex arrays or VBO. I get much higher numbers with terrain, where I send data in blocks of at least 1039 vertices. Bear in mind also that lighting makes a big difference to frame rate.


I'm sending 289 vertices per batch, so I don't think that's it... possibly because I don't have lighting enabled I think, because I have textured, and colored vertices both, but not lighting & normals (didn't see your second post when I posted that, so once I add that stuff, it may drop a bit).


--- Edit ---
Typo

[edited by - Ready4Dis on November 16, 2003 11:09:38 AM]

[edited by - Ready4Dis on November 16, 2003 11:09:59 AM]

This topic is closed to new replies.

Advertisement