• Advertisement
Sign in to follow this  

OpenGL GPU Skinning and frame interpolation

This topic is 2191 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello. I've been trying to do skinning, and so far I think I've kind of figured out the math behind it. I've successfully loaded an MD5 mesh and I'm working on implementing CPU skinning for a small animation following this guide: [url="http://3dgep.com/?p=1053"]http://3dgep.com/?p=1053[/url]. I think I can get software skinning working quite easily but I definitely want to do the skinning in a vertex shader. I found another tutorial on the same website that covers this: [url="http://3dgep.com/?p=1356"]http://3dgep.com/?p=1356[/url]. The "problem" is that it seems to do the bone interpolation on the CPU and then submit the data each frame to the GPU. I just transforming 100 bones shouldn't be that CPU heavy, but I might potentially have a thousand instances of the same mesh but in different animations. Therefore I also want to use instancing to reduce the CPU. Since my game is an RTS game the GPU is severely underused even though I offloaded fog of war rendering, so trading GPU cycles for CPU cycles is a good thing in my case.

I could just multi-thread the bone interpolation for an almost linear increase in performance, but it I still thought that it should be possible to off-load almost everything onto the GPU. That's when I stumbled over this white paper: [url="http://developer.download.nvidia.com/SDK/10/direct3d/Source/SkinnedInstancing/doc/SkinnedInstancingWhitePaper.pdf"]http://developer.dow...gWhitePaper.pdf[/url]. It seems to do exactly what I want by storing bone matrices in a texture, which was something I had thought of doing. However, the implementation in the white-paper does not seem to have any kind of bone interpolation, and simply rounds the frame to the closest frame (though this isn't written anywhere). I'm 99.9% sure they don't re-upload the bone matrices each frame since they seem to keep the bone data for each animation and frame, not for each individual instance. Losing interpolation seems like a huge step backwards, so I would definitely not implement skinning if that turned out to be the cost.

I figured I could just upload even "rawer" bone data to my animation texture, meaning I'd keep a 3D vector and a quaternion per bone instead of a matrix and then do the interpolation between the two frames in the vertex shader. The amount of data sampled would only increase by about 33%:

1 matrix per weight = RGBA 32-bit float x 3
2 vectors and 2 quaternions = RGBA 32-bit float x 4

I would also have to upload additional static data to the weights (a position for each weight). The problem is the additional logic needed to transform each vertex since the interpolation would have to be redone for each vertex. I think this additional cost will be almost unnoticeable though since the vertex shader should be bandwidth / texture limited anyway. If there happens to be built-in functions to do slerp (I've found mix(...), but I'm not sure if it's the right one) I think the additional logic cost would be negligible.

In short, I'd port this exact function to a GLSL vertex shader:
[CODE]
for (int i = 0; i < m.vertices.size(); i++) {
Vertex v = m.vertices.get(i);
float x = 0, y = 0, z = 0;
for (int k = 0; k < v.weightCount; k++) {
Weight w = m.weights.get(v.startWeight + k); //v.startWeight = index in list of weights
Joint j = bindPoseJoints.get(w.joint); //Joint contains position and orientation. I'd be using the animation joints, not the bind pose joints of course.
rot(j.orientation, w.position, temp); //Quaternion rotation of weight position, temp is a temporary Vec3.
Vector3f.add(temp, j.position, temp); //Add joint position
temp.scale(w.bias);
x += temp.x;
y += temp.y;
z += temp.z;
}
vertexData[mesh].putFloat(x).putFloat(y).putFloat(z); //Load data into an array to send it to OpenGL
}
[/CODE]

I'm pretty much a skinning n00b, but these are my thoughts on it. The main problem is the amount of (static) data needed per vertex (4 x vec3 per vertex for the weight position), but if the cost is acceptable I strongly suspect that this will have better performance than doing the interpolation on the CPU, at least in my case.

Share this post


Link to post
Share on other sites
Advertisement
I think it's better to bind a texture containing already interpolated matrices.
Interpolate on the CPU using appropriate structure (most likely 3D vectors for translation and scaling, and a Quaternion for rotation).

With each vertex attribute, pass your weights (you probably want to maximize them to a reasonable number. I think most people use a maximum of 4 weights per vertex), and use them as look-up values to the texture.

I have never done GPU skinning yet, but I plan to do it pretty soon and I think that's how I am going to do it. Sounds reasonable on both CPU and GPU, each doing the appropriate task.

Share this post


Link to post
Share on other sites
I realize that the GPU would have to interpolate 1 to 4 bones per vertex. With an average of 2.5 weights per vertex and a around 1 000 vertices, that's about 2 500 bone interpolations per instance. A CPU implementation would only have to process each bone once, which in the case of my test model means only 33 interpolations per instance.

In the end I think I'll just implement both just to see if and how much slower the GPU version is...

EDIT: Oh, and the function in my first post does not contain any frame interpolation. Doh!

Share this post


Link to post
Share on other sites
Things that don't change: weights and bone indices. So its pretty obvious to keep those on the GPU. Also keep the initial vertex position (bind pose). Take your for loop and put it on the GPU. For each object, compute the current frame matrix, and send all those bones down for that object. If 2 of the same character models are used with 1 running and 1 walking, then you have to send the walk bones for the one, and the run bone matrices for the other.

Share this post


Link to post
Share on other sites
[quote name='theagentd' timestamp='1327020506' post='4904419']
I realize that the GPU would have to interpolate 1 to 4 bones per vertex. With an average of 2.5 weights per vertex and a around 1 000 vertices, that's about 2 500 bone interpolations per instance. A CPU implementation would only have to process each bone once, which in the case of my test model means only 33 interpolations per instance.

In the end I think I'll just implement both just to see if and how much slower the GPU version is...

EDIT: Oh, and the function in my first post does not contain any frame interpolation. Doh!
[/quote]

Having just done that I can tell you that with running all four weights per vertex per bone (33 you say? you're animating Bob, aren't you? :) ) on the GPU, you'll be looking at around 30-100 fold [b]increase[/b] in speed compared to smart skinning per-weight on the CPU. The relevant timings for me in debug mode are ~1-1.5 milliseconds for skinning on the CPU and 0.025-0.1 milliseconds on the GPU (this is using a 3 year old mobile version of GF 240GT). This assumes bones are animated/interpolated on the CPU and passed to the shader via a uniform.

Share this post


Link to post
Share on other sites
[quote name='irreversible' timestamp='1327023957' post='4904431']
[quote name='theagentd' timestamp='1327020506' post='4904419']
I realize that the GPU would have to interpolate 1 to 4 bones per vertex. With an average of 2.5 weights per vertex and a around 1 000 vertices, that's about 2 500 bone interpolations per instance. A CPU implementation would only have to process each bone once, which in the case of my test model means only 33 interpolations per instance.

In the end I think I'll just implement both just to see if and how much slower the GPU version is...

EDIT: Oh, and the function in my first post does not contain any frame interpolation. Doh!
[/quote]

Having just done that I can tell you that with running all four weights per vertex per bone (33 you say? you're animating Bob, aren't you? [img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img] ) on the GPU, you'll be looking at around 30-100 fold [b]increase[/b] in speed compared to smart skinning per-weight on the CPU. The relevant timings for me in debug mode are ~1-1.5 milliseconds for skinning on the CPU and 0.025-0.1 milliseconds on the GPU (this is using a 3 year old mobile version of GF 240GT). This assumes bones are animated/interpolated on the CPU and passed to the shader via a uniform.
[/quote]
Yes, of course the GPU is faster than the CPU, but the real fight is between the hybrid CPU/GPU solution and a pure GPU solution. You're talking about the "hybrid" one, were the bone interpolation is done on the CPU and all per-vertex calculations are done on the GPU. In my first post I suggested offloading the interpolation to the GPU too, which would slightly lessen the amount of CPU cycles and bandwidth needed for a possibly big GPU hit. You're comparing the CPU/GPU solution to a pure CPU solution. And yes, I hope to get a few Bobs swinging their lanterns around soon. =D

Share this post


Link to post
Share on other sites
[quote name='theagentd' timestamp='1327025055' post='4904438']
In my first post I suggested offloading the interpolation to the GPU too, which would slightly lessen the amount of CPU cycles and bandwidth needed for a possibly big GPU hit. You're comparing the CPU/GPU solution to a pure CPU solution. And yes, I hope to get a few Bobs swinging their lanterns around soon. =D
[/quote]

Don't bother. Precompute your skeleton at whatever framerate that your target is (say 40-50) for all animations and round to the closest precomputed frame when rendering. You'll end up with zero computation time and the memory footprint is negligible.

Share this post


Link to post
Share on other sites
Its also good to note that not much animated stuff actually is rendered in any game other than an RTS. In that case you could worry about doing other things.

Share this post


Link to post
Share on other sites
[quote name='irreversible' timestamp='1327026591' post='4904441']
[quote name='theagentd' timestamp='1327025055' post='4904438']
In my first post I suggested offloading the interpolation to the GPU too, which would slightly lessen the amount of CPU cycles and bandwidth needed for a possibly big GPU hit. You're comparing the CPU/GPU solution to a pure CPU solution. And yes, I hope to get a few Bobs swinging their lanterns around soon. =D
[/quote]

Don't bother. Precompute your skeleton at whatever framerate that your target is (say 40-50) for all animations and round to the closest precomputed frame when rendering. You'll end up with zero computation time and the memory footprint is negligible.
[/quote]
But what if I want a super slow-motion effect that effectively drops the game speed to 1/10th? That would force me to precompute waaay too many frames for each animation. Also, I'd prefer the accuracy of real-time bone interpolation. That's a good suggestion though. Maybe precomputing a few extra frames to increase the framerate of each animation to 3x, and then not using slerp but simple lerp+normalizing on the quaternion in a vertex shader would produce accurate enough motion. You could even precompute with bicubic interpolation too.

Each frame requires about a kilobyte of data, Each animation may be shared between different meshes if they have the same skeleton layout, so the number of animations won't be that many. Let's say 20 units with different animation sets, each with 10 different 2 second animations. That'd be 20 x 10 x 2x24 frames of raw data --> each frame is 33 bones x 7 floats x 4 bytes per float --> 20x10x2x24x33x7x4 = 8.45MBs of data. Multiplying the number of frames by 4 is still just around 35MBs of data, so precomputing is a very possible solution. Nice idea!


[quote name='dpadam450' timestamp='1327027864' post='4904445']
Its also good to note that not much animated stuff actually is rendered in any game other than an RTS. In that case you could worry about doing other things.
[/quote]
I don't really understand what you're saying...? ._.

Share this post


Link to post
Share on other sites
[quote name='theagentd' timestamp='1327032459' post='4904460']
But what if I want a super slow-motion effect that effectively drops the game speed to 1/10th? That would force me to precompute waaay too many frames for each animation. Also, I'd prefer the accuracy of real-time bone interpolation. That's a good suggestion though. Maybe precomputing a few extra frames to increase the framerate of each animation to 3x, and then not using slerp but simple lerp+normalizing on the quaternion in a vertex shader would produce accurate enough motion. You could even precompute with bicubic interpolation too.
[/quote]

Question: is the "what if" actually a realistic expectation? Will you be implementing bullet time?

If the answer is yes, then you could do some dynamic branching - if the required animation speed drops below a certain threshold (say, the precomputed framerate), then interpolate the skeleton for that model on the fly. Note that bone interpolation is cheap for an average model. 30-60 bones isn't that big of a deal really and it's okay to do it when needed. The real optimization you should be looking into here is the actual number of updates per second. If you're running hundreds of models at 100 FPS, then updating them each frame is going to affect said FPS. If you update each skeleton every other frame, you won't be compromising any visual quality and you'll have effectively halved the work load. What is really expensive here is the skinning, which is done on the GPU. Note that the Bob model actually has a relatively low poly count (around 800 vertices) - take a proper model with 10k vertices and 50 bones and you'll be looking at roughly the same animation load on the CPU, but a 15 times more expensive skinning phase on the GPU. Vertices ramp up as time goes on. There's really no need to ramp up the number of bones in most cases.

[quote name='theagentd' timestamp='1327032459' post='4904460']
Each frame requires about a kilobyte of data, Each animation may be shared between different meshes if they have the same skeleton layout, so the number of animations won't be that many. Let's say 20 units with different animation sets, each with 10 different 2 second animations. That'd be 20 x 10 x 2x24 frames of raw data --> each frame is 33 bones x 7 floats x 4 bytes per float --> 20x10x2x24x33x7x4 = 8.45MBs of data. Multiplying the number of frames by 4 is still just around 35MBs of data, so precomputing is a very possible solution. Nice idea!
[/quote]

It's likelier you want to store your animation as a list of matrices, not 7 floats, for faster access. Also, think about it realistically - if your model comes at 24 FPS, then multiplying it by 2 will be enough for any human. 100 keyframes per second will give you smooth slow motion, which you very likely won't be needing.

[quote name='dpadam450' timestamp='1327027864' post='4904445']
Its also good to note that not much animated stuff actually is rendered in any game other than an RTS. In that case you could worry about doing other things.
[quote]
I don't really understand what you're saying...? ._.
[/quote]
[/quote]

What dpadam450 means is that the only genre in which you will realistically encounter a large number of models that need to be animated individually is an RTS games. In a general case you'll have 10 models tops running around at one time, which is a beeze to animate on the CPU.

Share this post


Link to post
Share on other sites
Regarding the static vertex data, most of the implementations I've seen use a UByte*4 for the associated bone indicies and a UByte*4 for the weights for those indices. This limits each vertex to being associated with only 4 bones, and if a vertex is associated with less bones, then it also performs same math as if it were associated with 4 but it uses weights of 0.0 for the extra bones.
I've usually seen the dynamic/animated bone data represented as a 4x3 (or 3x4) matrix containing rotation/scale/translation transforms relative to the bind-pose.
[quote name='irreversible' timestamp='1327056930' post='4904544']Also, think about it realistically - if your model comes at 24 FPS, then multiplying it by 2 will be enough for any human. 100 keyframes per second will give you smooth slow motion, which you very likely won't be needing.[/quote]Where does the magic number 24 (or 48) come from? ;P
[quote]What dpadam450 means is that the only genre in which you will realistically encounter a large number of models that need to be animated individually is an RTS games. In a general case you'll have 10 models tops running around at one time, which is a beeze to animate on the CPU.
[/quote]Modern FPS games often have ~50 characters on-screen at once [img]http://public.gamedev.net//public/style_emoticons/default/wink.png[/img]
I'm doing a sports game at the moment with 30 characters, each with 60 bones, and who all have multiple different animation sources [i]blended together[/i] unpredictably and [url="http://en.wikipedia.org/wiki/Inverse_kinematics"]IK[/url] applied on top -- the whole skeletal update part is still fairly cheap and only takes up a few milliseconds.

I'd personally just implement it in a way that is easily understood first ([i]especially if I was fairly new to skinned animation, which admittedly, I am[/i]) and work on writing a more optimal version after I got the basic one working [i]if it actually turns out to be performing badly[/i].

Share this post


Link to post
Share on other sites
@Irreversible
To be honest I probably won't be implementing any bullet-time effects, but I will have changeable game speed, which could drop the game speed to a very low value. I still think doing the interpolation in real-time is more accurate, since even if the animation speed matches the game FPS it would still be more accurate to do the interpolation for the exact time. Maybe it really is an unnoticeable difference in 99.999% of all cases. I might not be able to afford the additional cost of lots of slerps each frame even with multithreaded joint interpolation, so getting rid of it and just keeping the precomputed bone matrices in GPU memory might be the best choice anyway. Memory is something I can afford to use more of, so precomputing to about 60-120 frames per second should give enough smoothness in all possible cases. Now I know what the animation quality setting does in games... >_>

I am actually making a real-time strategy game, so I might be having about 100 units on the screen at the same time.


@Hodgman
I've read up quite a lot on GPU skinning and I have more than enough experience with shaders to implement this. Storing the joint translation and orientation in a matrix is probably the best idea as it eliminates the weight positions that would have to be stored per vertex otherwise. I'm loading MD5 meshes and animations, so the maximum number of weights per vertex that format supports is 4, so I'll just stick with that. It also doesn't support joint scales, so that simplifies it further. If using MD5 is a bad idea for some reason, please stop me now!!!
24 frames per second comes from the specific model I'm animating.


In other news, I just managed to get my software skinning working, so Bob is (happily?) waving his lantern around. FPS dropped from 83 FPS to 14 due to the skinning being done on the CPU (well, with 1000 instances though xD). Next I'll move the skinning to a vertex shader but keep joint interpolation on the CPU which is was the standard approach, right? Lastly I'll try a pure GPU solution with precomputed joints stored in a texture.

EDIT: My software implementation is obviously bottlenecked by the skinning. Skinning takes about 65% of the frame time at the moment, possibly a lot more if you count methods that are shared with other parts of the game.

Share this post


Link to post
Share on other sites
[quote name='Hodgman' timestamp='1327059015' post='4904550']
Regarding the static vertex data, most of the implementations I've seen use a UByte*4 for the associated bone indicies and a UByte*4 for the weights for those indices. This limits each vertex to being associated with only 4 bones, and if a vertex is associated with less bones, then it also performs same math as if it were associated with 4 but it uses weights of 0.0 for the extra bones.
I've usually seen the dynamic/animated bone data represented as a 4x3 (or 3x4) matrix containing rotation/scale/translation transforms relative to the bind-pose.
[/quote]

Incidentally, I don't have this working yet, but I'm packing indexes with a ratio of 3:1 into float vectors while maintaining 8-bit precision (I haven't done the actual math as to what the maximum practical precision is, but the packing is the same as RGB2Float), limiting the model to 255 bones, which should be enough in even the most fringe cases, but it enables more concurrently influencing bones without increasing storage. As for packing weights into a byte values, that results in a precision of 0.0039. I'm actually fairly curious as to whether this is enough (if it is, I'll definitely want to pack my weights as well). Incidentally, I'm limiting myself to 4 concurrent data streams since I'm using transform feedback to do the skinning, which supports 4 bones at most for now as the largest vector stream that can be passed to TF is vec4, which limits the number of weights that can be blended.

[quote name='Hodgman' timestamp='1327059015' post='4904550']
[quote name='irreversible' timestamp='1327056930' post='4904544']Also, think about it realistically - if your model comes at 24 FPS, then multiplying it by 2 will be enough for any human. 100 keyframes per second will give you smooth slow motion, which you very likely won't be needing.[/quote]Where does the magic number 24 (or 48) come from? ;P
[/quote]

Oh, that's from the Bob model discussed above [img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img]
[quote]Modern FPS games often have ~50 characters on-screen at once [img]http://public.gamedev.net//public/style_emoticons/default/wink.png[/img]
I'm doing a sports game at the moment with 30 characters, each with 60 bones, and who all have multiple different animation sources [i]blended together[/i] unpredictably and [url="http://en.wikipedia.org/wiki/Inverse_kinematics"]IK[/url] applied on top -- the whole skeletal update part is still fairly cheap and only takes up a few milliseconds.
[/quote]

A fair point, but it really boils down to what the game is about. I'm personally targeting a non-kinematic solution (which, admittedly, begs the question why would one need skeletal animation anyway?).

Share this post


Link to post
Share on other sites
What I'm saying is 2 things, which someone on gamedev that is a moderater apparently doesn't understand so they rate down.

Don't over-optimize something that doesn't need it. Whatever method you do with probably be fine, unless you are really drawing a massive amount or even moderate amout of animated stuff. Unless you have an artist to make 50 models for an FPS (which I find way too high a statistic anyway), then don't worry to much about a bottleneck that may or may not exist for your specific game. Most cases just on the cpu take all the bones between last frame and the next keyframe, blend those bones into new ones and send them down to the GPU.

Share this post


Link to post
Share on other sites
[quote]I'm personally targeting a non-kinematic solution (which, admittedly, begs the question why would one need skeletal animation anyway?).[/quote]
Kinematics is moving, so your probably thinking of inverse kinematics, or inverse of momement. If you have an animated character, it has bones created from art in order to make frames of animation. Any 3d object has a skeleton.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
  • Advertisement
  • Popular Now

  • Advertisement
  • Similar Content

    • By reenigne
      For those that don't know me. I am the individual who's two videos are listed here under setup for https://wiki.libsdl.org/Tutorials
      I also run grhmedia.com where I host the projects and code for the tutorials I have online.
      Recently, I received a notice from youtube they will be implementing their new policy in protecting video content as of which I won't be monetized till I meat there required number of viewers and views each month.

      Frankly, I'm pretty sick of youtube. I put up a video and someone else learns from it and puts up another video and because of the way youtube does their placement they end up with more views.
      Even guys that clearly post false information such as one individual who said GLEW 2.0 was broken because he didn't know how to compile it. He in short didn't know how to modify the script he used because he didn't understand make files and how the requirements of the compiler and library changes needed some different flags.

      At the end of the month when they implement this I will take down the content and host on my own server purely and it will be a paid system and or patreon. 

      I get my videos may be a bit dry, I generally figure people are there to learn how to do something and I rather not waste their time. 
      I used to also help people for free even those coming from the other videos. That won't be the case any more. I used to just take anyone emails and work with them my email is posted on the site.

      I don't expect to get the required number of subscribers in that time or increased views. Even if I did well it wouldn't take care of each reoccurring month.
      I figure this is simpler and I don't plan on putting some sort of exorbitant fee for a monthly subscription or the like.
      I was thinking on the lines of a few dollars 1,2, and 3 and the larger subscription gets you assistance with the content in the tutorials if needed that month.
      Maybe another fee if it is related but not directly in the content. 
      The fees would serve to cut down on the number of people who ask for help and maybe encourage some of the people to actually pay attention to what is said rather than do their own thing. That actually turns out to be 90% of the issues. I spent 6 hours helping one individual last week I must have asked him 20 times did you do exactly like I said in the video even pointed directly to the section. When he finally sent me a copy of the what he entered I knew then and there he had not. I circled it and I pointed out that wasn't what I said to do in the video. I didn't tell him what was wrong and how I knew that way he would go back and actually follow what it said to do. He then reported it worked. Yea, no kidding following directions works. But hey isn't alone and well its part of the learning process.

      So the point of this isn't to be a gripe session. I'm just looking for a bit of feed back. Do you think the fees are unreasonable?
      Should I keep the youtube channel and do just the fees with patreon or do you think locking the content to my site and require a subscription is an idea.

      I'm just looking at the fact it is unrealistic to think youtube/google will actually get stuff right or that youtube viewers will actually bother to start looking for more accurate videos. 
    • By Balma Alparisi
      i got error 1282 in my code.
      sf::ContextSettings settings; settings.majorVersion = 4; settings.minorVersion = 5; settings.attributeFlags = settings.Core; sf::Window window; window.create(sf::VideoMode(1600, 900), "Texture Unit Rectangle", sf::Style::Close, settings); window.setActive(true); window.setVerticalSyncEnabled(true); glewInit(); GLuint shaderProgram = createShaderProgram("FX/Rectangle.vss", "FX/Rectangle.fss"); float vertex[] = { -0.5f,0.5f,0.0f, 0.0f,0.0f, -0.5f,-0.5f,0.0f, 0.0f,1.0f, 0.5f,0.5f,0.0f, 1.0f,0.0f, 0.5,-0.5f,0.0f, 1.0f,1.0f, }; GLuint indices[] = { 0,1,2, 1,2,3, }; GLuint vao; glGenVertexArrays(1, &vao); glBindVertexArray(vao); GLuint vbo; glGenBuffers(1, &vbo); glBindBuffer(GL_ARRAY_BUFFER, vbo); glBufferData(GL_ARRAY_BUFFER, sizeof(vertex), vertex, GL_STATIC_DRAW); GLuint ebo; glGenBuffers(1, &ebo); glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo); glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(indices), indices,GL_STATIC_DRAW); glVertexAttribPointer(0, 3, GL_FLOAT, false, sizeof(float) * 5, (void*)0); glEnableVertexAttribArray(0); glVertexAttribPointer(1, 2, GL_FLOAT, false, sizeof(float) * 5, (void*)(sizeof(float) * 3)); glEnableVertexAttribArray(1); GLuint texture[2]; glGenTextures(2, texture); glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, texture[0]); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); sf::Image* imageOne = new sf::Image; bool isImageOneLoaded = imageOne->loadFromFile("Texture/container.jpg"); if (isImageOneLoaded) { glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, imageOne->getSize().x, imageOne->getSize().y, 0, GL_RGBA, GL_UNSIGNED_BYTE, imageOne->getPixelsPtr()); glGenerateMipmap(GL_TEXTURE_2D); } delete imageOne; glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, texture[1]); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); sf::Image* imageTwo = new sf::Image; bool isImageTwoLoaded = imageTwo->loadFromFile("Texture/awesomeface.png"); if (isImageTwoLoaded) { glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, imageTwo->getSize().x, imageTwo->getSize().y, 0, GL_RGBA, GL_UNSIGNED_BYTE, imageTwo->getPixelsPtr()); glGenerateMipmap(GL_TEXTURE_2D); } delete imageTwo; glUniform1i(glGetUniformLocation(shaderProgram, "inTextureOne"), 0); glUniform1i(glGetUniformLocation(shaderProgram, "inTextureTwo"), 1); GLenum error = glGetError(); std::cout << error << std::endl; sf::Event event; bool isRunning = true; while (isRunning) { while (window.pollEvent(event)) { if (event.type == event.Closed) { isRunning = false; } } glClear(GL_COLOR_BUFFER_BIT); if (isImageOneLoaded && isImageTwoLoaded) { glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, texture[0]); glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, texture[1]); glUseProgram(shaderProgram); } glBindVertexArray(vao); glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT, nullptr); glBindVertexArray(0); window.display(); } glDeleteVertexArrays(1, &vao); glDeleteBuffers(1, &vbo); glDeleteBuffers(1, &ebo); glDeleteProgram(shaderProgram); glDeleteTextures(2,texture); return 0; } and this is the vertex shader
      #version 450 core layout(location=0) in vec3 inPos; layout(location=1) in vec2 inTexCoord; out vec2 TexCoord; void main() { gl_Position=vec4(inPos,1.0); TexCoord=inTexCoord; } and the fragment shader
      #version 450 core in vec2 TexCoord; uniform sampler2D inTextureOne; uniform sampler2D inTextureTwo; out vec4 FragmentColor; void main() { FragmentColor=mix(texture(inTextureOne,TexCoord),texture(inTextureTwo,TexCoord),0.2); } I was expecting awesomeface.png on top of container.jpg

    • By khawk
      We've just released all of the source code for the NeHe OpenGL lessons on our Github page at https://github.com/gamedev-net/nehe-opengl. code - 43 total platforms, configurations, and languages are included.
      Now operated by GameDev.net, NeHe is located at http://nehe.gamedev.net where it has been a valuable resource for developers wanting to learn OpenGL and graphics programming.

      View full story
    • By TheChubu
      The Khronos™ Group, an open consortium of leading hardware and software companies, announces from the SIGGRAPH 2017 Conference the immediate public availability of the OpenGL® 4.6 specification. OpenGL 4.6 integrates the functionality of numerous ARB and EXT extensions created by Khronos members AMD, Intel, and NVIDIA into core, including the capability to ingest SPIR-V™ shaders.
      SPIR-V is a Khronos-defined standard intermediate language for parallel compute and graphics, which enables content creators to simplify their shader authoring and management pipelines while providing significant source shading language flexibility. OpenGL 4.6 adds support for ingesting SPIR-V shaders to the core specification, guaranteeing that SPIR-V shaders will be widely supported by OpenGL implementations.
      OpenGL 4.6 adds the functionality of these ARB extensions to OpenGL’s core specification:
      GL_ARB_gl_spirv and GL_ARB_spirv_extensions to standardize SPIR-V support for OpenGL GL_ARB_indirect_parameters and GL_ARB_shader_draw_parameters for reducing the CPU overhead associated with rendering batches of geometry GL_ARB_pipeline_statistics_query and GL_ARB_transform_feedback_overflow_querystandardize OpenGL support for features available in Direct3D GL_ARB_texture_filter_anisotropic (based on GL_EXT_texture_filter_anisotropic) brings previously IP encumbered functionality into OpenGL to improve the visual quality of textured scenes GL_ARB_polygon_offset_clamp (based on GL_EXT_polygon_offset_clamp) suppresses a common visual artifact known as a “light leak” associated with rendering shadows GL_ARB_shader_atomic_counter_ops and GL_ARB_shader_group_vote add shader intrinsics supported by all desktop vendors to improve functionality and performance GL_KHR_no_error reduces driver overhead by allowing the application to indicate that it expects error-free operation so errors need not be generated In addition to the above features being added to OpenGL 4.6, the following are being released as extensions:
      GL_KHR_parallel_shader_compile allows applications to launch multiple shader compile threads to improve shader compile throughput WGL_ARB_create_context_no_error and GXL_ARB_create_context_no_error allow no error contexts to be created with WGL or GLX that support the GL_KHR_no_error extension “I’m proud to announce OpenGL 4.6 as the most feature-rich version of OpenGL yet. We've brought together the most popular, widely-supported extensions into a new core specification to give OpenGL developers and end users an improved baseline feature set. This includes resolving previous intellectual property roadblocks to bringing anisotropic texture filtering and polygon offset clamping into the core specification to enable widespread implementation and usage,” said Piers Daniell, chair of the OpenGL Working Group at Khronos. “The OpenGL working group will continue to respond to market needs and work with GPU vendors to ensure OpenGL remains a viable and evolving graphics API for all its customers and users across many vital industries.“
      The OpenGL 4.6 specification can be found at https://khronos.org/registry/OpenGL/index_gl.php. The GLSL to SPIR-V compiler glslang has been updated with GLSL 4.60 support, and can be found at https://github.com/KhronosGroup/glslang.
      Sophisticated graphics applications will also benefit from a set of newly released extensions for both OpenGL and OpenGL ES to enable interoperability with Vulkan and Direct3D. These extensions are named:
      GL_EXT_memory_object GL_EXT_memory_object_fd GL_EXT_memory_object_win32 GL_EXT_semaphore GL_EXT_semaphore_fd GL_EXT_semaphore_win32 GL_EXT_win32_keyed_mutex They can be found at: https://khronos.org/registry/OpenGL/index_gl.php
      Industry Support for OpenGL 4.6
      “With OpenGL 4.6 our customers have an improved set of core features available on our full range of OpenGL 4.x capable GPUs. These features provide improved rendering quality, performance and functionality. As the graphics industry’s most popular API, we fully support OpenGL and will continue to work closely with the Khronos Group on the development of new OpenGL specifications and extensions for our customers. NVIDIA has released beta OpenGL 4.6 drivers today at https://developer.nvidia.com/opengl-driver so developers can use these new features right away,” said Bob Pette, vice president, Professional Graphics at NVIDIA.
      "OpenGL 4.6 will be the first OpenGL release where conformant open source implementations based on the Mesa project will be deliverable in a reasonable timeframe after release. The open sourcing of the OpenGL conformance test suite and ongoing work between Khronos and X.org will also allow for non-vendor led open source implementations to achieve conformance in the near future," said David Airlie, senior principal engineer at Red Hat, and developer on Mesa/X.org projects.

      View full story
    • By _OskaR
      Hi,
      I have an OpenGL application but without possibility to wite own shaders.
      I need to perform small VS modification - is possible to do it in an alternative way? Do we have apps or driver modifictions which will catch the shader sent to GPU and override it?
  • Advertisement