Followers 0

OpenGL problem writing complex vertex shader

5 posts in this topic

Greetings, everyone.

Recently I've been interested in Warcraft3's model system.

I download the War3ModelEditor source code (from: http://home.magosx.com/index.php?topic=6.0), read it, and rewrite a program witch can render Warcraft3's model using OpenGL ES.

When I run this code on an Android phone, it looks good but, when there're more than 5 models in the screen, the FPS becomes very low.

Currently I do all the bone animation(matrix calculation and vertex position calculation) in CPU side.

I think it might be faster if we can do all these works in GPU side.

But I just don't know how to do it

The Warcraft3's vertex position calculation is complex for me.

Let me explain a little more.

In a Warcraft3's model, each vertex is linked to one or moe bone.

Here is how the War3ModelEditor calculate the vertex's position:

step1. for each bone[i], calculate matrix_list[i]
step2. for each vertex
position = (matrix_list[vertex_bone[0]] * v
+  matrix_list[vertex_bone[1]] * v
+  ...
+  matrix_list[vertex_bone[n]] * v) / n

note: n is the length of 'vertex_bone', each vertex may have a different 'vertex_bone'.


Actually, several vertex can share a same 'vertex_bone' array,

while several other vertex share another 'vertex_bone' array.

For example, a model with 500 vertices may have only 35 different 'vertex_bone' arrays.

But I don't know how can I make use of this, to optimize the performance.

?

The step1 may be easy. Since a typical Warcraft3 model will have less than 30 bones, we can do this step in CPU side without much performance hit.

But step2 is quite complex.

If I write a vertex shader (GLSL) it will be something like this:

uniform mat4 u_matrix_list[50]; /* there might be more ?? */
attribute float a_n;
attribute float a_vertex_bone[4]; /* there might be more ?? */
attribute vec4 a_position;
void main() {
float i;
vec4 p = vec4(0.0, 0.0, 0.0, 1.0);
for (i = 0; i < a_n; ++i) {
p += u_matrix_list[int(a_vertex_bone[int(i)])] * a_position;
}
gl_Position = p / float(a_n);
}


There're some problems.

1. When I compile the vertex shader above (on my laptop, either than an Android phone), it reports 'success' with a warning message 'OpenGL does not allow attributes of type float[4]'.

And some times (when I change the order of the 3 attributes) it cause my program goes down, with a message 'The NVIDIA OpenGL driver lost connection with the display driver due to exceeding the Windows Time-Out limit and is unable to continue.'

2. The book <OpenGL ES 2.0 Programming Guide> page 83, says that 'many OpenGL ES only mandates that array indexing be supported by constant integral expressions (there is an exception to this, which is the indexing of uniform variables in vertex shaders that is discussed in Chapter 8).', so the statement 'a_vertex_bone[int(i)]' might not work on some OpenGL ES hardware.

Actually I've never write such a complex(?) shader before.

Any one could you give me some advice?

Thank you.

0

Share on other sites

You're on the right track!  A uniform array of bones, and vertex attributes that index into said array is the common way to handle this.

For your specific problem, I have a solution that should work but will limit you to 4 bones per vertex (I can't imagine this is a problem for WC3 models, but please let me know if it is.)

You could try representing your bone weights as a vec4 instead of an array in the attribute. From there, you could add a second vec4 attribute representing how many bones affect a vertex (such as [1.0, 1.0, 0.0, 0.0] for two bones).

Finally, If you take the dot product of this vector with itself, you conveniently enough get the number of bones out! (if we call the vector above v, then dot(v,v) = (1.0*1.0  + 1.0*1.0 + 0.0*0.0 + 0.0*0.0) = 2.0)

This would change your attribs to:

attribute vec4 a_position;
attribute vec4 bone_weights;


You would also remove the for loop above, and just say

vec4 p = vec4(0,0,0,1);
gl_Position = p / dot(bone_mask,bone_mask);

Hope this helps!

0

Share on other sites

Koehler, thank you very much for your reply. It helps me a lot.

Especially the 'dot product', that is wonderful.

But let me point out this.

The code "vec4 p = vec4(0,0,0,1);" you wrote, will actually be "vec4 p = vec4(0,0,0,0);". Or the transformation will not be correct.

Based on your idea, I've changed my source code.

I'm not very famillar about OpenGL version 2.0 and above. Fortunately I did it with a success:).

And there're still some issues that need to be think about.

Let me put my shader source code down here:

(Yes you can see there's something like gl_TextureMatrix and gl_ModelViewProjectionMatrix. That's because the first version of my program is written on an old PC witch only supports OpenGL 1.4. I'll modify these when necessary)

/* vertex shader */
uniform mat4 u_matrix_list[202];
attribute vec3 a_position;
attribute vec2 a_texcoord;
attribute vec4 a_mat_indices;
attribute vec4 a_mat_weights;
varying vec2 v_texcoord;
void main() {
v_texcoord = (gl_TextureMatrix[0] * vec4(a_texcoord, 0.0, 1.0)).xy;
vec4 p0 = vec4(a_position, 1.0);
vec4 p = vec4(0.0, 0.0, 0.0, 0.0);
p += (u_matrix_list[(int)a_mat_indices[0]] * p0) * a_mat_weights[0];
p += (u_matrix_list[(int)a_mat_indices[1]] * p0) * a_mat_weights[1];
p += (u_matrix_list[(int)a_mat_indices[2]] * p0) * a_mat_weights[2];
p += (u_matrix_list[(int)a_mat_indices[3]] * p0) * a_mat_weights[3];
p /= dot(a_mat_weights, a_mat_weights);
gl_Position = gl_ModelViewProjectionMatrix * p;
};

uniform sampler2D tex;
uniform vec4 u_color;
varying vec2 v_texcoord;
void main() {
gl_FragColor = u_color * texture2D(tex, v_texcoord);
}


Issues:

1. I wrote "uniform mat4 u_matrix_list[202];", this is a very large array for GPU.

I found that many of Warcraft3's unit model have less than 100 bones. For example a water elemental has 69 bones, and a footman has 49 bones.

But the buildings' model have many more bones. When I use the model 'AncientOfLore.mdx' for test. I found that it has 202 bones. So I declared such a large array. According to the MDX format, there can be up to 256 nodes(since the node's ID is a BYTE). But when I wrote "uniform mat4 u_matrix_list[256];" the glLinkProgram fails, with an error message "error C6007: Constant register limit exceeded; more than 1024 constant registers needed to compiled program".

I hear that if we store a mat4 as 3 vec4, it may save some space. But that may not be enough. The OpenGL ES 2.0 only ensure to have 128 vec4 uniform variables (glGetIntegeri with GL_MAX_VERTEX_UNIFORM_VECTORS), so we can only use 128 / 3 = 42 bones or less?

Or we can try to use a texture to store some more data. The book <OpenGL ES 2.0 Programming Guide> says that "Samplers in a vertex shader are optional". The POWERVR SGX seems to support it. But we need some more information to decide whether or not to use it.

2. Yes, the <Warcraft III Art Tools Documention.pdf> says that "Up to four bones can influence one vertex.". So we can use an vec4 attribute to simulate an float[4] array.

But I found there're some exceptions. For example a water elemetal has some vertices that are influenced by up to 6 bones. This is not very critical because we can add 2 more attribute to fix it.

In my test I just use the first 4 bones, and ignore the last 2, it looks fine without any obvious problem. So let's just ignore it for now:)

0

Share on other sites

Here's some snapshot of my test program.

I'd like to share my happy feeling with you. Thank you again.

[attachment=16979:testGL.01.png]

[attachment=16980:testGL.02.png]

[attachment=16981:testGL.03.png]

[attachment=16982:testGL.04.png]

0

Share on other sites

Glad to see you caught my mistake. I was calling "indices" weights, also. Clearly I didn't test that code :/

Those results look good! I am surprised that the ancients have so many bones. If I had to guess, maybe WC3 probably did software skinning so it didn't matter?

As an option, maybe you could look through the model and split the mesh based on the bone indices accessed? (half for indices < 110 or something, half for >110) and do two draw calls for the big guys.. This would work best if pieces don't rely on the root bone bones too much.

Alternatively you could split the model and duplicate the most shared bones into each of the two smaller models' bone arrays, changing the indices on your vertex data appropriately. It still might let you cut down the number enough to fit into your uniform space.

Edited by Koehler
0

Share on other sites

Yes the 'AncientOfLore.mdx' has many bones. When I found this for the first time, I am surprised too.

Once again the Warcraft3's model do not obey the rule they've made in the <Warcraft III Art Tools Documention.pdf>.

According to the documention, a building should have at most 15 bones. And a really big unit should have at most 30 bones.

By the way, OpenGL 2.0 spec is released on the year 2004. Warcraft3 is released before that. So I think Warcraft3 is not using a shader to do the bone animation.

I've noticed that, not all the bones are used by the mesh. Some of the bones are used for attaching another model, or used by a particle emitter, etc.

For example, when an AncientOfLore tree was badly damaged, some places of the tree body will be on fire. Each place uses a particle emitter to draw the fire, and a particle emitter needs a bone. Simply speaking, 6 places of fire will use 6 bones.

There is a concept named "geoset" in the Warcraft3's model. A geoset contains data like vertex positions, texture coords, normals, and the indices of bone matrix. One model may have one or more geoset(s).

Before today I thought that each vertex in each geoset can be linked to any bone of this model. When I see these words "split the mesh" I guess we may make use of the geoset directly, rather than split the mesh by an algorithm.

So I did a simple test.

The 'AncientOfLore.mdx' model has 12 geosets. And in the animation sequence "stand work alternate" there're 6 of them are visible(The documention says that one model should have at most 5 visible geosets!). The number of bones used in each geoset are: 27, 62, 3, 3, 8, 2. All these numbers are much lesser than 202.

But for OpenGL ES, the 62 bones is still too many and will need to split into smaller parts.

So if I need to display an 'AncientOfLore.mdx' on my Android phone, I have to design an algorithm to split a geoset into two or more small geosets.

The next step is to design and implement this algorithm. I think that will not be easy for me. But I'll try it.

0

Create an account

Register a new account

Followers 0

• Similar Content

• By DaniDesu
#include "MyEngine.h" int main() { MyEngine myEngine; myEngine.run(); return 0; } MyEngine.h
#pragma once #include "MyWindow.h" #include "MyShaders.h" #include "MyShapes.h" class MyEngine { private: GLFWwindow * myWindowHandle; MyWindow * myWindow; public: MyEngine(); ~MyEngine(); void run(); }; MyEngine.cpp
#pragma once #include <glad\glad.h> #include <GLFW\glfw3.h> class MyWindow { private: GLFWwindow * windowHandle; int windowWidth; int windowHeight; const char * windowTitle; public: MyWindow(int windowWidth, int windowHeight, const char * windowTitle); ~MyWindow(); GLFWwindow * getWindowHandle(); void createWindow(); void MyWindow::destroyWindow(); }; MyWindow.cpp
#include "MyWindow.h" MyWindow::MyWindow(int windowWidth, int windowHeight, const char * windowTitle) { this->windowHandle = NULL; this->windowWidth = windowWidth; this->windowWidth = windowWidth; this->windowHeight = windowHeight; this->windowTitle = windowTitle; glfwInit(); } MyWindow::~MyWindow() { } GLFWwindow * MyWindow::getWindowHandle() { return this->windowHandle; } void MyWindow::createWindow() { // Use OpenGL 3.3 and GLSL 3.3 glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3); glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3); // Limit backwards compatibility glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE); glfwWindowHint(GLFW_OPENGL_FORWARD_COMPAT, GL_TRUE); // Prevent resizing window glfwWindowHint(GLFW_RESIZABLE, GL_FALSE); // Create window this->windowHandle = glfwCreateWindow(this->windowWidth, this->windowHeight, this->windowTitle, NULL, NULL); glfwMakeContextCurrent(this->windowHandle); } void MyWindow::destroyWindow() { glfwTerminate(); } MyShapes.h
#pragma once #include <glad\glad.h> #include <GLFW\glfw3.h> class MyShapes { public: MyShapes(); ~MyShapes(); GLuint & drawTriangle(float coordinates[]); }; MyShapes.cpp
#include "MyShapes.h" MyShapes::MyShapes() { } MyShapes::~MyShapes() { } GLuint & MyShapes::drawTriangle(float coordinates[]) { GLuint vertexBufferObject{}; GLuint vertexArrayObject{}; // Create a VAO glGenVertexArrays(1, &vertexArrayObject); glBindVertexArray(vertexArrayObject); // Send vertices to the GPU glGenBuffers(1, &vertexBufferObject); glBindBuffer(GL_ARRAY_BUFFER, vertexBufferObject); glBufferData(GL_ARRAY_BUFFER, sizeof(coordinates), coordinates, GL_STATIC_DRAW); // Dertermine the interpretation of the array buffer glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3*sizeof(float), (void *)0); glEnableVertexAttribArray(0); // Unbind the buffers glBindBuffer(GL_ARRAY_BUFFER, 0); glBindVertexArray(0); return vertexArrayObject; } MyFileHandler.h
#pragma once #include <cstdio> #include <cstdlib> class MyFileHandler { private: const char * fileName; unsigned long fileSize; void setFileSize(); public: MyFileHandler(const char * fileName); ~MyFileHandler(); unsigned long getFileSize(); const char * readFile(); }; MyFileHandler.cpp
#include "MyFileHandler.h" MyFileHandler::MyFileHandler(const char * fileName) { this->fileName = fileName; this->setFileSize(); } MyFileHandler::~MyFileHandler() { } void MyFileHandler::setFileSize() { FILE * fileHandle = NULL; fopen_s(&fileHandle, this->fileName, "rb"); fseek(fileHandle, 0L, SEEK_END); this->fileSize = ftell(fileHandle); rewind(fileHandle); fclose(fileHandle); return; } unsigned long MyFileHandler::getFileSize() { return (this->fileSize); } const char * MyFileHandler::readFile() { char * buffer = (char *)malloc((this->fileSize)+1); FILE * fileHandle = NULL; fopen_s(&fileHandle, this->fileName, "rb"); fread(buffer, this->fileSize, sizeof(char), fileHandle); fclose(fileHandle); buffer[this->fileSize] = '\0'; return buffer; } VertexShader.glsl
#version 330 core layout (location = 0) vec3 VertexPositions; void main() { gl_Position = vec4(VertexPositions, 1.0f); } FragmentShader.glsl
#version 330 core out vec4 FragmentColor; void main() { FragmentColor = vec4(1.0f, 0.0f, 0.0f, 1.0f); } I am attempting to create a simple engine/graphics utility using some object-oriented paradigms. My first goal is to get some output from my engine, namely, a simple red triangle.
For this goal, the MyShapes class will be responsible for defining shapes such as triangles, polygons etc. Currently, there is only a drawTriangle() method implemented, because I first wanted to see whether it works or not before attempting to code other shape drawing methods.
The constructor of the MyEngine class creates a GLFW window (GLAD is also initialized here to load all OpenGL functionality), and the myEngine.run() method in Main.cpp is responsible for firing up the engine. In this run() method, the shaders get loaded from files via the help of my FileHandler class. The vertices for the triangle are processed by the myShapes.drawTriangle() method where a vertex array object, a vertex buffer object and vertrex attributes are set for this purpose.
The while loop in the run() method should be outputting me the desired red triangle, but all I get is a grey window area. Why?
(Note: I am aware that this code is not using any good software engineering practices (e.g. exceptions, error handling). I am planning to implement them later, once I get the hang of OpenGL.)

• By KarimIO
EDIT: I thought this was restricted to Attribute-Created GL contexts, but it isn't, so I rewrote the post.
Hey guys, whenever I call SwapBuffers(hDC), I get a crash, and I get a "Too many posts were made to a semaphore." from Windows as I call SwapBuffers. What could be the cause of this?
Update: No crash occurs if I don't draw, just clear and swap.
static PIXELFORMATDESCRIPTOR pfd = // pfd Tells Windows How We Want Things To Be { sizeof(PIXELFORMATDESCRIPTOR), // Size Of This Pixel Format Descriptor 1, // Version Number PFD_DRAW_TO_WINDOW | // Format Must Support Window PFD_SUPPORT_OPENGL | // Format Must Support OpenGL PFD_DOUBLEBUFFER, // Must Support Double Buffering PFD_TYPE_RGBA, // Request An RGBA Format 32, // Select Our Color Depth 0, 0, 0, 0, 0, 0, // Color Bits Ignored 0, // No Alpha Buffer 0, // Shift Bit Ignored 0, // No Accumulation Buffer 0, 0, 0, 0, // Accumulation Bits Ignored 24, // 24Bit Z-Buffer (Depth Buffer) 0, // No Stencil Buffer 0, // No Auxiliary Buffer PFD_MAIN_PLANE, // Main Drawing Layer 0, // Reserved 0, 0, 0 // Layer Masks Ignored }; if (!(hDC = GetDC(windowHandle))) return false; unsigned int PixelFormat; if (!(PixelFormat = ChoosePixelFormat(hDC, &pfd))) return false; if (!SetPixelFormat(hDC, PixelFormat, &pfd)) return false; hRC = wglCreateContext(hDC); if (!hRC) { std::cout << "wglCreateContext Failed!\n"; return false; } if (wglMakeCurrent(hDC, hRC) == NULL) { std::cout << "Make Context Current Second Failed!\n"; return false; } ... // OGL Buffer Initialization glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT); glBindVertexArray(vao); glUseProgram(myprogram); glDrawElements(GL_TRIANGLES, indexCount, GL_UNSIGNED_SHORT, (void *)indexStart); SwapBuffers(GetDC(window_handle));
• By Tchom
Hey devs!

I've been working on a OpenGL ES 2.0 android engine and I have begun implementing some simple (point) lighting. I had something fairly simple working, so I tried to get fancy and added color-tinting light. And it works great... with only one or two lights. Any more than that, the application drops about 15 frames per light added (my ideal is at least 4 or 5). I know implementing lighting is expensive, I just didn't think it was that expensive. I'm fairly new to the world of OpenGL and GLSL, so there is a good chance I've written some crappy shader code. If anyone had any feedback or tips on how I can optimize this code, please let me know.

uniform mat4 u_MVPMatrix; uniform mat4 u_MVMatrix; attribute vec4 a_Position; attribute vec3 a_Normal; attribute vec2 a_TexCoordinate; varying vec3 v_Position; varying vec3 v_Normal; varying vec2 v_TexCoordinate; void main() { v_Position = vec3(u_MVMatrix * a_Position); v_TexCoordinate = a_TexCoordinate; v_Normal = vec3(u_MVMatrix * vec4(a_Normal, 0.0)); gl_Position = u_MVPMatrix * a_Position; } Fragment Shader
precision mediump float; uniform vec4 u_LightPos["+numLights+"]; uniform vec4 u_LightColours["+numLights+"]; uniform float u_LightPower["+numLights+"]; uniform sampler2D u_Texture; varying vec3 v_Position; varying vec3 v_Normal; varying vec2 v_TexCoordinate; void main() { gl_FragColor = (texture2D(u_Texture, v_TexCoordinate)); float diffuse = 0.0; vec4 colourSum = vec4(1.0); for (int i = 0; i < "+numLights+"; i++) { vec3 toPointLight = vec3(u_LightPos[i]); float distance = length(toPointLight - v_Position); vec3 lightVector = normalize(toPointLight - v_Position); float diffuseDiff = 0.0; // The diffuse difference contributed from current light diffuseDiff = max(dot(v_Normal, lightVector), 0.0); diffuseDiff = diffuseDiff * (1.0 / (1.0 + ((1.0-u_LightPower[i])* distance * distance))); //Determine attenuatio diffuse += diffuseDiff; gl_FragColor.rgb *= vec3(1.0) / ((vec3(1.0) + ((vec3(1.0) - vec3(u_LightColours[i]))*diffuseDiff))); //The expensive part } diffuse += 0.1; //Add ambient light gl_FragColor.rgb *= diffuse; } Am I making any rookie mistakes? Or am I just being unrealistic about what I can do? Thanks in advance
• By yahiko00
Hi,
Not sure to post at the right place, if not, please forgive me...
For a game project I am working on, I would like to implement a 2D starfield as a background.
I do not want to deal with static tiles, since I plan to slowly animate the starfield. So, I am trying to figure out how to generate a random starfield for the entire map.
I feel that using a uniform distribution for the stars will not do the trick. Instead I would like something similar to the screenshot below, taken from the game Star Wars: Empire At War (all credits to Lucasfilm, Disney, and so on...).

Is there someone who could have an idea of a distribution which could result in such a starfield?
Any insight would be appreciated

• I have just noticed that, in quake 3 and half - life, dynamic models are effected from light map. For example in dark areas, gun that player holds seems darker. How did they achieve this effect ? I can use image based lighting techniques however (Like placing an environment probe and using it for reflections and ambient lighting), this tech wasn't used in games back then, so there must be a simpler method to do this.
Here is a link that shows how modern engines does it. Indirect Lighting Cache It would be nice if you know a paper that explains this technique. Can I apply this to quake 3' s light map generator and bsp format ?

• 14
• 12
• 11
• 18
• 19