Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

Community Reputation

157 Neutral

About StanLee

  • Rank
  1. StanLee

    Computing Matrices on the GPU

    The goal is the following:   I need to shoot rays from my camera into the scene every frame. At the intersection points I am creating virtual cameras. Which are then used to render the scene again (or only a bunch of objects). Creating virtual cameras is analogous to creating viewProjection matrices. The current approach is: I render the scene from my camera and output the world positions and the normals with multi render targeting to two different textures. So I have two textures on my GPU which contain the world positions and normals of the scene seen from the camera. To create my viewProjections matrices I only need to get the position and the normal from the textures ( I always assume the up-vector to be vec3(0, 1, 0) ). At the moment I am downloading these two textures and create the matrices on the client side, which are then send to the server side again for further rendering.   The models with 200K vertices are by far not absurd. My GPU runs fine at 40-60fps in a scene with about 1 million vertices and goes even higher when not all vertices are in the camera frustum. 
  2. StanLee

    Computing Matrices on the GPU

      I'd suggest that threading isn't even necessary here.  This really reads a lot like a misguided attempt to save memory by just storing 6 float sper object rather than all 16 of a 4x4 matrix, and that in this case it may very well be a more useful optimization to just burn the extra memory in exchange for more efficient ALU.  100 objects is, quite frankly, chickenfeed: 1996 class hardware could deal with that easily enough.   It's not about saving memory. As I have stated I need to create MVP matrices based on normals and positions which are stored on two different textures. So doing this on the CPU implies downloading the textures from server side, extracting the position and normal and then computing the matrices, which are then send to the server again. I am already doing this, but downloading textures every frame is too much of a performance killer. 
  3. StanLee

    Computing Matrices on the GPU

    But wouldn't this mean two texture accesses (normal and position texture) plus matrix construction per vertex thread? I have models with a vertex count of 200.000 and more. It would surprise me if this were faster than my proposed method. 
  4. StanLee

    Computing Matrices on the GPU

    I am using the version 4.5. Unfortunately the proprietary framework I am working with does not support compute shaders yet. 
  5. Hi,   I'm in the situation that I need to create a lot of model-view-projection (MVP) matrices (around 100 and more) out of a normal and position vector which are stored in a texture. I already implemented it in a way that I download these two textures and then create the matrices on the client side, but this is very very slow, as predicted. But the impact on the performance is really too big.   How to do this entirely on the GPU? I don't even need the matrices on the client side. They are later used for rendering.   My idea so far is to use a SSBO (Shader Storage Buffer Object) to store my matrices. I take every sample point on my texture (which corresponds to one matrix) and put it into a VBO which is then rendered to a screenquad. This should result in a fragment shader thread for every sample point. Then I sample the position and the normal from my textures in the fragment shader, construct a MVP matrix and store it in the SSBO. To get the right index for the SSBO every sample point is provided with its corresponding SSBO-index to the vertex shader.    The question is: Is this the fastest way possible to generate the matrices? Are there other faster possibilities? I also thought about using 2D Textures to store the matrices, but this would imply 4 texture accesses later to get one matrix. (considering a 4 channel texture) But I don't know if this is faster than using SSBO's.   Regards, Stan
  6. StanLee

    Atomic Add for Float Type

    I understand. =) I will stick to the NV extension for now, but try this out later. Thanks! =)
  7. StanLee

    Atomic Add for Float Type

    Hmmm, so only NV has atomics on floats. Maybe I'll stick to only NV support for now.   By using integeres, do you mean converting them to integer values before saving them in the buffer? 
  8. Hi,   I was searching the specifications for a possibility to do atomic addition for buffer variables of type float. But unfortunately I have found only atomic operations on integer types: https://www.opengl.org/wiki/Atomic_Variable_Operations   This is a big problem for me now. My fragment shader looks like this: struct ItemType{ float R, G, B, A; }; layout (std430, binding = 0) buffer BufferObject { ItemType items[]; }; uniform int ID; //... Other uniforms and variables void main(void){ // ... computations. Result: vec4 CompResult atomicAdd(items[ID].R, CompResult.r); //not working atomicAdd(items[ID].G, CompResult.g); //not working atomicAdd(items[ID].B, CompResult.b); //not working } This fragment shader is executed n-times with the same ID for one rendering call. (n is the number of rendered fragments) Obviously I have to perform the addition atomically but somehow there is no function which can do this for float types.   Has anyone else dealt with this problem before? Is there a possibility to circumvent the problem? I already had the idea to built my own per-pixel mutex just as it has been done in the Red Book, but I fear that this will destroy my performance completely.   Best Regards, Stan 
  9. Hi,   currently I am working on a new algorithm for realtime indirect illumination for my master thesis but I am not very versed in the mathematics of it. The goal is not to create stunning photorealistic images but rather the development and performance of the algorithm. Nevertheless I would like to generate images which are heading into the right direction in the sense of the math behind it.   At the moment I use phong shading as the illumination model for direct lighting. The situation in my algorithm is the following: I have something like photons in my scene. To gather indirect light for one photon I render the scene from its perspective. Assume only those fragments are rendered which are directly illuminated by a light source. To only account for diffuse indirect illumination I thought I could compute the diffuse color value with the phong model for every rendered fragment. Then all this values are summed up and divided by the number of rendered fragments. The resulting values (for every R,G,B channel one value) will be the indirect diffuse lighting part of the photon at its particular position.   Am I correct so far? Is this mathematically wise a reasonable approximation of the inidirect illumination?   The next question is what to do with this photon? Assume I render the scene from the perspective of my camera and a fragment with such a photon at its position is rendered. How do I incorporate the photon's diffuse color value in the lighting computation when using the phong model? My idea so far is to add the photon's diffuse color to the final color value (result of only direct lighting computation) of the fragment . Maybe with an appropriate weight factor to prevent overexposure.   Best Regards, Stan
  10. Thanks Hodgman! That was exactly the problem.  After passing the fov in radians everything worked perfectly fine. I somewhere defined the symbol GLM_FORCED_RADIANS and from then on the function took the angle in radians and not in degrees. 
  11. I have tried out your suggestion but the problem remains. I have even fixed the camera position to the origin: auto viewMatrix = glm::lookAt(glm::vec3(0), glm::vec3(viewDir + glm::vec3(0)), up); cubeMapContext._V = cubeMapContext._MV = viewMatrix; cubeMapContext._MVP = cubeMapContext._P * cubeMapContext._MV; _sceneGraph->renderSubtree(cubeMapContext); What I have also tried out is loading a cube map to see if my cube map rendering works correctly. And indeed it is, when loading a cube map everything fits perfectly together. Only when I generate the cube map myself as described above the faces don't fit together. Here is my shader code for the cube map rendering: #version 430 layout(location=0) in vec3 vertex; out vec3 ex_uv; uniform mat4 mvp; void main(){ ex_uv = vertex; gl_Position = mvp * vec4(vertex, 1.f); } #version 430 in vec3 ex_uv; out vec4 color; uniform samplerCube cubeMap; void main(){ color = texture(cubeMap, ex_uv); } And this is the client code initiating the rendering of the cube map: void render(const SceneContext& context, const glm::mat4x4& tmat){ glEnable(GL_TEXTURE_CUBE_MAP_SEAMLESS); _shader->bind(); //Calls: glUseProgram(shaderProgramID); _shader->loadMatrix4("mvp", glm::value_ptr(context._MVP)); // int _uniID = glGetUniformLocation(shaderProgramID, "mvp"); // glUniformMatrix4fv(_uniID, 1, GL_FALSE, valuePtr); _shader->loadUniform("cubeMap", _cubeMap->bind()); // glUniform1i(uniID, x); //uniID = _cubeMap->bind() _unitCube->bindVertices(0); // glEnableVertexAttribArray(location); //location = 0 // glBindBuffer(GL_ARRAY_BUFFER, _vBufferID); // glVertexAttribPointer(location, 3, GL_FLOAT, GL_FALSE, 0, 0); // glBindBuffer(GL_ARRAY_BUFFER, 0); _unitCube->render(); // glDrawArrays(GL_TRIANGLES, 0, _vertices.size()); _unitCube->unbind(); // glDisableVertexAttribArray(0); _shader->unbind(); // glUseProgram(0); glDisable(GL_TEXTURE_CUBE_MAP_SEAMLESS); } Is there maybe conceptually something wrong? I only want to create an environment map/panoramic view of my sponza scene. Isn't it possible to do it the way I do? Regards, Stan
  12. Hi,   I am trying to render a scene to a cube map but the faces are not fitting together. I think there is something wrong with the perspective:   I define a framebuffer and render the scene six times with different viewing directions (one renderpass for each face of the cube map). Then I place the camera at the origin and render a unit cube with the cube map.   The creation of the cube map: TextureCubeMap::TextureCubeMap(const unsigned int size) : _size(size), _boundSlot(-1){ glGenTextures(1, &_texID); glBindTexture(GL_TEXTURE_CUBE_MAP, _texID); glTexParameteri(GL_TEXTURE_CUBE_MAP, GL_TEXTURE_BASE_LEVEL, 0); glTexParameteri(GL_TEXTURE_CUBE_MAP, GL_TEXTURE_MAX_LEVEL, 0); glTexParameteri(GL_TEXTURE_CUBE_MAP, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_CUBE_MAP, GL_TEXTURE_MIN_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_CUBE_MAP, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_CUBE_MAP, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_CUBE_MAP, GL_TEXTURE_WRAP_R, GL_CLAMP_TO_EDGE); for (int loop = 0; loop < 6; ++loop) { glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X + loop, 0, GL_RGB, size, size, 0, GL_RGB, GL_UNSIGNED_BYTE, 0); } glBindTexture(GL_TEXTURE_CUBE_MAP, 0); } int TextureCubeMap::bind(){ ... glEnable(GL_TEXTURE0 + _boundSlot); glActiveTexture(GL_TEXTURE0 + _boundSlot); glEnable(GL_TEXTURE_CUBE_MAP); glBindTexture(GL_TEXTURE_CUBE_MAP, _texID); ... return _boundSlot; } void TextureCubeMap::unbind(){ ... glActiveTexture(GL_TEXTURE0 + _boundSlot); glBindTexture(GL_TEXTURE_2D, 0); ... } And this is the part where I render the scene to the cube map sixe times: void RenderToCubeMap::render(SceneContext& scene_context){ glBindFramebuffer(GL_FRAMEBUFFER, _fboID); const GLenum drawBuffer = { GL_COLOR_ATTACHMENT0 }; glDrawBuffers(1, &drawBuffer); glViewport(0, 0, (GLsizei)_cubeMap->getSize(), (GLsizei)_cubeMap->getSize()); SceneContext cubeMapContext = scene_context; //Contains all relevant matrices like projection-, view- and modelmatrix cubeMapContext._P = glm::perspective(90.f, 1.f, 1.f, 1000.f); _cubeMap->bind(); for (unsigned int iFace = 0; iFace < 6; iFace++){ glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_CUBE_MAP_POSITIVE_X + iFace, _cubeMap->getTexID(), 0); IRenderPass::startRenderPass(); //Enable depth test, backface culling, clear depth & draw buffer glm::vec3 viewDir, up; switch (GL_TEXTURE_CUBE_MAP_POSITIVE_X + iFace){ case GL_TEXTURE_CUBE_MAP_POSITIVE_X: viewDir = glm::vec3(1.0f, 0.0f, 0.0f); up = glm::vec3(0.f, -1.f, 0.f); break; case GL_TEXTURE_CUBE_MAP_NEGATIVE_X: viewDir = glm::vec3(-1.0f, 0.0f, 0.0f); up = glm::vec3(0.f, -1.f, 0.f); break; case GL_TEXTURE_CUBE_MAP_POSITIVE_Y: viewDir = glm::vec3(0.0f, 1.0f, 0.0f); up = glm::vec3(0.f, 0.f, 1.f); break; case GL_TEXTURE_CUBE_MAP_NEGATIVE_Y: viewDir = glm::vec3(0.0f, -1.0f, 0.0f); up = glm::vec3(0.f, 0.f, -1.f); break; case GL_TEXTURE_CUBE_MAP_POSITIVE_Z: viewDir = glm::vec3(0.0f, 0.0f, 1.0f); up = glm::vec3(0.f, -1.f, 0.f); break; case GL_TEXTURE_CUBE_MAP_NEGATIVE_Z: viewDir = glm::vec3(0.0f, 0.0f, -1.0f); up = glm::vec3(0.f, -1.f, 0.f); break; } auto viewMatrix = glm::lookAt(glm::vec3(scene_context._camPos), glm::vec3(viewDir + scene_context._camPos), up); cubeMapContext._V = viewMatrix; cubeMapContext._MV = viewMatrix * cubeMapContext._M; cubeMapContext._MVP = cubeMapContext._P * cubeMapContext._MV; _sceneGraph->renderSubtree(cubeMapContext); } _cubeMap->unbind(); glBindFramebuffer(GL_FRAMEBUFFER, 0); IRenderPass::endRenderPass(); //Disable depth test, backface culling } What am I doing wrong? Is there something wrong with the projection matrix? I am really thankful for any input.   Best regards, Stan
  13. StanLee

    3D Texture upload fails

    I checked the data again and I saw that they only use 12 bits, i.e. the original data values are in a range from 0 - 4095. That means my values are clamped to [0, 4095/65535 = 0.062485]. That makes sense, of course! Thank you guys!
  14. Hello guys,   I am developing a direct volume renderer and have issues with the 3D texture upload. I have a ct scan with 421 slices and each slice has a size of 512x512 with one channel containing an unsigned short value. This is how I upload the data at the moment:   glBindTexture(GL_TEXTURE_3D, _volumeTextureID); glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_BORDER); glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_BORDER); glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_WRAP_R, GL_CLAMP_TO_BORDER); glTexImage3D(GL_TEXTURE_3D, 0, GL_R16F, 512, 512, 421, 0, GL_RED, GL_UNSIGNED_SHORT, _volumeData); The problem is in the fragment shader. The values of the volumeTexture are not clamped to the range of [0,1]. I do raycasting in a cube which contains my 3D texture. As you can see in the for loop I check if the sampled value of my volumeTexture is bigger than 0.1 and set the color output to red. Unfortunately I don't see anything red in the cube. The values are all below 0.1 because when I do checks of the values below 0.1 then I already can see the lung, spine etc.  #version 430 out vec4 color; uniform sampler2D backTex; uniform sampler2D frontTex; uniform sampler3D volumeTexture; smooth in vec4 position; uniform int width; uniform int height; uniform uint samplingSteps; void main(void){ vec2 texCoor = vec2( (gl_FragCoord.x-0.5)/width, (gl_FragCoord.y-0.5)/height); vec3 front = texture(frontTex, texCoor).xyz; vec3 back = texture(backTex, texCoor).xyz; float length = length(back - front); vec3 dir = normalize(back - front); vec3 step = dir * 1/samplingSteps; vec4 pos = vec4(front, 0); float volValue = 0; vec4 src = vec4(0); vec4 finalColor = vec4(1); float max = 0; for(int i = 0; i < samplingSteps*length; i++){ volValue = texture(volumeTexture, pos.xyz).r; if(volValue >= 0.1){ finalColor = vec4(0.5, 0, 0, 0.5); break; } pos.xyz += step; if(pos.x > 1 || pos.y > 1 || pos.z > 1){ break; } } color = finalColor; }   What is happening here? Why are all my values in the volume texture clamped to [0, 0.1] and not to [0, 1]?   Best regards, StanLee
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!