Hyunkel

Members
  • Content count

    307
  • Joined

  • Last visited

Community Reputation

401 Neutral

About Hyunkel

  • Rank
    Member
  1. OpenGL FBO Questions

      The reason I initially wanted to do this is because my shadow maps for directional lights have 4 cascades, and I tried generating them simultaneously by quadrupling the geometry using  the geometry buffer. If I rendered only to the depth buffer I was able to avoid allocating an additional 4 R32f 2048x2048 color attachment slices. After some profiling I noticed that this isn't really any faster than generating the cascades in individual passes though, provided that I use decent frustum culling for the individual cascades.   This is precisely what I am doing now. It is also much easier to have either a linear or exponential shadow map this way, which is needed for ESM.   Thanks again! Hyu
  2. OpenGL FBO Questions

    I've tried a few more different combinations, and it seems like it is not possible to create a texture that can be used both as a depth attachment and as a color attachment. But as you've mentioned, I can either simply output to gl_FragDepth for the final pass, or for the shadow map generation, output the depth to a color attachment, using a shared RBO across all shadow maps.   Thank you for all the helpful comments! Cheers, Hyu
  3. OpenGL FBO Questions

    Good to know, thanks!   But I can't use GL_32F as a depth attachment when generating the shadow map, can I?
  4. OpenGL FBO Questions

    Yes, that is correct. I would like to avoid allocating an extra texture for each shadow map if possible.   Makes perfect sense, thanks!
  5. Hello,   I'm doing shadow mapping with exponential shadow maps (ESM) in OpenGL 4.3. This requires blurring the shadow map, which I do with a normal 2-pass gaussian blur. What I want to do is the following:   Shadow map -> (Vertical Blur Shader) -> Intermediate Texture -> (Horizontal Blur Shader) -> Shadow Map   The shadow map is a DepthComponent32f texture and the intermediate texture uses R32f. The first pass works fine, but for the second pass, where I want to write back to the shadow map, I can't seem to use the shadow map as a FBO color attachment, so I'm unable to write back to it.   I've also noticed, completely by accident, that I can sample from a texture that I am currently writing to, without any ill results. For example I can do: Texture -> (Vertical Blur Shader) -> Texture   To recap: Is there a way to use a DepthComponent texture as a color attachment in a FBO? Why can I sample a texture that I'm currently writing to? Is this legal in OpenGL 4.3, or is the behavior undefined? What happens behind the scenes? Does it internally create a new texture to write to, and then discard the old one when the draw call finishes? Cheers, Hyu
  6. Thanks for the suggestions!   Unfortunately the hd3000 does not support ARB_debug_output, and afaik Intel's Graphics Performance Analyzer doesn't support OpenGL on Windows yet.
  7. Hi,   I'm currently working on a game project which uses OpenTK targeting OpenGL 3.1. After implementing instanced rendering ( instanced vertex attributes using glDrawArraysInstanced ) I've noticed a horrible performance drop on our intel test machine ( Intel HD3000 ). It went from 2ms frame time to over 2000ms. All other machines (using Amd and Nvidia cards) are performing better with instancing. I've checked the intel site to make sure that the HD3000 supports OpenGL 3.1 and that it has the latest drivers.   Do you have any ideas what could be causing this issue?   Buffer Structures: [StructLayout( LayoutKind.Sequential )] struct VertexData { public Vector3 Position; public Vector3 Normal; public Vector2 TexCoord; public static readonly int SizeInBytes = Marshal.SizeOf( new VertexData() ); public VertexData( Vector3 position, Vector3 normal, Vector2 texcoord ) { Position = position; Normal = normal; TexCoord = texcoord; } }; [StructLayout( LayoutKind.Sequential )] public struct InstanceData { public Vector4 SpriteRect; public Vector4 DestinationRect; public Color4 Color; public Vector4 Scissors; public static readonly int SizeInBytes = Marshal.SizeOf( new InstanceData() ); } VAO Creation: float v1 = -1f; float v2 = 1f; VertexData[] Vertices = new VertexData[] { new VertexData( new Vector3(v1, v2, 0), new Vector3(0, 0, 1), new Vector2(0, 0)), new VertexData( new Vector3(v2, v2, 0), new Vector3(0, 0, 1), new Vector2(1, 0)), new VertexData( new Vector3(v1, v1, 0), new Vector3(0, 0, 1), new Vector2(0, 1)), new VertexData( new Vector3(v1, v1, 0), new Vector3(0, 0, 1), new Vector2(0, 1)), new VertexData( new Vector3(v2, v2, 0), new Vector3(0, 0, 1), new Vector2(1, 0)), new VertexData( new Vector3(v2, v1, 0), new Vector3(0, 0, 1), new Vector2(1, 1)) }; Buffer vertexBuffer = Buffer.CreateVertexBuffer( Vertices, VertexData.SizeInBytes ); InstanceBuffer = Buffer.CreateInstanceBuffer( InstanceData.SizeInBytes, 4096 ); GL.GenVertexArrays( 1, out VAOHandle ); GL.BindVertexArray( VAOHandle ); // Vertex Buffer vertexBuffer.Bind(); GL.VertexAttribPointer( 0, 3, VertexAttribPointerType.Float, false, VertexData.SizeInBytes, 0 ); GL.EnableVertexAttribArray( 0 ); GL.VertexAttribPointer( 1, 3, VertexAttribPointerType.Float, false, VertexData.SizeInBytes, Vector3.SizeInBytes ); GL.EnableVertexAttribArray( 1 ); GL.VertexAttribPointer( 2, 2, VertexAttribPointerType.Float, false, VertexData.SizeInBytes, Vector3.SizeInBytes * 2 ); GL.EnableVertexAttribArray( 2 ); // Instance Buffer InstanceBuffer.Bind(); GL.VertexAttribPointer( 3, 4, VertexAttribPointerType.Float, false, InstanceData.SizeInBytes, 0 ); GL.EnableVertexAttribArray( 3 ); GL.VertexAttribDivisor( 3, 1 ); GL.VertexAttribPointer( 4, 4, VertexAttribPointerType.Float, false, InstanceData.SizeInBytes, Vector4.SizeInBytes ); GL.EnableVertexAttribArray( 4 ); GL.VertexAttribDivisor( 4, 1 ); GL.VertexAttribPointer( 5, 4, VertexAttribPointerType.Float, false, InstanceData.SizeInBytes, Vector4.SizeInBytes * 2 ); GL.EnableVertexAttribArray( 5 ); GL.VertexAttribDivisor( 5, 1 ); GL.VertexAttribPointer( 6, 4, VertexAttribPointerType.Float, false, InstanceData.SizeInBytes, Vector4.SizeInBytes * 3 ); GL.EnableVertexAttribArray( 6 ); GL.VertexAttribDivisor( 6, 1 ); GL.BindVertexArray( 0 ); Vertex Shader: #version 140 // Vertex Data in vec3 in_position; in vec3 in_normal; in vec2 in_texcoord; // Instance Data in vec4 in_spriteRect; in vec4 in_destinationRect; in vec4 in_color; in vec4 in_scissors; // Output out vec3 vs_normal; out vec2 vs_texcoord; out vec4 vs_color; out vec4 vs_scissors; void main() { // Texture Coordinates vs_texcoord = in_texcoord; vs_texcoord *= in_spriteRect.zw; vs_texcoord += in_spriteRect.xy; // Position vec4 Position = vec4( in_position, 1.0f ); // Normalize to [0, 1] Position.xy = Position.xy * 0.5f + 0.5f; // Apply Destination Transform Position.xy *= in_destinationRect.zw; Position.xy += in_destinationRect.xy; // Normalize to [-1, 1] Position.xy = Position.xy * 2.0f - 1.0f; // In OpenGL -1,-1 is the bottom left screen corner // In DirectX -1,-1 is the top left screen corner Position.y += 2.0f - in_destinationRect.w * 2.0f; vs_normal = in_normal; vs_color = in_color; vs_scissors = in_scissors; gl_Position = Position; } Fragment Shader: #version 140 uniform sampler2D Tex; // Input in vec3 vs_normal; in vec2 vs_texcoord; in vec4 vs_color; in vec4 vs_scissors; // Output out vec4 out_frag_color; bool ScissorTest() { return gl_FragCoord.x > vs_scissors.x && gl_FragCoord.y > vs_scissors.y && gl_FragCoord.x < vs_scissors.x + vs_scissors.z && gl_FragCoord.y < vs_scissors.y + vs_scissors.w; } void main() { out_frag_color = vec4(0, 0, 0, 0); if(ScissorTest()) out_frag_color = texture( Tex, vs_texcoord ) * vs_color; } Rendering InstanceBuffer.Write( InstanceDataCPU, InstanceData.SizeInBytes, InstanceCount ); GL.BindVertexArray( VAOHandle ); BindTexture( TextureHandle, 0, texture ); GL.UseProgram( ProgramHandle ); GL.DrawArraysInstanced( PrimitiveType.Triangles, 0, 6, InstanceCount );
  8. I think that is indeed the best way to go about it. I'll need to make some modifications to my cube to sphere mapping formula, but that shouldn't be too difficult. Thank you for your advice!
  9. To some extend, yes. But the base algorithm generating the terrain generates vertices for a [-1, 1] cube, which are then mapped to a unit sphere. All of this is done entirely on the gpu, which means single precision. I can scale and translate these vertices of course, which improves normal vector generation. But the trade-off is that I introduce some jitter in the vertex positions, which will influence the normal vectors. It is fine most of the time, but in some situations it creates easily recognizable patterns. I've also experimented with using partial double precision, but support for this is unfortunately still very limited.
  10. I generate procedural planets pretty much entirely using compute shaders. (CPU manages a modified quad tree for LOD calculations) The compute shader outputs vertex data on a per terrain patch basis, which is stored in buffers. Normal vectors are calculated during this stage by using a sobel operator on the generated position data: [source lang="cpp"]// Only operate on non-padded threads if((GroupThreadID.x > 0) && (GroupThreadID.x < PaddedX - 1) && (GroupThreadID.y > 0) && (GroupThreadID.y < PaddedY - 1)) { // Generate normal vectors float3 C = VertexPosition; float3 T = GetSharedPosition(GroupThreadID.x, GroupThreadID.y + 1); float3 TR = GetSharedPosition(GroupThreadID.x + 1, GroupThreadID.y + 1); float3 R = GetSharedPosition(GroupThreadID.x + 1, GroupThreadID.y); float3 BR = GetSharedPosition(GroupThreadID.x + 1, GroupThreadID.y - 1); float3 B = GetSharedPosition(GroupThreadID.x, GroupThreadID.y - 1); float3 BL = GetSharedPosition(GroupThreadID.x - 1, GroupThreadID.y - 1); float3 L = GetSharedPosition(GroupThreadID.x - 1, GroupThreadID.y); float3 TL = GetSharedPosition(GroupThreadID.x - 1, GroupThreadID.y + 1); float3 v1 = normalize((TR + 2.0*R + BR) * 0.25 - C); float3 v2 = normalize((TL + 2.0*T + TR) * 0.25 - C); float3 v3 = normalize((TL + 2.0*L + BL) * 0.25 - C); float3 v4 = normalize((BL + 2.0*B + BR) * 0.25 - C); float3 N1 = cross(v1, v2); float3 N2 = cross(v3, v4); Normal = (N1 + N2) * 0.5; // Write Normal to Shared Memory SharedMemory[GroupIndex].Normal = Normal; }[/source] This works very well in most situations. Unfortunately, once I get to very high LOD levels, floating point precision causes quite a few issues. In order to illustrate the problem I make the compute shader generate a sphere of radius 1. I then use the following code to display the error rate of the generated normal vectors: [source lang="cpp"]float3 NormalError = abs(Normal - normalize(PositionWS)) * 10.0;[/source] LOD 16 - First signs of errors, no visual artifacts [img]http://i.imgur.com/bPtfZ.jpg[/img] LOD 20 - First visual artifacts. Can be masked with normal mapping or some perlin noise. [img]http://i.imgur.com/kShNl.jpg[/img] LOD 24 (highest lod): Visual artifacts are visible all over the terrain. [img]http://i.imgur.com/QhehH.jpg[/img] At this LOD, vertices are only 0.0000000596 units apart from each other, hence the problem with my current method for generating normal vectors. I understand that I'm pushing the limits of floating point precision here, and not having that high of a terrain resolution isn't that big of an issue, but I was wondering if anyone had any ideas on how to squeeze out a little more detail? Cheers, Hyu
  11. I agree that texture arrays are ideal in a situation where you have many sprites of the same (or nearly the same) size. However, as you said, it depends entirely on what the OP is doing. The last project I worked on for example, had completely arbitrary sprite sizes. I suppose we both agree that if you can use texture arrays instead of sprite sheets, you should definitely go for it.
  12. [quote name='Gavin Williams' timestamp='1347885008' post='4980854'] OMG No ! Not sprite sheets if you are using DX11. Use a texture array, they are so much better. [img]http://public.gamedev.net//public/style_emoticons/default/biggrin.png[/img]. Tricky to setup and get to know but so much more powerful after that. [/quote] Texture arrays are definitely very useful, but I don't think they are a replacement for sprite sheets. Why do you consider texture arrays so much better for sprite storage? They definitely have some useful properties, such as texture clamping, which is problematic with sprite sheets, but what happens when individual sprites have different resolutions or different amounts of animation frames?
  13. Constant Buffer usage

    Thank you, this was extremely helpful!
  14. I went over some of my code with a colleague yesterday and he was quite surprised by how I manage my constant buffers. Except for a few rare and very specific situations, I only use 2 "global" constant buffers. A per-frame buffer, which contains data which only needs to be updated once per frame. [source lang="cpp"]cbuffer PerFrameCB : register (b0) { float4x4 CameraView : packoffset( c0.x); float4x4 CameraProjection : packoffset( c4.x); float4 CameraPosition : packoffset( c8.x); float4 SunDirection : packoffset( c9.x); float2 ViewportSize : packoffset(c10.x); }[/source] And another buffer used for everything else. This buffer is 1kb in size, and is updated whenever new data is needed, which is multiple times per frame. Both of these buffers are always bound to the registers b0 and b1 for all shader stages. I've been told that this is the "wrong" way to do it. I'm supposed to split this up into individual constant buffers. However I don't understand why that is the case. If I split my current b1 constant buffer into X buffers, not only do I still need to update these buffers, but I'll also need to bind a new constant buffer whenever new data is needed. I don't see how my method is wrong, but I'm a little paranoid when I hear such claims because I am self-taught. So I figured it's better to ask than potentially doing something wrong. Cheers, Hyu
  15. I'm a bit confused. You say the error is related to 'SlimDX.Direct3D10.Device.OpenSharedResource(System.IntPtr)' yet you change 'SlimDX.Direct3D11.Resource.FromSwapChain(swapChain, 0)' Can you show a little more code? Specifically the part where you obtain and access the shared resource.