• Advertisement

pnt1614

Member
  • Content count

    115
  • Joined

  • Last visited

Community Reputation

407 Neutral

About pnt1614

  • Rank
    Member
  1. AFAIK, in Directx 11 there are two ways to get the total number of samples that pass the depth test and the stencil test:   1. If we use an unordered access view with an internal counter,  we can increase the internal counter for each processed sample and then use a staging buffer to obtain the internal counter.   2. We can use hardware occlusion query (D3D11_QUERY_OCCLUSION)   But both ways slows down performance, so is there another way to get the total number of samples that pass the depth test and the stencil test?
  2. Thank Hodgman for the debugging tip. I will try it.
  3. Thank Crossbones and MJP for the reply.   If I am using the same index buffer and vertex buffer to render without using the geometry shader, it works perfectly. The vertex shader just perform the transformation and the pixel shader outputs a color. It means that there is nothing wrong with the index buffer and the vertex buffer. So I guess that there is something wrong with the geometry shader but I do not know exactly how the geometry shader is invoked to process input vertices. Because my program has no response and it turns to black then backs to normal, sometimes I got an "pure virtual function" error message.   To debug the shader, I try to use a query to check the number of shader invocations as rendering a cube. the query data look like this:   // Using adjacent indices doubles the index buffer size so the input vertices is 72 instead of 36? IAVertices          72 IAPrimitive         12 VSInvocations    26  // There are totally 24 vertices in the vertex buffer so why is there  26 vertex shader invocation? GSInvocations   12  // The number of invoked geometry shader, this value sometimes is 0 and I do not know the reason GSPrimitives      12 //  When GSInvocations = 0 the number of output primitives in the geometry shader is still 12. Why? ....        In C++, I am using an instance a class (Model_Loader) for loading a model and creating buffers (index buffer and vertex buffer). In the main source code I get these two buffers for binding to GPU as rendering. I am newbie at debugging C++ at low level so I do not know where to begin. It looks like this: // Load model class Model_Loader { public: void LoadModel(....); void CreateBuffers(); private: ID3D11Buffer* m_pVB; ID3D11Buffer* m_pIB; } // Main C++ code Model_Loader* g_pModelLoader = new Model_Loader(); g_pModelLoader.LoadModel(...); g_pModelLoader.CreateBuffers(); and I check the ID3D11Buffer class and see that this class have a pure virtual function. So is this the cause for the "pure virtual function" call? and how to test it?
  4. I want to render silhouettes edges using the geometry shader. First I create an index buffer (with adjacent indices) and a vertex buffer then render. For testing, the geometry shader just pass an input triangle to the pixel shader. The input triangle is at 0, 2, 4 indices. struct VS_OUTPUT { float4 posH: SV_POSITION; }; struct GS_OUTPUT { float4 posH: SV_POSITION; }; void Silhouette_Edges_GS(triangleadj VS_OUTPUT input[6], inout TriangleStream<GS_OUTPUT> output) { GSOutput v[3]; for (uint i = 0; i < 3; i++) { v[i].posH = input[i].posH; output.Append(v[i]); } output.RestartStrip(); } I got an "pure virtual function call" error message as running. Please, help me. 
  5. I am trying to implement the stochastic rasterization based on the paper: "Real-Time Stochastic Rasterization on Conventional  GPU Architectures" and provided pseudo code. But I do not understand how they address a problem as a triangle moves to/from a position behind the camera ("crossing z=0" in the paper) in the geometry shader. To handle the "crossing z = 0" problem, they generate a 2D AABB to contain all vertices behind the camera then find an intersection between an edge and the near plane and use this intersection point to update the 2D AABB. As generating a 2D AABB to contain all vertices behind the camera, they perform perspective divison without clipping and I think this yield a incorrect result.   I have implemented the following shader code in the geometry shader using DirectX 11 and HLSL 5.0 void intersectNear(float3 start, float3 end, inout float2 minXY, inout float2 maxXY) { float denom = end.z - start.z; if (abs(denom) > 0.0001) { float a = (nearPlaneZ - start.z) / denom; if ((a >= 0.0) && (a < 1.0)) { // Intersection point in camera space float3 cs = float3(lerp(start.xy, end.xy, a), nearPlaneZ); // Intersection point in screen space float2 ss = project42(mul(float4(cs, 1.0), g_mProj)); minXY = min(minXY, ss); maxXY = max(maxXY, ss); } } } void ST_GS(triangle VS_OUTPUT input[3] : SV_POSITION, inout TriangleStream<GS_OUTPUT_ST> output) { ...... float4 ssP0A = input[0].Prev_hPos; // A0 in the clip space float4 ssP0B = input[1].Prev_hPos; // B0 in the clip space float4 ssP0C = input[2].Prev_hPos; // C0 in the clip space float4 ssP1A = input[0].Curr_hPos; // A1 in the clip space float4 ssP1B = input[1].Curr_hPos; // B1 in the clip space float4 ssP1C = input[2].Curr_hPos; // C1 in the clip space // scaled depth values in the view space float minDepth = min6(input[0].Prev_vPos.z, input[1].Prev_vPos.z, input[2].Prev_vPos.z, input[0].Curr_vPos.z, input[1].Curr_vPos.z, input[2].Curr_vPos.z) / far_plane; float maxDepth = max6(input[0].Prev_vPos.z, input[1].Prev_vPos.z, input[2].Prev_vPos.z, input[0].Curr_vPos.z, input[1].Curr_vPos.z, input[2].Curr_vPos.z) / far_plane; if (maxDepth < near_plane || minDepth > 1.0f) { return; // Out of view frustrum } else if (minDepth < 0.0f) { float2 clipMin = float2(-1.0f, -1.0f); float2 clipMax = float2(1.0f, 1.0f); // Make 2D AABB // Grow the 2D AABB to contain all points with z < z_near if (input[0].Prev_vPos.z < nearPlaneZ) { float2 v = project42(ssP0A); clipMin = min(clipMin, v); clipMax = max(clipMax, v); } if (input[1].Prev_vPos.z < nearPlaneZ) { float2 v = project42(ssP0B); clipMin = min(clipMin, v); clipMax = max(clipMax, v); } if (input[2].Prev_vPos.z < nearPlaneZ) { float2 v = project42(ssP0C); clipMin = min(clipMin, v); clipMax = max(clipMax, v); } if (input[0].Curr_vPos.z < nearPlaneZ) { float2 v = project42(ssP1A); clipMin = min(clipMin, v); clipMax = max(clipMax, v); } if (input[1].Curr_vPos.z < nearPlaneZ) { float2 v = project42(ssP1B); clipMin = min(clipMin, v); clipMax = max(clipMax, v); } if (input[2].Curr_vPos.z < nearPlaneZ) { float2 v = project42(ssP1C); clipMin = min(clipMin, v); clipMax = max(clipMax, v); } // Viewport clips the generated 2D AABB intersectNear(input[0].Prev_vPos.xyz, input[1].Prev_vPos.xyz, clipMin, clipMax); intersectNear(input[1].Prev_vPos.xyz, input[2].Prev_vPos.xyz, clipMin, clipMax); intersectNear(input[2].Prev_vPos.xyz, input[0].Prev_vPos.xyz, clipMin, clipMax); intersectNear(input[0].Curr_vPos.xyz, input[1].Curr_vPos.xyz, clipMin, clipMax); intersectNear(input[1].Curr_vPos.xyz, input[2].Curr_vPos.xyz, clipMin, clipMax); intersectNear(input[2].Curr_vPos.xyz, input[0].Curr_vPos.xyz, clipMin, clipMax); intersectNear(input[0].Prev_vPos.xyz, input[0].Curr_vPos.xyz, clipMin, clipMax); intersectNear(input[1].Prev_vPos.xyz, input[1].Curr_vPos.xyz, clipMin, clipMax); intersectNear(input[2].Prev_vPos.xyz, input[2].Curr_vPos.xyz, clipMin, clipMax); // Output a quad points[0].posH = float4(clipMax.x, clipMin.y, nearPlaneDepth, 1.0); points[1].posH = float4(clipMax.x, clipMax.y, nearPlaneDepth, 1.0); points[2].posH = float4(clipMin.x, clipMin.y, nearPlaneDepth, 1.0); points[3].posH = float4(clipMin.x, clipMax.y, nearPlaneDepth, 1.0); output.Append(points[0]); output.Append(points[1]); output.Append(points[2]); output.Append(points[3]); output.RestartStrip(); return; } I am not sure that this is the limitation of this paper or I do not understand this stage. Is there anyone understood this paper or this stage? Please help me. Thanks in advance.
  6. Among some DirectX 11.1 features, there is one feature about binding a big constant buffer and specify a range in this buffer for updating at each stage. But I cannot find any examples how to configure and use this feature, is there anyone knowing how to do it? Please help me.
  7. First, I render geometries and generate a velocity texture (1280 x 720 pixels) then I do a post-processing pass to find maximum velocity in each tile (a velocity texture is seperated into 32 x 18 tiles). Maximum velocity in each tile is output to an intermediate texture (32 x 18 pixels). This is similar to a downsampling pass. To do that, I do as follows:   1. Generate a velocity texture (1280 x 720). 2. Adjust the viewport size to 32 x 18 and render a full-screen quad to downsampl the velocity texture from 1280 x 720 to 32 x 18, each pixel in the output texture stores the maximum velocity of a tile in the velocity texture.   But It seems that I have a problem with texture coordinate (uv) as downsampling. In the pixel shader, I compute uv as following: float texture_width // 1280 float texture_height // 720 float2 texel_size = 1.0f / float2(texture_width, texture_height); // Offset a half texel float2 uv_center = uv - texel_size * 0.5f; for (uint col = 0; col < tile_size; col++) // tile_size is 40 x 40 for (uint row = 0; row < tile_size; row++) { float2 sample_pos = float2(uv_center + float2(col, row) * texel_size); float2 curr_velocity = texture.Sample(point_sampler, sample_pos); // Check and save the maximum velocity ... } The velocity texture is   But the maximum velocity texture is:   I am not sure that there is anything wrong with the texture cooridnates?
  8. Thanks for helping me guys. It works.
  9. I just use ResolveSubResource to resolve the multisampled render target
  10. I have the same think and I render a full-screen quad and check it with the following simple pixel shader: // 4 samples per pixel float4 PS(VS_OUTPUT input, uint sampleIndex : SV_SAMPLEINDEX) : SV_TARGET { if (sampleIndex % 2 == 0) return float4(1,0f, 0.0f, 0.0f, 1.0f); return float4(1.0f, 1.0f, 0.0f, 1.0f); } We have 4 sampes per pixel so we have red color at the 2nd and the 4rd samples and yellow color at the 1st and the 3rd samples. But the result is all red.
  11. Thanks guys for your reply, but I think that rendering to a multi-sampled render target then resolving it is a MSAA rendering technique not a SSAA rendering. And using SV_SampleIndex for running pixel shader per sample instead of per pixel but I do not know how to check that the pixel shader is performed at sample frequency. I have search on the internet and they said that SSAA rendering is a two-pass rendering. First, we render to a larger render target (for example 2 times larger for both dimensions) then down sample it to the original size in the second pass. If we use a multi-sampled resource render target in the first pass, it is a combination of SSAA and MSAA, isn't it?
  12. I have googled about enabling supersampling in GPU and I have found that it can be done by using SV_SampleIndex as an pixel shader input. There is no difference after inserting the SV_SampleIndex as an input of the pixel shader. Is there anyway to check that the supersampling is enable or not in DirectX 11 and HLSL?
  13. I have also tried to understand the author's C++ code for computing a upper left ray and a lower right ray. But I am sure that those two rays are in the world space or the camera space. Because he computes a ray direction then apply a transformation using a function named ToWorldSpace(), but this function uses the camera's rotation matrix. It confuses me.
  14. Thank you so much. There is one more thing about generating a ray from a sample position. I have searched on the internet and found a solution to generate a camera-space ray using picking algorithm, which computes a camera-space position from a screen-space position and camera is at the origin. But the original source code does not use this method, the author uses an interpolation. Shader code is // Under MSAA, shift away from the pixel center // subpixelOffset is a hardcorded sample position vec2 myPix = gl_FragCoord.xy - viewportOrigin - vec2(0.5) + (subpixelOffset - vec2(0.5)); // Compute the fraction of the way in screen space that this pixel // is between ulRayDir and lrRayDir. vec2 fraction = myPix * viewportSizeSub1Inv; // Note: z is the same on both rays, so no need to interpolate it direction = vec3(mix(ulRayDir.xy, lrRayDir.xy, fraction), ulRayDir.z); origin = vec2(0.0); direction = normalize(direction); Is there any difference between these two approaches?
  15. Thank for the reply, iFragCoordBase is to calculate a sample position, the whole function is: visibilityMask_last = 0; int2 iFragCoordBase = int2(input.posH.xy)* int2(MSAA_SAMPLES_CONSUMED_X, MSAA_SAMPLES_CONSUMED_Y) + int2(samplePatternShift & 2, samplePatternShift >> 1) * 64; for (int iy = 0; iy < MSAA_SAMPLES_CONSUMED_Y; ++iy) // In 4xMSAA case, MSAA_SAMPLES_CONSUMED_Y = 2 for (int ix = 0; ix < MSAA_SAMPLES_CONSUMED_X; ++ix) // MSAA_SAMPLES_CONSUMED_X = 2 { int index = ix + iy * MSAA_SAMPLES_CONSUMED_X; //sample index int2 samplePos = (iFragCoordBase + int2(ix, iy)) & int2(127, 127); // Load a random time at the current sample from a 2D texture float t = g_randomTime.Load(int3(samplePos, 0)); //-------------------Find the current position of a moving triangle at time t---------------- float3 csPositionAT = lerp(csPositionA0, csPositionA1, t); float3 csPositionBT = lerp(csPositionB0, csPositionB1, t); float3 csPositionCT = lerp(csPositionC0, csPositionC1, t); //------------------------At the current sample, generate a ray from camera------------------ float3 rayDir; float2 rayOrigin; computeRays(input.posH.xy, samplePosXY[index], rayDir, rayOrigin); //------------------------------------------------------------------------------------------- //------------------------------Perform ray-triangle intersection---------------------------- float3 weight; float distance = intersectTri(rayDir, rayOrigin, csPositionAT, csPositionBT, csPositionCT, weight); if (distance > 0.0f) { csPosition_last = rayDir * distance + float3(rayOrigin, 0.0); weight_last = weight; t_last = t; visibilityMask_last = visibilityMask_last | (1 << index); rayDir_last = rayDir; rayOrigin_last = rayOrigin; } } Time is generated randomly in C++ and stored in a 128 x 128 2Dtexture
  • Advertisement