Jump to content

  • Log In with Google      Sign In   
  • Create Account

Awesome job so far everyone! Please give us your feedback on how our article efforts are going. We still need more finished articles for our May contest theme: Remake the Classics

maxest

Member Since 05 Nov 2005
Offline Last Active Yesterday, 09:19 AM
*****

Topics I've Started

[DX9/DX10] Profiling

13 December 2012 - 11:06 AM

I've added some profiling calls to my renderer. Basically, it looks like this:
void CRenderer::beginTimeQuery()
{
#ifdef RENDERER_D3D9
  eventQuery->Issue(D3DISSUE_END);
  while (eventQuery->GetData(NULL, 0, D3DGETDATA_FLUSH) == S_FALSE);
  QueryPerformanceCounter(&beginTime);
#elif RENDERER_D3D10
  timestampDisjointQuery->Begin();
  beginTimestampQuery->End();
#endif
}

double CRenderer::endTimeQuery()
{
#ifdef RENDERER_D3D9
  eventQuery->Issue(D3DISSUE_END);
  while (eventQuery->GetData(NULL, 0, D3DGETDATA_FLUSH) == S_FALSE);
  QueryPerformanceCounter(&endTime);
  return (double)(endTime.QuadPart - beginTime.QuadPart) / (double)timeFrequency.QuadPart;
#elif RENDERER_D3D10
  endTimestampQuery->End();
  timestampDisjointQuery->End();
  uint64 beginTime, endTime;
  D3D10_QUERY_DATA_TIMESTAMP_DISJOINT disjoint;
  while (beginTimestampQuery->GetData(&beginTime, sizeof(uint64), 0) != S_OK);
  while (endTimestampQuery->GetData(&endTime, sizeof(uint64), 0) != S_OK);
  while (timestampDisjointQuery->GetData(&disjoint, sizeof(D3D10_QUERY_DATA_TIMESTAMP_DISJOINT), 0) != S_OK);
  if (!disjoint.Disjoint)
   return (double)(endTime - beginTime) / (double)disjoint.Frequency;
  else
   return -1.0;
#endif
}

Now, I render 2500 cubes, what gives me 230 FPS with both renderers. The problem is that the timings returned by endTimeQuery differ. For D3D9 it returns 0.003 and for D3D10 returns 0.0013. The rendering code does not do anything else and I wrapped the whole rendering loop inside the begin/end time block. So, where does this discrepancy could come from?

Sampling shadow-map inside an if-statement

10 December 2012 - 04:04 PM

I've read this article http://http.develope..._chapter17.html

Basically, in order to speed up shadow rendering we can check whether we are shading a pixel that faces away from the light, and in that case we can skip shadow computation (we know this pixel is in shadow).
So I implemented this. Here comes the code:
float NdotL = saturate(dot(-input.lightDirection, normal));
if (NdotL > 0.0)
{
diffuse *= NdotL;
input.shadowMapTexCoord.z += sunLightShadowMapBias;
#ifdef SOFT_SHADOWS
  float shadow = 0.0f;
  for (int i = 0; i < SOFT_SHADOWS_SAMPLES_NUM; i++)
  {
   float4 texCoord = input.shadowMapTexCoord;
   texCoord.xy += sunLightShadowsSoftness * poissonDiskTexCoords[i];
   shadow += tex2Dproj(shadowMap, texCoord).r;
  }
  shadow /= (float)SOFT_SHADOWS_SAMPLES_NUM;
#else
  float shadow = tex2Dproj(shadowMap, input.shadowMapTexCoord).r;
#endif

output.color += shadow * diffuse;
}

I'm using Cg and compile to D3D9 SM3.0 shaders, and OGL ARB shaders.

There is one problem however. Under D3D9 rendering is broken, giving me some random-colored flat-shaded surfaces. I would guess that hardware or the driver cannot handle texture lookups inside the if-statement but that could be a problem for a mipmapped texture. My shadow map has no mipmaps so I would expect texture lookups to work inside if-statements.
Moreover, on OpenGL everything works fine, yielding some slight performance boost when the if-statement is on.

Any ideas?

VIsual C++ generates redundant code of inline function in Release mode

04 December 2012 - 06:53 AM

Hey guys,

In my cross-renderer I have a function that sets shader's constant:
inline void CRenderer::setVertexShaderConstant_ogl_d3d9(const string& name, const mtx& m)
{
#ifdef RENDERER_OGL
  if (!currentShaderProgram->setMatrix(name, m))
  {
   SAFE_CALL(Renderer_outputMessageFunction)(string("WARNING: There is no uniform with given name or it's not used in the shader"));
   SAFE_CALL(Renderer_outputMessageFunction)("\tVertex shader: " + currentVertexShader->name);
   SAFE_CALL(Renderer_outputMessageFunction)("\tUniform name: " + name);
  }
#elif RENDERER_D3D9
  currentVertexShader->constantTable.setMatrix(D3D9Device, name, m);
#endif
}

Now, in my rendering loop I call this function:
renderer.setVertexShaderConstant_ogl_d3d9("worldTransform", worldTransform);
renderer.setVertexShaderConstant_ogl_d3d9("viewProjTransform", viewProjTransform);

It works fine. The problem is that when I switch to D3D10 renderer (whose source is empty in this function) VC++ generates some redundant code for it.

This is assembly when these two lines are commented out:
; 210  :    // renderer.setVertexShaderConstant_ogl_d3d9("worldTransform", worldTransform);
; 211  :    // renderer.setVertexShaderConstant_ogl_d3d9("viewProjTransform", viewProjTransform);

And the assembly when these two functions are called:
; 210  :	 renderer.setVertexShaderConstant_ogl_d3d9("worldTransform", worldTransform);
push OFFSET ??_C@_0P@KOAAPGHC@worldTransform?$AA@
lea ecx, DWORD PTR $T190163[esp+392]
call DWORD PTR __imp_??0?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@QAE@PBD@Z
lea ecx, DWORD PTR $T190163[esp+388]
call DWORD PTR __imp_??1?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@QAE@XZ
; 211  :	 renderer.setVertexShaderConstant_ogl_d3d9("viewProjTransform", viewProjTransform);
push OFFSET ??_C@_0BC@FADNJIOD@viewProjTransform?$AA@
lea ecx, DWORD PTR $T190164[esp+392]
call DWORD PTR __imp_??0?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@QAE@PBD@Z
lea ecx, DWORD PTR $T190164[esp+388]
call DWORD PTR __imp_??1?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@QAE@XZ

My intuition would be that if the function is empty for D3D10 renderer and it is inline, the compiler will simply skip it, but it doesn't. Any idea why?

FrameBuffer mixing with BackBuffer

12 August 2012 - 07:24 AM

I read here (http://stackoverflow.com/questions/5279123/normal-back-buffer-render-to-depth-texture-in-opengl-fbos) that it is impossible to mix backbuffer with a custom GL framebuffer. I ran into this problem today, but it appears that some sort of mixing does work.

I have a few passes in my engine. The first one is early-z, followed by a pass that writes down normals and depths. Early-z renders to the backbuffer, but normal-depth pass switches the render target (*). What is interesting is that normal-depth pass utilizies early-z to only shade pixels that are visible, but in theory that should not work as the backbuffer's z-buffer is filled it, whereas I already have bound a new framebuffer object, which has not z-buffer render target or texture bound at all.

On the other hand, if I skip the early-z pass, leaving only normal-depth that sets the render target, the further rendering screws up (note that right now, as there is no early-z, normal-depth pass has to write z values). So basically it appears that reading from the backbuffer's Z is possible, but it is not possible to simultanously write to a framebuffer color, and backbuffer's z. Is that correct?

I am aware that my description might be a bit enigmatic so if necessary, I can describe it in more detail and post more code.

(*) the function looks this:
void CRenderer::setRenderTarget(const CRenderTarget *renderTarget)
{
  if (renderTarget == NULL)
  {
   glViewport(0, 0, CApplication::getScreenWidth(), CApplication::getScreenHeight());
   glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, 0);
  }
  else
  {
   glViewport(0, 0, renderTarget->width, renderTarget->height);
   glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, offScreenFramebuffer);
   glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, GL_TEXTURE_2D, renderTarget->texture, 0);
  }
}

Materials Blending a'la UDK

16 May 2012 - 01:23 AM

Hey,

I've just stumbled upon an interesting video showing off materials blending in UDK (http://www.3dmotive.com/training/udk/advanced-mesh-paint-with-udk/) and I was wondering how such an effect could be implemented.

There are two ways that this could be accomplished I can think of.
First is to render the mesh twice and use alpha blending. One mesh would be rendered on top of the other one with alpha blending turned on.
The second way would be to render mesh once, with both materials computed in a single shader and lerped at the end. That would however require some materials generation system.

Which solution do you think UDK uses? Does it render mesh twice, or does it use its internal material system to combine component materials?

PARTNERS