• Advertisement
Sign in to follow this  

DX11 What is the point of using Catmull-Clark subdivision shaders?

This topic is 1817 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I've been checking out demos of Catmull-Clark subdivisions implemented with DX11 tessellation,however I don't understand what exactly is the benefit of this technique.The visual effects are identical to the simpler,basic dynamic-LOD-tessellation shaders in the samples,yet the Catmull-Clark samples are a LOT heavier on performance.What am I missing?

Share this post


Link to post
Share on other sites
Advertisement

I'm not that familiar with the samples, but they're probably just implementing "linear" tesselation, where more triangles are added, but they don't curve at all to better match the curved surface that's roughly defined by their 'source' triangles. This is useful when you need extra vertices for something like displacement mapping, but not for smoothing out edges.


Catmull-Clark subD surfaces add curvature to the generated "sub triangles", e.g. on the Wikipedia page, you can see a cube bulge out into a sphere. The artist has control over how/where this "bulging" will occur.

Also, these surfaces and their behaviours are programmed into many 3D modelling packages, so if you implement them in the exact same way, then an artist working with Max/Maya/Blender/Softimage/etc can tweak their "bulge"/"smooth" parameters to get the kind of shape that they want, and then know it's actually going to appear that way in the engine too.

Edited by Hodgman

Share this post


Link to post
Share on other sites

actually, the artist have barely control over where bulging etc. happens, if you look for it on the net, you'll see that a lot of beginner artist wonder how they can control it. e.g. if you have a cylinder and you tessellate it with catmull-clark to make it rounder, you will end up with a capsule shape. some editing packages add extensions where artist can define hard borders, but most work-arounds for the original algorithm are to add two borders on edges you want to preserve to some degree (beveling in 3ds max), but you still get some smoothing at them.

but that's actually what makes catmull clark so nice and why artist who worked with the pure version, don't like the tools that extend it. if you have some nurb surfaces or bezier patches or ..., artist have to tweak them, and if you have an animated mesh, you have to tweak those control points in every keyframe, which makes it quite a lot of work. catmull clark meshes just work, they deliver mostly the expected result, they have no control points to skin with the mesh or to adjust. you tessellate an object, it looks nice, you apply a displacement texture and that's it. and while other algorithms usually get into trouble when you vary in the valence of your polys, catmull clark also works nicely in those special cases.

 

I also think you haven't seen a DX11 tessellation implementation of catmull clark, the tessellator hardware of dx11 cannot really be used for catmull clark as catmull clark is a recursive approach. there are ways to make it none-recursive, but the higher the tessellation factor, the more of the mesh you evaluate, it's not doable beyond some simple shapes. you've probably seen some approximation of catmull clark using e.g. bezier patches. but those are quite complex and error prone to implement and you need to run them on every animation step of a mesh, to re-create the approximation (at least that's what I've read in the papers when I was implementing it).

 

however, it's quite straight forward to implement catmull clark via compute. it's actually really nice for GPUs, working on every vertex independently etc.

http://twitpic.com/3ud6cx

:)

Share this post


Link to post
Share on other sites

actually, the artist have barely control over where bulging etc. happens

I've never modelled anything with catmull-clark surfaces -- is the tesselation shape dependent only on the vertex positions and normals, like phong tesselation?

Share this post


Link to post
Share on other sites



actually, the artist have barely control over where bulging etc. happens

I've never modelled anything with catmull-clark surfaces -- is the tesselation shape dependent only on the vertex positions and normals, like phong tesselation?
Normals are ignored. The new points are build by averaging neighbouring polygon centers, edge centers, vertices... The different rules for subdivided corner points / edge-, poly-centers are simple, but because the process is recursive, it's difficult to accelerate.

I've done a lot of modeling with catmull clark and also made my own editor because i was not happy with crease options from commercial apps.
For modeling organic shapes catmull clark is the best option. With proper creases it's also a very good alternative to nurbs for things like cars etc., while still easier to understand.
Cons are: You need to avoid triangles and use regular quad grids whenever possible. A good model will end up with mostly quads, some 5 sided and a few 6 sided polygons.
Subdividing a typical triangulated mesh makes no sense - you need to have the original quadbased model to get good results.

The first subdivision step is special, it does the most important work and ends up with a mesh containing quads only.
For a good HW-acceleration it gives sense to do it with its own algorithm, maybe on CPU.
For following steps it could give sense to switch to a more hardware friendly method, like bezier patches.

If anyone has experience with practical HW-acceleration i would like to hear something about it too...
Note that this can be a very good thing, because if you do the skinning with the low res control mesh, you get MUCH better final high res skinning! This also saves some work, as you don't need to skin the subdivided stuff.

Skinning is where difference to other tesselation methods shows up most noticeably. Because the corner vertices get smoothed too, not just the surface around them. Maybe it's hard for a programmer to get the point why they are so good compared th other methods - but with skinning the difference in visual quality is really huge. Trust me :) Edited by JoeJ

Share this post


Link to post
Share on other sites

Hodgman

JoeJ pretty much hits the spot :)

just to emphasize it, while just positions are taken and it sounds like you loose a lot of informations (e.g. curvature that normals might express), it's actually the really good point of the algorithm, it is very very simple, you know what to expect, every implementation will lead to the same result (if you try to get some data from one modeling package to the other, tessellated stuff can be a horror, while catmull-clark basically is just an obj mesh, no extra features/data).

 

If anyone has experience with practical HW-acceleration i would like to hear something about it too...
Note that this can be a very good thing, because if you do the skinning with the low res control mesh, you get MUCH better final high res skinning! This also saves some work, as you don't need to skin the subdivided stuff.

you mean the tessellator on GPU? I've used it to implement an approximation described in this paper: http://faculty.cs.tamu.edu/schaefer/research/acc.pdf

 

as I said in my first post here, the sad thing comes with animation, I had to evaluate the skinned mesh every time, to generate those patches and to make it leak-free is quite an effort, nothing compared to the simplicity and beauty of catmull-clark tessellation.

 

 

Skinning is where difference to other tesselation methods shows up most noticeably. Because the corner vertices get smoothed too, not just the surface around them. Maybe it's hard for a programmer to get the point why they are so good compared th other methods - but with skinning the difference in visual quality is really huge. Trust me smile.png

I totally agree, that's why I've made the GPGU version of it, it works flawlessly with skinned characters, it's fast even in the cpu version (vectorized), you can go crazy to 1Mio vertices, then displace them (also with GPGPU) and it just works. :)

 

Hardware tessellation units are way faster, of course, but even without HW, you can get to a point where the polycount exceeds the pixelcount by far (while you still have normalmaps etc) and it's still running smoothly on average GPUs.

Share this post


Link to post
Share on other sites

Thx for summing up again, that gives a lot of sense to me now. I'm not really up to date with GPU stuff and missed the point that OGL/DX now have their own compute stuff and we can avoid to choose between Cuda or OpenCL :)

Share this post


Link to post
Share on other sites

Thx for summing up again, that gives a lot of sense to me now. I'm not really up to date with GPU stuff and missed the point that OGL/DX now have their own compute stuff and we can avoid to choose between Cuda or OpenCL smile.png

I've actually implemented it in OpenCL.

I've also written an rasterizer in OpenCL (for this renderer) rather than inter-op with OGL/DX ( tho, I have sadly no Catmull+software screenshot, just http://twitpic.com/40e85b ), but abusing the massive compute power for rasterization works actually quite nicely. you setup 1024 triangles into the local memory, then you can work on them in 8x8 pixel granularity, I think I got 10% to 20% of the theoretical peak hardware rasterization performance in a real world scenario. it wasn't even fully optimized, I just stopped when it was fast enough (was just like 2 or 3 days of work to make the rasterizer).

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
  • Advertisement
  • Popular Now

  • Advertisement
  • Similar Content

    • By AxeGuywithanAxe
      I wanted to see how others are currently handling descriptor heap updates and management.
      I've read a few articles and there tends to be three major strategies :
      1 ) You split up descriptor heaps per shader stage ( i.e one for vertex shader , pixel , hull, etc)
      2) You have one descriptor heap for an entire pipeline
      3) You split up descriptor heaps for update each update frequency (i.e EResourceSet_PerInstance , EResourceSet_PerPass , EResourceSet_PerMaterial, etc)
      The benefits of the first two approaches is that it makes it easier to port current code, and descriptor / resource descriptor management and updating tends to be easier to manage, but it seems to be not as efficient.
      The benefits of the third approach seems to be that it's the most efficient because you only manage and update objects when they change.
    • By evelyn4you
      hi,
      until now i use typical vertexshader approach for skinning with a Constantbuffer containing the transform matrix for the bones and an the vertexbuffer containing bone index and bone weight.
      Now i have implemented realtime environment  probe cubemaping so i have to render my scene from many point of views and the time for skinning takes too long because it is recalculated for every side of the cubemap.
      For Info i am working on Win7 an therefore use one Shadermodel 5.0 not 5.x that have more options, or is there a way to use 5.x in Win 7
      My Graphic Card is Directx 12 compatible NVidia GTX 960
      the member turanszkij has posted a good for me understandable compute shader. ( for Info: in his engine he uses an optimized version of it )
      https://turanszkij.wordpress.com/2017/09/09/skinning-in-compute-shader/
      Now my questions
       is it possible to feed the compute shader with my orignial vertexbuffer or do i have to copy it in several ByteAdressBuffers as implemented in the following code ?
        the same question is about the constant buffer of the matrixes
       my more urgent question is how do i feed my normal pipeline with the result of the compute Shader which are 2 RWByteAddressBuffers that contain position an normal
      for example i could use 2 vertexbuffer bindings
      1 containing only the uv coordinates
      2.containing position and normal
      How do i copy from the RWByteAddressBuffers to the vertexbuffer ?
       
      (Code from turanszkij )
      Here is my shader implementation for skinning a mesh in a compute shader:
      1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 struct Bone { float4x4 pose; }; StructuredBuffer<Bone> boneBuffer;   ByteAddressBuffer vertexBuffer_POS; // T-Pose pos ByteAddressBuffer vertexBuffer_NOR; // T-Pose normal ByteAddressBuffer vertexBuffer_WEI; // bone weights ByteAddressBuffer vertexBuffer_BON; // bone indices   RWByteAddressBuffer streamoutBuffer_POS; // skinned pos RWByteAddressBuffer streamoutBuffer_NOR; // skinned normal RWByteAddressBuffer streamoutBuffer_PRE; // previous frame skinned pos   inline void Skinning(inout float4 pos, inout float4 nor, in float4 inBon, in float4 inWei) {  float4 p = 0, pp = 0;  float3 n = 0;  float4x4 m;  float3x3 m3;  float weisum = 0;   // force loop to reduce register pressure  // though this way we can not interleave TEX - ALU operations  [loop]  for (uint i = 0; ((i &lt; 4) &amp;&amp; (weisum&lt;1.0f)); ++i)  {  m = boneBuffer[(uint)inBon].pose;  m3 = (float3x3)m;   p += mul(float4(pos.xyz, 1), m)*inWei;  n += mul(nor.xyz, m3)*inWei;   weisum += inWei;  }   bool w = any(inWei);  pos.xyz = w ? p.xyz : pos.xyz;  nor.xyz = w ? n : nor.xyz; }   [numthreads(1024, 1, 1)] void main( uint3 DTid : SV_DispatchThreadID ) {  const uint fetchAddress = DTid.x * 16; // stride is 16 bytes for each vertex buffer now...   uint4 pos_u = vertexBuffer_POS.Load4(fetchAddress);  uint4 nor_u = vertexBuffer_NOR.Load4(fetchAddress);  uint4 wei_u = vertexBuffer_WEI.Load4(fetchAddress);  uint4 bon_u = vertexBuffer_BON.Load4(fetchAddress);   float4 pos = asfloat(pos_u);  float4 nor = asfloat(nor_u);  float4 wei = asfloat(wei_u);  float4 bon = asfloat(bon_u);   Skinning(pos, nor, bon, wei);   pos_u = asuint(pos);  nor_u = asuint(nor);   // copy prev frame current pos to current frame prev pos streamoutBuffer_PRE.Store4(fetchAddress, streamoutBuffer_POS.Load4(fetchAddress)); // write out skinned props:  streamoutBuffer_POS.Store4(fetchAddress, pos_u);  streamoutBuffer_NOR.Store4(fetchAddress, nor_u); }  
    • By mister345
      Hi, can someone please explain why this is giving an assertion EyePosition!=0 exception?
       
      _lightBufferVS->viewMatrix = DirectX::XMMatrixLookAtLH(XMLoadFloat3(&_lightBufferVS->position), XMLoadFloat3(&_lookAt), XMLoadFloat3(&up));
      It looks like DirectX doesnt want the 2nd parameter to be a zero vector in the assertion, but I passed in a zero vector with this exact same code in another program and it ran just fine. (Here is the version of the code that worked - note XMLoadFloat3(&m_lookAt) parameter value is (0,0,0) at runtime - I debugged it - but it throws no exceptions.
          m_viewMatrix = DirectX::XMMatrixLookAtLH(XMLoadFloat3(&m_position), XMLoadFloat3(&m_lookAt), XMLoadFloat3(&up)); Here is the repo for the broken code (See LightClass) https://github.com/mister51213/DirectX11Engine/blob/master/DirectX11Engine/LightClass.cpp
      and here is the repo with the alternative version of the code that is working with a value of (0,0,0) for the second parameter.
      https://github.com/mister51213/DX11Port_SoftShadows/blob/master/Engine/lightclass.cpp
    • By mister345
      Hi, can somebody please tell me in clear simple steps how to debug and step through an hlsl shader file?
      I already did Debug > Start Graphics Debugging > then captured some frames from Visual Studio and
      double clicked on the frame to open it, but no idea where to go from there.
       
      I've been searching for hours and there's no information on this, not even on the Microsoft Website!
      They say "open the  Graphics Pixel History window" but there is no such window!
      Then they say, in the "Pipeline Stages choose Start Debugging"  but the Start Debugging option is nowhere to be found in the whole interface.
      Also, how do I even open the hlsl file that I want to set a break point in from inside the Graphics Debugger?
       
      All I want to do is set a break point in a specific hlsl file, step thru it, and see the data, but this is so unbelievably complicated
      and Microsoft's instructions are horrible! Somebody please, please help.
       
       
       

    • By mister345
      I finally ported Rastertek's tutorial # 42 on soft shadows and blur shading. This tutorial has a ton of really useful effects and there's no working version anywhere online.
      Unfortunately it just draws a black screen. Not sure what's causing it. I'm guessing the camera or ortho matrix transforms are wrong, light directions, or maybe texture resources not being properly initialized.  I didnt change any of the variables though, only upgraded all types and functions DirectX3DVector3 to XMFLOAT3, and used DirectXTK for texture loading. If anyone is willing to take a look at what might be causing the black screen, maybe something pops out to you, let me know, thanks.
      https://github.com/mister51213/DX11Port_SoftShadows
       
      Also, for reference, here's tutorial #40 which has normal shadows but no blur, which I also ported, and it works perfectly.
      https://github.com/mister51213/DX11Port_ShadowMapping
       
  • Advertisement