Sign in to follow this  
Followers 0
mrheisenberg

DX11
What is the point of using Catmull-Clark subdivision shaders?

7 posts in this topic

I've been checking out demos of Catmull-Clark subdivisions implemented with DX11 tessellation,however I don't understand what exactly is the benefit of this technique.The visual effects are identical to the simpler,basic dynamic-LOD-tessellation shaders in the samples,yet the Catmull-Clark samples are a LOT heavier on performance.What am I missing?

0

Share this post


Link to post
Share on other sites

I'm not that familiar with the samples, but they're probably just implementing "linear" tesselation, where more triangles are added, but they don't curve at all to better match the curved surface that's roughly defined by their 'source' triangles. This is useful when you need extra vertices for something like displacement mapping, but not for smoothing out edges.


Catmull-Clark subD surfaces add curvature to the generated "sub triangles", e.g. on the Wikipedia page, you can see a cube bulge out into a sphere. The artist has control over how/where this "bulging" will occur.

Also, these surfaces and their behaviours are programmed into many 3D modelling packages, so if you implement them in the exact same way, then an artist working with Max/Maya/Blender/Softimage/etc can tweak their "bulge"/"smooth" parameters to get the kind of shape that they want, and then know it's actually going to appear that way in the engine too.

Edited by Hodgman
1

Share this post


Link to post
Share on other sites

actually, the artist have barely control over where bulging etc. happens, if you look for it on the net, you'll see that a lot of beginner artist wonder how they can control it. e.g. if you have a cylinder and you tessellate it with catmull-clark to make it rounder, you will end up with a capsule shape. some editing packages add extensions where artist can define hard borders, but most work-arounds for the original algorithm are to add two borders on edges you want to preserve to some degree (beveling in 3ds max), but you still get some smoothing at them.

but that's actually what makes catmull clark so nice and why artist who worked with the pure version, don't like the tools that extend it. if you have some nurb surfaces or bezier patches or ..., artist have to tweak them, and if you have an animated mesh, you have to tweak those control points in every keyframe, which makes it quite a lot of work. catmull clark meshes just work, they deliver mostly the expected result, they have no control points to skin with the mesh or to adjust. you tessellate an object, it looks nice, you apply a displacement texture and that's it. and while other algorithms usually get into trouble when you vary in the valence of your polys, catmull clark also works nicely in those special cases.

 

I also think you haven't seen a DX11 tessellation implementation of catmull clark, the tessellator hardware of dx11 cannot really be used for catmull clark as catmull clark is a recursive approach. there are ways to make it none-recursive, but the higher the tessellation factor, the more of the mesh you evaluate, it's not doable beyond some simple shapes. you've probably seen some approximation of catmull clark using e.g. bezier patches. but those are quite complex and error prone to implement and you need to run them on every animation step of a mesh, to re-create the approximation (at least that's what I've read in the papers when I was implementing it).

 

however, it's quite straight forward to implement catmull clark via compute. it's actually really nice for GPUs, working on every vertex independently etc.

http://twitpic.com/3ud6cx

:)

1

Share this post


Link to post
Share on other sites

actually, the artist have barely control over where bulging etc. happens

I've never modelled anything with catmull-clark surfaces -- is the tesselation shape dependent only on the vertex positions and normals, like phong tesselation?
0

Share this post


Link to post
Share on other sites



actually, the artist have barely control over where bulging etc. happens

I've never modelled anything with catmull-clark surfaces -- is the tesselation shape dependent only on the vertex positions and normals, like phong tesselation?
Normals are ignored. The new points are build by averaging neighbouring polygon centers, edge centers, vertices... The different rules for subdivided corner points / edge-, poly-centers are simple, but because the process is recursive, it's difficult to accelerate.

I've done a lot of modeling with catmull clark and also made my own editor because i was not happy with crease options from commercial apps.
For modeling organic shapes catmull clark is the best option. With proper creases it's also a very good alternative to nurbs for things like cars etc., while still easier to understand.
Cons are: You need to avoid triangles and use regular quad grids whenever possible. A good model will end up with mostly quads, some 5 sided and a few 6 sided polygons.
Subdividing a typical triangulated mesh makes no sense - you need to have the original quadbased model to get good results.

The first subdivision step is special, it does the most important work and ends up with a mesh containing quads only.
For a good HW-acceleration it gives sense to do it with its own algorithm, maybe on CPU.
For following steps it could give sense to switch to a more hardware friendly method, like bezier patches.

If anyone has experience with practical HW-acceleration i would like to hear something about it too...
Note that this can be a very good thing, because if you do the skinning with the low res control mesh, you get MUCH better final high res skinning! This also saves some work, as you don't need to skin the subdivided stuff.

Skinning is where difference to other tesselation methods shows up most noticeably. Because the corner vertices get smoothed too, not just the surface around them. Maybe it's hard for a programmer to get the point why they are so good compared th other methods - but with skinning the difference in visual quality is really huge. Trust me :) Edited by JoeJ
3

Share this post


Link to post
Share on other sites

Hodgman

JoeJ pretty much hits the spot :)

just to emphasize it, while just positions are taken and it sounds like you loose a lot of informations (e.g. curvature that normals might express), it's actually the really good point of the algorithm, it is very very simple, you know what to expect, every implementation will lead to the same result (if you try to get some data from one modeling package to the other, tessellated stuff can be a horror, while catmull-clark basically is just an obj mesh, no extra features/data).

 

If anyone has experience with practical HW-acceleration i would like to hear something about it too...
Note that this can be a very good thing, because if you do the skinning with the low res control mesh, you get MUCH better final high res skinning! This also saves some work, as you don't need to skin the subdivided stuff.

you mean the tessellator on GPU? I've used it to implement an approximation described in this paper: http://faculty.cs.tamu.edu/schaefer/research/acc.pdf

 

as I said in my first post here, the sad thing comes with animation, I had to evaluate the skinned mesh every time, to generate those patches and to make it leak-free is quite an effort, nothing compared to the simplicity and beauty of catmull-clark tessellation.

 

 

Skinning is where difference to other tesselation methods shows up most noticeably. Because the corner vertices get smoothed too, not just the surface around them. Maybe it's hard for a programmer to get the point why they are so good compared th other methods - but with skinning the difference in visual quality is really huge. Trust me smile.png

I totally agree, that's why I've made the GPGU version of it, it works flawlessly with skinned characters, it's fast even in the cpu version (vectorized), you can go crazy to 1Mio vertices, then displace them (also with GPGPU) and it just works. :)

 

Hardware tessellation units are way faster, of course, but even without HW, you can get to a point where the polycount exceeds the pixelcount by far (while you still have normalmaps etc) and it's still running smoothly on average GPUs.

2

Share this post


Link to post
Share on other sites

Thx for summing up again, that gives a lot of sense to me now. I'm not really up to date with GPU stuff and missed the point that OGL/DX now have their own compute stuff and we can avoid to choose between Cuda or OpenCL :)

0

Share this post


Link to post
Share on other sites

Thx for summing up again, that gives a lot of sense to me now. I'm not really up to date with GPU stuff and missed the point that OGL/DX now have their own compute stuff and we can avoid to choose between Cuda or OpenCL smile.png

I've actually implemented it in OpenCL.

I've also written an rasterizer in OpenCL (for this renderer) rather than inter-op with OGL/DX ( tho, I have sadly no Catmull+software screenshot, just http://twitpic.com/40e85b ), but abusing the massive compute power for rasterization works actually quite nicely. you setup 1024 triangles into the local memory, then you can work on them in 8x8 pixel granularity, I think I got 10% to 20% of the theoretical peak hardware rasterization performance in a real world scenario. it wasn't even fully optimized, I just stopped when it was fast enough (was just like 2 or 3 days of work to make the rasterizer).

0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • By Enitalp
      Hi all.
      I have a direct2d1 application with all my UI. And now i'm trying to insert 3d rendering in my UI. I tried a lot of thing as i'm new to that. and failed....
      So my UI contain a control that is a 3d render. so stupidly i was thinking of making a 3d rendertarget, get the bitmap of that. and draw it at the place of my control.
      So i created this function
      public Bitmap1 CreateTarget(int i_Width, int i_Height) { Texture2DDescription l_Description = new Texture2DDescription(); l_Description.BindFlags = BindFlags.RenderTarget; l_Description.Format = m_BackBuffer.Description.Format; l_Description.Width = i_Width; l_Description.Height = i_Height; l_Description.Usage = ResourceUsage.Default; l_Description.ArraySize = 1; l_Description.MipLevels = 1; l_Description.SampleDescription = new SampleDescription(1, 0); l_Description.CpuAccessFlags = CpuAccessFlags.None; l_Description.OptionFlags = ResourceOptionFlags.None; Texture2D l_RenderTarget = new Texture2D(m_Device, l_Description); BitmapProperties1 properties = new BitmapProperties1() { PixelFormat = new PixelFormat(l_Description.Format, SharpDX.Direct2D1.AlphaMode.Premultiplied), BitmapOptions = BitmapOptions.Target, DpiX=96, DpiY = 96 }; Bitmap1 m_OffscreenBitmap; using (Surface l_Surface = l_RenderTarget.QueryInterface<Surface>()) { m_OffscreenBitmap = new Bitmap1(m_2DContext, l_Surface, properties); } return m_OffscreenBitmap; }  
      And my control does a simple :
      if (m_OldSize != Size) { m_OldSize = Size; if (m_OffscreenBitmap != null) { m_OffscreenBitmap.Dispose(); } m_OffscreenBitmap = i_Param.CurrentWindow.CreateTarget(Size.Width, Size.Height); } i_Context.DrawContext2D.DrawBitmap(m_OffscreenBitmap, m_Rect, 1.0f, BitmapInterpolationMode.Linear);  
      Here is my problem, if BitmapOptions is different from BitmapOptions = BitmapOptions.Target | BitmapOptions.CannotDraw
      i crash when creating my new Bitmap1 because of invalid params.
      and if i let it, i crash at present because :
      Additional information: HRESULT: [0x88990021], Module: [SharpDX.Direct2D1], ApiCode: [D2DERR_BITMAP_CANNOT_DRAW/BitmapCannotDraw], Message: Impossible de dessiner avec une bitmap qui a l’option D2D1_BITMAP_OPTIONS_CANNOT_DRAW.
       
      I must admit i'm out of idea. and i'm stuck. Please help.
      Does my method is totally wrong ?
      I tried to make my control owning is own 3d device so i can render that at a different pace than the 2d and did get the same result
       
       
       
       
    • By Zototh
      I am using slimDX and am having a problem with a shader. I have an instance Shader that works perfect but I needed one for drawing fonts manually. The idea is to create the plane and simple instance it with separate position color and texture coordinates for each char.  I know this post is terribly long but any help would be appreciated. I tried to provide everything needed but if you need more I will be glad to post it.
      This is the shader. the only difference between it and the working one is the instance texture coordinates. I was able to render 4,000 spheres with 30,000 faces with the original and still maintain a 100+ framerate. I don't know if that is a lot but it looked like it to me.
      cbuffer cbVSPerFrame:register(b0) { row_major matrix world; row_major matrix viewProj; }; Texture2D g_Tex; SamplerState g_Sampler; struct VSInstance { float4 Pos : POSITION; float3 Normal : NORMAL; float2 Texcoord : TEXCOORD0; float4 model_matrix0 : TEXCOORD1; float4 model_matrix1 : TEXCOORD2; float4 model_matrix2 : TEXCOORD3; float4 model_matrix3 : TEXCOORD4; // this is the only addition float2 instanceCoord:TEXCOORD5; float4 Color:COLOR; }; struct PSInput { float4 Pos : SV_Position; float3 Normal : NORMAL; float4 Color:COLOR; float2 Texcoord : TEXCOORD0; }; PSInput Instancing(VSInstance In) { PSInput Out; // construct the model matrix row_major float4x4 modelMatrix = { In.model_matrix0, In.model_matrix1, In.model_matrix2, In.model_matrix3 }; Out.Normal = mul(In.Normal, (row_major float3x3)modelMatrix); float4 WorldPos = mul(In.Pos, modelMatrix); Out.Pos = mul(WorldPos, viewProj); Out.Texcoord = In.instanceCoord; Out.Color = In.Color; return Out; } float4 PS(PSInput In) : SV_Target { return g_Tex.Sample(g_Sampler, In.Texcoord); } technique11 HWInstancing { pass P0 { SetGeometryShader(0); SetVertexShader(CompileShader(vs_4_0, Instancing())); SetPixelShader(CompileShader(ps_4_0, PS())); } } this is the input elements for the 2 buffers
      private static readonly InputElement[] TextInstance = { new InputElement("POSITION", 0, Format.R32G32B32_Float, 0, 0, InputClassification.PerVertexData, 0), new InputElement("NORMAL", 0, Format.R32G32B32_Float, InputElement.AppendAligned, 0, InputClassification.PerVertexData, 0), new InputElement("TEXCOORD", 0, Format.R32G32_Float, InputElement.AppendAligned, 0, InputClassification.PerVertexData, 0), new InputElement("TEXCOORD", 1, Format.R32G32B32A32_Float, 0, 1, InputClassification.PerInstanceData, 1 ), new InputElement("TEXCOORD", 2, Format.R32G32B32A32_Float, InputElement.AppendAligned, 1, InputClassification.PerInstanceData, 1 ), new InputElement("TEXCOORD", 3, Format.R32G32B32A32_Float, InputElement.AppendAligned, 1, InputClassification.PerInstanceData, 1 ), new InputElement("TEXCOORD", 4, Format.R32G32B32A32_Float, InputElement.AppendAligned, 1, InputClassification.PerInstanceData, 1 ), new InputElement("TEXCOORD", 5, Format.R32G32_Float, InputElement.AppendAligned, 1, InputClassification.PerInstanceData, 1 ), new InputElement("COLOR", 0, Format.R32G32B32A32_Float, InputElement.AppendAligned, 1, InputClassification.PerInstanceData, 1 ) }; the struct for holding instance data. 
      [StructLayout(LayoutKind.Sequential)] public struct InstancedText { public Matrix InstancePosition; public Vector2 InstanceCoords; public Color4 Color; }; instanceData buffer creation. Instance Positions is a simple List<InstancedText> above
      DataStream ds = new DataStream(InstancePositions.ToArray(), true, true); BufferDescription vbDesc = new BufferDescription(); vbDesc.BindFlags = BindFlags.VertexBuffer; vbDesc.CpuAccessFlags = CpuAccessFlags.None; vbDesc.OptionFlags = ResourceOptionFlags.None; vbDesc.Usage = ResourceUsage.Default; vbDesc.SizeInBytes = InstancePositions.Count * Marshal.SizeOf<InstancedText>(); vbDesc.StructureByteStride = Marshal.SizeOf<InstancedText>(); ds.Position = 0; instanceData = new Buffer(renderer.Device, vbDesc);  
      and finally the render code.
      the mesh is a model class that contains the plane's data. PositionNormalTexture is just a struct for those elements.
      renderer.Context.InputAssembler.InputLayout = new InputLayout(renderer.Device, effect.GetTechniqueByName("HWInstancing").GetPassByIndex(0).Description.Signature, TextInstance); renderer.Context.InputAssembler.PrimitiveTopology = PrimitiveTopology.TriangleList; renderer.Context.InputAssembler.SetVertexBuffers(0, new VertexBufferBinding(mesh.VertexBuffer, Marshal.SizeOf<PositionNormalTexture>(), 0)); renderer.Context.InputAssembler.SetIndexBuffer(mesh.IndexBuffer, SlimDX.DXGI.Format.R32_UInt, 0); renderer.Context.InputAssembler.SetVertexBuffers(1, new VertexBufferBinding(instanceData, Marshal.SizeOf<InstancedText>(), 0)); effect.GetVariableByName("g_Tex").AsResource().SetResource(textures[fonts[name].Name]); EffectTechnique currentTechnique = effect.GetTechniqueByName("HWInstancing"); for (int pass = 0; pass < currentTechnique.Description.PassCount; ++pass) { EffectPass Pass = currentTechnique.GetPassByIndex(pass); System.Diagnostics.Debug.Assert(Pass.IsValid, "Invalid EffectPass"); Pass.Apply(renderer.Context); renderer.Context.DrawIndexedInstanced(mesh.IndexCount, InstancePositions.Count, 0, 0, 0); }; I have been over everything I can think of to find the problem but I can't seem to locate it.
      my best guess is the instance data buffer is wrong somehow since VS graphics debugger shows no output from vertex shader stage
       but I just can't see where.
    • By Jordy
      I'm copying mipmaps of a BC3 compressed texture region to a new (and bigger) BC3 compressed texture with ID3D11DeviceContext::CopySubresourceRegion.
      Unfortunately the new texture contains incorrect mipmaps when the width or height of a mipmap level are unaligned to the block size, which is 4 in the case of BC3.
      I think this has to do with the virtual and physical size of a mipmap level for block compressed textures: https://msdn.microsoft.com/en-us/library/windows/desktop/bb694531(v=vs.85).aspx#Virtual_Size
      There is also a warning:
      I don't know how to account for the physical memory size and if that's possible when using ID3D11DeviceContext::CopySubresourceRegion.
      Is it possible, and if so, how?
    • By thefoxbard
      From what the MSDN states, there are two ways of compiling HLSL shaders: either at runtime or "offline" -- using a tool like fxc.exe, for instance
      My question is, are there any risks in using pre-compiled shaders in the final game? I mean, is there any situation in which the pre-compiled shaders might not work?
      Or ideally shaders should always be compiled when lauching the game?
    • By maxest
      I have code like this:
      groupshared uint tempData[ElementsCount]; [numthreads(ElementsCount/2, 1, 1)] void CSMain(uint3 gID: SV_GroupID, uint3 gtID: SV_GroupThreadID) {     tempData[gtID.x] = 0; } And it works fine. Now I change it to this:
      void MyFunc(inout uint3 gtID: SV_GroupThreadID, inout uint inputData[ElementsCount]) {     inputData[gtID.x] = 0; } groupshared uint tempData[ElementsCount]; [numthreads(ElementsCount/2, 1, 1)] void CSMain(uint3 gID: SV_GroupID, uint3 gtID: SV_GroupThreadID) {     MyFunc(gtID, tempData); } and I get "error X3695: race condition writing to shared memory detected, consider making this write conditional.". Any way to go around this?