• Advertisement

AliasBinman

Member
  • Content count

    54
  • Joined

  • Last visited

Community Reputation

855 Good

About AliasBinman

  • Rank
    Member
  1. Apply shadow map on scene

    The Matrix for rendering the shadow map and then sampling from it are different due to the viewport vs texture coordinates. When rendering the shadow map the map goes from -1,+1 to +1,-1. When sampling you go from 0,0 to 1,1 You need to adjust the shadowVPMatrix  by concatenating something like this [ 0.5    0    0   0 ] [ 0  -0.5    0    0 ] [ 0     0     0    0  ] [0.5 0.5    0    1]
  2. Clustered shading - why deferred?

    There are a few reasons for performance gains. It can depend on whether the gains outweigh the losses however so it is not always a net win overall. I presume with their data sets it was however.   If you consider all pixel shading has two phases. The first phase is to evaluate the material properties for the closest surface for the particular pixel. The second phase is to use these material parameters along with all the lighting to evaluate the BRDF to derive a final color. For forward shading this material evaluation and the BRDF evaluation is done in the same shader. For deferred the material evaluation is performed first and the results are written to GBuffers. Then later passes will read in the GBuffers and evaluate the BRDF. Typically material evaluation is cheaper than lighting evaluation.   If you are rendering at say 1920x1080 resolution you have about 2M pixels to evaluate.    When you render the scene which is represented by lots of triangles then this is certain amount of inefficiency which can cause you to ultimately process more than 2M pixels. Some of this is due to dead pixels inside pixel quads, overdraw which will still happen with Hi-Z due to quantisation issues and non full wavefronts. So for deferred you will evaluate materials more than needed but only light each pixel exactly once. For forward then you will material evaluate and light each pixel more than once.   Also for forward the shaders are bigger and likely to use more registers which has reduce occupancy which means the GPU is more likely to blocked on memory reads.   Finally for the deferred case you can guarantee wavefronts of pixel data are rendered which hit one screen space tile only. This can improve caching and allow for some operations to run wavefront wide rather than pixel wide. 
  3. DX11 Problem with Diffuse Lighting

    You are not setting the World matrix in the object CB. The transformed normal is probably coming out as (0,0,0)
  4. Avoiding huge far clip

    Use an infinite far plane.   This is a great paper on improving perspective precision http://www.geometry.caltech.edu/pubs/UD12.pdf
  5. What format is the color output/input. You should define it to be a texcoord if its currently a color.
  6. The banding you are seeing is expected with the FP10 format. The format is 6 bits mantissa and 4 bit exponent. So from 0.5 to 1.0 you have 64 unique codes (2^6). 1 / 0.0156 == 64.   The banding is a quantisation when storing to the FP10 blue format.    A solution is is to add noise or dither at export time to mask the banding artifacts. Here is a great presentation on this. http://loopit.dk/banding_in_games.pdf
  7. The best way to think of it is to picture what a pixel looks like when projected into texture space. If the surface is aligned with the camera near plane then the pixel will map to a square in texture space. It could be freely rotated however and look like a diamond. For point sampling the texel closest to this square sample is used. For bilinear its a weighted average of the 4 nearest. For the case of mip mapping a mip level is selected such that the square is about 1 texel unit area. Think about this square projected onto each mip level. As each mip level goes down each texel effectively doubles in each axis.   For anisotropic filtering then the pixel is elongated and forms a thin rectangle. There is no mip level you can use which either fully represents all of the rectangle. Instead the rectangle can be subdivided into smaller rectangles such that they are more square like. Each of this subdivisions have their own texel centre and can be each bilinear filtered and averaged for a result whcih better represents all the texels that touch the rectangle.   Hope this helps. I know pictures will help better explain. I'm sure there is so good visual explanations out there. 
  8. Per Triangle Culling (GDC Frostbite)

    The point is that the ALU processing capabilities far exceed that of the fixed function triangle setup and rasterizer. Using compute you can prune the set of triangles to get rid of the ones that don't lead to any shaded pixels and are therefore discarded. Its purely there to get a bit more performance. 
  9. Its probably worth calculating the distance usianf a projection onto the view space forward vector So   float dist = dot(billboardpos-cameraPosition, cameraFwd)
  10. Non-linear zoom for 2D

    The following code will do what you want assuming you have a constant update tick.   float Size = 16.0f;   then in the update const float RateDecay = 0.99f;  Size = Size*RateDecay;   If you don't have a fixed timestep you can do the same using exponential function size = size * log(dt * kConstant)
  11. Post Processing Causing Terrible Swimming

    This presentation is a great overview of how to resolve the issues you are probably seeing   http://advances.realtimerendering.com/s2014/sledgehammer/Next-Generation-Post-Processing-in-Call-of-Duty-Advanced-Warfare-v17.pptx
  12. Also post processing effects such as DOF, Motion Blur and bloom look far better if they operate on the HDR data prior to tonemapping.
  13. Tinted bloom?

    Looks like the bloom buffer is tapped 2 times with a centered radial UV shift and each tap is tinted with a purple and orange tint and added together.
  14. Is it worth to use index buffers?

    Almost in all cases using index buffers is a win. One nice property of index buffers is that they are guaranteed to be read contiguously forward by the hardware and hence its trivial for them to read ahead to ensure the data is there ahead of time. On some popular GPUs the index buffer read units don't even go though the main mem hierarchy and pollute the cache.   Another question to answer is whether to use index tri lists or indexed tri strips.
  15. Imposter Transitioning to Mesh

    I created a simple demo on this using WebGL. The blog post detailing what is happening isn't finished but in the mean time you can look at the demo and see what I am doing in the ModelWarpVS shader.    https://dl.dropboxusercontent.com/u/20691/AAImpostor.html
  • Advertisement