Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 05 May 2012
Offline Last Active Today, 03:23 AM

#5145427 where to start Physical based shading ?

Posted by kalle_h on 08 April 2014 - 11:31 AM

Roughness and glossiness is usually just same term. Glossiness is just calculated with some formula from smoothness which is 1- roughness. In our material pipeline we always talk about roughness but actually artists author smoothness map instead. Usual optimization is to store just spec intensity and calculate spec color from albedo if specular intensity is higher than x.(0.2 is good choise).

#5144573 RSM indirect Light flickering

Posted by kalle_h on 05 April 2014 - 09:39 AM

Just try to use binary nDotL test to prevent backside lighting. Then use standard distance square fallof. This should be lot less flickery.

#5139838 Million Particle System

Posted by kalle_h on 17 March 2014 - 06:46 PM

There are actual use cases for million particle systems and they have been feasible many years already. 5 years old blog post about how to do it with dx9. http://directtovideo.wordpress.com/2009/10/06/a-thoroughly-modern-particle-system/


With modern api and clever coding it should not be any problem.

#5139428 I love shader permutations! (not sure if ironic or not)

Posted by kalle_h on 16 March 2014 - 06:44 AM

Kill all permutations that are not really needed. More options does only confuse artists and make debugging harder. Also when you have minimal amount of options its easier to optimize those.

#5139114 Debugging Graphics

Posted by kalle_h on 14 March 2014 - 06:38 PM

You always will fail if you try to take more than one step at once. Experience will give you bit bigger steps but same rule apply. Incremental changes, lot of refactoring, source control and naive implementation first is my set of rules.


Example: Many times novice graphics programmers fail things like how to reconstruct position from depth without never trying to first writing full position to render target. Or bulding octree based frustum culling before they have brute force. They are just trying to take too many steps at once.

#5137965 Same model, different UV's

Posted by kalle_h on 10 March 2014 - 05:04 PM


glMatrixMode — specify which matrix is the current matrix
GL_TEXTURE Applies subsequent matrix operations to the texture matrix stack.

#5137764 Beginner: not performing CPU-wise calculations on objects outside of camera s...

Posted by kalle_h on 10 March 2014 - 04:44 AM

Gpu does run vertex shader for each objects visible or not. What you are missing is frustum culling.

#5137199 ASSIMP skinned mesh with DX9 problem

Posted by kalle_h on 07 March 2014 - 03:40 PM

Even thought you usually don't want to optimize before its needed you always should clean and simplify all code before starting to add more features. This step is usually most beneficial for actual learning because you need really understand something to make it pretty and simple.


There is fast and simple technique to calculate global skeleton without recursion. http://molecularmusings.wordpress.com/2013/02/22/adventures-in-data-oriented-design-part-2-hierarchical-data/

#5137135 Average luminance (2x downsample) filter kernel

Posted by kalle_h on 07 March 2014 - 09:20 AM

You could go from 5x3  to 1x1 using simple gather shader. Performance should be even better.

#5135850 Is it possible to render minimap in a different way?

Posted by kalle_h on 02 March 2014 - 09:17 AM

Are you meaning create another scene for rendering minimap? Would that be too complicated? I don't want to create a copy of my game scene so that I can use different materials or geometry. I just want to create one scene, and render the minimap based on it.

Minimap is just different presentation of that scene not different scene itself. For gameplay and UI design reason it has to be clearly readable. Totally different needs than main scene rendering.

#5131111 Any options for affordable ray tracing?

Posted by kalle_h on 13 February 2014 - 02:04 PM



#5125768 Hypothesizing a new lighting method.

Posted by kalle_h on 22 January 2014 - 07:41 PM

I used a technique on the Wii that's very similar to the one mentioned in the OP, to get lots of dynamic lights on it's crappy hardware... except, instead of global cube-maps for the scene, each object had it's own cube-map, so that directionality and attenuation worked properly (at least on a per-mesh granularity - not very good for large meshes). Also, instead of cube-maps, we used sphere-maps for simplicity.... and you can't just render a light into a single texel, you have to render a large diffused blob of light.

The lighting is obviously much more approximate than doing it traditionally per-pixel -- the surface normal is evaluated per-pixel, but the attenuation and direction to light are evaluated per mesh. This means that for small meshes, it's pretty good, but for large meshes, all lights start to look like directional lights.

The other down-side is that you can't use nice BRDF's like Blinn-Phong or anything...


In general, this is part of a family of techniques known as image based lighting, and yes, it's common to combine a bunch of non-primary lights into a cube-map, etc -- e.g. think of every pixel in your sky-dome as a small directional light. Using it as a replacement for primary lights, across an entire scene, is a bit less popular.

For kinghunt 


I did very similar trick but instead of sphere maps I used spherical harmonics. I also calculated aproximated visibility function per object against all other objects. This was so fast that even particles could be light emitters.

#5121715 bandwidth theme and gpu-z info

Posted by kalle_h on 06 January 2014 - 02:53 PM

Maybe i can add something yet to those general performance and fillrate subtopics:


I run the fine program named gpu-z and it gives some info about my 

cheap gpu gt610 (worth of 50 dollars probably - i paid 50 for it but i am

from europe, but it seem that it is worldwide price )


Pixel Fillrate 1.4GPixels/s (shows the number of pixels that can be rendered to the screen in one second)

Texel Fillrate 5.6 GTexels/s (Shows the number of texels that can be

processen in one second)

Memory Type DDR3 (Please also note that GDDR3 doubles the avaliable bandwith of prewious DDR memory and that GDDR5 doubles the bandwidth od previous GDDR3 memory again)

Bandwidth 8.0GB/s (Shows the effective memory bandwidth avaliable between GPU and graphics memory)


Could maybe someone run gpu-z on his gpu and present me this info for his GPu it will maybe help me to clear some bandwith topics..


speaking about billions of triangles per second (which i doubt) iznt the general 

drawing limited by those numbers - I mean if triangle is described by some say

80 bytes of data 8.0 GB/80 = only 100 M triangles per second not billions?


Could maybe also someone say to me why texel fillrate is 4 times higher than pixel fillrate ? (it seem to me that pixel and texel are just 4 byte vram)?

Why you picked up 80bytes per triangle? In perfect scene vertex to triangle ratio is close to one and vertex size is something like this: 12bytes for position. 4bytes fo uv. 6bytes for normal and another 6 for tangent. Add vertex color for 4bytes and you get 32bytes for one vertice.(which is around the same for one triangle with perfect vertice reuse.) Add 6bytes for index and you are still half of the the 80byte figure.

#5120990 View Frustum Culling Corner Cases

Posted by kalle_h on 03 January 2014 - 03:18 PM

There is simple algorithm for correct culling. http://www.iquilezles.org/www/articles/frustumcorrect/frustumcorrect.htm

#5116326 Tangents for heightmap from limited information

Posted by kalle_h on 11 December 2013 - 04:01 PM


Thanks for the alternative! Your method actually looks faster.

(While we're here, would anyone mind reassuring me that this HLSL code to calculate the normal from the heightfield is actually correct?)

float hxm = Heightmap.Load(int3(iPos.x-1, iPos.y, 0)).r;
float hxp = Heightmap.Load(int3(iPos.x+1, iPos.y, 0)).r;
float hzm = Heightmap.Load(int3(iPos.x, iPos.y-1, 0)).r;
float hzp = Heightmap.Load(int3(iPos.x, iPos.y+1, 0)).r;
float h = Heightmap.Load(int3(iPos.x, iPos.y, 0)).r;

float3 t1 = float3(2, hxp - hxm, 0);
float3 t2 = float3(0, hzp - hzm, 2);
float3 n = cross(t2, t1);
float hxm = Heightmap.Load(int3(iPos.x-1, iPos.y, 0)).r;
float hxp = Heightmap.Load(int3(iPos.x+1, iPos.y, 0)).r;
float hzm = Heightmap.Load(int3(iPos.x, iPos.y-1, 0)).r;
float hzp = Heightmap.Load(int3(iPos.x, iPos.y+1, 0)).r;
float h = Heightmap.Load(int3(iPos.x, iPos.y, 0)).r;

float3 n = 2.0 * float3(hxp - hxm, 2, hzp - hzm); //2.0 * can be optimized off if normalized

Optimized version.