Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 29 Mar 2012
Offline Last Active Dec 18 2015 03:29 AM

Topics I've Started

Efficient array shuffing in pure HLSL

29 October 2014 - 04:37 AM

Hi all!


I am looking for an efficient way to shuffle an array in plain HLSL (i.e., create a random sequence where every index is used exactly once).


I've learned so far that Fisher–Yates shuffle  (Knuth shuffle) algorithm would do the job, but I didn't find any implementations in HLSL so far. So I tried implementing the algorithm myself but the naiv approach of transforming code from a different language like C++ into HLSL produces rather slow running results.


Any ideas for a really fast way to achieve this?

Loop Compilation - Bad Performance

15 September 2014 - 04:11 AM

Hey all!


I've got issues with the performance (as in, time taken) of the compilation of a relatively simply loop (DX9):

for (int i=0; i < pointsnum; i++)
	Color.rgb = samplepoint(OrigColor, Color, uv, ov, points[i].xy, points[i].z);				
	Color.rgb = samplepoint(OrigColor, Color, uv, ov, float2(points[i].y,points[i].x), points[i].z);		

I will save the details here. Basically the compiler takes constantly around 2.7 seconds with 18 iterations, but 7.8 seconds with 36. I would have suspected compilation time to increase linearly with the number of iterations, so should be about twice the time when doubling iterations wacko.png


Does the complexity of the called samplepoint function (which has calls to subsequent custom functions as well) affect compilation performance this much, or why does the compiler need that much more time for just twice the iterations count?





If I disable one of the two lines in the loop (doesn't matter which one), compilation time is the same as if using 18 instead of 36 iterations for both lines.


If I make two loops each with one of the two lines instead just one loop, compilation time is the same as with only one loop.


So it's definitively a compilation issue, not one of my code in particular!


Any hints would be appreciated.

Intrinsics to improve performance of interpolation / mix functions

04 September 2014 - 09:29 AM

Hello all


In pixel shaders (SM3 to 5) I often do some mixing / interpolation between two or more vectors (e.g. colors) or scalars (e.g. luminances).


A simple example of a common code as used for example in gaussion-blur-like implementation, looks like this


float4 mixColor = (tex2D(colorSampler, uv-blur)+tex2D(colorSampler, uv+blur)) / 2.f;


I would like to improve that, performance- and instruction-limits-wise, making use of hardware-supported HLSL intrinsic functions, but I'm not sure what would best here. lerp(), for example? I think I'd basically need the opposite of the mad() command. Are there better ways?

Convert view space to world space issues

13 August 2014 - 01:31 AM

Hi all!


In a 3D game that I'm writing a (post process) pixel shader for I try to transform a view space to a world space coordinate. My HLSL code looks about like this:


float3 world_pos = mul(view_pos,(float3x3)m_WV) + camera_pos


This works, but only for certain view angles and camera positions. E.g. when I look "from south" to the position in question it looks as it should (I mark the position to be transformed on screen as a colored sphere), but when turning the camera more than about 20 degrees, or shifting the camera position so that I will look "from east" the transformation renders completely off.


I must be missing something here, but I don't know what. I've tried normalizing, transposing and some other basic mutations / additions to my code but didn't find any working solution.


Any hints?

Convert 2D Post Process targeting Cube Texture / Sky Box

04 August 2014 - 01:55 AM

Hello guys,


I want to apply a post process shader effect (procedural starfield) that I have written to a sky box cube map, which is currently not really possible because the effect is made for pure 2D / screen space purposes. How would I convert or apply it to my sky shader which currently queries its color data using texCUBE()?


I'm working with DX9 and pure HLSL, no CPU-side / host application processing possible.


Any ideas?