Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 30 Aug 2006
Offline Last Active Today, 09:47 AM

Posts I've Made

In Topic: Sharpen shader performance

Yesterday, 04:32 AM

Is there a tool similar to AMD Shader Analyzer, but for nVidia gpus ?


Don't know about NV, but Shader Analyzer is outdated, you may wanna look and CodeXL instead.

CodeXL is great, it shows things like ISA code, runtime, occupancy, LDS & register usage, cache hits, stall time due to bandwith limits, spilled registers etc.

It's easy to get conclusions like: "If i could decrease register count by 2 and LDS usage by 0.5 kB, occupancy would raise from  50% to 60%, probably resulting in a 10% speed up".

I've use it only for compute, but assume it's similar usefull to general shaders as well.

In Topic: Sharpen shader performance

28 September 2016 - 02:19 PM

I guess the speed up comes not from the manual unroll, but because only by doing so the compiler becomes clever enough to replace slow array lookups with constants.


However, i wonder if a compute shader implementation would be faster, e.g. processing 8x8 pixels per invocation.

There would be much less texture access, and maybe it beats texture cache.

In Topic: Coding-Style Poll

27 September 2016 - 10:58 AM

Syntax coloring can help people that are used to prefixes, so i would dare to forbid them.


For braces, sometimes i tend to temporary reformat code if it's very difficult to figure out (preferring braces at new lines - others are just opposite).

I would only give a convention for function bodies, header files etc. and leave the rest to the implementors preference.

Saves time - later it's easier to extend / fix bugs when coding style did not need to change since initial implementation.


I'd also quit that commitee :D and do something else instead.

In Topic: Is OpenCL slowly dying?

24 September 2016 - 11:29 PM

I've recently noticed that it looks like the support for OpenCL looks like being slowly dropped in favor of using Vulkan (although it might be only in game industry, as I assume OpenCL is still used in places where rendering is not going to be the thing),


Agree, but even for non rendering tasks Compute Shader will be preferred because you can do both async if you use Graphics API for everything.

Also NVidias lack of support makes OpenCL inpractical because 1.2 has no indirect dispatch.


I do a large project in OpenCL and Vulkan (to get some profiler data - CodeXL does not work yet for Vulkan).

Ignoring the dispatch problem and looking only at GPU time and AMD, the performance varies about 20-30%. Sometimes VK wins, sometimes OpenCL.

Next Vulkan will have data sharing, so personally i'm still considering OpenCL as an alternative in extreme cases.

In Topic: Branching in compute kernels

24 September 2016 - 10:19 AM

In my Vulkan implementation, it just crashes after the compute fence goes timeout because of resource locking.

So the shader finishes, and you get the crash afterwards?

Try a vkQueueWaitIdle() after vkQueueSubmit(), to quickly ensure all work has finished and see if it prevents the crash.


Or does the shader never finish? Usually this causes a bluescreen or a unstable system (Do NOT save your source files in this case - reboot instantly. I've lost some work with this)

Probably a infinite loop - implement an additional max counter to prevent this and see if the crash goes away.



Can you post some performance comparision if you get it to work?