What is a wavefront?

Started by
2 comments, last by Syntac_ 8 years, 3 months ago

Looking at AMD's documentation on GCN architecture ( http://developer.amd.com/community/blog/2014/05/16/codexl-game-developers-analyze-hlsl-gcn/ ) it is a little confussing what exactly a wavefront is.

It says:

Work is performed on the SIMDs in groups of 64 work-items (i.e. 64 threads) called wavefronts.

Ok, so a thread is a wavefront... but later on it says:

The value in a particular SGPR is shared across all threads in a wavefront.

Ok, so... a thread is not a wavefront as the sentence before, but a wavefront can have multiple threads...

There are two other factors that determine the number of simultaneous wavefronts for a shader.

So... a shader has different wavefronts (and wavefronts have threads).

Also, a little bit confusing, it says that:

Each SIMD supports a maximum of 10 simultaneous wavefronts in flight

But this contradicts with the 64 mentioned on the first quote...

Can someone explain?

Thanks!

"lots of shoulddas, coulddas, woulddas in the air, thinking about things they shouldda couldda wouldda donne, however all those shoulddas coulddas woulddas ran away when they saw the little did to come"
Advertisement

None of the things you've quoted are contractory -- the first quote says that a wavefront is 64 threads, not that a wavefront is 1 thread.

A SIMD unit can have up to 10 wavefronts in flight at once. Each wavefront contains 64 threads. Hence a SIMD unit can have up to 640 threads in flight at once (in multiples of 64).

The scheduler will take the pixels/vertices that need to be processed, allocate one thread per pixel/vertex, and then tries to group up to 64 threads together into a wavefront. That bundle of threads is then given to a SIMD, which runs the shader code.

The number of wavefronts that 'fit' into a SIMD depends on the complexity of the shader code. For simple shaders, you can squeeze 10 wavefronts at a time into a SIMD, but for complex shaders you may only be able to fit one or two wavefronts into a SIMD.

This is because different shaders require different numbers of temporary registers, which are stored in the SIMD's register array. Say the SIMD has 1000 registers in total -- if a shader uses 100 or less, then you can fit 10 (or more) "instances" of that shader into the register array. If a shader uses 500 temporary registers, then only two "instances" of that shader will fit into the SIMD - so the SIMD will only accept two concurrent wavefronts.

Each "register" actually contains 64 floats -- which is why this calculation is done for wavefronts and not threads. One register is used by a wavefront to store a value for each of it's threads.

Oh!... I see... I was misinterpreting the first quote...

Your explanation makes perfect sense now.... thanks Hodgman.

"lots of shoulddas, coulddas, woulddas in the air, thinking about things they shouldda couldda wouldda donne, however all those shoulddas coulddas woulddas ran away when they saw the little did to come"

You'll often hear the number of wavefronts that can run simultaneously for a shader as shader occupancy or wavefront occupancy.

High occupancy is obviously better than low occupancy so reducing register count for your shaders on GCN is worthwhile. Doing things like avoiding unrolling loops can help.

BkyOju4CYAAzvAS.jpg

However, don't always try to achieve 10 as it can be detrimental in some cases, such as cache thrashing in high-bandwidth shaders.

Useful link.

This topic is closed to new replies.

Advertisement