Geometry shader output ordering

Started by
5 comments, last by Krypt0n 10 years, 2 months ago

Hello everyone,

I have recently come across a technique implemented by NVIDIA to generate the mesh of an implicit surface via geometry shaders.

Given their implementation, they seem to assume that two different geometry shaders executed on the same data will stream out their result in the same order.

Is there any kind of official definition that specifies whether there is a connection between input-order and output-order when executing geometry shaders?

In the given scenario, each geometry shader execution (of either shader) generates exactly one out-vertex from one in-vertex, No conditions.

Knowing whether this was vendor-specific or actually specified somewhere would help me a great deal smile.png

Cheers

Advertisement

It's not an assumption, it's a required part of the spec. It's actually part of the reason why geometry shaders can be slow, and why you need to specify a max vertex count for your GS.

Hey, thanks. It's nice to know I can rely on this, although I can see the issue with performance. Makes me wonder why there is no switch (that I know of) to disable this.

Regardless, many thanks!

Sounds like a job for Vendor Extension Man!

...

Vendor Extension Man? Are you there?

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

Makes me wonder why there is no switch (that I know of) to disable this.


Largely because it wouldn't work; as soon as you need to output a primative you have to have a contiguous set of data to form that primative. So in the case of a triangle you need all 3 next to each other in memory for later processing.

If you had 4 instances of a geo-shader outputting the data for one triangle at the same time then the order becomes utterly non-deterministic for the later stages as the order could be [1,1,1,2,2,2,3,3,3,4,4,4], or [1,2,2,3,1,4,1,1,4,3,2,3,4] or any other combination depending on how the memory accesses were ordered.

There is no way for later stages to make sense of that data to build a triangle to process.
You can't add an index buffer as it would require the same data order garentee as the current geo-shader setup, and the index skipping would then slow down later stages as they hopped about memory to find the data in question.

So, basically, given what the geo-shader can do there is no sane way to do it.

Well, from a pure CPU perspective, I would cache the result on a per-thread-basis and eventually consolidate the results from all the worker-threads, thus creating correct primitives but in an arbitrary order. But I guess on a massively parallel machine like a GPU different rules apply, so I am uncertain whether such a strategy was even possible, let alone effective. And it's moot to really ponder since it is as it is. I am just trying to explain how I came to the assumption the output was not ordered. Thx again.

that's why you have to specify the max amount of primitives you will output.

Geometry shader can run in parallel, you just end up with buffers of various sizes that need to be merged if you stream out, or you have to live with those 'bubbles' which make it less efficient for the rest of the pipeline, yet it still works.

if there would be just one GS running at a time, then the G80 (running shaders @ 1.35GHz) would be faster with geometry shaders than the latest GPU monsters (barely reaching 1GHz stock).

This topic is closed to new replies.

Advertisement