Jump to content

  • Log In with Google      Sign In   
  • Create Account

Geometry shader output ordering


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
6 replies to this topic

#1 IronFox   Members   -  Reputation: 148

Like
0Likes
Like

Posted 05 February 2014 - 06:31 AM

Hello everyone,

 

I have recently come across a technique implemented by NVIDIA to generate the mesh of an implicit surface via geometry shaders.

Given their implementation, they seem to assume that two different geometry shaders executed on the same data will stream out their result in the same order.

Is there any kind of official definition that specifies whether there is a connection between input-order and output-order when executing geometry shaders?

In the given scenario, each geometry shader execution (of either shader) generates exactly one out-vertex from one in-vertex, No conditions.

 

Knowing whether this was vendor-specific or actually specified somewhere would help me a great deal smile.png

 

Cheers



Sponsor:

#2 MJP   Moderators   -  Reputation: 11751

Like
1Likes
Like

Posted 05 February 2014 - 04:10 PM

It's not an assumption, it's a required part of the spec. It's actually part of the reason why geometry shaders can be slow, and why you need to specify a max vertex count for your GS.



#3 IronFox   Members   -  Reputation: 148

Like
0Likes
Like

Posted 07 February 2014 - 03:49 AM

Hey, thanks. It's nice to know I can rely on this, although I can see the issue with performance. Makes me wonder why there is no switch (that I know of) to disable this.

Regardless, many thanks!



#4 TheChubu   Crossbones+   -  Reputation: 4756

Like
0Likes
Like

Posted 07 February 2014 - 07:07 AM

Sounds like a job for Vendor Extension Man!

 

...

 

Vendor Extension Man? Are you there?


"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

 

My journals: dustArtemis ECS framework and Making a Terrain Generator


#5 phantom   Moderators   -  Reputation: 7563

Like
0Likes
Like

Posted 07 February 2014 - 09:24 AM

Makes me wonder why there is no switch (that I know of) to disable this.


Largely because it wouldn't work; as soon as you need to output a primative you have to have a contiguous set of data to form that primative. So in the case of a triangle you need all 3 next to each other in memory for later processing.

If you had 4 instances of a geo-shader outputting the data for one triangle at the same time then the order becomes utterly non-deterministic for the later stages as the order could be [1,1,1,2,2,2,3,3,3,4,4,4], or [1,2,2,3,1,4,1,1,4,3,2,3,4] or any other combination depending on how the memory accesses were ordered.

There is no way for later stages to make sense of that data to build a triangle to process.
You can't add an index buffer as it would require the same data order garentee as the current geo-shader setup, and the index skipping would then slow down later stages as they hopped about memory to find the data in question.

So, basically, given what the geo-shader can do there is no sane way to do it.

#6 IronFox   Members   -  Reputation: 148

Like
0Likes
Like

Posted 08 February 2014 - 06:08 AM

Well, from a pure CPU perspective, I would cache the result on a per-thread-basis and eventually consolidate the results from all the worker-threads, thus creating correct primitives but in an arbitrary order. But I guess on a massively parallel machine like a GPU different rules apply, so I am uncertain whether such a strategy was even possible, let alone effective. And it's moot to really ponder since it is as it is. I am just trying to explain how I came to the assumption the output was not ordered. Thx again.



#7 Krypt0n   Crossbones+   -  Reputation: 2672

Like
0Likes
Like

Posted 08 February 2014 - 04:20 PM

 

that's why you have to specify the max amount of primitives you will output.

Geometry shader can run in parallel, you just end up with buffers of various sizes that need to be merged if you stream out, or you have to live with those 'bubbles' which make it less efficient for the rest of the pipeline, yet it still works.

 

if there would be just one GS running at a time, then the G80 (running shaders @ 1.35GHz) would be faster with geometry shaders than the latest GPU monsters (barely reaching 1GHz stock).






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS