Jump to content

  • Log In with Google      Sign In   
  • Create Account

Speed - Texture Lookups and Structured Buffers


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
10 replies to this topic

#1 Migi0027   Crossbones+   -  Reputation: 2116

Like
0Likes
Like

Posted 18 April 2014 - 11:17 AM

Hi guys,

 

so currently I'm following the deferred pipeline, storing the normals/depths/etc.. in different render targets.

 

Now I'm not proud of it, but each single time I render a, e.g., point light, I reconstruct the position and normals per pixel, and then do some lighting calculations, a single pass of a point light takes around 500 us.

 

Now for performance I though it would be better if I packed the gbuffer into a structured buffer, so I did. I construct the gbuffer from the textures in the compute shader bound as a UAV, then access it in the pixel shader for lighting as a SRV. I was expecting an increase in performance, as texture lookups are expensive, but instead rendering a point light took around 1000 us.

 

For the unpacked gbuffer, I have 3 float4 textures, and in the packed gbuffer, the structured buffer consists of 2 float4 and 1 float3.

 

Now my question is how are the speeds when comparing texture lookups and structured buffers, is it supposed to be like this? Or am I doing something wrong?

 

I apologize that I do not have the source code, but at the moment I do not have access to the machine where the source code is on. When I get access, I'll post it, should be within a few hours.

 

Thank you for your time

-MIGI0027


Hi! Cuboid Zone
The Rule: Be polite, be professional, but have a plan to steal all their shaders!

Sponsor:

#2 MJP   Moderators   -  Reputation: 11737

Like
3Likes
Like

Posted 18 April 2014 - 11:48 AM

Texture reads are expensive (relatively speaking) because the GPU has to fetch the data from off-chip memory and then wait for that memory to be available. Buffer reads have the same problem, so you're not going to avoid it by switching to buffers. When you're bottlenecked by memory access, the performance will heavility depend on your access patterns with regards to cache. In this regard textures have an advantage, because GPU's usually store textures in a "swizzled" pattern that maps the texels to hardware caches when fetched in a pixel shader. Buffers are typically stored linearly, which won't map as well to pixel shaders.


Edited by MJP, 18 April 2014 - 12:49 PM.


#3 Migi0027   Crossbones+   -  Reputation: 2116

Like
0Likes
Like

Posted 18 April 2014 - 12:21 PM

Buffer reads have the same problem, so you're going to avoid it by switching to buffers

 

It's probably just me, but please elaborate unsure.png .

 

By buffer reads do you mean just reading from a buffer or structured buffers?


Edited by Migi0027, 18 April 2014 - 12:21 PM.

Hi! Cuboid Zone
The Rule: Be polite, be professional, but have a plan to steal all their shaders!

#4 mhagain   Crossbones+   -  Reputation: 8276

Like
0Likes
Like

Posted 18 April 2014 - 12:42 PM

I would guess that MJP just accidentally omitted the word "not" there.


It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.


#5 MJP   Moderators   -  Reputation: 11737

Like
0Likes
Like

Posted 18 April 2014 - 12:49 PM

I would guess that MJP just accidentally omitted the word "not" there.

 

Indeed, sorry about that. 



#6 Migi0027   Crossbones+   -  Reputation: 2116

Like
0Likes
Like

Posted 18 April 2014 - 01:12 PM

So I should go about not using buffers at all?


Hi! Cuboid Zone
The Rule: Be polite, be professional, but have a plan to steal all their shaders!

#7 Migi0027   Crossbones+   -  Reputation: 2116

Like
0Likes
Like

Posted 18 April 2014 - 03:47 PM

So I learned that some developers manage to get a point light to render in about 0.5 us, which I find amazing.

 

So out of curiosity, what are your measures, of a single point light, directional or something close?


Hi! Cuboid Zone
The Rule: Be polite, be professional, but have a plan to steal all their shaders!

#8 Hodgman   Moderators   -  Reputation: 31799

Like
4Likes
Like

Posted 18 April 2014 - 04:15 PM

Your buffer method has about double the floats, so double the memory bandwidth of the texture method. The fact that it also ran in around double the time is an indication that your shader is bottlenecked by memory bandwidth.
Try to reduce your memory requirements as much as possible - e.g. 3x8bit normals and 16 or 24 bit depth ;)

The more recent tiled/clustered deferred renderers improve bandwidth by shading more than 1 light at a time -- i.e. They'll read the gbuffer, shade 10 lights, add them and return the sum. Thus amortizing the gbuffer read and ROP/OM costs.

#9 Migi0027   Crossbones+   -  Reputation: 2116

Like
0Likes
Like

Posted 19 April 2014 - 04:51 PM

So I managed to compress my gbuffer into 2 float2s, which is good, I hope. smile.png

struct GB
{
	float4 ColorXYZ_RoughW;
	float4 NormXY_PostZ_DepthW;
};

So, as millions of people say, I store the view space position depth, and then reconstruct it later on, but, there's a problem. The position isn't being reconstructed correctly, and I'm following MJP's article on the reconstruction part 3.

 

What I'm currently doing:

Forward GBuffer Rendering:
output.depth = length(input.positionView.xyz);

Directional Light Shading:
   Vertex Shader;;
	float3 positionWS = mul(Output.Pos, mul(viewInv, projInv)); // not the fastest, I know...
	Output.ViewRay = positionWS - cameraPosition;

   Pixel Shader;;
	float depth = gb.NormXY_PostZ_DepthW.w;
	float3 viewRay = normalize(input.ViewRay);
	float3 positionWS = cameraPosition + viewRay * depth;
	return float4(positionWS, 1); // debugging purposes

And this is the result:

 

wb40h2.png

 

So I'm not expecting a rescue mission, but maybe you could spot the issue? happy.png  ( If it's that simple )


Hi! Cuboid Zone
The Rule: Be polite, be professional, but have a plan to steal all their shaders!

#10 Migi0027   Crossbones+   -  Reputation: 2116

Like
0Likes
Like

Posted 21 April 2014 - 03:00 AM

Maybe I'm doing something wrong, as usual.

 

In the vertex shader I have the Output.Pos, which is the screen space pos of the full screen quad.


Hi! Cuboid Zone
The Rule: Be polite, be professional, but have a plan to steal all their shaders!

#11 Migi0027   Crossbones+   -  Reputation: 2116

Like
0Likes
Like

Posted 21 April 2014 - 06:21 AM

But overall I should switch back to regular textures instead of buffers?


Hi! Cuboid Zone
The Rule: Be polite, be professional, but have a plan to steal all their shaders!




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS