Jump to content

  • Log In with Google      Sign In   
  • Create Account

We need your help!

We need 7 developers from Canada and 18 more from Australia to help us complete a research survey.

Support our site by taking a quick sponsored survey and win a chance at a $50 Amazon gift card. Click here to get started!


Speed - Texture Lookups and Structured Buffers


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
10 replies to this topic

#1 Migi0027 (肉コーダ)   Crossbones+   -  Reputation: 3828

Like
0Likes
Like

Posted 18 April 2014 - 11:17 AM

Hi guys,

 

so currently I'm following the deferred pipeline, storing the normals/depths/etc.. in different render targets.

 

Now I'm not proud of it, but each single time I render a, e.g., point light, I reconstruct the position and normals per pixel, and then do some lighting calculations, a single pass of a point light takes around 500 us.

 

Now for performance I though it would be better if I packed the gbuffer into a structured buffer, so I did. I construct the gbuffer from the textures in the compute shader bound as a UAV, then access it in the pixel shader for lighting as a SRV. I was expecting an increase in performance, as texture lookups are expensive, but instead rendering a point light took around 1000 us.

 

For the unpacked gbuffer, I have 3 float4 textures, and in the packed gbuffer, the structured buffer consists of 2 float4 and 1 float3.

 

Now my question is how are the speeds when comparing texture lookups and structured buffers, is it supposed to be like this? Or am I doing something wrong?

 

I apologize that I do not have the source code, but at the moment I do not have access to the machine where the source code is on. When I get access, I'll post it, should be within a few hours.

 

Thank you for your time

-MIGI0027


FastCall22: "I want to make the distinction that my laptop is a whore-box that connects to different network"

Blog about... stuff (GDNet, WordPress): www.gamedev.net/blog/1882-the-cuboid-zone/cuboidzone.wordpress.com/


Sponsor:

#2 MJP   Moderators   -  Reputation: 14505

Like
3Likes
Like

Posted 18 April 2014 - 11:48 AM

Texture reads are expensive (relatively speaking) because the GPU has to fetch the data from off-chip memory and then wait for that memory to be available. Buffer reads have the same problem, so you're not going to avoid it by switching to buffers. When you're bottlenecked by memory access, the performance will heavility depend on your access patterns with regards to cache. In this regard textures have an advantage, because GPU's usually store textures in a "swizzled" pattern that maps the texels to hardware caches when fetched in a pixel shader. Buffers are typically stored linearly, which won't map as well to pixel shaders.


Edited by MJP, 18 April 2014 - 12:49 PM.


#3 Migi0027 (肉コーダ)   Crossbones+   -  Reputation: 3828

Like
0Likes
Like

Posted 18 April 2014 - 12:21 PM

Buffer reads have the same problem, so you're going to avoid it by switching to buffers

 

It's probably just me, but please elaborate unsure.png .

 

By buffer reads do you mean just reading from a buffer or structured buffers?


Edited by Migi0027, 18 April 2014 - 12:21 PM.

FastCall22: "I want to make the distinction that my laptop is a whore-box that connects to different network"

Blog about... stuff (GDNet, WordPress): www.gamedev.net/blog/1882-the-cuboid-zone/cuboidzone.wordpress.com/


#4 mhagain   Crossbones+   -  Reputation: 9954

Like
0Likes
Like

Posted 18 April 2014 - 12:42 PM

I would guess that MJP just accidentally omitted the word "not" there.


It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.


#5 MJP   Moderators   -  Reputation: 14505

Like
0Likes
Like

Posted 18 April 2014 - 12:49 PM

I would guess that MJP just accidentally omitted the word "not" there.

 

Indeed, sorry about that. 



#6 Migi0027 (肉コーダ)   Crossbones+   -  Reputation: 3828

Like
0Likes
Like

Posted 18 April 2014 - 01:12 PM

So I should go about not using buffers at all?


FastCall22: "I want to make the distinction that my laptop is a whore-box that connects to different network"

Blog about... stuff (GDNet, WordPress): www.gamedev.net/blog/1882-the-cuboid-zone/cuboidzone.wordpress.com/


#7 Migi0027 (肉コーダ)   Crossbones+   -  Reputation: 3828

Like
0Likes
Like

Posted 18 April 2014 - 03:47 PM

So I learned that some developers manage to get a point light to render in about 0.5 us, which I find amazing.

 

So out of curiosity, what are your measures, of a single point light, directional or something close?


FastCall22: "I want to make the distinction that my laptop is a whore-box that connects to different network"

Blog about... stuff (GDNet, WordPress): www.gamedev.net/blog/1882-the-cuboid-zone/cuboidzone.wordpress.com/


#8 Hodgman   Moderators   -  Reputation: 42283

Like
4Likes
Like

Posted 18 April 2014 - 04:15 PM

Your buffer method has about double the floats, so double the memory bandwidth of the texture method. The fact that it also ran in around double the time is an indication that your shader is bottlenecked by memory bandwidth.
Try to reduce your memory requirements as much as possible - e.g. 3x8bit normals and 16 or 24 bit depth ;)

The more recent tiled/clustered deferred renderers improve bandwidth by shading more than 1 light at a time -- i.e. They'll read the gbuffer, shade 10 lights, add them and return the sum. Thus amortizing the gbuffer read and ROP/OM costs.

#9 Migi0027 (肉コーダ)   Crossbones+   -  Reputation: 3828

Like
0Likes
Like

Posted 19 April 2014 - 04:51 PM

So I managed to compress my gbuffer into 2 float2s, which is good, I hope. smile.png

struct GB
{
	float4 ColorXYZ_RoughW;
	float4 NormXY_PostZ_DepthW;
};

So, as millions of people say, I store the view space position depth, and then reconstruct it later on, but, there's a problem. The position isn't being reconstructed correctly, and I'm following MJP's article on the reconstruction part 3.

 

What I'm currently doing:

Forward GBuffer Rendering:
output.depth = length(input.positionView.xyz);

Directional Light Shading:
   Vertex Shader;;
	float3 positionWS = mul(Output.Pos, mul(viewInv, projInv)); // not the fastest, I know...
	Output.ViewRay = positionWS - cameraPosition;

   Pixel Shader;;
	float depth = gb.NormXY_PostZ_DepthW.w;
	float3 viewRay = normalize(input.ViewRay);
	float3 positionWS = cameraPosition + viewRay * depth;
	return float4(positionWS, 1); // debugging purposes

And this is the result:

 

wb40h2.png

 

So I'm not expecting a rescue mission, but maybe you could spot the issue? happy.png  ( If it's that simple )


FastCall22: "I want to make the distinction that my laptop is a whore-box that connects to different network"

Blog about... stuff (GDNet, WordPress): www.gamedev.net/blog/1882-the-cuboid-zone/cuboidzone.wordpress.com/


#10 Migi0027 (肉コーダ)   Crossbones+   -  Reputation: 3828

Like
0Likes
Like

Posted 21 April 2014 - 03:00 AM

Maybe I'm doing something wrong, as usual.

 

In the vertex shader I have the Output.Pos, which is the screen space pos of the full screen quad.


FastCall22: "I want to make the distinction that my laptop is a whore-box that connects to different network"

Blog about... stuff (GDNet, WordPress): www.gamedev.net/blog/1882-the-cuboid-zone/cuboidzone.wordpress.com/


#11 Migi0027 (肉コーダ)   Crossbones+   -  Reputation: 3828

Like
0Likes
Like

Posted 21 April 2014 - 06:21 AM

But overall I should switch back to regular textures instead of buffers?


FastCall22: "I want to make the distinction that my laptop is a whore-box that connects to different network"

Blog about... stuff (GDNet, WordPress): www.gamedev.net/blog/1882-the-cuboid-zone/cuboidzone.wordpress.com/





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS