• Advertisement
Sign in to follow this  

Pixel Shader resolution question

This topic is 2301 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,
I created quite an exhausting pixel shader which takes a long time to process per pixel (about 2-3 sec for one frame).

Since my pixel shader does very simple transformation operations (but iterated about hundred times), they cannot be more optimized.

So only solution to me is to reduce number of points for which pixel shader will call my procedure.

Is there a way to control that in Direct3D?
Is there a way to know for which points will Direct3D call my pixel shader procedure?

I'm using now Direct3D9 with pixel shader 3.0.

Thanks,
Lisur

Share this post


Link to post
Share on other sites
Advertisement
The pixel shader is called (on the best case) once per frame. So if you want the pixel shader to be called fewer times per frame you just have to reduce the render target size (be aware that you will lose quality).

Can you tell what does the pixel shader do? 2-3 secs per frame is A LOT...

Share this post


Link to post
Share on other sites
I want my pixel shader to be called only once per frame, but for fewer number of points so reducing the quality is exactly what I want.
I'm doing some nasty raymarching (100 samples per pixel).

Can you tell me how to reduce target size in D3D9?

Thanks :)

Share this post


Link to post
Share on other sites

Can you tell me how to reduce target size in D3D9?


You can specify the size of the back buffer in D3DPRESENT_PARAMETERS structure used to create the D3DDevice (in the BackBufferWidth and BackBufferHeight members of the structure).

Share this post


Link to post
Share on other sites
Hmmm...

So TiagoCosta says pixel shader is called once for every pixel projected on screen and Matias Goldberg says pixel shader is called once for every texel projected on rendered surface.

Can anyone confirm which one is right?

Surface on which my pixel shader is applied is made of quad (4 vertices) with only position and texture coordinates on vertices.

Thanks,
Lisur

Share this post


Link to post
Share on other sites
I said IN THE BEST CASE the pixel shader is called once per pixel on screen (that is size of the render target you're rendering to). But that is only true if there is no overdraw (for example when you draw a full screen triangle) so each pixel is only drawn once per frame. In your case, since your drawing a full screen quad there will be some pixel overdraw in the triangles side where the two triangles meet.

If your rendering to the back buffer you have to reduce the size of the back buffer, on the other hand, if your rendering to a different render target you have to reduce the size of the render target your rendering too. Edited by TiagoCosta

Share this post


Link to post
Share on other sites
Whattttt??????????? When did I say that?
Are you rendering a fullscreen quad?

What do you mean with "iterated about hundred times"?
You mean you have something like "for( i=0; i<100; ++i)" in your shader?

And what do you mean with "So only solution to me is to reduce number of points for which pixel shader will call my procedure."?
If you have a render target that is 512x512, it will process 4x less pixels than a 1024x1024 render target.

We're just trying to pin-point your cryptic post.

Share this post


Link to post
Share on other sites
Sorry for misunderstanding :(

Yes, you could say I'm rendering a fullscreen quad (quad is rendered a bit in front of near clipping plane).

What I'm trying to do is implement rendering of volumetric smoke as similar to chapter 30.3. Volumetric rendering on http://http.developer.nvidia.com/GPUGems3/gpugems3_ch30.html .
But it has to render only volumetrics without scenery.

So what I do is create a volumetric texture with smoke and a huge quad with texture coordinates.

Pixel shader is called when rendering quad and in every point pixel shader is called, I do raymarching trough volumetric texture and output resulting color to quad surface.

Yes, by iterating I meant for loop.

So what I want is reduce number of points where pixel shader routine will be called.

Proposed methods were reducing backbuffer dimensions and reducing texture dimensions so that's where misunderstanding happened.

Thanks,
Lisur

Share this post


Link to post
Share on other sites
After speed-reading that article I think you should reduce the number of samples you take... Do you loose that much quality doing so? Whats the size of the 3D texture your sampling? Maybe you can post your shader...

Share this post


Link to post
Share on other sites
Ok, I see. You never talked about a volumetric texture (or any texture at all) so hardly think I could be referring to downsizing the vol. texture.

Funny the article talks about Real Time Simulation of 3D fluids and 2 seconds per frame is not real time.

Reducing the backbuffer's dimensions will greatly improve your performance as the pixel shader is run less.
However, reducing your volumetric texture's size may actually help, because of better cache usage. Just try what's best.

Btw. the technique in that article doesn't render directly to the backbuffer. It applies your raymarching shader to a much smaller offscreen render target and then mix the results with the backbuffer.

Share this post


Link to post
Share on other sites
Thank you both for suggestions and patience. You were both very helpful.

2 sec per frame is very bad result. Using different volume rendering techniques (slicing) I managed to reach speed of only few milliseconds per frame but they gave very poor quality compared to this method.

Texture is of size about 256[sup]3[/sup]. When number of slices is reduced under 100, there is visible difference in quality.

Texture generation is very fast and there are no problems with that part of problem. Reading it is also fast and reducing texture quality also reduces quality.

Reducing quad size to be the size of smoke bounds did made some difference in speed, but not too much since smoke is taking most of the screen and rays which do not intersect smoke bounding box are discarded.

Reducing the backbuffer size did change speed linearly according to number of pixels.

I wanted to implement some high quality and high speed volumetric rendering technique, but it seams I will have to find some tradeoff between quality and speed :(

Ty again,
Lisur

Share this post


Link to post
Share on other sites

Thank you both for suggestions and patience. You were both very helpful.

2 sec per frame is very bad result. Using different volume rendering techniques (slicing) I managed to reach speed of only few milliseconds per frame but they gave very poor quality compared to this method.

Texture is of size about 256[sup]3[/sup]. When number of slices is reduced under 100, there is visible difference in quality.

Texture generation is very fast and there are no problems with that part of problem. Reading it is also fast and reducing texture quality also reduces quality.

Reducing quad size to be the size of smoke bounds did made some difference in speed, but not too much since smoke is taking most of the screen and rays which do not intersect smoke bounding box are discarded.

Reducing the backbuffer size did change speed linearly according to number of pixels.

I wanted to implement some high quality and high speed volumetric rendering technique, but it seams I will have to find some tradeoff between quality and speed :(

Ty again,
Lisur


Yeah this is the drawback of using NVidia samples, they aren't created in a general solution to the problem. Their solutions often only work in the context they use them in.
You might be better of running your fluid simulation in a DirectCompute or CUDA kernel and only render the results form that simulation. I have seen this in real time and it looks pretty good.


Share this post


Link to post
Share on other sites

2 sec per frame is very bad result. Using different volume rendering techniques (slicing) I managed to reach speed of only few milliseconds per frame but they gave very poor quality compared to this method.

Slicing is generally how it's done. As for quality, it depends a lot on how you place those slices (and how many of them). It's not my strongest point, but AFAIK there are algorithms for determining the optimal slice position & orientation.

Share this post


Link to post
Share on other sites
This seems to be one of the few cases that dynamic branching might actually help out. You could read the pixel position system value, and then determine if you should calculate that pixel based on some criteria. That would let you do a stipple pattern across your full screen quad, and thus reduce the number of times you perform your calculations (as opposed to the number of times your shader is invoked).

Also, if you can swing an upgrade to D3D11, DirectCompute would be a perfect match for this application!

Share this post


Link to post
Share on other sites
I deeply studied the problem and run some test and here's what I got:

I used stencil buffer to project grid sides on near clipping plane so pixel shader routine is never called on a point whose ray does not intersect smoke. With that, I reached 500 ms per frame (speedup 4x).

Funny thing happened. When I reduce backbuffer width/length, timing is the same.
Another thing which is suspicious to me is my CPU usage. One of CPU cores goes up to 90% when pixel shader is executed (when I use slices CPU usage is very low).

Is there a chance that Direct3D actually emulates my PS routine on CPU? How can I check that?

Thanks,
Lisur

Share this post


Link to post
Share on other sites
Just to inform you that I found a solution :)

I was compiling PS on every frame so compilation time took a lot of time. That was the reason CPU usage was high.

Now I can run my program with 20-30 FPS w/o reduction of backbuffer or texture size which could be treated as real-time :D

Ty all guys again

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement