low sample count screen space reflections

RPTD · 2013-08-03T12:02:42

The basic idea behind SSR is clear to me. Although there is little useful information around the basic idea is to ismply march along a ray in either view or screen space. Personally I do it in screen space as I think this is better but that I can't say for sure for the lack of information around. Whatever the case the common approach seems to be to do a linear stepping along the ray and then doing a bisecting search to refine the result. The bisecting search is clear and depending on the step size is around 5-6 steps for a large screen and a ray running across a large part of the screen. The problematic part is the step size. I made tests with a step size of 20 (not counting the refinement). In this case for a large screen (1680x1050 as an example) this gives for a moderately long ray bouncing from one side of the screen to the other of lets say 1000 pixel length a step size of 1000/20 = 50 pixels. This is quite large and steps right scross thiner geometry like for example the edges of the boxes in the test-bed I put together attached below (smaller than 1680x1050 as it's from the editor). Furthermore it leads to incorrect sampling as seen on the right side. [attachment=17069:test1b.jpg] Now I've seen other people claiming they do (on the same large screen or larger) 16 samples only even for long rays running across the screen. 16 Samples is even less than the 20 I used in the test which already misses geometry a large deal. Nobody ever stated though how these under-sampling issues work out with such a low sampling count. In my tests I required 80-100 samples to keep these undersampling issues somewhat at bay (speed is gruesome). So the question is: 1) how can 16 samples for the linear search possibly work without these undersampling issues? Another issue is stuff like a chair or table resting on the ground. All rays passing underneath would work with an exhaustive search across the entire ray. With the linear test though the test goes into the bisecting phase at the first step the ray crosses geometry like the table or chair. The bisecting test then finds no solution and thus leaks the env-map through. Some others seem to not be affected by this problem but what happens there? Do they continue steping along the ray if the bisecting fails? This though would increase the sample count beyond 20+6 and kills the worst case. So another question is: 2) with rays passing underneath geometry at the first linear search hit and bisecting fails to return a result, what do you do? continue on the ray with worse worst case sample count or fail out? 3) how to detect these cases properly to fade out? bluring or more intelligent?

Graphics and GPU Programming Programming

Started by RPTD July 30, 2013 12:31 PM

17 comments, last by RPTD 10 years, 8 months ago

Yours3!f

1,534

July 31, 2013 06:51 PM

Actually for a long ray running across the screen it does not matter that much if you are in screen space or view space. Both translate to the other using a simple calculation. So if you take the start and end point of the ray in view space and translate it into screen space and you split it up into 8 pieces you end up in average with a pixel block size of 210. The only difference is that stepping in view space instead of screen space the block size is not uniform (first step huge block size then smaller with each step). So from this point of view I still expect jumps of roughly 210 pixels per ray step which is more than 10% of the screen size for long rays. So how does it pan out with geometry thinner than roughly 200 pixels on screen?

What do you mean with "non-uniform flow control"? If you don't continue after fine-checking a pixel block where do you need non-uniform control flow? In this situation you need one initial loop to find the candidate pixel block for the fine-search and then a second loop afterwards doing the fine-search on the found candidate block (no matter if nested or not. on my card nesting drops speed like hell). So the performance is 8-loop + 32-loop hence 40-loop in total for all pixels in a warp.

Mind stating what card you run this on and how the artifacts look like?

what do you mean 'long running'? If the ray travelled too far, I simply discard it. There's no reason to display it if the result not good enough. (see my prev post)

Yes, that is true, 210 pixels, but as I wrote previously my initial step size depends on the view space z value. So if it is close to the camera, then the step size will be smaller. Again reflections for far away objects are simply discarded.

I mean IFs and ELSEs. Video cards don't really like them, especially if a texture lookup is dependent upon it.
My loops are like this:
for c = 0 to 8
if( beyond depth )

{
for d = 0 to 32
other ifs... (and texture lookups)
break;
}

but yeah at max 40 lookups for high quality.

as I mentioned I ran it on a AMD A8-4500m APU, which has the 7640G IGP. (stock clocks)

see pics for error.

Blog:

http://extremeistan.wordpress.com/

Stuff I wrote:

https://github.com/Yours3lf/libmymath

https://github.com/Yours3lf/linux_gl_fps

https://github.com/Yours3lf/instanced_font_rendering

http://youtu.be/k8PYkihyGXA

https://github.com/scrawl/smaa-opengl

https://github.com/Yours3lf/gl_browser_gui

Follow me on twitter:

https://twitter.com/0martint

RPTD

359

Author

July 31, 2013 07:14 PM

what do you mean 'long running'? If the ray travelled too far, I simply discard it. There's no reason to display it if the result not good enough. (see my prev post)

Yes, that is true, 210 pixels, but as I wrote previously my initial step size depends on the view space z value. So if it is close to the camera, then the step size will be smaller. Again reflections for far away objects are simply discarded.

I mean IFs and ELSEs. Video cards don't really like them, especially if a texture lookup is dependent upon it.
My loops are like this:
for c = 0 to 8
if( beyond depth )

{
for d = 0 to 32
other ifs... (and texture lookups)
break;
}

but yeah at max 40 lookups for high quality.

as I mentioned I ran it on a AMD A8-4500m APU, which has the 7640G IGP. (stock clocks)

see pics for error.

Long running or long ray is a ray which covers a large distance in screen space (number of pixels). So for example if the tubes would be with a reflective material the ray can easily run across the screen from the right border all the way to the left border. In this case the test-ray can easily run across >75% of the screen dimension. These are extreme cases and as you mentioned tricky. But since I'm using PBR everything reflects and just doing reflections for very close geometry shows considerably.

I agree with you that a result should not be shown if the result is not good enough. The problem is how to mathematically define when a result is good and when not. I could so far not find a robust definition to tell if the SSR influence for a single pixel should go to 0 or stay at 1 (or a percentage in the middle somewhere). It find it especially difficult with the missing hits due to undersampling while stepping. In this case pixels with full coverage (I call this influence factor coverage) are right next to those with no coverage as visible in the above image. I could not find a mathematically reasonable definition for the coverage there.

Judging from the images you provided you only check rays up to something like 5m in length or am I wrong? I currently allow rays up to the entire frustum range clamped to the frustum in screen space. This is so to speak the worst case hence a visible ray until it goes of view in the distance or any side of the screen. I heard CryEngine has a maximum length too for the rays but could never find any numbers on how this looks like. I tried limiting the length of the test-rays but somehow never got convincing results since the original problem of missing geometry still applied just with shorter step sizes.

I could give that view-space a try though. I think I've seen somebody using z-coordinate to calculate the step size for but I didn't figure out yet what the logic is behind it (if there is any).

Life's like a Hydra... cut off one problem just to have two more popping out.
Leader and Coder: Project Epsylon | Drag[en]gine Game Engine

Yours3!f

1,534

July 31, 2013 08:13 PM

what do you mean 'long running'? If the ray travelled too far, I simply discard it. There's no reason to display it if the result not good enough. (see my prev post)

Yes, that is true, 210 pixels, but as I wrote previously my initial step size depends on the view space z value. So if it is close to the camera, then the step size will be smaller. Again reflections for far away objects are simply discarded.

I mean IFs and ELSEs. Video cards don't really like them, especially if a texture lookup is dependent upon it.
My loops are like this:
for c = 0 to 8
if( beyond depth )

{
for d = 0 to 32
other ifs... (and texture lookups)
break;
}

but yeah at max 40 lookups for high quality.

as I mentioned I ran it on a AMD A8-4500m APU, which has the 7640G IGP. (stock clocks)

see pics for error.

Long running or long ray is a ray which covers a large distance in screen space (number of pixels). So for example if the tubes would be with a reflective material the ray can easily run across the screen from the right border all the way to the left border. In this case the test-ray can easily run across >75% of the screen dimension. These are extreme cases and as you mentioned tricky. But since I'm using PBR everything reflects and just doing reflections for very close geometry shows considerably.

I agree with you that a result should not be shown if the result is not good enough. The problem is how to mathematically define when a result is good and when not. I could so far not find a robust definition to tell if the SSR influence for a single pixel should go to 0 or stay at 1 (or a percentage in the middle somewhere). It find it especially difficult with the missing hits due to undersampling while stepping. In this case pixels with full coverage (I call this influence factor coverage) are right next to those with no coverage as visible in the above image. I could not find a mathematically reasonable definition for the coverage there.

Judging from the images you provided you only check rays up to something like 5m in length or am I wrong? I currently allow rays up to the entire frustum range clamped to the frustum in screen space. This is so to speak the worst case hence a visible ray until it goes of view in the distance or any side of the screen. I heard CryEngine has a maximum length too for the rays but could never find any numbers on how this looks like. I tried limiting the length of the test-rays but somehow never got convincing results since the original problem of missing geometry still applied just with shorter step sizes.

I could give that view-space a try though. I think I've seen somebody using z-coordinate to calculate the step size for but I didn't figure out yet what the logic is behind it (if there is any).

I defined the distance to be 50 units. This is again empirical value, and is probably highly dependent on the scene. This is probably not 5 meters, as I don't really know how much that would be in the real world, but something like that. According to blender the length of the blue curtain is 28 meters, and the length of the vase is 5 meters... I have downloaded the original file from cryengine, and it is like 10x bigger. So based on real world photos I scaled it down, so that the columns are like 2.2m high in Blender. This way 5 units (or meters now?) seemed to be fine.

I'm looking forward to your view space implementation!

Blog:

http://extremeistan.wordpress.com/

Stuff I wrote:

https://github.com/Yours3lf/libmymath

https://github.com/Yours3lf/linux_gl_fps

https://github.com/Yours3lf/instanced_font_rendering

http://youtu.be/k8PYkihyGXA

https://github.com/scrawl/smaa-opengl

https://github.com/Yours3lf/gl_browser_gui

Follow me on twitter:

https://twitter.com/0martint

RPTD

359

Author

August 01, 2013 10:18 PM

I defined the distance to be 50 units. This is again empirical value, and is probably highly dependent on the scene. This is probably not 5 meters, as I don't really know how much that would be in the real world, but something like that. According to blender the length of the blue curtain is 28 meters, and the length of the vase is 5 meters... I have downloaded the original file from cryengine, and it is like 10x bigger. So based on real world photos I scaled it down, so that the columns are like 2.2m high in Blender. This way 5 units (or meters now?) seemed to be fine.

I'm looking forward to your view space implementation!

I gave it a try to implement it but the view-space version is even worse than the screen-space version what goes for the broad-phase. This I did expect since I looked for screen-space to counter exactly this problem. The narrow-phase though I had to adjust and this one works better. Coverage calculation though is totally horrible and results in punctured geometry worse than before. The marked areas show this problem well.

[attachment=17150:test2.jpg]

So I guess the best solution is screen-space but with a modified narrow-phase calculation. Let's see if this works out.

Astonishing is that with the modified narrow-phase in the view-space version coverage actually fades out if samples turn not included in the image. If just the punctured pattern would go away it would be near optimal given the instable nature of the SSR algorithm to begin with.

Life's like a Hydra... cut off one problem just to have two more popping out.
Leader and Coder: Project Epsylon | Drag[en]gine Game Engine

Quat

569

August 01, 2013 10:35 PM

I am working on screen space reflections as well. I also did both screen and view space marching. I would support reflected rays in view space up to 50 meters. Both seemed to have issues but I felt like screen space was less arifacts and it works well if you are using hardware depth values. However, doing a coarse linear march followed by refinement did not give me very good results. Basically I could not make the coarse step size too large. Depending on the ray direction, the broad phase would miss intersection around the edges/boundaries of object resulting in noticeable aliasing. I settled with a 10 pixel sized broad phase where the amount of aliasing was acceptable. Then I would loop over the 10 pixels linearly (didn't feel like a binary chop was worth it for 10 because of the conditional instructions but I should try it anyway). Next I need to try it at half resolution. I should get satisfactory speed then.

-----Quat

Yours3!f

1,534

August 02, 2013 11:24 AM

I defined the distance to be 50 units. This is again empirical value, and is probably highly dependent on the scene. This is probably not 5 meters, as I don't really know how much that would be in the real world, but something like that. According to blender the length of the blue curtain is 28 meters, and the length of the vase is 5 meters... I have downloaded the original file from cryengine, and it is like 10x bigger. So based on real world photos I scaled it down, so that the columns are like 2.2m high in Blender. This way 5 units (or meters now?) seemed to be fine.

I'm looking forward to your view space implementation!

I gave it a try to implement it but the view-space version is even worse than the screen-space version what goes for the broad-phase. This I did expect since I looked for screen-space to counter exactly this problem. The narrow-phase though I had to adjust and this one works better. Coverage calculation though is totally horrible and results in punctured geometry worse than before. The marked areas show this problem well.

test2.jpg

So I guess the best solution is screen-space but with a modified narrow-phase calculation. Let's see if this works out.

Astonishing is that with the modified narrow-phase in the view-space version coverage actually fades out if samples turn not included in the image. If just the punctured pattern would go away it would be near optimal given the instable nature of the SSR algorithm to begin with.

okay, so you were right saying that they may look really similar. I think that these artifacts could be filtered out by 'checking' some things:
-check if the resulting ss vector is on the screen (fade out by applying some math)
-check if the resulting ray is within search distance (again fade)
-check if the original vs normal and view direction are good (too big angles are baaad, again fade)
-check if the resulting vs normal and reflection vector are good (again big angles are bad, fade)
-also there's one more thing, you should check if the raycast even succeed. I consider it successful, if the binary search is launched.

Blog:

http://extremeistan.wordpress.com/

Stuff I wrote:

https://github.com/Yours3lf/libmymath

https://github.com/Yours3lf/linux_gl_fps

https://github.com/Yours3lf/instanced_font_rendering

http://youtu.be/k8PYkihyGXA

https://github.com/scrawl/smaa-opengl

https://github.com/Yours3lf/gl_browser_gui

Follow me on twitter:

https://twitter.com/0martint

RPTD

359

Author

August 02, 2013 08:27 PM

I did now some experimenting by combining the broad-phase stepping from my screen-space with the narrow-stepping from the view-space. The rest is better in the narrow-phase but still not as clean as in the screen-space. I think though this problem is due to me currently using depth-reconstruction as I didn't yet switch back to hacing a full RGBF16 position texture in the gbuffer. And far away the differences in the depth value are so small that stepping fails to be accurate in the narrow-phase while in the view-space version the z-difference is precise enough. I'm going to change this once I have the position texture back in the gbuffer.

So right now I would say my screen-space version still wins in terms of overall quality once I have fixed the narrow-phase with using z instead of depth.

Life's like a Hydra... cut off one problem just to have two more popping out.
Leader and Coder: Project Epsylon | Drag[en]gine Game Engine

Yours3!f

1,534

August 03, 2013 05:40 AM

I did now some experimenting by combining the broad-phase stepping from my screen-space with the narrow-stepping from the view-space. The rest is better in the narrow-phase but still not as clean as in the screen-space. I think though this problem is due to me currently using depth-reconstruction as I didn't yet switch back to hacing a full RGBF16 position texture in the gbuffer. And far away the differences in the depth value are so small that stepping fails to be accurate in the narrow-phase while in the view-space version the z-difference is precise enough. I'm going to change this once I have the position texture back in the gbuffer.

So right now I would say my screen-space version still wins in terms of overall quality once I have fixed the narrow-phase with using z instead of depth.

I used vs pos reconstruction too. It is supposed to be the same quality as a full-blown position buffer. This is why it is called reconstruction

FYI: I've just tried Call of Juarez: Gunslinger, and they use SSR, and it is much worse than your or my version...

Blog:

http://extremeistan.wordpress.com/

Stuff I wrote:

https://github.com/Yours3lf/libmymath

https://github.com/Yours3lf/linux_gl_fps

https://github.com/Yours3lf/instanced_font_rendering

http://youtu.be/k8PYkihyGXA

https://github.com/scrawl/smaa-opengl

https://github.com/Yours3lf/gl_browser_gui

Follow me on twitter:

https://twitter.com/0martint

RPTD

359

Author

August 03, 2013 12:02 PM

I did now some experimenting by combining the broad-phase stepping from my screen-space with the narrow-stepping from the view-space. The rest is better in the narrow-phase but still not as clean as in the screen-space. I think though this problem is due to me currently using depth-reconstruction as I didn't yet switch back to hacing a full RGBF16 position texture in the gbuffer. And far away the differences in the depth value are so small that stepping fails to be accurate in the narrow-phase while in the view-space version the z-difference is precise enough. I'm going to change this once I have the position texture back in the gbuffer.

So right now I would say my screen-space version still wins in terms of overall quality once I have fixed the narrow-phase with using z instead of depth.

I used vs pos reconstruction too. It is supposed to be the same quality as a full-blown position buffer. This is why it is called reconstruction

FYI: I've just tried Call of Juarez: Gunslinger, and they use SSR, and it is much worse than your or my version...

The problem is the lack of precision. Depth is calculated using a perspective division and most pixels on screen are not close to the camera with their depth value somewhere above 0.9 quickly approaching 1. The range of z-values mapping to the same pixel gets large quickly. With your reconstruction you obtain a sort of middle z-value midst in the range. Comparing this with the test ray doesn't do precision any good. So in most of the range of pixels in screen the depth difference is small although the z-difference is huge. Combine this now with stepping (reflectionDir / stepCount) and pair this up with 32-bit floats in shaders and their 6-7 digits of precision and you are up to a precision problem.

Life's like a Hydra... cut off one problem just to have two more popping out.
Leader and Coder: Project Epsylon | Drag[en]gine Game Engine

low sample count screen space reflections

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

low sample count screen space reflections

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines