Jump to content

  • Log In with Google      Sign In   
  • Create Account

400% Raytracing Speed-Up by Re-Projection (Image Warping)


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
12 replies to this topic

#1 spacerat   Members   -  Reputation: 753

Like
8Likes
Like

Posted 03 May 2014 - 09:42 PM

Intro I have been working a while on this technology and since real-time raytracing is getting faster like with the Brigade Raytracer e.g., I believe this can be an important contribution to this area, as it might bring raytracing one step closer to being usable for video games.

 

Algorithm: The technology exploits temporal coherence between two consecutive rendered images to speed up ray-casting. The idea is to store the x- y- and z-coordinate for each pixel in the scene in a coordinate-buffer and re-project it into the following screen using the differential view matrix. The resulting image will look as below.
 
The method then gathers empty 2x2 pixel blocks on the screen and stores them into an indexbuffer for raycasting the holes. Raycasting single pixels too inefficient. Small holes remaining after the hole-filling pass are closed by a simple image filter. To improve the overall quality, the method updates the screen in tiles (8x4) by raycasting an entire tile and overwriting the cache. Doing so, the entire cache is refreshed after 32 frames. Further, a triple buffer system is used. That means two image caches which are copied to alternately and one buffer that is written to. This is done since it often happens that a pixel is overwritten in one frame, but becomes visible already in the next frame. Therefore, before the hole filling starts, the two cache buffers are projected to the main image buffer.
 
Results: Most of the pixels can be re-used this way as only a fraction of the original needs to be raycated, The speed up is significant and up to 5x the original speed, depending on the scene. The implementation is applied to voxel octree raycasting using open cl, but it can eventhough be used for conventional triangle based raycasting.
 
Limitations: The method also comes with limitations of course. So the speed up depends on the motion in the scene obviously, and the method is only suitable for primary rays and pixel properties that remain constant over multiple frames, such as static ambient lighting. Further, during fast motions, the silhouettes of geometry close to the camera tends to loose precision and geometry in the background will not move as smooth as if the scene is fully raytraced each time. There, future work might include creating suitable image filters to avoid these effects.
 
How to overcome the Limitations: 
 
Ideas to solve noisy silhouettes near the camera while fast motion:

1. suppress unwanted pixels with a filter by analyzing the depth values in a small window around each pixel. In experiments it removed some artifacts but not all - also had quite an impact on the performance.

2. (not fully explored yet) Assign a speed value to each pixel and use that to filter out unwanted pixels

3. create a quad-tree-like triangle mesh in screenspace from the raycasted result. the idea is to get a smoother frame-to-frame coherence with less pixel noise for far pixels and let the zbuffer do the job of overlapping pixels. Its sufficient to convert one tile from the raycasted result to a mesh per frame. Problem of this method : The mesh tiles dont fit properly together as they are raycasted at different time steps. Using raycasting to fill in holes was not very simple which is why I stopped exploring this method further
http://www.farpeek.com/papers/IIW/IIW-EG2012.pdf fig 5b

** Untested Ideas **

4. compute silhouettes based on the depth discontinuity and remove pixels crossing them
5. somehow do a reverse trace in screenspace between two frames and test for intersection
6. use splats to rasterize voxels close to the camera so speckles will be covered
 
You can find the full text here, including paper references.
 
Clipboard01.png
 

Edited by spacerat, 19 August 2014 - 06:02 AM.


Sponsor:

#2 Aressera   Members   -  Reputation: 1490

Like
0Likes
Like

Posted 03 May 2014 - 11:31 PM

Interesting idea, I also use coherence for accelerating diffuse sound ray tracing and get a 10x improvement by averaging the ray contributions over several frames.

 

Do you have any ideas for how to improve the visual quality?



#3 spacerat   Members   -  Reputation: 753

Like
0Likes
Like

Posted 04 May 2014 - 12:20 AM

Sound raytracing for realistic echoing ? Thats also interesting.

 

For the quality, the upper image is just in 256 colors, so doesnt look that good.

I just used it for testing. The general method also can be applied to 32 bit colors of course. 

 

When in motion, the reprojection version looks not as smooth as the raycasted version - so more research could be done like re-projecting to a higher resolution frame buffer that is downsampled for the final rendering e.g. Also edges with fast motion will loose some accuracy. There additional research may improve the result too, such as by using image filters. 



#4 Hodgman   Moderators   -  Reputation: 32016

Like
0Likes
Like

Posted 04 May 2014 - 01:37 AM

There's a bit of research on this technique under the name "real-time reverse reprojecion cache". It's even been used in Battlefield 3 (rasterized, not Ray-traced though!)
[edit]i should've read your blog first and seen that you'd mentioned the above name.

 

[edit2] Here's the BF3 presentation where they use it to improve the quality of their SSAO calculations: http://dice.se/wp-content/uploads/GDC12_Stable_SSAO_In_BF3_With_STF.pdf


Edited by Hodgman, 04 May 2014 - 07:12 PM.


#5 spacerat   Members   -  Reputation: 753

Like
0Likes
Like

Posted 04 May 2014 - 02:04 AM

Yes, I know that paper. To what I understood, they just re-use the shading. Related is also a EG paper a while ago, which does iterative image warping ( http://www.farpeek.com/papers/IIW/IIW-EG2012.pdf ) However, they seem to need a stereo image pair to compute the following image.

 

The advantage of the raycasting method is, that it can reuse color and position, and further that raycasting allows to selectively raycast missing pixels, which is impossible with rasterization.



#6 Frenetic Pony   Members   -  Reputation: 1407

Like
0Likes
Like

Posted 04 May 2014 - 03:13 AM

Very neat, though I've found that ideas along these lines break down completely when it comes to the important secondary rays, IE incoherent bounces rays for GI, ambient occlusion, and reflections. Still thanks for this; there are use cases for primary raycasting, EG virtualized geometry. Hope you can find some way clever way to get high quality motion.



#7 spacerat   Members   -  Reputation: 753

Like
0Likes
Like

Posted 04 May 2014 - 03:48 AM

Yes, this technology is in general for speeding up primary rays. For secondary rays it depends - for static light sources, you can store the shadow information along with the pixel, then you can re-use it in the following frame as well.

 

If its a reflection or refraction, then you could do a quick neighbor search in the previous frame if its possible to reuse anything - but most probably this technology wont suit well for that case.



#8 MJP   Moderators   -  Reputation: 11832

Like
1Likes
Like

Posted 04 May 2014 - 10:46 AM

From your brief description this sounds very much like the temporal antialiasing techniques that are commonly used with rasterization. For reprojecting camera movement you really only need depth per pixel, since that's enough to reconstruct position with high precision. However it's better if you store per-pixel velocity so that you can handle object movement as well (although keep in mind you need to store multiple layers if you want to handle transparency). Another major issue that you have when doing this for antialiasing is that often your reprojection will fail for various reasons. The pixel you're looking for may have been "covered up" the last frame, or the camera may have cut to a comepletely different scene, or there might be something rendered that you didn't track in your position/depth/velocity buffer. Those cases require careful filtering that will exclude non-relevant samples, which generally means taking drop in quality for those pixels for at least that one frame. In your case I would imagine that you have to do the same thing, since spiking to 5x the render time for even a single frame would be very bad.



#9 spacerat   Members   -  Reputation: 753

Like
1Likes
Like

Posted 04 May 2014 - 03:01 PM

Yes, its related to temporal antialiasing techniques. However there, the geometry is not reused and needs to be rendered again. For the coordinates I also tried different ways but it turned out that using relative coordinates (such as the zbuffer) tend to accumulate the error over multiple frames, and the image is not consistent anymore then.  

 

Transparency is not yet handled and might need special treatment indeed. Its just for opaque surfaces.

 

To still cache the pixels efficiently even lots of them get covered-up / overwritten, I am using a multi buffer cache. That means the result of a screen is alternately stored in either cache, while both caches are projected to the screen at the beginning before filling the holes by raycasting. That can keep the pixels efficiently, as pixels that are covered up in one frame might already be visible again in the next frame. In general the caching works well. Also its a relaxed caching theme, where not every empty pixel get filled by a ray. The method uses an image filter for filling small holes and only fills holes that are 2x2 pixels large (the threshold can be defined) by raycasting.

 

If the camera changes to a different view, then obviously the cache cannot be used and the first frame will render slower.


Edited by spacerat, 04 May 2014 - 11:07 PM.


#10 MJP   Moderators   -  Reputation: 11832

Like
0Likes
Like

Posted 05 May 2014 - 05:01 PM

Nice, sounds cool!



#11 Krypt0n   Crossbones+   -  Reputation: 2686

Like
3Likes
Like

Posted 06 May 2014 - 04:23 AM

Intro I have been working a while on this technology and since real-time raytracing is getting faster like with the Brigade Raytracer e.g., I believe this can be an important contribution to this area, as it might bring raytracing one step closer to being usable for video games.

I had a feeling, by watching their demos that they're doing this already. (but the video quality is very bad, it's hard to see, but the silhouette ghosting artifacts made me think that)
 

Algorithm: The technology exploits temporal coherence between two consecutive rendered images to speed up ray-casting. The idea is to store the x- y- and z-coordinate for each pixel in the scene in a coordinate-buffer and re-project it into the following screen using the differential view matrix. The resulting image will look as below.

you don't need to store x,y,z, it's enough to have the depth, from there and the screen pixel coordinates, you can re-project it. that's done e.g. in Crysis 2 (called temporal AA) and in killzone 4 it got recently famous ( http://www.killzone.com/en_GB/blog/news/2014-03-06_regarding-killzone-shadow-fall-and-1080p.html ) and old school cpu-tracer demos have done that as well. This becomes easily a memory bound thingy, that's why reducing the fragment size to a minimum is key. Practically it should be nearly not noticeable compared to the tracing time.

 

The method then gathers empty 2x2 pixel blocks on the screen and stores them into an indexbuffer for raycasting the holes. Raycasting single pixels too inefficient. Small holes remaining after the hole-filling pass are closed by a simple image filter.

in path tracing, the trick is to not only re-project the final pixel (those are anti-aliased etc. and give you anyway wrong result), you have to save the originally traced image (with 10spp it's 10x the size!) and re-project those samples, then you'll get a pretty perfect coverage.
updates are now done on an interleaved pattern, e.g. replacing 1 out of 10 samples of the re-projection source buffer per frame.
 

Results: Most of the pixels can be re-used this way as only a fraction of the original needs to be raycated, The speed up is significant and up to 5x the original speed, depending on the scene. The implementation is applied to voxel octree raycasting using open cl, but it can eventhough be used for conventional triangle based raycasting.

5x is also what I've seen for primary rays, if you have enough depth during path tracing, it can get closer to linear speedup (depending on the spp and the update rate).
 

Limitations: The method also comes with limitations of course. So the speed up depends on the motion in the scene obviously, and the method is only suitable for primary rays and pixel properties that remain constant over multiple frames, such as static ambient lighting. Further, during fast motions, the silhouettes of geometry close to the camera tends to loose precision and geometry in the background will not move as smooth as if the scene is fully raytraced each time. There, future work might include creating suitable image filters to avoid these effects.

this also works for secondary rays, but gets a bit more complex, you have to save not only the position, but also the 2nd bounce and recalculate the shading using the brdf. I think vray is doing that, calling it 'light cache'.

the trick with motion is that you won't notice artifacts that much, so you can keep the update-rate stable, areas of fast motions will look obviously worse, but you won't notice it that much. only real problem you can get is similar to stochastic rendering, where a lot of samples fall into the same pixel and you have to use some smart reconstruction filter to figure out which samples really contribute the that pixel, and which ones are just 'shining' through and should be rejected. that might not be noticeable that much in your video, but if you trace e.g. trees with detailed leaves, you'll have to re-project sky samples and close leave samples and there is not really an easy way to decide what sky pixel to reject and which ones to to keep. best I've seen so far was a guy who used primitive-ids and has done some minimal triangulation of samples with the same id, but the result were far from good.
in Crysis 2, during motion, you'll notice that tree-tops look like they get thicker, I guess they use simply a nearest filter to combine sub-samples.


the silhouette issue arises when you work on the final buffer, that's what you can see in the rasterizer versions like in Crysis 2, Killzone 4. re-projecting the spp-buffer (using the propper subpixel position) will end up with no ghosting on sillouettes (beside the previously mentioned reconstruction issue).

#12 Frenetic Pony   Members   -  Reputation: 1407

Like
0Likes
Like

Posted 06 May 2014 - 05:07 PM

blink.png So much stuff, thanks Krypton!

 

Anything you could link to on caching secondary rays?



#13 spacerat   Members   -  Reputation: 753

Like
0Likes
Like

Posted 06 May 2014 - 10:40 PM

>>I had a feeling, by watching their demos that they're doing this already. (but the video quality is very bad, it's hard to see, but the silhouette ghosting artifacts made me think that)

 

Yes, a month ago they stated that. However, they dont tell if its used for primary or secondary rays and how the speedup was. I believe its simpler with secondary rays as you do a reverse projection, so you can avoid empty holes.

 

>>you don't need to store x,y,z, it's enough to have the depth

 

I tried to use the actual depth from the depthbuffer but that failed due to accumulated errors from frame to frame as most pixels are used over up to 30 frames.

 

>>  Crysis 2 (called temporal AA) and in killzone 4 it got recently famous 

 

TXAA (Crysis2) is just reusing the shading as I understand, so not for reconstructing the geometry,

The killzone method sounds more interesting as they predict pixels using the motion, so can reduce the render resolution.

I wonder if they need to store the motion vectors. Perhaps using the previous view matrices + depth is sufficient.

Sounds related to the approach used in mpg compression.

 

>> in path tracing, the trick is to not only re-project the final pixel (those are anti-aliased etc. and give you anyway wrong result), you have to save the originally traced image (with 10spp it's 10x the size!) and re-project those samples, then you'll get a pretty perfect coverage.

updates are now done on an interleaved pattern, e.g. replacing 1 out of 10 samples of the re-projection source buffer per frame.

 
>>this also works for secondary rays, but gets a bit more complex, you have to save not only the position, but also the 2nd bounce and recalculate the shading using the brdf. I think vray is doing that, calling it 'light cache'.
 

10 samples sounds pretty memory consuming, but interesting to hear more details about that method.

 

>>the silhouette issue arises when you work on the final buffer, that's what you can see in the rasterizer versions like in Crysis 2, Killzone 4. re-projecting the spp-buffer (using the propper subpixel position) will end up with no ghosting on sillouettes (beside the previously mentioned reconstruction issue). 

 

Apparently gamers also noticed the quality impact of this method , even its just applied to every second pixel. In my case the pixel is reused far longer which makes it even more difficult to keep the image consistent. I have also tried to increase the tile size, so every pixel on the screen is raycasted every 4th frame - that reduced the silhouette issue significantly, but also led to a lower performance obviously. Would be nice to track the silhouette over several frames somehow so it wont loose accuracy.


Edited by spacerat, 06 May 2014 - 10:46 PM.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS