Home » Community » Forums » » A Closer Look At Parallax Occlusion Mapping
  Intel sponsors gamedev.net search:   
[Control Panel] [Register] [Bookmarks] [Who's Online] [Active Topics] [Stats] [FAQ] [Search]

Add Forum to Favorites |  Send Topic To a Friend | View Forum FAQ | Track this topic


 Last Thread Next Thread 
 A Closer Look At Parallax Occlusion Mapping
Post Reply 
Is this 'spose to be new or something?

I implemented a very similar concept in a co project about 2 years ago, and it was faster than 1 frame every 12 seconds.

 User Rating: 1015    Report this Post to a Moderator | Link

Is that a typo, or does this implementation really only run at 1/12 fps?
I had thought that real-time paralax mapping was possible on latest-gen hardware.

 User Rating: 1743   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

The 1/12 FPS was measured using the reference rasterizer according to the article, which is probably why it's so slow.

 User Rating: 1535   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

Quote:
Original post by nilkn
The 1/12 FPS was measured using the reference rasterizer according to the article, which is probably why it's so slow.


Sorry, I am not familiar with DirectX specific terminology. I take it the reference rasteriser is a Software implementation?

 User Rating: 1743   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

Quote:
Original post by swiftcoder
Quote:
Original post by nilkn
The 1/12 FPS was measured using the reference rasterizer according to the article, which is probably why it's so slow.


Sorry, I am not familiar with DirectX specific terminology. I take it the reference rasteriser is a Software implementation?

More or less.



 User Rating: 1620   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

Yes, I should have been more clear. The reference rasterizer is a software device. The testing was done on a SM2.0 video card and had to be tested in emulation mode.

To the AP, I certainly did not invent POM. The article is supposed to be an explanation of how it works and to give an indication of several factors that could affect the performance of the algorithm. If you have developed a superior algorithm then I would encourage you to write an article and share the idea, so that everyone can benefit from it.

Hopefully the article helps explain the inner workings of the algorithm and is of use to some of the readers.

Jason Zink :: MVP XNA/DirectX
"Intellectuals solve problems, geniuses prevent them." - Albert Einstein
Check out my game: Lunar Rift :: Dual-Paraboloid Mapping Article :: Parallax Occlusion Mapping Article :: Fast Silhouettes Article
Check out our free online D3D10 book: Programming Vertex, Geometry, and Pixel Shaders

 User Rating: 1488   |  Rate This User  Send Private MessageView ProfileView JournalView GD Showcase Entries Report this Post to a Moderator | Link

That AP was me. My old team and I implemented what I guesss was pretty much the same thing as this in a project a while ago, just I don't remember anyone calling it 'Parallax Occlusion Mapping'. The project was closed, and I had forgotten all about it till now, we moved on to other stuff.

I'll have to go through your code, but I remember our code was pretty fast, just enough for real time. If I can dig the project out, I'll be happy to discuss it.

 User Rating: 1015    Report this Post to a Moderator | Link

That would be great - if you get a chance, please either post it or you can PM me and we can discuss the differences of each of the algorithms. I look forward to hearing about your method.

Jason Zink :: MVP XNA/DirectX
"Intellectuals solve problems, geniuses prevent them." - Albert Einstein
Check out my game: Lunar Rift :: Dual-Paraboloid Mapping Article :: Parallax Occlusion Mapping Article :: Fast Silhouettes Article
Check out our free online D3D10 book: Programming Vertex, Geometry, and Pixel Shaders

 User Rating: 1488   |  Rate This User  Send Private MessageView ProfileView JournalView GD Showcase Entries Report this Post to a Moderator | Link

You could increase your speed for searching the intersection tex coords from linear to logarithmic scale O(n) -> O(logn)

Right now you search like this if I understood you correct(this is for the negative z case only, positive case should work analog.):
From z=0;
move (dx,dy) along the texture plane and retrieve newz;
if(newz < z)
{
z = newz;
repeat;
}

This results in maxsamples in the worst case.


Now my idea:
z=0;
zend = sample(z.xy +(dx,dy)*maxsamples)

znew = sample((z.xy+zend.xy)*0.5)
if(znew > zend)
{
//the intersection must be right of znew
zend = znew;
repeat midpoint sampling on right of znew;
}
else
{
repeat midpoint sampling on left and right of znew
[zend-znew][znew-z]
}



Since your heightmaps usually don t jump up and down every pixel you will get O(logn) in the average case. thats just like traversing a binary tree

 User Rating: 1093   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

I also considered changing the next sample selection methods. However, I found that a logarithmic step generates more missed heightmap intersections as the camera angle becomes more inline with the plane of the POM surface. Since the steps are so large it is easy to miss a peak of some other feature.

I do think that there is a good use for it depending on the height map shape though. If the majority of the detail is low frequency then it could work very well and make things faster. However, for high frequency heightmaps (which is generally what POM would be used for) there may need to be more care taken in making large sample steps.

However, you could try it out - the sample .fx file supplied with the article should be pretty easy to modify to get it working the way that you intended. Try it out and let us know how it works out.

One thing to consider though is that your method uses dependant texture reads every texture fetch. Using a standard sampling size allows for non-dependant reads, which is a pretty good speed up.

Jason Zink :: MVP XNA/DirectX
"Intellectuals solve problems, geniuses prevent them." - Albert Einstein
Check out my game: Lunar Rift :: Dual-Paraboloid Mapping Article :: Parallax Occlusion Mapping Article :: Fast Silhouettes Article
Check out our free online D3D10 book: Programming Vertex, Geometry, and Pixel Shaders

 User Rating: 1488   |  Rate This User  Send Private MessageView ProfileView JournalView GD Showcase Entries Report this Post to a Moderator | Link

I can t test it because I am under linux with OpenGL only

 User Rating: 1093   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

Is it SM 3.0 only technique?

 User Rating: 1015    Report this Post to a Moderator | Link

The sample implementation is, but Natalya Tatarchuk has implemented it in SM2.0 as well. You can check her work out at www.ATI.com.

Jason Zink :: MVP XNA/DirectX
"Intellectuals solve problems, geniuses prevent them." - Albert Einstein
Check out my game: Lunar Rift :: Dual-Paraboloid Mapping Article :: Parallax Occlusion Mapping Article :: Fast Silhouettes Article
Check out our free online D3D10 book: Programming Vertex, Geometry, and Pixel Shaders

 User Rating: 1488   |  Rate This User  Send Private MessageView ProfileView JournalView GD Showcase Entries Report this Post to a Moderator | Link

I cannot find a SM 2.0 implementation of Parallax Occlusion Mapping anywhere... :(

 User Rating: 1015    Report this Post to a Moderator | Link

Hi,

I've not really understood what I need the:
float2 dx, dy;
dx = ddx( IN.texcoord );
dy = ddy( IN.texcoord );

//[...]

vCurrSample = tex2Dgrad( Sampler, IN.texcoord + vCurrOffset, dx, dy );
part for? Why can't I simple use a standard tex2D look-up?

And what would be the counterpart in GLSL?

 User Rating: 1009   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

The reason that you can't use the regular tex2D is that there is a hidden LOD calculation in that texture lookup. The LOD is based on the screen space derivative of the texture coordinates. So to find the screen space derivative, the hardware manufacturers use a first difference by subtracting the texcoords for two side by side fragments. This gives you the information needed to calculate the mip level to be used.

Since the texture lookup is used in the dynamic branching portion of the fragment shader, it is possible for two side by side fragments to take completely different code paths. Since this caused a big problem for the hardware implementation, it was just decided that HLSL would not allow any instruction that requires gradient-type (including tex2D) calculations within the dynamic part of the shader.

So the alternative is to manually calculate the gradient in the shader using ddx and ddy, then using the tex2dgrad to allow you to pass in the gradient values. I am sure it was added specifically for dynamic branching and texture sampling.

As far as GLSL, unfortunately I have never used it. However, I would be extremely surprised if there wasn't an equivalent instruction. After all, if the hardware supports the instruction in DX, then it probably supports it in OGL as well.

Jason Zink :: MVP XNA/DirectX
"Intellectuals solve problems, geniuses prevent them." - Albert Einstein
Check out my game: Lunar Rift :: Dual-Paraboloid Mapping Article :: Parallax Occlusion Mapping Article :: Fast Silhouettes Article
Check out our free online D3D10 book: Programming Vertex, Geometry, and Pixel Shaders

 User Rating: 1488   |  Rate This User  Send Private MessageView ProfileView JournalView GD Showcase Entries Report this Post to a Moderator | Link

Thank you. :)

dFdx(x)/dFdy(x) are the counterparts in GLSL. But I havn't found something analog for tex2grad() in GLSL.

 User Rating: 1009   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link


Eexcuse me
May I ask a question about the performance of your algorithm vs. ATI Natalya's example(in Microsoft DirectX 9.0 SDK Update (August 2005) sample).

Because I test both of two .fx in FX Composer and my project's engine, Natalya's is faster than yours about 30~40 fps with the same condition.

BTW, Natalya's example seems not fast enough to using in game, so I try to find some way to improve it, but I still find no good one yet.

sorry, my english is poor, it may need a little time to understand my words.

jrchen@nmi.iii.org.tw

 User Rating: 1015    Report this Post to a Moderator | Link

http://www.gamedev.net/columns/hardcore/pom

I read that as hardcore/porn

 User Rating: 1015    Report this Post to a Moderator | Link

I've been reading the ATi material on Parallax Occlusion Mapping and I'm not really sure how they compute the soft shadows.

As I understand is, from the found intersection within the heightmap profile, they construct a new ray and start sampling into the direction of the lightsource? But I don't understand how is this translated into a (soft) shadowing term? If the first step goes through matter this point is obviously occluded, but how is this translated to a softshadowing term?

thanks

 User Rating: 1015   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

Sorry to dig this up from the dead, but I can't seem to figure out how the author is deriving Ua:

His final derivation:
float Ua = (vLastSample.a - (stepHeight+fStepSize)) /
( fStepSize + (vCurrSample.a - vLastSample.a));


Using his variable names, I end up with:
Ua = (stepHeight - vCurrSample.a) /
((vLastSample.a - vCurrSample.a) - fStepSize)


I start with this:
Ua = (x4 - x3)(y1 - y3) - (y4 - y3)(x1 - x3) /
(y4 - y3)(x2 - x1) - (x4 - x3)(y2 - y1)

P1 = newSample
P2 = oldSample

P3 = newHeightField
P4 = oldHeightField

Then I want to find where the lines P1-P2 and P3-P4 intersect, but we only care about the X-axis so we can figure out where the texture offset is.

(x1-x3) = 0 and
(x4-x3) = (x2-x1) = step size along the parallax vector

Ua = (step)(y1 - y3) /
(y4 - y3)(step) - (step)(y2 - y1)

Ua = (y1 - y3) /
(y4 - y3) - (y2 - y1)

y1 = current sampling height
y3 = current height field height
y4 = old height field height

y2-y1 = 1/number of height samples = fStepSize is his example

Ua = (stepHeight - vCurrSample.a) /
((vLastSample.a - vCurrSample.a) - fStepSize)


Can anyone see if I'm doing something wrong, or if there's some algebraic trickery I'm missing to get our two derivations to be the same?

 User Rating: 1060   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

After reviewing the old code during an update to D3D10, I have found that there was indeed a bug in the interpolation between the two points. I have rewritten the effect file for D3D10, and am making it available here.

The updated code will work in D3D9 as well, but just needs to be ported back. It is significantly more efficient and produces much higher quality images without as many samples. I also included a 'gridlines' generation technique that can be used to accentuate the contours of the surface. If anyone has any issues with the new file please let me know!

Jason Zink :: MVP XNA/DirectX
"Intellectuals solve problems, geniuses prevent them." - Albert Einstein
Check out my game: Lunar Rift :: Dual-Paraboloid Mapping Article :: Parallax Occlusion Mapping Article :: Fast Silhouettes Article
Check out our free online D3D10 book: Programming Vertex, Geometry, and Pixel Shaders

 User Rating: 1488   |  Rate This User  Send Private MessageView ProfileView JournalView GD Showcase Entries Report this Post to a Moderator | Link

How do you do lighting with the dx9 version?

 User Rating: 1015   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

All times are ET (US)

Post Reply
 Last Thread Next Thread 
Forum Rules:
You may not post new threads
You may post replies
You may not edit your posts
You may not use HTML in your posts
Jump To:
Administrative Options: