I'm developing an application making use of DirectX11's stereoscopic features. At such I am not relying on nVidia's automatic implementation. That means I have to replicate a way to render 2d elements in clip space at an user-defined depth.
According to their 3d vision automatic best practices whitepaper, they suggest scaling the clip position by the chosen depth factor in the shader itself:
pClip *= depth/pClip.w
This works because the shader "footer" they add to the intercepted draw calls does:
pClip.x+= eye * separation * (pClip.w - convergence)
Where eye is the sign which alternates between left and right draw calls. This produces offsetted 2d renderings that when fused make 2d elements appear at the chosen apparent depth. Any .w greater than convergence appear beyond the screen and vice versa. I have replicated this functionality (for regular 3d drawing I am using two projection matrices) but since I need these elements to be rendered in orthographic mode I am not entirely sure if it is correct.
I am using a value of 1.25" as the interocular distance and 24" as the viewer distance (or convergence). If I plug those values in the equation above then it produces an offset that is too great to be fused. I have eyeballed a value of 0.015 that appears to produce correct positioning. So the equation becomes:
pClip.x+= +/-1 * .015 * (pClip.w - 24)
What is the correct math to use to calculate the correct offset in clip space so that two 2d ortho renderings result in a user chosen apparent screen depth?