GL/VK(/DX) abstraction layer: coordinate origin differences

Started by
1 comment, last by theagentd 7 years, 3 months ago

Hello.

I'm working on an abstraction layer for OpenGL and Vulkan, with the plan to add other APIs too in the future like DirectX 12, possibly even Metal. I'm coming from OpenGL, but I've spent a huge amount of time reading up on how Vulkan works. Now I want to write a powerful abstraction layer so that we can write our game once and have it run on multiple APIs so that we can support more hardware and OS combinations. I want to do write this myself and not use any libraries for this. The target is a minimal function set of OpenGL 3.3 with some widely supported extensions to match some Vulkan features, with OpenGL extensions allowing for more advanced features that are supported in both APIs.

My ultimate goal is to minimize or even eliminate the amount of API-specific code the user has to write, and in my investigations I found out that Vulkan uses a different NDC system (z from 0 to 1) and origin (upper left) than OpenGL. The NDC z range is a good change, as it allows for the 32-bit float depth buffer reverse depth trick and in general has better precision, so I want to embrace that whenever possible. This is pretty easy to do using either NV_depth_buffer_float or ARB_clip_control, whichever is supported, both of which are supported by both AMD and Nvidia. For certain Intel GPUs and very old AMD GPUs that support neither of those two, a simple manual fallback for the user of the abstraction is to modify the projection matrices they use, which is easy with the math library I use, so I consider this a "solved" problem.

The coordinate system origin difference is a much tougher nut to crack. It makes the most sense to go with Vulkan's origin, as it's the standard in all other APIs as well (DirectX and Vulkan). I see two possible solutions to the problem, but they either require manual interaction from the user or force me to inject mandatory operations into shaders/the abstractions making it slower and more limited. I'm not sure, but it seems like ARB_clip_control can be used to solve this problem by changing the origin, but I'm not sure if it covers everything (texture coordinate origin, glViewport() origin, glTex*() functions origin, etc). Regardless, it's not something that I can rely on being supported.

Solution 1:

Just roll with it. Let OpenGL render everything upside down with its lower-left-corner origin, then just flip the image at the end to correct it. This is a very attractive solution because it adds zero overhead to a lot of functions:

+ glViewport() and glScissor() just work.

+ The matching texture coordinate origin means that render-to-texture sampling cancels out, which means that sampling render targets in GLSL using texture() and texelFetch() both work without any modifications.

+ gl_FragCoord.xy both work without any modifications.

+ Possible culling differences due to face order differences can be easily compensated for in the API.

+ No dependence on the window/framebuffer size.

The only disadvantage, and it's a major disadvantage, is that textures loaded from disk (that weren't rendered to) will be flipped due to the mismatch in origin. There is no simple solution to this:

- Flipping all textures loaded from disk. That's a huge amount of CPU (or GPU if compute shaders are supported) overhead that adds a huge amount of complexity for precompressed textures. In addition to flipping, I'd need to go in and manually flip the indices in the blocks of all texture compression formats I want to support, and this would have to happen when streaming in textures from disk. We cannot afford to duplicate all our texture assets just to support OpenGL, and the CPU overhead during streaming is way too much for low-end CPUs, so this is not a feasible solution.

- Have the user mark which textures it reads from disk and which it reads from render targets, and manually flip the y-coordinate before sampling the texture. This could be done by a simple macro injected into the OpenGL GLSL that the user calls on texture coordinates to flip them for OpenGL for texture(), but solving it for texelFetch() requires querying the texture's size using textureSize(), which I think would add a noticeable amount of overhead in the shader. In addition, in some cases the user may want to either use preloaded texture or a rendered texture for the same sampler in GLSL, at which point more overhead would need to be introduced.

- Leave it entirely to the user to solve it by flipping texture coordinates in the vertex data, etc. I would like to avoid this as it requires a lot of effort for the user of the abstraction layer, even though it most likely provides the best performance.

Solution 2:

Flip everything to perfectly emulate VK/DX/Metal's origin. Pros:

+ Identical result with zero user interaction.

+ No need for manual texture coordinate flipping; GLSL preprocessor just injects flipping to all calls to texture() and texelFetch() (including variations).

The cons are essentially everything from solution 1, PLUS overhead on a lot of CPU-side functions too: glViewport(), glScissor(), etc, which requires the window/framebuffer size, ALL texture fetches would need their coordinates inverted (not just fetches from disk loaded textures).

Is there a cleaner solution to all this? =< There must be a lot of OpenGL/DirectX abstractions out there that have to deal with the same issue. How do they do it?

Advertisement

Doesn't seems textureSize has as much overhead:

https://www.opengl.org/discussion_boards/showthread.php/175998-Cost-of-textureSize

I'd just flip the UVs on access in the shader. Then flip the framebuffer when rendering at the end. I very much doubt it makes it as "slower and limited" as you think.

For example, Valve just flips the framebuffer for their OpenGL backend (Steam dev days talk). Unity seems to also just flip the coords on read, here:

https://docs.unity3d.com/Manual/SL-PlatformDifferences.html

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

Thanks, good to know that textureSize doesn't have much overhead. AMD's ancient shader analyzer seemed to equate it with the cost of a texture sample, which seemed a bit weird to me, but I assumed that it was simply more expensive than a uniform read.

Well, if Unity can't come up with a good solution, I guess there's no way to hide the origin difference completely. It's probably best to just expose it to the user and let them deal with it using some simple tools/macros-ish stuff.

This topic is closed to new replies.

Advertisement