Sign in to follow this  
lipsryme

DX12 Subpixel precision and integer coordinates

Recommended Posts

lipsryme    1522

When implementing subpixel precision in a software rasterizer I've found the following code:

It's just too bad the author doesn't explain anything to it. Can someone explain to me how shifting these integer variables gives us 4 bit subpixel precision a.k.a 16 more values of precision ? Not to mention this not working in my code :/

// 28.4 fixed-point coordinates
const int Y1 = iround(16.0f * v1.y);
const int Y2 = iround(16.0f * v2.y);
const int Y3 = iround(16.0f * v3.y);

const int X1 = iround(16.0f * v1.x);
const int X2 = iround(16.0f * v2.x);
const int X3 = iround(16.0f * v3.x);

// Fixed-point deltas
const int FDX12 = DX12 << 4;
const int FDX23 = DX23 << 4;
const int FDX31 = DX31 << 4;

const int FDY12 = DY12 << 4;
const int FDY23 = DY23 << 4;
const int FDY31 = DY31 << 4;

// Bounding rectangle
int minx = (min(X1, X2, X3) + 0xF) >> 4;
int maxx = (max(X1, X2, X3) + 0xF) >> 4;
int miny = (min(Y1, Y2, Y3) + 0xF) >> 4;
int maxy = (max(Y1, Y2, Y3) + 0xF) >> 4;

int CY1 = C1 + DX12 * (miny << 4) - DY12 * (minx << 4);
int CY2 = C2 + DX23 * (miny << 4) - DY23 * (minx << 4);
int CY3 = C3 + DX31 * (miny << 4) - DY31 * (minx << 4);

for(int y = miny; y < maxy; y++)
{
    int CX1 = CY1;
    int CX2 = CY2;
    int CX3 = CY3;

    for(int x = minx; x < maxx; x++)
    {
        if(CX1 > 0 && CX2 > 0 && CX3 > 0)
        {
            colorBuffer[x] = 0x00FFFFFF;
        }

        CX1 -= FDY12;
        CX2 -= FDY23;
        CX3 -= FDY31;
    }

    CY1 += FDX12;
    CY2 += FDX23;
    CY3 += FDX31;

}
Edited by lipsryme

Share this post


Link to post
Share on other sites
Hodgman    51234

That code basically compacts to:

((x + 0xF) >> 4) << 4

...which is strange -- doing a shift down followed by a shift up immediately afterwards.

If I'm not mistaken, that's the same as:

((x + 0xF) & ~0xFU)

i.e. clear the lower 4 bits, while rounding upwards to the nearest multiple of 16.

Given this, and not much context, I would guess that the author is using a 28.4 fixed point format, and this is an implementation of the ceil function?

Share this post


Link to post
Share on other sites
Necrolis    1464

That code basically compacts to:

((x + 0xF) >> 4) << 4

...which is strange -- doing a shift down followed by a shift up immediately afterwards.

If I'm not mistaken, that's the same as:

((x + 0xF) & ~0xFU)

i.e. clear the lower 4 bits, while rounding upwards to the nearest multiple of 16.

Given this, and not much context, I would guess that the author is using a 28.4 fixed point format, and this is an implementation of the ceil function?

Shifts preserve the sign bit if I'm not mistaken

 

EDIT: hmm, on second though, I didn't read your code correctly, but then again, shifts would be "safe" standard wise, as opposed to the bit masking on an integer.

Edited by Necrolis

Share this post


Link to post
Share on other sites
lipsryme    1522

Sorry I've added some important bits after your post.

So I need to round upwards after multiplying by 16.0f ? Because I thought his iround would be the same as a cast to int.

 

update1: still using ceil does not help me on my results: http://d.pr/i/qZBx (screen). I basically copy+pasted his code with the same result.

update2: Let's see if I get this right...

 

Looking at the bitfield and consider as an example an 8bit (less zeros to write smile.png ) integer with the value 3

 

3 = 0000 0011

 

now we multiply this by 16

 

3 * 16 (48) = 0011 0000

 

and shift it 4 to the right ?

 

48 >> 4 =  0000 0000

 

and then we do some math/comparison with it ?

 

and after that we shift it back to get our original value

 

48 << 4 (3) = 0000 0011

 

Is that correct ?

Edited by lipsryme

Share this post


Link to post
Share on other sites
alvaro    21246

Just a word of caution about shift operators and signed integers: The result of shifting negative values is either implementation defined or undefined, depending on the direction of the shift and the exact standard [of C or C++] being followed. The way this code is written is problematic.

 

If the code can be rewritten using only unsigned integer types, that would be a good thing to do. Otherwise it's probably better to express it using division and multiplication (which the compiler will often turn into shifts for you). I guess if everything else fails, it can be implemented in assembly language.

Edited by Álvaro

Share this post


Link to post
Share on other sites
lipsryme    1522

My problem is rather how does the above code (or similar) for getting subpixel precision work...?

Using the exact code from his topic for rasterization (the one above) results in artifacts like these: http://d.pr/i/TerH

Without the subpixel precision the result is flawless.

Edited by lipsryme

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Similar Content

    • By mark_braga
      I have a compute shader which writes to an RWStructuredBuffer and later a pixel shader uses it as SRV for reading (StructuredBuffer).
      For this I put a barrier to transition the buffer to SRV and back to UAV after the pixel shader is done. I am trying to minimize number of barriers so one option is to just use RWStructuredBuffer even in the pixel shader even if the shader only reads from it.
      So my question is, does RWStructuredBuffer come with a hidden cost for reading which is greater than the cost of the two barriers?
       
    • By AxeGuywithanAxe
      I wanted to get a gauge on how everyone is handling resource descriptors and bind slots in the new API. Under directx11 I would allow the compiler to generate its own bind slots, and look it up at runtime. I wanted to see if people using directx12/Villanova have switched to explicit slots for easier descriptor set management, or are still using the prior version, and burning through descriptors
    • By Inline Engine
      Hello Everybody,

      Our professional experienced hobby team (5 active members) is looking for Volunteers to take a seat in the development of a next generation C++ 3D Game Engine.

      The minimum requirement is to have passion about games/game engine programming.

      The Engine:

      - Cross Platform & Clean Code Design ( For now we only want to aim PC & Console )

      - Fully Customizable Graph Based DirectX12 Graphics Engine, (PBR is in progress)

      - Own Editor + UI system + Math library

      - (Will be) multithreaded with Job Based System like in Uncharted 4's engine

      Roles we are looking for now: (Later will be more)
      - Editor & UI & Tools Programmer

      - Generalist Programmer

      Picture from the current early state of the editor:
      https://pasteboard.co/eb4hfgGM3.png

      Source code: https://github.com/petiaccja/Inline-Engine

      If you have the passion to build game engines / games write an e - mail to: InlineEngine@gmail.com
    • By mark_braga
      I am working on making a DX12, Vulkan framework run on CPU and GPU in parallel.
      Decided to finish the Vulkan implementation before DX12. (Eat the veggies before having the steak XDD)
      I have a few questions about the usage of ID3D12CommandAllocator:
      Different sized command lists should use different allocators so the allocators dont grow to worst size Does this mean that I need to know the size of the command list before calling CreateCommandList and pass the appropriate allocator? Try to keep number of allocators to a minimum What are the pitfalls if I create a command allocator per list? This way each allocator will never grow too large for the list. In addition, there will be no need for synchronization. Most of the examples I have seen just use a pool of allocators and do fence based synchronization. I can modify that to also consider command list size but before that any advice on this will really help me to understand the internal workings of the ID3D12CommandAllocator in a better way.
    • By _void_
      Hello guys,
      I would like to use MinMax filtering (D3D12_FILTER_MAXIMUM_MIN_MAG_MIP_POINT) in compute shader.
      I am trying to compile compute shader (cs_5_0) and encounter an error: "error X4532: cannot map expression to cs_5_0 instruction set".
      I tried to compile the shader in cs_6_0 mode and got "unrecognized compiler target cs_6_0". I do not really understand the error as cs_6_0 is supposed to be supported.
      According to MSDN, D3D12_FILTER_MAXIMUM_MIN_MAG_MIP_POINT should "Fetch the same set of texels as D3D12_FILTER_MIN_MAG_MIP_POINT and instead of filtering them return the maximum of the texels. Texels that are weighted 0 during filtering aren't counted towards the maximum. You can query support for this filter type from the MinMaxFiltering member in the D3D11_FEATURE_DATA_D3D11_OPTIONS1 structure".
      Not sure if this is valid documentation as it is talking about Direct3D 11. D3D12_FEATURE_DATA_D3D12_OPTIONS does not seem to provide this kind of check.
      Direct3D device is created with feature level D3D_FEATURE_LEVEL_12_0 and I am using VS 2015.
      Thanks!
  • Popular Now