Jump to content
  • Advertisement
Sign in to follow this  
Mr_Fox

Does 32bit RWTexture3D<uint> (or any 32bit RWTexture<uint> object) support atomic ops?

This topic is 694 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey Guys,

 

Does RWTexture3D<uint> (32bit) support atomic ops?  MSDN page didn't list any InterlockedXXX as RWTexture3D's functions, however shader compiler is ok with 

RWTexture3D<uint> tex_uavFuseBlockVol : register(u2);
...
...
[numthreads(THREAD_DIM, THREAD_DIM, THREAD_DIM)]
void main(uint3 u3GID : SV_GroupID, uint3 u3GTID : SV_GroupThreadID,
    uint uGIdx : SV_GroupIndex)
{
    ....
    ....
    InterlockedAdd(tex_uavFuseBlockVol[u3BlockIdx], 1, uOrig);
    ...
    ...
    if (uGIdx == 0 && uThreadGroupIdxInBlock == 0) {
        InterlockedAnd(
            tex_uavFuseBlockVol[u3BlockIdx], ~BLOCKSTATEMASK_UPDATE);
    }
}
...
...

No error, no warning

 

So I didn't pay much attention to it... However recently I found a bug which seems point directly to the atomicity of operation on that Texture3D. So I checked the docs, as mentioned before, it didn't say RWTexture3D support atomic ops. However it state that :

 

 

 

You can prefix RWTexture3D objects with the storage class globallycoherent. This storage class causes memory barriers and syncs to flush data across the entire GPU such that other groups can see writes. Without this specifier, a memory barrier or sync will flush a UAV only within the current group.

 

I was wondering whether the keywoard 'globallycoherent' will fix my bug, but it didn't, so I get a little bit confused about what this keyword actually does, and in what kind of case should we use it.

 

Also my shader suggests that for RWTexture3D<uint>(32bit) InterlockedAdd is working properly (maybe I just lucky?)  however InterlockedAnd is not...

 

Any idea, comments, or suggestions are greatly appreciated.

 

Thanks

Edited by Mr_Fox

Share this post


Link to post
Share on other sites
Advertisement

Atomic ops should all be fine on an R32_UINT texture.

 

How have you verified that the InterlockedAnd isn't working?

Share this post


Link to post
Share on other sites

Atomic ops should all be fine on an R32_UINT texture.

 

How have you verified that the InterlockedAnd isn't working?

 

If that's the case I may need to dig more into my algorithm... it may be on my side though:

during end of my frame I have a cs will do bunch of computation and at end will rest one bit of that volume (I do have other interlocked inst in the same shader to the same volume without barriers, but it should be ok right? also this need atomic op across different threadgroups though...), so at the beginning of next frame, that bit should be 0 for all voxels, but it's not. Then later I use a cs with UAV barriers after that pass to just do   tex_uavFuseBlockVol[u3BlockIdx] &= ~Flag;  And the bug disappear... beside, the MSDN page didn't list atomic ops so I want to first check with you guys before I dig more into my implementation...

 

Also how to use the globallycoherent keyword? And when we need that?

 

Thanks 

Edited by Mr_Fox

Share this post


Link to post
Share on other sites

How have you verified that the InterlockedAnd isn't working?

It turned out the problem is caused by not turning debuglayer on.... please see my new post on that

 

Thanks 

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!