HI! Thanks a lot!, this along with some NVidia slides led me what I believe you were attempting as well.
Modifying your solution, This is the most Elegant regarding what I believe most people will stumble upon this post for.
RWByteAddressBuffer Accum : register( u0 );
void interlockedAddFloat(uint addr, float value)
{
uint comp,orig = Accum.Load(addr);
[allow_uav_condition]do
{
Accum.InterlockedCompareExchange(addr, comp = orig, asuint(asfloat(orig) + value), orig);
}
while(orig != comp);
}