It does not work that way. You have no guarantee on the order of execution (much less on the order of completion) inside a single draw-call.
It's really simple. Multiple execution units --> race conditions. You see those GPU blocks on every article each time a new GPU is released.
The only decent way to do order-independant-transparency is using D3D11 linked lists in my opinion.
The order that a primitive is rasterized and written to a render target is the same as the order in which you submit those primitives. This is part of the DX spec, and is guaranteed by the hardware. In fact the hardware has to jump through a lot of hoops to maintain this guarantee while still making use of multiple hardware units. This means that if you were able to perfectly sort all primitives in a mesh by depth, you would get perfect transparency. The same goes for multiple instances in a single draw call. The only case that's totally impossible to handle without OIT is the case of intersecting primitives.