(dwA << 24) | (dwR << 16) | (dwG << 8) | (dwB << 0)
in a little-endian system this means that B will be at the lowest byte, which is byte 0. A will be at the highest byte, which is byte 3.
What's a little weird is that it's often common to show byte order as going from right to left. When you show things this way (like what you did in that diagram you made) then the byte order direction is consistent with the shifting operators, and is also consistent with how you construct numbers (since when you write a number you put the most significant digits on the left). Using such notation, R8G8B8A8 woud actually be written as "ABGR" byte order. However this is backwards from the actual order in terms of memory addresses, and DXGI lists components in terms of lowest to highest memory address. So for R8G8B8A8, you want (A << 24) | (B << 16) | (G << 8) | (R << 0),
Also does the + 0.5f in (r * 255.0f + 0.5f) and the like, cause rounding upwards or is it doing something else?
Yes, that is for implementing round-to-nearest, which is what the hardware uses when converting from FLOAT to UNORM formats.