Going straight to 1/4 of the size can lose single pixel size bright spots - due to the 2x2 sample you pointed out. For bloom this can make a huge difference, as now your highlights can shimmer in and out of existence as that bright pixel moves in screenspace. The upsampling on each level gives a smoother result than going straight from 1/4 back to 1/1.
That said, depending on your hardware (i'm looking at both main current gen consoles here, but it probably applies to PC GPUs) its faster to go straight to the 1/4 size with a shader that samples 4 times instead. This is due to actual cost of the physical write to texture memory for the intermediate step being expensive itself. Doing this also means no precision loss that would otherwise be incurred by downsampling RGBA8 textures, as the shader does the filtering in 32-bit.