There is a reason why second method (using pointers) is faster than the first one. Processor doesn't have to calculate the memory offset of each member of the structure when using pointers which makes things run faster.
From personal experience, generally everyone knows why pointers are fast, but few people understand why.
Late to the party, but no, that is not why pointers are faster.
The pointer version of the function is faster only because it accesses and modifies the initial object in place, whereas the non-pointer version copies the entire object twice (once when passed as a parameter by-value, and once when returned by-value).
That's just another reason, and not the only reason. Two people can still be right with different facts sir, doesn't mean one needs to diss the other.
To add to above solution, it isn't possible to "average" out the colors the way you are trying to achieve it in the colorbar fashion. So you do need separate buffer for each bar and use a color value for that. Ofcourse, getting the value of color to fill up each bar is easier to achieve by simply taking the leftmost and rightmost bar's color and breaking up into N components for N bars using Arithmetic progression logic or whatever suits you.
I think you're a bit confused about /clr option. It works even for an application which doesn't have managed data. You may want to read more about both nullptr and /clr. But anyhow, that isn't the problem at hand.
I think you're already quite close to the solution.
Unfortunately, I am unable to try out your code on my setup and check this issue firsthand.