Er, wait. Then again, now that I look at it, you already have:
for( int y = temp->h - 1, y2 = 0; y >= 0 && y2 < sheight; --y, ++y2 )
The idea being that 'y' counts backwards over rows, and 'y2' counts forwards over them, and you thus reverse the rows as you copy them across. Apparently (given your results) you're supposed to *not* do that, as a consequence of whatever SDL is doing internally to be BMP-friendly.