I'm reviving some of my graphics programming abilities by playing around with some old HW of mine and trying to do graphics programming without any external libraries. I'm getting stuck implementing a fast enough fill routine to to draw overlaps of objects. I'm letting the method figure out which is on which side, so the corner coordinates aren't sorted and might move outside of the image and needs to be clipped. The result is therefore:
void fill(int x0, int y0, int x1, int y1, int color) {
if (x1 < x0) std::swap(x0, x1);
if (y1 < y0) std::swap(y0, y1);
if (x1 < 0 || y1 < 0 || x1 >= 1024 || y1 >= 768) return;
x0 = x1 < 0 ? 0 : x0;
x1 = x1 >= 1024 ? 1023 : x1;
y0 = y0 < 0 ? 0 : y0;
y1 = y1 >= 768 ? 765 : y1;
// Rest of code...
}
The problem is that the inputs are pretty randomly ordered etc. so the branches aren't well predicted. I assume that there must be some neat tricks for optimizing this chunk of code. I've tried various tricks that compile to SAR for the clamping, but if I convert the code line by line to bit fiddlings the code gets significantly longer and actually executes worse. I assume this is a pretty well known method to implement efficiently and there must be some old tricks for speeding it up significantly?