Never sort a render queue with std::sort(). It is neither stable (most important factor) nor fast enough thanks to pointer function calls on each compare. Create a templated base class that uses < and == operators within its code and add these as inlined overridden operators to the render-queue item class/structure.
First you were saying std::sort was guaranteed to be too slow because it had to use a function pointer.
Only when it is a function object. Function pointers can’t be inlined, and it does work with function pointers: http://www.cplusplus.com/reference/algorithm/sort/
Now you're saying that, because std::sort has the option to use a function pointer, even though you don't have to, it's still too slow. Well, ok, I have the option of creating an insertion sort that calls new for every temp variable to go up against your function pointer std::sort. We'll see who can make the slowest straw-man sort.
In my case it was 2.34 times faster in a large scene with about 100 sortable objects, each sorted 3 times for the different passes for shadows and normal rendering.
Without knowing many details about the OP's code, there's no way I would ever come into an argument saying that something is definitely too slow, just because there's some other version that's 2.34 times faster. Maybe his program has 2.34 times as much as yours did to sort. Maybe it has ten times as much time. Maybe it has one tenth. Generally, I'd expect many more situations where either has plenty of time or neither is fast enough, compared to situations where one fits and the other doesn't.