Many game engines use a draw call sorting approach to create render queues (like this, Bitsquid also uses a similar system).
This works well when each stage (gbuffer pass, lighting) is only executed one time since you just have to set some bits in the sort key that identify the pass.
The problem appears when you have passes that might be executed multiple times like shadow mapping (and real time reflection maps) because it needs to be executed for each light on screen that casts shadows. There is a limited number of bits that identity each stage so you (probably) cannot simply assign each light a unique stage number.
I currently don't include stage bits in the sort key instead I have multiple "buckets" each bucket associated with a stage and render items are put in the buckets. Then the stages are executed in the defined order by rendering the render items inside each bucket, this way I can render the shadow map bucket multiple times.
In the end of the article I linked in the beginning of the topic the author wrote that this multiple buckets approach where draw calls are sorted inside the buckets was used in the PS2-era and nowadays draw calls from multiple stages are sorted together.
What am I missing?