Let's start from the problem I am having now: I (in the sense of my system) am bad at transparency (in the sense of blend). What I'm currently doing is the following: there's a limited set of blending operations available in the material definition, if a blending operation is specified, it looked up to understand if it's order dependant.
If it is, the objects using the order-dependant material will be rendered back-to-front.
I already have a few problems up to now.
- The blending operations are essentially fixed pipeline. This is a bit of an anticlimax to me considering all the rest is largely shader-driven. I'd like to hear some opinions on how to make this look "more programmable" than it really is. The main problem to me would then be to find out if an operation needs to preserve order or not.
- It is unclear how, for example "rusted windows" might have to interact as they truly have to force full background render. I know some systems have a specific per-material blend priority setting but I cannot quite make it work in my head. It seems to me that a full sort would suffice even for some effects. The main trouble I see here is, for example, transparent decals on transparent glass windows but in line of concept, the decals would be closer to the camera and render last.
- Because the sort is currently per-object, some objects won't render correctly against themselves. Given back face culling, the small amount of demos I've given up to now and the complete lack of goblets, I've managed to get satisfactory results.
I can live with (1) and (2) for now but I really feel the need to deal with (3) so I already put in place some of the machinery required to sort object batches. I don't plan to resolve per-triangle sorting for the time being, it seems way too much work. Goblets would still not render correctly but objects using multiple materials will as long as they play nice.
This would potentially introduce the need of a lot of batches. That made me think about the possibility to evaluate overlap on a finer basis. I admit it's probably not completely my idea, I think I got the inspiration from Dice's presentations about software occlusion culling in Frostbite.
So what I was thinking is this:
- always generate order-dependant geometry as triangles (user set a special flag if material is order-dependant). This might take some minor work as some assets might be using strips.
- Rasterize z (or "some metric") for the triangles obtained (probably 1/4 res), when overlapping, put the overlapping batch "on a new layer".
- Merge batches of triangles in the same layer.
- Draw in order.
For me, this will result in generally low batch count... perhaps too low. I see this useful for the future but it's a distant one and it sounds more like a premature optimization than anything else. It is unclear how this will fit in the system for dynamic geometry.
What's really troublesome it's the way geometry would have to get transformed as nobody says it'll be a standard MVP transform. Because of a previous design, this is not a problem but performance will be terrible (probably a very good candidate for multi-threading). As it stands now, the system won't consider opaque geometry which would result in extra, unnecessary passes, but I don't see any possibility of transforming world geometry in the long run (I think it would be viable right now).
At this point, I might even try full OIT for what it takes!
But there's something else with the software rasterization when it comes to particle systems, which is another system I need to bump up quite a bit!
Since particle systems are good mostly for fuzzy objects, I could somehow instruct the z-resolver in issuing "no more than N layers". This would result in more efficient usage of blending operations for the interior part of the PS, which could probably go unnoticed given sufficient particles. It's ironically a very "fuzzy" idea.
Suggestions and further considerations are welcome.