I really like this question (or set of questions) as it presents a large number of the issues developers in your position face.
From my experience if you are already on the right track, you identified a large bottleneck and improved it.
What you need to avoid it trying to catch/handle every situation in the most efficient way - this can often end up worse! I recommend an approach that doesn't try to solve all of the issues but just the main ones. Often enough the performance of such a solution is plenty good enough for most applications.
If you are optimising your engine for this game and this game alone things might be different, but suppose you want to use the engine for a future game one that has more transparent objects, or transparent objects completely mixed in with opaque ones all over the place? Then a seemingly good solution might end up being a bad one.
In terms of over complexity avoiding you could start with very simple dirty flags for things like children changed or not, rather that trying to work out which exact child was removed or added it might be easier just to know IF one was added or removed and respond accordingly, hope that makes some kind of sense.
You will have to make those calls though based on your needs.
That said some kind of double buffer might work for handling changes (or lack thereof) well i.e. for each frame swap buffers and copy large chunks from the last one where possible when no changes occur otherwise just write in the new data.
Just remember that blasting through more linear data can often be faster than jumping around all over the place trying to avoid unnecessary work due to cache misses anyway.
Look forward to seeing other replies in here