this doesn't actually sound too far off from how I am handling it, except that I only create/update the shadow-maps for light-sources within a certain radius (for smaller-scale lights, and use delays for updating each light). if a light gets too far away, its shadow-maps are destroyed.
currently, I am using a bigger 1024x1024 cube-map for the sun, but most smaller lights are currently using 256x256 cube-maps (was using 128x128, but made it bigger to increase shadow precision).
this is actually being combined with the use of stencil-shadowing for smaller objects (like characters and random 3D models), mostly because the shadow-maps aren't really all that precise and update slowly, making them a lot better for things like general scenery/terrain and not so much for something like a moving NPC.
currently, shadow-maps are also not used for object-attached dynamic lights, which instead use stencil shadows exclusively (theoretically though, for something like a lantern or projectile, a person could just have a much more rapidly updating shadow map though if-needed).
performance? not necessarily great, but it is workable I guess. there are tradeoffs here.
I think I saw something at one point IIRC using a camera-space depth-map and (somehow) making this work with shadow-mapping, but I am not really sure how exactly...
(I don't really remember the specifics, bit I think it was based on ray-marching similar to parallax-occlusion mapping or similar, but I could be wrong here...).
but, yeah, it is a mystery if anyone has better ideas?...