I'm pretty sure this is possible because there are a finite number of sequences that can be constructed - because each operation is an edge-reduction, a model with N edges can only have N operations (before it's just a single vertex). So we're looking at extracting from that set U of all possible operations, the sequence S that is the best visual result for the current camera parameters.
Also, neighboring camera positions will have similar sequences - strings of a hundred ops that may only differ by a couple of indices. So perhaps delta compression would be a way to go. I'm not sure... you can't interpolate an edge-collapse operation, really. Either you do it or you don't. If you 'half' do it, you're shrinking the edge but not removing it, which seems kinda pointless... I guess it might be worthwhile to prevent popping.
Should each string contain an ordering of all available operations? It's probably the case. So for N edges, we're looking at N operations, and thus N! possible sequences. That's a lot of sequences.
Though that does suggest that it'd be possible to actually divide the viewing sphere up into regions, each of which is assigned a sequence ID (a number between 0 and N! - 1).
I think I'd better go and re-read through Hugues Hoppe's stuff on the subject. What with him being the chap who invented progressive meshes and all...