Since I am not working with an artist I get my my models from sites like turbosquid. This being the case I can't really place constraints on my models so I need to use fairly robust techniques.
The algorithm essentially comes down to first, creating a unique set of triangle-edges, where each edge has a signed counter variable initialized to 0. You create the shadow volume in two steps. First you render the front facing triangles, and while doing so, you either increment or decrement the counter value of its edges based on a certain condition (specifically, based on whether the edge is directed the same as the rendered triangle).
After all triangles have been processed, the silohuette edges are rendered based on the value of the counter in each edge object.
The exact details are not so important, my concern is a good GPU implementation. The shader would need to keep a list of counters for each edge and update them per-triangle render. Then process only those edges who's final counter value is non-zero. Since they potentially update the same value counter variable, there could be some bottleneck I think.
Is this technique just not practical for GPU optimization?