The example above that uses appliedVelocity was part of my less-stable version which I am not using anymore. The idea comes from "Game Physics Engine Development" by Ian Millington (section 16.1.2 of 2nd ed) and is intended to remove the previous frame's applied velocity (primarily acceleration due to gravity) in order to reduce jitteriness. With position correction integrated, my engine was stable enough that the term wasn't needed anymore, so I've since removed it.
If you still want to use it, my understanding is that you determine it's magnitude along the collision normal (appliedVel dot normal), and subtract it from the amount of desired change in velocity for that frame in order to prevent over-compensation.
Regarding restitution, for my needs, I really don't need much to be "bouncy", so I'm setting the change in vel to be only enough to remove the relative velocity w.r.t the normal (i.e. changeInVel = -magnitudeOfRelativeVelOnNormal). But anyways, the added energy may be coming because of the integration of the position correction. Remember position correction only needs to be applied for a single frame; and so when we apply it as an impulse, it will continue to affect the bodies in question over several frames. To account for this, you should experiment with only applying a fraction of this corrective impulse. This combined with warm starting should be enough. If you dig into the full Box2D (rather than Box2D lite) I believe it handles coefficients of restitution more dynamically.
Something else to consider. If you are caching your contacts across multiple frames or have implemented any type of relaxation to solve your velocity constraints, you'll want to make sure that you update the "velAlongNormal" each time you apply an impulse to either body otherwise you may accumulate error which *could* be causing any extra bounciness.