I've used SAT-GJK combined with EPA (Expanding polytope algorithm) for generating accurate contacts with success. It's actually not as slow as some would have you believe, I can easily simulate a few thousand objects in a large pile on one thread at 60fps with a brute-force broadphase. The main bottleneck is usually constraint solving, not collision. My implementation handles any type of convex shape, given a support function, and can refine the contact results to be within a user-defined epsilon of the actual contacts.
My implementations of these algorithms are pretty well commented/flexible and has been rewritten several times for performance/clarity. You can find it here (as part of om::physics::collision, sorry I don't have a direct link). (GJKSolver/EPASolver classes + supporting classes)