I think I got similar artifacts when using env-maps, here's my journal entry about how I handled it in my engine, it might be helpful for you.
I don't think that really helps in my case.
You could look into bent normals or bent cones for sampling the environment, either pre-baked into vertices/textures, or a screen-space version. These replace the actual surface normal with a fudged version to produce more feasible lighting.
You could also try baking a visibility map per vertex/texel, stored in a spherical-harmonics basis (or similar), and then using the visibility term to modulate the env-map. This won't be correct if using pre-convolved env-maps, but might be better than nothing...
I generated a bent normal map in xNormal, and it made nearly no difference. I'd like to know more about using SH visibility maps, but I haven't found much material on it.
you could approximate the head with a sphere in the shader and calculate a soft occlusion by it, not perfect, but could be quite fine most of the cases, while fast, minimal storage and no precomputation.
This head is just an example. I would need a solution that would work for all objects, some much more complicated than a head.