Hardware filtering on its own is not going to give you a large penumbra. A large penumbra is achieved by using a very large filter kernel, and bilinear is only 2x2. Trilinear and anisotropic are only going to kick in once there's some minification involved (shadow map resolution is greater than than the effective pixel shader resolution), and are designed to preserve texture detail while avoiding aliasing. If you want a large penumbra that "softens" your shadows everywhere, then you should pre-filter your shadow map with a large filter kernel (try 7x7 or so).
If you were using PCF, you could not pre-filter your shadow map. You would have to filter in the pixel shader, using however many samples are necessary. Since VSM's are filterable, you can use a separable filter kernel when pre-filtering to reduce the actual number of samples required. In addition, if you use a caching scheme for your shadow maps then you can amortize the filtering cost across multiple frames by caching the filtered result. With standard shadow maps, you must pay the filtering cost every time you sample the shadow map.
Aside from that, VSM also lets you utilize MSAA and hardware trilinear + anisotropic filtering. MSAA will increase your rasterization resolution to reduce sub-pixel aliasing, but you won't have to pay the cost of filtering or sampling at that increased resolution. As I already mentioned, trilinear and anisotropic will prevent aliasing when there is minification involved. This usually happens when viewing surfaces at a grazing angle. Here's some images from my sample app to show you what I mean:
[attachment=23211:Shadows_PCF.png] [attachment=23210:Shadows_Aniso.png]
The image on the left shows shadows being cast onto a ground plane, using only 2x2 PCF filtering. You can see that as you get further away from the camera, the shadows just turn into a mess of aliasing artifacts due to extreme undersampling. The image on the right is using EVSM (a variant of VSM) with mipmaps and 16x anisotropic filtering. Notice how the shadows don't have the same aliasing artifacts, and smoothly fade into a mix of shadowed and non-shadowed lighting.