How about a two-click mechanism? The first click lets you drag a rectangle of blocks over the XZ-plane, the second click scales along the Y-axis. This allows the user to create any combination of blocks.
Given the three orthogonal axes defined by a bone matrix, you could search for extreme vertices that are affected by the bone in the underlying mesh. If you search in both the positive and negative directions you'll end up with six planes that form an oriented box. This box you can then transform into world space using the same matrices as used for the bones.
Or you can transform the box by the inverse of its bone matrix if you need an AABB that's positioned at the origin. Would take up less memory.
A typical displacement map is sampled like any other texture: using a set of texture coordinates (like from a uv map). You can use adaptive tessellation to create additional vertices that sample this map in between the original vertices.
If you want to do a one-on-one mapping of 409 displacement values tied to 409 vertices it makes more sense to store them as a vertex-(weight-)-map and use them as a vertex shader input (next to position, uvs, etc.).
Could I for example save alpha in normal map textures in the 'Z' channel, which is always sort of the same?
It's not always the same, but its value can be derived of the other two components.
D3D10 added the BC5_SNORM 2-channel format which lets you store the x and y channels in the normalized [-1, 1] range. The z component can then be re-calculated in-shader. This format provides high quality compression for normal maps.
So smooth surfaces do not duplicate vertices at the same position? I figure the Normal must then be a blend of the 3 adjacent face normals?
Yes, for a cube specifically, that would be problematic. If you tie the SV_VertexID semantic to a vertex shader input (D3D10+), you can forego vertex buffers altogether and construct a cube in its entirety in the vertex shader based of the index value that's passed in.