Normally what take most of the loading time are:
- I/O: Loading from disk and/or decompressing if it's compressed.
- Creating the D3D resources (be it a vertex buffer or texture, etc)
Uploading the data from RAM into VRAM is the least expensive. It's not free either, but you can load the data from disk into RAM in the background then from the main thread send to the GPU; and you have no way to decrease the cost of creating a D3D (even in D3D11), you can however, preallocate those resources to avoid the stall.
I agree it would be best (and certainly easier!) if you could load from disk into VRAM without any need for locking, but well, that's what we have to live with.