Does this mean that if there is not enough VRAM, the vbo will be stored on ram instead? Or will it just throw a memory error and not store it?
So far as any guess that I have concerning a split memory scheme goes, I would say 'yes', most likely. This would be entirely up to the GPU driver capabilities. If the driver is written to do so and if the driver is stable on that machine, why not. If the driver can do one or the other, then why not both at the same time?
I can foresee some issues with this. How does the driver know what to put where? What if all the irrelevant models are on GPU RAM and all the currently displayed models are all in main system memory? This would not be optimal.
I don't think that a memory error would be thrown unless you decide to implement this yourself, or if a library that you are using does this to you.
You might want to do this for mobile devices which may have memory restrictions, but for conventional machines you will likely just end up forcing the OS to swap out RAM onto the hard-drive into virtual memory space.
For desktops and laptops, you not only have GPU RAM(500MB?) and system RAM(2GB?) but virtual memory on the hard-drive as well(2GB?),
You can get away with loading a whole heck of a lot of stuff, but the problem is not running out of memory so much as all the swapping that will have to take place behind the scenes. Swapping between CPU and GPU isn't so bad, swapping from CPU to HDD is going to lock up your OS for some time. Hopefully not a long time.
It may be best, as a start, to script what is supposed to show when the character is at a specific point. This requires the least memory but will require hard-drive access during game-play. Hard-drive access is very slow, for a long time the common number being passed around for how much slower the hard-drive is was 200x slower then the 'system-bus-thing-a-ma-bobby'.
If I were to put any logic into loading/unloading I would start with the following.
I would only load models from the HDD when the frame rate is high and I'd leave things as they are when the frame-time drops due to other reasons. Instead of loading everything from the next scene all at once, I'd load pieces of the new scene at times when the system is running quickly. This has the potential to minimize hick-ups in the game-play due to asset loading.
Dropping assets from memory should be quicker than loading unless you happen to be saving them for some reason, you could do this almost anytime.