I'm working with an app that is very greedy when it comes to video memory. When one instance of the app is running the performance is fine. I can run two instances of the app and there is little performance loss - that is, each instance runs only slightly slower when compared to only one instance. When a third instance is running the system grinds to a halt. Other tests show the issue is related to graphics and I'm trying to determine the cause. The symptoms point to thrashing but can GPU's have thrashing? I understand there are caches for things like transformed vertices and local texels fetched but those deal with small elements, not entire textures. Also, I think there is a context switch that happens when a different process uses the GPU for rendering, but surely this can't cause the problem I'm seeing. Have any of you experienced any thrashing on the GPU?
Note: When all three instances are running I estimate VRAM usage is ~700 MB and the computer has a 1GB video card.
Yes, there is such a thing as GPU thrashing, not just on VRAM resources but also (GPU-accessible) system memory resources that can be accessed by the GPU. Your app needs a certain number of buffers every frame, some of those buffers need to be in VRAM, some need to be in system memory, and some can be in either. You don't have unlimited VRAM, and you also don't have unlimited GPU-accessible system memory, so buffers may get swapped out of either or both of those to regular system memory and possibly to disk if you start running out of that. Your estimate of 700MB may be off since some buffers may be in both VRAM and GPU-accessible system memory for whatever reason (usually to speed up context switching).
The other possibility is that you are not experiencing classical thrashing but excessive memory movement due to other constraints. If your app reads back render targets or other VRAM-resident buffers for example they may have to be moved to a place where the CPU can access them. The CPU usually only has a 256MB window into VRAM, so if your buffer is outside that window it may have to be moved; even if it's in the window it may be moved to system memory because that's what the driver thinks will give the lowest latency. Who knows.
And yes there context switches both on the CPU and GPU when rendering from multiple contexts. On some GPUs that don't have HW context switching the driver has to store and re-emit whatever state info is necessary (this means it has to both program the GPU state and re-upload any VRAM-only buffers for example) when switching to a new context. I doubt that this is your problem going from 2 to 3 contexts unless you are hitting some pathological case.
Anyway, most of this is just a guess based on the few details you provided and my understanding of such things.