Foo* foo = new Foo[size];
std::vector<Foo*> fooVector;
for(int i=0; i<size; i++)
{
//fill fooVector with &foo
}
Now let's assume that during run-time an event occurs and each time this event occurs a pointer in the vector is being processed. Some operations are done on the value being pointed at by this pointer and then a swap is done. The swap takes the address being pointed at by the current processed pointer and swap it with the address being pointed at by the last pointer in the vector. Note that 'last' here doesn't necessarily mean the last element in the vector, there could be some variable which keeps track of what it considers to be the "last element" in the vector, this is not important though as the point here is that the addresses that the pointers are pointing at are being swapped between the pointers. Again some code to clarify what I mean:
void processPointer(int pointerId)
{
//do some operations on foo object being pointed at with fooVector[pointerId]
//Swap the addresses between two pointers
std::swap(fooVector[pointerId], fooVector[lastElement]);
}
Now as far as I am concerned, the next time the vector is read the pointers are still adjacent in memory so there is no change there. What they are pointing at becomes unordered with time though, the first element in the vector could be pointing at the third element in the array allocated from the heap and the third element in the vector could be pointing at the tenth element in the array and so on. How does this affect cache locality when the values, which the pointers are pointing at, are read when everything is unordered? I realize that the values that are being pointed at still belong to the same continuous block but a cache line doesn't consist of unlimited memory.
To elaborate on the last sentence above I see it this way: say we end up in a situation where fooVector[0] points at foo[3] and fooVector{1] point at foo[10]. Now when I go through my vector, step by step and perform calculations on the values being pointed at then when I process fooVector[0] I could perhaps be reading foo[3], foo[4] and foo[5] (perhaps something before foo[3] as well, I don't honestly know if it looks "behind" it or not) and that might be as much as I could fit in one cache line. When I process fooVector[1] then foo[10] will not be in the cache and thus I would be incurring a cache miss. I realize this entirely depends on the actual size of my objects (a foo object in this case) but this is how I thought of it.
I could be wrong about many things here, I'm fairly new when it comes to considering the cache when coding so I could have assumed many things which are wrong, if so please let me know.