Dealing with Cache misses

Started by
7 comments, last by LonelyStar 18 years, 10 months ago
Hi everybody, I just ran "cachegrind" over my Zertz-AI programm (Zertz is a Board game). Now, I have something like this:

struct TableEntry
{
    long long key;
    int flag;
    int value;
    Move move;
};
#define TABLE_SIZE >huge number<
TableEntry Table[TABLE_SIZE];

int TestPosition(long long key,int alpha, int beta,Move* m)
{
    TableEntry* entry=Table[key % TABLE_SIZE];
    if(entry->key == key)            //<-Very very many cache misses here
    {
       if(entry->flags == EXACT_VALUE) //<-Cache misses here
          return entry->value;         //<-Cache misses here
       if(entry->flags == ALPHA_VALUE)&&(entry->value<=alpha))//<-Cache misses here
          return alpha;
       if ((entry->flags == BETA_VALUE)&&(entry->value>=beta))//<-Cache misses here
          return beta;
       (*m)=entry->move;  //<-Cache misses here
       return MOVE_ONLY;
    }
    return UNKNOWN_POSITION;
}

Now I want to get rid of those cache misses :(. The first one, I guess, I can not deal with beacause I just do not know which entry I will need. But is not there a way to preload the whole entry structure ones I known which one I need? Do you see a way of optimizing this function? (it uses the most CPU time according to gprof). Thanks in advance! Nathan (EDIT: Fixed your broken source tag -- Kylotan.) [Edited by - Kylotan on June 18, 2005 7:27:22 AM]
Advertisement
What's with the pointer/value?

TableEntry Table[TABLE_SIZE];

TableEntry* entry=Table[key % TABLE_SIZE];
________^^^

(Or am I missing something?)
deffer's point is important: which did you get wrong, the array definition or the first line of TestPosition? If you copy the entire TableEntry instead of just getting a pointer to it, it will cut down on the indirection and possibly cache misses.

However, I can't see why it would give you a cache miss when you access the same variable twice (ie. entry->flags). Although following the pointer probably involves going to an uncached place in memory, I can't see why the problem would arise a second time when you do the same thing. That sounds like an erroneous result from cachegrind, unless perhaps EXACT_VALUE, ALPHA_VALUE, and BETA_VALUE are not compile-time constants.

Have you considered inlining TestPosition()? It looks like a good candidate for inlining as it doesn't have any loops, and you pass it a pointer to m only to use that to fill in the original object.
Hi, thanks for your answers.
I got the first line of "TestPosition" wrong, it should be:
TableEntry* entry=&(Table[key % TABLE_SIZE]);

You think it is better to do:

Table entry=Table[key % TABLE_SIZE];

?
I thought that that would be bad, because I would have to copy the whole structure every time ...
Have not tried to inline the function, will try that as soon as I get home.
It's just that this function takes 13% of execution time, while other function which are longer (=I would think they take longer) are taking below 10% of execution time. Do you think operating with a "long long" could be a problem?

In general: Is it possible to do prefatching in C++?
I have a structure for the Game-Board which looks like:

char Board[11][11];

and is used often at many places. And as often, as it is used, it causes Cache-Misses. If I am not misinformed (while I have been misinformed often before), the cache should be more than big enough to be able to hold the "Board" the whole time. Can I tell the compiler to do that?
Thanks!
Nathan
Copying a structure may seem bad, but if cache-misses are your main problem, then it's probably preferable. What is worse - several cache misses like you're getting now, or one cache miss, copying 30 or 40 bytes of data, and then no cache misses? It's something to try, anyway. At this sort of optimisation level it's all about experimentation.

If you're using gcc, look at ' __builtin_prefetch' here. Cache manipulation is not a standard C++ procedure though.

As for your data fitting into the cache... you probably have several levels of cache and if TABLE_SIZE is truly huge then I doubt it would fit into the very fastest (and smallest) level.
Hey thanks! I will experiment with ' __builtin_prefetch'!
About prefetching may Board: I did not mean to prefetch the Table of size "HUGE_NUMBER" but the Board:
char Board[11][11];
which has a size of 11*11=121 bytes. It is a structure which is almost always in use! Should I look for good positions for prefetching it? Or is there another way?
Sorry, I misread what you meant, because you didn't use the board in the code. The prefetch function is really up to you to use properly. I think the idea is that you call it a few instructions before you need the data, so that you can be doing a little local processing while that data is being prefetched in parallel.

The main thing is to achieve locality of reference. If you can get the function to stop dereferencing pointers then you'll have fewer cache misses. You might also want to try looking at the disassembly and seeing what sort of code it creates. Or reordering the branches so that the most likely one is first. Also, I didn't notice you mention this the first time, but yes a long long is likely to be slower than an int on a 32-bit platform. I expect that comparisons are going to be quick but arithmetic will be slow, with division the slowest, and you perform a division of a long long on the first line. I'm not sure why you do that, to be honest. Anyway, if you don't need keys bigger than 2^31, use a plain int.
I suspect he uses 64-bit keys because there are 64 squares.

However, if this is a zobrist hash, then there is absolutely no reason to relate game states to hash key size. If there does seem to be a reason to use keys of this size, then Zobrist isnt being implimented correctly. For example, trying to XOR bit-boards together to make a hash isnt a zobrist hash.

- Rockoon (Joseph Koss)
Mmh ... I use Zobrist Keys as described here: http://www.seanet.com/~brucemo/topics/zobrist.htm[\url]<br>But I am not sure what you mean. My Board is not 64 squares in size. But I do XOR positions together to change my Zobrist key for the new "Board-State". I use 64 bit beacause the site mentioned above said so. If this is all completly wrong, it would be very kind of you to tell what is wrong.<br>Thanks!<br>Nathan

This topic is closed to new replies.

Advertisement