# Programming Challenge 1

This topic is 3772 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Optimize the following code and give detailed reason.
// Initialize iloveseven to an array of 7s
int i, j;
int iloveseven[1024][1024];

for(i = 0; i < 1024; i++) for(j = 0; j < 1024; j++) iloveseven[j][i] = 7;


Hints: obviously the code has no logical problem. Try to think in a remote area of computer programming. EDIT: more hints: paging, int has size of 4 bytes EDIT2: Check answer on page 2 of this thread [Edited by - TimChan on August 15, 2007 8:43:17 PM]

##### Share on other sites
You could pre-increment for a start :P. Though the compiler most likely sorts that out.

##### Share on other sites
Quote:
 Original post by DaveYou could pre-increment for a start :P. Though the compiler most likely sorts that out.

##### Share on other sites
Optimize for what — space? Performance?

First thought (probably not what you're looking for, though; technically, I guess it also doesn't function in exactly the same way in that it isn't possible to overflow):

template <typename T> class FilledContainer{    T fill;public:    FilledContainer(T Fill) : fill(Fill)    {}    T operator[](unsigned long)    {        return fill;    }};// ...FilledContainer<int> foo(7);

##### Share on other sites
iloveseven[i][j] has better cache coherency, since i changes less frequently than j.

TheUnbeliever: That's cute, I like that.

##### Share on other sites
cant you just memset that entire block of memory?

##### Share on other sites
Can you go from i = 0 to i = 1048576 and hop through *iloveseven setting each to 7?

or just iloveseven[1024][1024] = {7}; //can't remember if that works how I'm thinking it does

##### Share on other sites
// Initialize iloveseven to an array of 7sshort i, j;char iLoveseven[1024][1024];for(i = -1; i < 1024; ++i) for(j = -1; j < 1024; ++j) iLoveseven[i][j] = 7;

##### Share on other sites
You are misusing the cache.
Try this:
for(j = 0; j < 1024; j++) for(i = 0; i < 1024; i++) iloveseven[j][i] = 7;

##### Share on other sites
Quote:
 Original post by jpetrieiloveseven[i][j] has better cache coherency, since i changes less frequently than j.TheUnbeliever: That's cute, I like that.

So what exactly cache coherency means? Your answer is correct and expected but I think the explanation is incomplete.

##### Share on other sites
Quote:
 Original post by InsaneBoarder234 *** Source Snippet Removed ***

for(i = -1; i < 1024; ++i) for(j = -1; j < 1024; ++j) iLoveseven[i][j] = 7;

Except that means that you die painfully immediately when you try to access iLoveseven[-1][-1]

##### Share on other sites
Obviously, more constraints on what you're allowed to do to the code to optimize it would have been helpful. Converting iloveseven[1024][1024] into [1024*1024] and using one loop would have been my choice -- solves the cache coherency issue, and I like 1D arrays better anyway (but that array dimension isn't a concern here).

These sorts of things are 99% of the time obnoxious micro-optimizations. Profile, profile, profile.

Quote:
 cant you just memset that entire block of memory?

You'd want to use fill_n or similar, not memset, but yes, that also achieves the solution of improving cache coherency.

Quote:
 or just iloveseven[1024][1024] = {7}; //can't remember if that works how I'm thinking it does

If there are few initializer clauses in an initializer list than there are members in the aggregate, the remaining members are value-initialized (that is, zerod). It won't work here, but if you wanted to zero-initialize the array it would.

Quote:
 So what exactly cache coherency means?

It means you're performing the access in a coherent fashion, in this case linearly stepping across the entire region of memory instead of jumping about in 1024-element chunks. I appreciate that you're from Hong Kong, and English may not be your native language, but the term is generally fairly sufficiently understood. Would you like me to define cache, too?

##### Share on other sites
Quote:
 Original post by TimChanSo what exactly cache coherency means? Your answer is correct and expected but I think the explanation is incomplete.
You should know, you're apparently the one applying for the job and using jpetrie's answers to sleaze past the interview.

##### Share on other sites
you could also unroll the loop to reduce overhead

##### Share on other sites
Quote:
Original post by Ravuya
Quote:
 Original post by TimChanSo what exactly cache coherency means? Your answer is correct and expected but I think the explanation is incomplete.
You should know, you're apparently the one applying for the job and using jpetrie's answers to sleaze past the interview.

hehe

##### Share on other sites
Quote:
 Original post by TheUnbelieverExcept that means that you die painfully immediately when you try to access iLoveseven[-1][-1]

Eeee... I think I read a while ago that ++i incremented i before it was used by the for loop and I just took it as that, it seems I was wrong though!

##### Share on other sites
Quote:
 Original post by jpetrieiloveseven[i][j] has better cache coherency, since i changes less frequently than j.TheUnbeliever: That's cute, I like that.

It took me a couple of looks at iloveseven[i][j] and iloveseven[j][i] to even realize you changed something. Just saw the iloveseven[?][?] and assumed iloveseven[i][j]. *hits self for being stupid and not paying attention*

##### Share on other sites
Quote:
Original post by Ravuya
Quote:
 Original post by TimChanSo what exactly cache coherency means? Your answer is correct and expected but I think the explanation is incomplete.
You should know, you're apparently the one applying for the job and using jpetrie's answers to sleaze past the interview.

LOL Of course no, I'm already a paid web programmer. And I don't think such question would appear in job interview since it is about low-level stuff or OSDev...

##### Share on other sites
Quote:
 Eeee... I think I read a while ago that ++i incremented i before it was used by the for loop and I just took it as that, it seems I was wrong though!

++i behaves the way you think, that's not the problem. The problem is that the loop executes the body first before the increment statement is executed. You can still preincrement the iteration index, you just still want to have the index initialized to 0 to begin with so you don't execute the bodies with i and j equal to -1.

##### Share on other sites
Ah ok I understand. Ah well, ta for clearing that up :)

##### Share on other sites
Quote:
Original post by jpetrie
Quote:
 cant you just memset that entire block of memory?

You'd want to use fill_n or similar, not memset, but yes, that also achieves the solution of improving cache coherency.

Indeed. Optimized for style:

const int size = 1024;// Initialize iloveseven to an array of 7sint iloveseven[size][size];std::fill_n(iloveseven, size * size, 7);

##### Share on other sites
iloveseven.cpp :

int iloveseven[1024][1024] =
{
{ 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, ..., 7, 7, }
{ 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, ..., 7, 7, }
{ 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, ..., 7, 7, }
....
{ 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, ..., 7, 7, }
};

fin.

GENiloveseven.cpp :

// generate iloveseven.cpp
...

fin.

This is a fast iloveseven filling no ? May we consider that the asm code generate from iloveseven.cpp fill a 1024*1024 memory array fastly ?

##### Share on other sites
Quote:
Original post by Zahlman
Quote:
Original post by jpetrie
Quote:
 cant you just memset that entire block of memory?

You'd want to use fill_n or similar, not memset, but yes, that also achieves the solution of improving cache coherency.

Indeed. Optimized for style:

const int size = 1024;// Initialize iloveseven to an array of 7sint iloveseven[size][size];std::fill_n(iloveseven, size * size, 7);

I don't know, but fill_n could actually have been implemented as a loop (I suspect they use different strategies for different cases).
And then there are the optimizations done by the compiler. Relative small arrays may get inlined like:

array[0] = 7;
array[1] = 7;
array[2] = 7;

##### Share on other sites
Quote:
 Relative small arrays may get inlined like:

Perhaps, but why worry?

Optimizing for what you think the compiler will generate in terms of machine code is fruitless. Unrolling the loop may or may not be done by the compiler, so it shouldn't be considered a reasonable "optimization" to attempt to the C++ code.

The only real sane optimization you can make is the observation that the algorithm in question suffers from poor locality (it doesn't traverse the storage linearly). Worrying about unrolling the loop or other things where you try to cajole the compiler into generating better assembly is problematic because they're not general-purpose solutions, whereas improving the algorithm is.

Besides, it's also worth nothing that if the array is small enough to be trivially manually inlined, it's probably small enough that the cache coherency issue isn't relevant. That applies mainly to large datasets.

##### Share on other sites
Quote:
Original post by jpetrie
Quote:
 Relative small arrays may get inlined like:

Perhaps, but why worry?

I don't. Not in real life, at least.

It's a challenge, I have no idea how sane the solutions have to be. Nor do I know if it should be optimized for size or for speed (I suspect the latter), or whether the optimizations are allowed to be limited to a single platform. I merely indicating that the gains in performance (if taken to the extreme) can be different from what could be expected by reading the code.

##### Share on other sites

This topic is 3772 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Create an account

Register a new account

• ### Forum Statistics

• Total Topics
628682
• Total Posts
2984213

• 11
• 13
• 13
• 9
• 10