# Optimizing this algorithm

## Recommended Posts

MOVSW    132
I have three large arrays. Data in all of them is ordered in ascending order. I'd like to be able to quickly find occurance of a same number in all three arrays. Let's say I start with a basic straightforward algorithm.
int a[sizeA],b[sizeB],c[sizeC];
for(int ai=0;ai<sizeA;ai++)
{
for(int bi=0;bi<sizeB;bi++)
{
for (int ci=0;ci<sizeC;ci++)
{
if((c[ci]==b[bi]) && (b[bi]==a[ai]))
{
cout<<"Match found";
break;
}
}
}
}


This algorithm has complexity of Big O(n^3). Which is just about the worst solution. I can decrease complexity by making this algorithm intelligent by taking advantage of its sort order by moving index pointer of BI and CI depending on the value and sort order. Problem is that my co-worker tells me that I can solve this task with Big O(n) complexity. But I just don't see it. Binary search is out of the question as it has lousy efficiency comparing to O(n). Any ideas to improve efficieny to O(n)?

##### Share on other sites
Krumble    151
I don't quite understand your question, but binary search is O(logn) which is so much better than O(n).

##### Share on other sites
Guest Anonymous Poster
Any searching algorithm will not be O(n). Are there any keys to each item that are used for hashing? Hashing is o(n).

##### Share on other sites
Sneftel    1788
Here's a simpler problem. Solving this problem will make it easy for you to solve your problem.

You have two stacks of cards. Each stack contains lots and lots of cards, each with a number from 1 to a billion. Each stack is sorted from lowest to highest. Your task is to determine if the two stacks have any cards in common.

But there's a complication. You can only see the card on top of each stack at any one time (no rifling through the stack). And once you take a card off the top of a stack, you have to throw it away and never think about it again. In other words, you only get one pass through the stacks.

How do you do this?

##### Share on other sites
MOVSW    132
I'll look into hashing. Looks like i'll have to scrap that idea. It's a non-essential part anyways. Spent to much time on this today already. Got a headache...

##### Share on other sites
kaysik    382
Quote:
 Original post by SneftelHow do you do this?

oooo - pick me pick me *jumps up and down with hand in the air*

##### Share on other sites
Karg    133
Well, Sneftel has the answer for you, but why didn't you just ask your coworker how to do it? If they know how, then it would be a lot faster than posting here and waiting for response.

karg

##### Share on other sites
Ravyne    14300
- Compare the first 3 values for a match.
- Within a loop begin at the beginning of each array.
- Chose the largest result as a baseline and increment the index of the other arrays independantly until their value is greater or equal to the chosen baseline value. Compare for a match after each increment in index.
- repeat until the arrays are exhausted.

OR:

- Within a loop begin at the beginning of each array.
- Compare current three values for match
- Increment the index of the array with the smallest value, but don't overflow it. Establish a Tie-breaking rule for multiple same values.
- Repeat until all arrays are exhausted.

The second method is a cleaner implementation.

[EDIT] - Added the bit about comparing after each increment and put the first comparison before the loop.

##### Share on other sites
MOVSW    132
My coworker is sort of an asshat, i bet this algorithm is probably not even optimizable to O(n). All three arrays have different sets of sorted numbers. There might not even be a number that occurs in all arrays. That's why I need to do this.

##### Share on other sites
kaysik    382
Quote:
 Original post by KargWell, Sneftel has the answer for you, but why didn't you just ask your coworker how to do it? If they know how, then it would be a lot faster than posting here and waiting for response.

where's the fun in that?

##### Share on other sites
HellRaider    122
Quote:
 Original post by MOVSWMy coworker is sort of an asshat, i bet this algorithm is probably not even optimizable to O(n).

How can you bet that if someone has ALREADY explained how to do it in O(n) in this very thread?

##### Share on other sites
MOVSW    132
I guess I took to long to write that post.

##### Share on other sites
Ravyne    14300
As a side-note: you can early-out when the values of the 2 remaining arrays are larger than the value of first array to reach the end.

If you're implimenting it as I stated, go with the second, cleaner one. The first implimentation is actually pretty cracked-out in comparison. It was just the first way that came to mind. I'd drop it like a bad habit in favor of the second.

##### Share on other sites
s_p_oneil    443
Actually, using a binary search on the two inner loops would increase the speed quite a bit. It would be O(a*logb*logc), which is a lot better than O(a*b*c). (You should also make sure the outer loop works with the smallest array.) However, your co-worker is right, there is an even better way to do it.

bool bDone = false;int ai=0, bi=0, ci=0;while(!bDone){  if(a[ai] == b[bi] && b[bi] == c[ci])    break;  else  {    if(a[ai] <= b[bi] && a[ai] <= c[ci])      ai++;    else if(b[bi] <= a[ai] && b[bi] <= c[ci])      bi++;    else if(c[ci] <= a[ai] && c[ci] <= b[bi])      ci++;    if(ai, bi, or ci went out of bounds)      bDone = true;  }}if(!bDone)  cout << "Match found";

I believe that should do the same thing, and it's O(a+b+c). The algorithm takes advantage of the fact that all three arrays are sorted, and it keeps bumping the index that has the smallest value up until all three match or the end of an array is hit. It would also be easy to modify this algorithm to return a list of matches. Instead of breaking out of the loop, just add the three indices to the end of a list.

EDIT: This is the same solution Ravyne posted, but you didn't seem to notice that he had provided the solution.
EDIT: Either that, or I didn't notice what you meant by "I guess I took to long to write that post." ;-)