• Advertisement
Sign in to follow this  

Comparing two arrays

This topic is 4184 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have two arrays of objects. I need to compare the two arrays to find out which items are contained in both lists, or which items are unique to a list. I decided that the fastest way to do this would be to have both arrays sorted in some order, but this takes time to sort them, so would it be better to just compare unsorted arrays? Any algorithms for comparing arrays that anyone knows about? If I do sort them what is the fastest algorithm to sort numeric data into accending order?

Share this post


Link to post
Share on other sites
Advertisement
You have to do some asymptotic analysis. Of course, if you are working with a small sample set this may not be correct (but in that case speed doesn't matter).

Take for example two balanced trees. Inserting all of the elements takes O(nlog(n)) time for both. Then, continue to remove the root of the first set of elements and search the second tree for that element. This will take O(1) (I think?) time to remove the root and O(log(n)) time to search. If you can't find it in the second tree it is unique to the first set. When the first tree is completely consumed, the remaining elements in the second tree will be unique to the second set.

The running time will be O(1)*(the number of elements in the first set) + O(log(n))*(the number of elements in the first set). (Rememeber that searching terminates when the first tree is empty.)

This is essentially: O(n) + O(nlog(n)), or, more simply, O(nlog(n)).

Your algorithm proposes just blindly checking every element in the first against the second. That will be O(n) time for every search in the second group. This must be performed for every element in the first group, thus, O(n^2) time.

I hope this has shown you how algorithms can speed things up. There may be better approaches, but, I think for now this is pretty reasonable.

J

Share this post


Link to post
Share on other sites
Quote:
Original post by orryx
I have two arrays of objects. I need to compare the two arrays to find out which items are contained in both lists, or which items are unique to a list. I decided that the fastest way to do this would be to have both arrays sorted in some order, but this takes time to sort them, so would it be better to just compare unsorted arrays? Any algorithms for comparing arrays that anyone knows about?


Sorting them will be faster.

Quote:
If I do sort them what is the fastest algorithm to sort numeric data into accending order?


Quicksort is usually very fast. Insertion sort does very well on small or almost sorted arrays. Combine them and you get Introsort, which is typically found in libraries. Shell sort is extremely fast on small arrays. Merge sort is good for large arrays.

As jyk pointed out, in C++, you can sort with std::sort and then compare with std::set_intersection and std::set_difference.

Share this post


Link to post
Share on other sites
Quote:
Original post by Fruny
Merge sort is good for large arrays.


And linked lists :) Actually, merge sort is my personal favourite, because I know how to implement it non-recursively (but can't claim the same about quicksort).

Share this post


Link to post
Share on other sites
Quote:
Original post by orryx
I have two arrays of objects. I need to compare the two arrays to find out which items are contained in both lists, or which items are unique to a list. I decided that the fastest way to do this would be to have both arrays sorted in some order, but this takes time to sort them, so would it be better to just compare unsorted arrays? Any algorithms for comparing arrays that anyone knows about? If I do sort them what is the fastest algorithm to sort numeric data into accending order?
C++ is it?

Yeah okay, at the moment you have two arrays, but if your goal is what you describe, then that isn't ideal. Ideally you'd have two std::sets. If at all possible, do away with the arrays and just put these items straight into sets when they are obtained. If not, then simply read the items from the arrays and insert them into the appropriate sets now.

Then it is a simple matter of performing two calls to set_difference and you have your two resulting sets. If you want them all in one set then you should instead call set_symmetric_difference. This should be pretty good speedwise.
If you need help with std::set, just ask.

I'm assuming you don't have duplicates within a single array to start with.

Share this post


Link to post
Share on other sites
Quote:
Original post by Fruny
Sorting them will be faster.


Although it might not be a possibility (in C++, for instance), a hash table is even faster for this purpose (results in linear-time processing).

Share this post


Link to post
Share on other sites
Quote:
Original post by Fruny
Quicksort is usually very fast. Insertion sort does very well on small or almost sorted arrays. Combine them and you get Introsort, which is typically found in libraries.


Introsort is a quicksort that detects O(n2) behavior and changes to a heap sort if that's the case.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement