Archived

This topic is now archived and is closed to further replies.

P2

Representing sets and subsets

Recommended Posts

I''m toying around with som graphalgorithms and realized that I need some way to represent sets. But I''m not really sure how to code it... I want to be able to do operations like union,intersection, merge,member and so on. First I was just thinking of a bitvector, a big array that is true if the number is a member. member and insert operations should work on constant time. union, intersection and operations like that would work on linear time, n, if the array is n long. This is fine but it might waste quite a lot of memory since the size is proportional to the sets maxsize. A linked list wouldn''t waste so much memory but it''ll slow down many of the operations and might need to be sorted in some way. Are there other ways to do this? How is this usually done? Thanks in advance, P2

Share this post


Link to post
Share on other sites
I would think then that the STL vector class would be a good basis for a set since it is resizable. Create your own class that inherits from the vector class and adds in the methods (set operations) that you need.

Cheers,

Timkin

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
If you''re using STL, might as well use STL''s set

Share this post


Link to post
Share on other sites
If you expect your sets to be dense, arrays of bits (std::vector) are a good way to represent them. If you expect them to be sparce, a red-black tree of numbers (std::set) is better. If you expect them to be just a few elements, a sorted vector of ints is better.

S = "size of your range of numbers"
N = "number of elements in the set"

The time required for each operation (in big-O notation) would be:

Operation vector set vector
Membership test 1 log N log N
Insert 1 log N N
Delete 1 log N N
Negation S S S
Union S (N1+N2)*log(N1+N2)? N1+N2
Intersection S (N1+N2)*log(N1+N2)? N1+N2


The times can be affected by very different constants, so every representation has its niche. You will have to time your code.

If you really like trouble (and speed) you can use several representations, using for each particular subset the most convinient. But then union and intersection have to be implemented for the n^2 possibilities. The times for union would be something like

set 2
---------------------------------------
vector set vector
| vector S N2 N2
set 1 | set N1 (N1+N2)*log(N1+N2)? N1+N2
| vector N1 N1+N2 N1+N2


I am making these tables as I go, so they might be wrong.

Share this post


Link to post
Share on other sites
Aaargggh! Let''s try using "source" instead of "code".


  
Operation vector<bool> set<int> vector<int>
Membership test 1 log N log N
Insert 1 log N N
Delete 1 log N N
Negation S S S
Union S (N1+N2)*log(N1+N2)? N1+N2
Intersection S (N1+N2)*log(N1+N2)? N1+N2



  
set 2
---------------------------------------
vector<bool> set<int> vector<int>
| vector<bool> S N2 N2
set 1 | set<int> N1 (N1+N2)*log(N1+N2)? N1+N2
| vector<int> N1 N1+N2 N1+N2

Share this post


Link to post
Share on other sites