Archived

This topic is now archived and is closed to further replies.

Representing sets and subsets

This topic is 5796 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Recommended Posts

I''m toying around with som graphalgorithms and realized that I need some way to represent sets. But I''m not really sure how to code it... I want to be able to do operations like union,intersection, merge,member and so on. First I was just thinking of a bitvector, a big array that is true if the number is a member. member and insert operations should work on constant time. union, intersection and operations like that would work on linear time, n, if the array is n long. This is fine but it might waste quite a lot of memory since the size is proportional to the sets maxsize. A linked list wouldn''t waste so much memory but it''ll slow down many of the operations and might need to be sorted in some way. Are there other ways to do this? How is this usually done? Thanks in advance, P2

Share on other sites
What language are you coding in?

Share on other sites
I''m coding in c++

/P2

Share on other sites
I would think then that the STL vector class would be a good basis for a set since it is resizable. Create your own class that inherits from the vector class and adds in the methods (set operations) that you need.

Cheers,

Timkin

Share on other sites
If you''re using STL, might as well use STL''s set

Share on other sites
If you expect your sets to be dense, arrays of bits (std::vector) are a good way to represent them. If you expect them to be sparce, a red-black tree of numbers (std::set) is better. If you expect them to be just a few elements, a sorted vector of ints is better.

S = "size of your range of numbers"
N = "number of elements in the set"

The time required for each operation (in big-O notation) would be:
Operation        vector    set      vectorMembership test      1            log N           log NInsert               1            log N             NDelete               1            log N             NNegation             S              S               SUnion                S      (N1+N2)*log(N1+N2)?   N1+N2Intersection         S      (N1+N2)*log(N1+N2)?   N1+N2

The times can be affected by very different constants, so every representation has its niche. You will have to time your code.

If you really like trouble (and speed) you can use several representations, using for each particular subset the most convinient. But then union and intersection have to be implemented for the n^2 possibilities. The times for union would be something like
                                        set 2                       ---------------------------------------                       vector    set      vector      | vector        S             N2             N2set 1 | set            N1     (N1+N2)*log(N1+N2)?  N1+N2      | vector         N1           N1+N2          N1+N2

I am making these tables as I go, so they might be wrong.

Share on other sites
Aaargggh! Let''s try using "source" instead of "code".

  Operation        vector<bool>    set<int>      vector<int>Membership test      1            log N           log NInsert               1            log N             NDelete               1            log N             NNegation             S              S               SUnion                S      (N1+N2)*log(N1+N2)?   N1+N2Intersection         S      (N1+N2)*log(N1+N2)?   N1+N2

                                          set 2                       ---------------------------------------                       vector<bool>    set<int>      vector<int>      | vector<bool>        S             N2             N2set 1 | set<int>            N1     (N1+N2)*log(N1+N2)?  N1+N2      | vector<int>         N1           N1+N2          N1+N2

Share on other sites
Thanks for that very enlightening information alvaro!

Timkin

1. 1
Rutin
25
2. 2
3. 3
JoeJ
19
4. 4
5. 5

• 10
• 11
• 9
• 9
• 10
• Forum Statistics

• Total Topics
631753
• Total Posts
3002098
×