Jump to content
  • Advertisement
Sign in to follow this  
Gage64

Recursively finding all subsets of a set

This topic is 3663 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Given a set (represented as a string), I want to print all it's subsets, and I want to do it recursively. Unfortunately I don't seem to understand recursion all that well. I think the basic idea is that the set of all subsets consists of two groups: the subsets containing the first element, and the ones that don't. So for example, to find all subsets of "ABC", I just need to find all subsets of "BC" and print them twice - once as is and once with A prependded to them. The base case would be the empty set, which only has itself as a subset. I've been able to solve it like this:
void subsets(const string &s, list<string> &subs) {
    if (s == "") {
        subs.push_back("");
        return;
    }

    subsets(s.substr(1), subs);

    int size = subs.size();
    list<string>::iterator it = subs.begin();
    for (int i = 0; i < size; ++i, ++it)
        subs.push_back(s[0] + *it);
}


void print(const list<string> &lst) {
    list<string>::const_iterator it = lst.begin();
    for (; it != lst.end(); ++it)
        cout << *it << endl;
}


void subsets(const string &s) {
    list<string> subs;
    subsets(s, subs);
    print(subs);
}
But using the list feels like I'm missing the point of recursion because the runtime stack already stores intermediate results. I just can't figure out how to use them. Any help would be appreciated. This is not a homework question, but I still prefer that you guide me towards the solution instead of just giving it to me. Thanks in advance.

Share this post


Link to post
Share on other sites
Advertisement
Well the way you are describing it is actually pretty spot on. An exhaustive discovery of subsets of a sequence consisting of an element X, and the rest of the elements, is the set of subsets of the rest of the elements with X both included and not included. So in a conceptual way, you're golden. The idea of recursion in this context though is to push the use of the data deep into the function call tree, rather than pulling the results up to the surface.

Suppose though that you really do just want to print them, and don't really care about using them elsewhere. You don't actually have to construct a list then, since you don't care what an answer is after you have printed it out [you don't need old solutions]. Consider something a bit more like this then:
Discovery(string partialSolution, string remainingSolution)
{
if(remainingSolution has length > 0)
{
//the case of including this element in the subset
Discovery(partialSolution + remainingSolution[0], remainingSolution - first element)
//the case of not including this element in the subset
Discovery(partialSolution, remainingSolution - first element)
}
else
{
//there are no more elements to examine, so we have a subset in some form
print out (partialSolution) // which in this case is in fact one complete subset
}
}
Discovery(string setToCheck){ Discovery(empty string, setToCheck) }
It is a matter of exchanging the base case, which is the tricky part of recusion, and pushing the printing deep into the function call tree rather than pulling the solutions to the surface and printing them there. [clearly this is not C++ code, but it should give you a solid idea of what is going on].

*edit* fixed a sloppy code typo.

[Edited by - Drigovas on November 30, 2008 2:15:03 AM]

Share this post


Link to post
Share on other sites
As Drigovas said, the recursive algorithm you've described is a good one. The biggest advantage of recursion to me is the conceptual clarity with which it expresses algorithms. Unfortunately, in certain languages like C++, implementing recursive strategies, as you've seen here, is often cumbersome and quite the opposite of elegant. If you're not careful, recursion occasionally will also generate processes that are asymptotically inefficient. A decent compiler, however, will optimize tail-recursion (in which the recursive call is the final action taken by the function) so that it behaves as fast as or faster than its iterative counterpart.

Consider, for example, that this algorithm can be written in Scheme as the following wonderful procedure:

(define (subsets s)
(if (null? s)
'()
(let ((rest (subsets (cdr s))))
(append rest (map (lambda (x) (cons (car s) x)) rest)))))


Recursion makes certain algorithms, such as the one above, trivial to understand once you've mastered the following trick. While writing a recursive function, assume the function is already correctly implemented for all inputs of "smaller" size. Don't even try to imagine in your mind the way the compiler or interpreter will unravel all the recursive calls. The meaning of "smaller" depends on context; in this case, in implementing subsets for sets of size n, we assume it works for sets of size n - 1, n - 2, n - 3, ..., 3, 2, 1.

As long as you can show that the recursion will always reduce to some base case, you're done. Often this is trivial because each recursive call reduces the input size; it will then be guaranteed to hit some "minimum" (the meaning again depending on context). In this case, each recursive call reduces the size of the set by one (by breaking off the first element or, in the case of the Scheme code, taking the cdr of the original list).

Share this post


Link to post
Share on other sites
Thank you for replying. I tried implementing your solution but it doesn't seem to work. It only prints the full set. Can you see anything wrong with it?

void subset(string partial, string remaining) {
if (remaining.length() > 0) {
subset(partial + remaining[0], remaining.substr(1));
subset(partial, remaining.substr(1));
} else {
cout << partial << endl;
}
}


int main() {
subset("ABC", "");
}


Quote:
It is a matter of exchanging the base case, which is the tricky part of recusion, and pushing the printing deep into the function call tree rather than pulling the solutions to the surface and printing them there.


I think this is the part that's giving me trouble. Could you please explain this a bit more?

Share this post


Link to post
Share on other sites
Quote:
Original post by Gage64
Thank you for replying. I tried implementing your solution but it doesn't seem to work. It only prints the full set. Can you see anything wrong with it?
Yeah, I had a typo [that I have now fixed], sorry. On the initial call to Discovery, it would be "Discovery(empty string, setToCheck)" rather than "Discovery(setToCheck, empty string)". Sorry.... been a long day.

Share this post


Link to post
Share on other sites
Quote:
Original post by Gage64
I think this is the part that's giving me trouble. Could you please explain this a bit more?
Sure. Consider your original code. There is a point at which you are adding the discovered solution to a list that you are then maintaining so that whatever caller on the surface can iterate through it to print things out. In short, you HAVE a solution [a subset of this set] right there, at the point when you add it to the list. Thus, instead of adding it to the list, you can just apply whatever function [print] that you want to apply to the solution right there.

Note also that this approach is more efficient, in that you are not actually storing every solution you come across. To illustrate this in a rather extreme example, suppose you have a sequence that is 32 elements long. A complete listing would then contain 2^32 elements, which is a number that should look mightly familiar considering it is the number of possible addresses on 32 bit machines, and considering individual solutions consume more than one byte, and thus span more than a single address, such a solution can't be stored within 32 bit addressable memory. If you analyze the data you are working on, and do what you want with it, and don't store it for later [by pushing the function to where you actually have the answers]. Thus you store only as much as takes to hold a partial solution, which is effectively 32^2 [a tiny number in comparison].

Share this post


Link to post
Share on other sites
Quote:
Original post by Drigovas
There is a point at which you are adding the discovered solution to a list that you are then maintaining so that whatever caller on the surface can iterate through it to print things out. In short, you HAVE a solution [a subset of this set] right there, at the point when you add it to the list. Thus, instead of adding it to the list, you can just apply whatever function [print] that you want to apply to the solution right there.


But I don't add the solution just to print it later. I add it so that the "higher levels" can use it to construct their solutions. If I just print the solution right there, the higher levels won't work.

It seems like the more complicated recursive functions require additional parameters to store intermediate results. You somehow managed to do that with just a single string while I had to use a list, but I have a feeling that it's not really the same thing and I'm missing the core idea behind your solution...

Share this post


Link to post
Share on other sites
Quote:
Original post by Gage64
But I don't add the solution just to print it later. I add it so that the "higher levels" can use it to construct their solutions. If I just print the solution right there, the higher levels won't work.

May I suggest implementing your own iterator then?

Something along those lines, but since I've never implemented an iterator myself, I'm sure there's lots of things that could be done better:
#include <iostream>
#include <string>

class Set
{
std::string set;
public:
Set(const std::string& set) : set(set) {}

class iterator : public std::iterator<std::forward_iterator_tag, std::string>
{
std::string set;
std::string count;
public:
iterator(const std::string& set) : set(set), count(set.length() + 1, '0')
{
}

iterator(size_t len) : count(len + 1, '0')
{
count[len] = '1';
}

bool operator==(const iterator& rhs)
{
return count == rhs.count;
}

bool operator!=(const iterator& rhs)
{
return count != rhs.count;
}

std::string operator*() const
{
std::string current;
const size_t len = set.length();
for (size_t i = 0; i < len; ++i)
if (count == '1') current.push_back(set);
return current;
}

const iterator& operator++()
{
const size_t len = count.length();
for (size_t i = 0; i < len; ++i)
{
if (++count == '1') break;
count = '0';
}
return (*this);
}

iterator operator++(int)
{
iterator temp(*this);
++*this;
return temp;
}
};

iterator begin()
{
return iterator(set);
}

iterator end()
{
return iterator(set.length());
}
};

int main()
{
Set test("ABCDE");
for (Set::iterator it = test.begin(), end = test.end(); it != end; ++it)
{
std::cout << *it << std::endl;
}
}


This program outputs


A
B
AB
C
AC
BC
ABC
D
AD
BD
ABD
CD
ACD
BCD
ABCD
E
AE
BE
ABE
CE
ACE
BCE
ABCE
DE
ADE
BDE
ABDE
CDE
ACDE
BCDE
ABCDE


[Edited by - DevFred on November 30, 2008 5:42:18 AM]

Share this post


Link to post
Share on other sites
DevFred: Your code looks interesting, but my goal is to understand how this can be implemented using a simple recursive solution, like the one Drigovas presented.

I'm still far from clear on this, so if anyone has anything to add, I would greatly appreciate it.

Share this post


Link to post
Share on other sites
Quote:
Original post by Gage64
my goal is to understand how this can be implemented using a simple recursive solution

Okay, then I suggest that every time you find a subset, instead of adding that subset to a list, you do something. What that something is must be decided by the client (using a C-style function pointer or a C++-style functor).

#include <iostream>
#include <string>

template<typename Fun>
class FindSubSets
{
std::string set;
Fun callback;

std::string prefix;

void find_recursive(int pos)
{
if (pos == set.length())
{
callback(prefix);
}
else
{
prefix.push_back(set[pos]);
find_recursive(pos + 1);
prefix.resize(prefix.length() - 1);

find_recursive(pos + 1);
}
}

public:
FindSubSets(const std::string& set, Fun callback)
: set(set), callback(callback), prefix()
{
find_recursive(0);
}
};

typedef void (*funptr)(const std::string&);

void output(const std::string& s)
{
std::cout << s << std::endl;
}

struct Hollywood
{
void operator()(const std::string& s)
{
std::cout << s << std::endl;
}
};

int main()
{
std::cout << "via function pointer:" << std::endl;
FindSubSets<funptr>("ABCDE", output);

std::cout << "via functor:" << std::endl;
FindSubSets<Hollywood>("ABCDE", Hollywood());
}


Both solutions produce the following output:

ABCDE
ABCD
ABCE
ABC
ABDE
ABD
ABE
AB
ACDE
ACD
ACE
AC
ADE
AD
AE
A
BCDE
BCD
BCE
BC
BDE
BD
BE
B
CDE
CD
CE
C
DE
D
E


Is that what you are looking for?

[Edited by - DevFred on November 30, 2008 2:52:26 PM]

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!