Recursive merge?

Started by
3 comments, last by JakeM 16 years, 10 months ago
I have a list of integer pairs, such as (2,3) (4,1) (1,2) (9,5) (6,7) (5,6) and I want to place these pairs into buckets, such that two pairs have at least one integer in common, but not all pairs in the bucket have to have a common integer: bucket 1: (2,3) (4,1) (1,2) bucket 2: (9,5) (6,7) (5,6) etc... What would be an efficient way to create these buckets? I was thinking maybe I would have to do something like a recursive merge? Because if I added (2,3) to a new bucket in a linear fashion, then checked to see if (4,1) would fit, I wouldn't be able to add it at that point, because (1,2) hasn't been added yet.
Advertisement
The "2 buckets" seems redundant definition.
So does "but not all pairs in the bucket have to have a common integer:" - is this a requirement? If not, it sounds redundant.

Next is the question of input data. Is it always partitionable into 2 sets? What about (1,2), (3,4), (5,6)? Or (1,1),(1,1),(1,1),(1,1)?

One way could be this:

Put all pairs into set S.
Make 2 empty sets A and B.

Define condition C( S, P ) = pair P matches at least one pair in set S.
algorithm ( T, S ){  let X be first element from S  while ( S not empty AND X exists)  {    move X from S into T    let X be first element in T that satisfies C( T, X )  }}

then fill the two sets:
algorithm ( A, S )algorithm ( B, S )

if S is empty, the solution is valid.
if S isn't empty, or B is empty, then input is degenerate.


general solution is then:
let S be all pairslet B be array of sets.let i = 0;while ( S not empty ) {  algorithm( B, S );  i++;}


If i == 2, the solution is valid, and B[0] and B[1] contain the two sets.

Note: set == bucket
If I treat each pair as an edge of a graph, it sounds like you are looking for an algorithm to find all the connected components of a graph. Draw it out as a graph and you can see that 1,2,3,4 are connected to each other, and 5,6,7,9 are connected to each other. Here's some C/pseudocode code that is based on an algorithm to find all the connected components of a graph. I wrote that the graph is represented by a bunch of pairs in some container, so the pairs would need to be stored in a container that can be searched quickly using either integer as a key. You can use adjacency lists or maybe one of the STL containers can be used (assuming you are using C++), I don't know much about STL. Even if this isn't exactly what you are looking for, maybe it can help. Iif you are using arbitrary integers you should be able to get it working based on this with a little tweaking.

/* Have a graph G = (V,E)   V = verticles   E = eges   n = number of vertices   Graph is represented by a container of integer pairs   that can be searched according to one of the integers in the pair   *//*    Basically the algorithm starts with a vertex, finds all its neighbors,   then when no more can be found, it starts again on a new vertex. If that   vertex has already been found in a previous search, it skips it and goes   on to the next one*//*    int component[] is an array of ints, after the function terminates   all vertices in the same component will have the same number   edit: you can create buckets along the way and fill them*/void connected_components( List vertices[], int component[], int n){    for( int i = 0; i < n; ++i )        component = 0;    int comp_num = 0;    for( int i = 0; i < n; ++i ) {        if( component == 0 ) {            ++comp_num;            // create a new bucket here            depth_first_search(i,comp_num, component);        }    }}void depth_first_search( int vertex, int comp_num, int component[] ){    component[vertex] = comp_num;    //find all pairs that contain vertex    // for each pair (vertex, w) do    // if( component[w] == 0 ) {    //      depth_first_search(w, comp_num, component);    //      edit: here you know you have a pair that is in the same component    //      so you can add (vertex,w) to the bucket    //  }}
Although I vaguely smell homework...

typedef std::pair<int, int> pair;typedef std::vector<pair> pairlist;class Bucket {  pairlist pairs;  std::set<int> common;  bool accept_helper(const pair& p) {    bool found = common.find(p.first) != common.end();    if (found) { common.insert(p.second); }    return found;  }  public:  Bucket(const pair& first) {    pairs.push_back(first);    common.insert(first.first);    common.insert(first.second);  }  bool accept(const pair& p) {    bool accepted = accept_helper(p) ||                     accept_helper(std::make_pair(p.second, p.first));    if (accepted) { pairs.push_back(p); }    return accepted;  }  // So you can get at results  pairlist::iterator begin() { return pairs.begin(); }  pairlist::iterator end() { return pairs.end(); }}std::vector<Bucket> process(const pairlist& pl) {  typedef std::vector<Bucket> bucketlist;  bucketlist result;  for (pairlist::iterator pit = pl.begin(); pit != pl.end(); ++pit) {    pair& p = *pit;    bool placed = false;    for (bucketlist::iterator bit = result.begin(); bit != result.end(); ++bit) {      if (bit->accept(p)) { placed = true; break; }    }    if (!placed) {      result.push_back(Bucket(p));    }  }  return result;}
Quote:Original post by Antheus
Next is the question of input data. Is it always partitionable into 2 sets? What about (1,2), (3,4), (5,6)? Or (1,1),(1,1),(1,1),(1,1)?


nicksterdomus is right, you can treat this as an edge graph, so (1,1) is invalid.
The goal is creating containers of connectivity.
And there's no limit on how many containers are created. As many as necessary.
The max. limit is not set beforehand. They are created as needed.
And there's always a solution, a single pair can be placed by itself if it's not connected.
So you can say in the worst case, if nothing is connected, the number of containers created will
always equal the number of pairs.

Quote:nicksterdomus
If I treat each pair as an edge of a graph, it sounds like you are looking for an algorithm to find all the connected components of a graph.


Correct.


Quote:Zahlman
Although I vaguely smell homework...


Not even close. :)

This topic is closed to new replies.

Advertisement