Jump to content
  • Advertisement
Sign in to follow this  
godecho

Run-time dynamic sort ordering

This topic is 3762 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

The Goal: Come up with a std::sort friendly generic functor that allows for a (run-time) dynamic member-wise comparison of an object. e.g., given:
struct Foo
{
    int a, b, c;
};
vector<Foo> foos; sort(foos, mycomparisonobject()); If foos[0].a == foos[1].a, then compare .b, and if those are equal, compare .c... and so on for any given permutation of a, b, and c. The Motivation: Primarily my own curiosity. I've often wondered how one would sort a database table on any given column, or sort a list of files based on its various attributes (think: a window manager's file view in 'details' mode, where one clicks on a column heading to sort on that attribute). The Request: I'd like a little feedback on the implementation I've come up with below. style, correctness, performance, whatever. The Implementation: I'd recommend jumping down to main first to get a feel for how the SortObject class is used. One quirk you may notice is the Holder class. The functor passed to std::sort is copy constructed for every comparison. That didn't jive well with the heap-allocated members of SortObject, and a deep copy would have been prohibitively expensive given its frequency. I'd definitely be interested in a better solution. I've also added some extras (sort direction, accessor binding, etc) for completeness.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <string>
#include <vector>

template <class IdType, class ClassType>
class SortOrder
{
    private:

        template <class Order>
        class Holder
        {
            public:
            
                Holder(const Order&  order)  : order_(order)         {}
                Holder(const Holder& holder) : order_(holder.order_) {}

                inline bool operator () (const typename Order::sort_class_type& lhs,
                                         const typename Order::sort_class_type& rhs) const
                {
                    return order_.compare(lhs, rhs);
                }

            private:

                const Order& order_;
        };


    public:

        typedef ClassType sort_class_type;
        typedef IdType    id_type;

        enum Direction { Ascend = 0, Descend = 1};
        
        SortOrder() {}

        ~SortOrder()
        {
            for(MembersItr itr = members_.begin(); itr != members_.end(); ++itr)
            {
                delete itr->second;
            }
        }
        
        std::vector<IdType> order() const
        {
            std::vector<IdType> ids(members_.size());

            typename std::vector<IdType>::iterator idsItr = ids.begin();
            
            for(MembersCItr memItr = members_.begin(); memItr != members_.end(); ++memItr, ++idsItr)
            {
                *idsItr = memItr->first;
            }
        }
        
        template <class MemberType>
        SortOrder& add(IdType id, const MemberType ClassType::* member, Direction direction = Ascend)
        {
            members_.push_back(std::make_pair(id, new TemplatedVar<MemberType>(member, direction)));
            return *this;
        }

        template <class AccessorType>
        SortOrder& add(IdType id, AccessorType (ClassType::* accessor)() const, Direction direction = Ascend)
        {
            members_.push_back(std::make_pair(id, new TemplatedAccessor<AccessorType>(accessor, direction)));
            return *this;
        }

        void remove(IdType id)
        {
            for(MembersItr itr = members_.begin(); itr != members_.end(); ++itr)
            {
                if(itr->first == id)
                {
                    delete itr->second;
                    itr->second = NULL;
                    std::swap((*itr), *(members_.end()));   // move the one we want to remove to the end
                    members_.pop_back();                    // remove the one we just swapped to the end from the back.
                    return;
                }
            }
        }

        void promote(IdType id)
        {
            for(MembersItr itr = members_.begin(); itr != members_.end(); ++itr)
            {
                if(itr->first == id)
                {
                    std::swap(*(members_.begin()), (*itr));
                    return;
                }
            }
        }

        void toggleDirection(IdType id)
        {
            for(MembersItr itr = members_.begin(); itr != members_.end(); ++itr)
            {
                if(itr->first == id)
                {
                    itr->second->direction_ = ((itr->second->direction_ == Ascend) ? Descend : Ascend);
                }
            }
        }

        bool compare(const ClassType& lhs, const ClassType& rhs) const
        {
            for(MembersCItr itr = members_.begin(); itr != members_.end(); ++itr)
            {
                const GenericVar* var = itr->second;
                
                if(var->compare(lhs, rhs))
                {
                    return true;
                }
                else
                {
                    if(var->equal(lhs, rhs))
                    {
                        continue;
                    }
                    else
                    {
                        return false;
                    }
                }
            }

            return false;
        }

        Holder<SortOrder> operator() () { return Holder<SortOrder>(*this); }

    private:

        class GenericVar
        {
            public:
                
                GenericVar(Direction direction) : direction_(direction) {}
                
                virtual bool compare(const ClassType& lhs, const ClassType& rhs) const
                {
                    return ((direction_ == Ascend) ? lessThan(lhs, rhs) : greaterThan(lhs, rhs));
                }
                
                virtual bool equal      (const ClassType& lhs, const ClassType& rhs) const = 0;
                virtual bool lessThan   (const ClassType& lhs, const ClassType& rhs) const = 0;
                virtual bool greaterThan(const ClassType& lhs, const ClassType& rhs) const = 0;

                Direction direction_;
        };

        template <class MemberType>
        class TemplatedVar : public GenericVar
        {
            public:
                
                typedef MemberType member_type;
                
                TemplatedVar(const MemberType ClassType::* member, Direction direction) : GenericVar(direction), member_(member) {}

                virtual bool equal(const ClassType& lhs, const ClassType& rhs) const
                {
                    return (lhs.*member_ == rhs.*member_);
                }

                virtual bool greaterThan(const ClassType& lhs, const ClassType& rhs) const
                {
                    return (lhs.*member_ > rhs.*member_);
                }

                virtual bool lessThan(const ClassType& lhs, const ClassType& rhs) const
                {
                    return (lhs.*member_ < rhs.*member_);
                }

            private:

                const MemberType ClassType::* member_;
        };

        template <class AccessorType>
        class TemplatedAccessor : public GenericVar
        {
            public:
                
                typedef AccessorType accessor_type;
                
                TemplatedAccessor(AccessorType (ClassType::* accessor)() const, Direction direction) : GenericVar(direction), accessor_(accessor) {}

                virtual bool equal(const ClassType& lhs, const ClassType& rhs) const
                {
                    return ((lhs.*accessor_)() == (rhs.*accessor_)());
                }
                
                virtual bool greaterThan(const ClassType& lhs, const ClassType& rhs) const
                {
                    return ((lhs.*accessor_)() > (rhs.*accessor_)());
                }

                virtual bool lessThan(const ClassType& lhs, const ClassType& rhs) const
                {
                    return ((lhs.*accessor_)() < (rhs.*accessor_)());
                }

            private:

                AccessorType (ClassType::* accessor_)() const;
        };

        typedef std::pair<IdType, GenericVar*>   Member;
        typedef std::vector<Member>              Members;
        typedef typename Members::iterator       MembersItr;
        typedef typename Members::const_iterator MembersCItr;

        SortOrder(const SortOrder& sortOrder);               // Disable: declare, but not define.
        SortOrder& operator = (const SortOrder& sortOrder);  // Disable: declare, but not define.
        
        Members members_;
};

class Foo
{
    public:

        Foo(int a, int b, int c, const std::string& str) : a_(a), b_(b), c_(c), str_(str) {}    

        int a_;
        int b_;
        std::string str_;

        inline int getC() const { return c_; }

    private:

        int c_;
};

std::ostream& operator << (std::ostream& os, const Foo& foo)
{
    return (os << '(' << foo.a_ << ", " << foo.b_ << ", " << foo.getC() << ", " << foo.str_ << ')');
}

int main(int argc, char** argv)
{
    std::vector<Foo> foos;

    foos.push_back(Foo(2, 4, 75, "lamda"));
    foos.push_back(Foo(1, 5,  8, "gamma"));
    foos.push_back(Foo(3, 3, 90, "alpha"));
    foos.push_back(Foo(4, 2,  1, "omega"));
    foos.push_back(Foo(5, 1, 26, "beta"));
    foos.push_back(Foo(1, 1,  1, "beta"));
    
    SortOrder<std::string, Foo> sortOrder;
    sortOrder.add("a",   &Foo::a_)
             .add("b",   &Foo::b_)
             .add("c",   &Foo::getC)
             .add("str", &Foo::str_);

    using std::cin;
    using std::cout;
    using std::endl;
    using std::ostream_iterator;
    
    cout << "Unmodified" << endl;
    copy(foos.begin(), foos.end(), ostream_iterator<Foo>(cout, "\n"));  cout << endl;


    cout << "Sorted by default order" << endl;
    sort(foos.begin(), foos.end(), sortOrder());
    copy(foos.begin(), foos.end(), ostream_iterator<Foo>(cout, "\n"));  cout << endl;

    
    cout << "Promote \"b\", order is now: b, a, c, str" << endl;
    sortOrder.promote("b");
    sort(foos.begin(), foos.end(), sortOrder());
    copy(foos.begin(), foos.end(), ostream_iterator<Foo>(cout, "\n"));  cout << endl;


    cout << "Promote \"str\", order is now: str, b, a, c" << endl;
    sortOrder.promote("str");
    sort(foos.begin(), foos.end(), sortOrder());
    copy(foos.begin(), foos.end(), ostream_iterator<Foo>(cout, "\n"));  cout << endl;


    cout << "Promote \"c\", order is now: c, str, b, a" << endl;
    sortOrder.promote("c");
    sort(foos.begin(), foos.end(), sortOrder());
    copy(foos.begin(), foos.end(), ostream_iterator<Foo>(cout, "\n"));  cout << endl;

    
    cout << "Change the direction of of sort for \"c\" to descending" << endl;
    sortOrder.toggleDirection("c");
    sort(foos.begin(), foos.end(), sortOrder());
    copy(foos.begin(), foos.end(), ostream_iterator<Foo>(cout, "\n"));  cout << endl;

    cin.get();

    return 1;
}







Sample output:
Unmodified
(2, 4, 75, lamda)
(1, 5, 8, gamma)
(3, 3, 90, alpha)
(4, 2, 1, omega)
(5, 1, 26, beta)
(1, 1, 1, beta)

Sorted by default order
(1, 1, 1, beta)
(1, 5, 8, gamma)
(2, 4, 75, lamda)
(3, 3, 90, alpha)
(4, 2, 1, omega)
(5, 1, 26, beta)

Promote "b", order is now: b, a, c, str
(1, 1, 1, beta)
(5, 1, 26, beta)
(4, 2, 1, omega)
(3, 3, 90, alpha)
(2, 4, 75, lamda)
(1, 5, 8, gamma)

Promote "str", order is now: str, b, a, c
(3, 3, 90, alpha)
(1, 1, 1, beta)
(5, 1, 26, beta)
(1, 5, 8, gamma)
(2, 4, 75, lamda)
(4, 2, 1, omega)

Promote "c", order is now: c, str, b, a
(1, 1, 1, beta)
(4, 2, 1, omega)
(1, 5, 8, gamma)
(5, 1, 26, beta)
(2, 4, 75, lamda)
(3, 3, 90, alpha)

Change the direction of of sort for "c" to descending
(3, 3, 90, alpha)
(2, 4, 75, lamda)
(5, 1, 26, beta)
(1, 5, 8, gamma)
(1, 1, 1, beta)
(4, 2, 1, omega)







[Edited by - godecho on June 24, 2008 9:35:27 AM]

Share this post


Link to post
Share on other sites
Advertisement
I've done something similar before when dealing with databases. Essentially the problem (for us) was that invoking a complex comparison function which has to run through and invoke the appropriate comparison sub-function for a list of fields is damned expensive. Obviously the most straightforward solution would be to dynamically generate the code, but that's really not an option in C/C++ unless you want to go crazy with run-time assemblers or distribute the program with a compiler.

What we did instead was to create an additional comparison key vector to which an initialization pass writes the key fields in-order, together with a pointer back to the original structure. Then this smaller vector can be sorted with a simple memcmp or even through a radix sort.
This works nicely with classic basic database tables, e.g. fixed-length strings, integers and floats. Case-insensitive compares are simply a matter of translating the characters to lower/upper case when forming the key, IEEE floats are dealt with by flipping the sign bit (which also avoids the nastiness of unordered NaNs fucking up the sort) and external references are either folded into the key or written by index if their vector is in-turn sorted, etc. If you can't easily translate a comparison rule for this system then you probably don't have a total order to begin with.
Furthermore the conversion pass can be sped up significantly by letting the normalization callbacks process a whole batch of fields at once.

Granted variable length fields (e.g. strings) would complicate things and it can't be wrapped up nicely for std::sort either, but then we had to support external sorting and storage anyway so it wasn't an issue. Personally I would've opted to just use a ready-made (lightweight) database system but some companies are even more hell-bent on reinventing the wheel then I am.

Share this post


Link to post
Share on other sites
For the interested, on a 1,000,000 element vector<Foo>:

With a < operator, std::sort took ~ 61 seconds
With a traditional functor it took ~ 88 seconds
and with SortOrder it took ~ 162 seconds

Each comparison checked a then b, then c, then str.


For smaller vectors the relative differences between the three methods are roughly the same. Foo's < operator takes about 1/3rd less time than the comparison functor, and SortOrder always a little under double that of the traditional functor.

edit: these numbers were taken from a debug build, and should be ignored. see post below for new numbers.

[Edited by - godecho on June 25, 2008 7:26:23 AM]

Share this post


Link to post
Share on other sites
If the objects that you are sorting have complicated copy constructor semantics then consider that it may be advantageous to build a vector of indexes or pointers and initially sort this in place, instead of the underlying object container. Then do a final iteration to reorder the container according to this sort information vector.

this allows you to limit the n * log (n) complexity to a very lightweight integer or pointer swap operation, and then only perform n real copys for the final ordering pass. (Or avoid it entirely and have future operations just work through the sort vector.)

The downside to this approach is that there is more random access required (through the indirection) and so it is more likely to trash the cache. Its all a bit of a tradeoff as to how expensive the element copy is since quicksort and merge sort achieve performant behavoir in the real world because of good cache locality.

Share this post


Link to post
Share on other sites
Quote:
Original post by chairthrower
If the objects that you are sorting have complicated copy constructor semantics then consider that it may be advantageous to build a vector of indexes or pointers and initially sort this in place, instead of the underlying object container. Then do a final iteration to reorder the container according to this sort information vector.

this allows you to limit the n * log (n) complexity to a very lightweight integer or pointer swap operation, and then only perform n real copys for the final ordering pass. (Or avoid it entirely and have future operations just work through the sort vector.)

The downside to this approach is that there is more random access required (through the indirection) and so it is more likely to trash the cache. Its all a bit of a tradeoff as to how expensive the element copy is since quicksort and merge sort achieve performant behavoir in the real world because of good cache locality.


I'm not sure if you were addressing my issue directly, or speaking about sorting complex objects in general.

If the former is true, then it misses the mark somewhat. My difficulty with copy constructors wasn't for what was being sorted, but the sorting functor itself. When the functor was being copy constructed, it'd make a copy of its internal list of pointers to the heap. Those pointers would be freed when the first copy of the functor went out of scope. Providing my own deep-copy copy constructor for the sorting functor would have been too expensive to do for every comparison (which is what triggers the copy construction). My solution was the Holder wrapper which works, and is only as expensive as copying a reference. The reason for my OP was to get ideas for refining that, and other parts of my solution.

Share this post


Link to post
Share on other sites
fair enough - perhaps i was sort of musing very generally. anyway this would be my good faith attempt at the problem.

the object knows internally the fields that it needs to compare - so encapsulate the comparisons in the object but propagate on the comparison function the possibility for the caller to select which dimension it wants to be compare. i think encapsulation of field comparisons at this level is reasonable and properly belongs to the Foo object rather than the comparison function. the comparator function then only needs to carry around around a vector of indexes that represent the priority ordering for the sort. if you do need to map the field names then do it as a seperate operation with a lookup function on the object - ie field name 'a' maps to fixed field number 0 and adjust the ordering accordingly. the comparison function just holds a reference to the sort order vector because as you note it gets copied around a lot in the sort function.


#include <cassert>
#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

struct Foo
{
int a;
int b;
string c;

Foo( int a, int b, string c)
: a( a), b( b), c( c) { }

bool compare( const Foo &f, unsigned dim) const
{
switch( dim)
{
case 0: return a < f.a;
case 1: return b < f.b;
case 2: return c < f.c;
assert( 0);
}
}
};

struct comparator
{
const vector<unsigned> &dims;

comparator( const vector< unsigned> &dims)
: dims( dims) { }

bool operator () ( const Foo &a, const Foo &b) const
{
for( unsigned i = 0; i < dims.size(); ++i) {
if( a.compare( b, dims[ i])) return true;
if( b.compare( a, dims[ i])) return false;
// !(a < b) && !(b < a) means they are equivalent so use next dim to resolve
}
return false;
}
};

main()
{
vector< Foo> v;

v.push_back( Foo( 3, 4, "apple"));
v.push_back( Foo( 3, 6, "yyy"));
v.push_back( Foo( 6, 4, "apple"));
v.push_back( Foo( 3, 4, "zzz"));

vector< unsigned> dims;
dims.push_back( 2);
dims.push_back( 0);
dims.push_back( 1);

sort( v.begin(), v.end(), comparator( dims));

for( int i = 0; i < v.size(); ++i)
cout << v[ i].a << " " << v[ i].b << " " << v[ i].c << endl;


};






Share this post


Link to post
Share on other sites
Quote:
Original post by chairthrower
fair enough - perhaps i was sort of musing very generally. anyway this would be my good faith attempt at the problem.

*snip*


Interesting implementation. Moving the actual comparisons in to the class would allow for less straight forward comparisons (e.g. strcmp vs '<') too. Ideologically though, I dislike the idea of having to modify the original class to include the comparison operation. Perhaps I shouldn't- if one expects the class to be sortable, it's reasonable to expect that class must have operator< defined for it. In my case, I approached the problem with the lesser expectation that the object couldn't be modified but that sort-relevant members would be exposed (via accessor, or public).

Share this post


Link to post
Share on other sites
You don't have to modify the class to supply operator<. You can make it a free function.

On the other hand, you could try to implement your idea as a predicate for std::sort. Actually it's more or less already implemented, but you could try to wrap it in some class to make in faster to write. e.g.


struct Foo {
int a;
int b;
std::string s;
};

bool Foo_Less(const Foo& left, const Foo& right)
{
return std::make_pair(left.a, std::make_pair(left.b, left.s)) < std::make_pair(right.a, std::make_pair(right.b, right.s));
}

std::vector<Foo> foos;

std::sort(foos.begin(), foos.end(), Foo_Less);


Share this post


Link to post
Share on other sites
You could also try something like this.


#include <string>
#include <iostream>
#include <vector>
#include <algorithm>

template <typename T, typename M, typename N>
struct order_info {
M const T:: *member;
N next;

order_info(M const T:: *member, N next) : member(member), next(next) { }

bool operator()(const T& left, const T& right)
{
return (left.*member < right.*member) || (!(right.*member < left.*member) && next(left, right));
}
};

template <typename T, typename M>
struct order_info<T, M, void> {
M const T:: *member;

explicit order_info(M const T:: *member) : member(member) { }

bool operator()(const T& left, const T& right)
{
return (left.*member < right.*member);
}
};

template <typename T, typename M>
order_info<T, M, void> make_order(M const T:: *member)
{
return order_info<T, M, void>(member);
};

template <typename T, typename M, typename M2, typename N>
order_info<T, M, order_info<T, M2, N> > make_order(M const T:: *member, const order_info<T, M2, N>& oi)
{
return order_info<T, M, order_info<T, M2, N> >(member, oi);
};

template <typename T, typename M1, typename M2>
order_info<T, M1, order_info<T, M2, void> > make_order(M1 const T:: *m1, M2 const T:: *m2)
{
return make_order(m1, make_order(m2));
}

template <typename T, typename M1, typename M2, typename M3>
order_info<T, M1, order_info<T, M2, order_info<T, M3, void> > > make_order(M1 const T:: *m1, M2 const T:: *m2, M3 const T:: *m3)
{
return make_order(m1, make_order(m2, make_order(m3)));
}

struct Foo {
int a;
int b;
std::string s;

Foo() { }
Foo(int a, int b, std::string s) : a(a), b(b), s(s) { }
};

int main(int argc, char* argv[])
{
std::vector<Foo> foos;

foos.push_back(Foo(1, 1, "e"));
foos.push_back(Foo(1, 2, "d"));
foos.push_back(Foo(1, 3, "d"));
foos.push_back(Foo(1, 4, "c"));
foos.push_back(Foo(1, 5, "c"));
foos.push_back(Foo(2, 6, "b"));
foos.push_back(Foo(2, 7, "b"));
foos.push_back(Foo(2, 8, "a"));
foos.push_back(Foo(2, 9, "a"));

std::sort(foos.begin(), foos.end(), make_order(&Foo::a, &Foo::s, &Foo::b));
// or
// std::sort(foos.begin(), foos.end(), make_order(&Foo::a, make_order(&Foo::s, make_order(&Foo::b))));

for (std::vector<Foo>::const_iterator it = foos.begin(); it != foos.end(); ++it) {

std::cout << it->a << " " << it->b << " " << it->s << std::endl;
}

return 0;
}


It wasn't tested.

You could try adding support for predicates instead of operator< or make version of make_order with more parameters (or variadic templates in C++0x or ConceptGCC).

EDIT: fixed some bugs. Now it works.

[Edited by - rozz666 on June 24, 2008 3:13:28 PM]

Share this post


Link to post
Share on other sites
Definitely an impressive solution rozz. The only catch to that solution is that it's dynamic at compile time. It wouldn't be possible to change the sort order at run time.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!