Sign in to follow this  
TheComet

Sorting (mathematical) vectors in list

Recommended Posts

Hi,

 

Google is being rather unhelpful, as any search for "sorting vectors" inevitably results in std::sort because the unfortunate name of a dynamic array in C++ is "vector".

 

I wish to sort the following data structure(s) contained withing a list:

struct Vector2
{
   int x;
   int y;
};
struct Vector3
{
   int x;
   int y;
   int z;
};

The more I play around with the implementation of the overloaded comparison operators, the more I'm lead to believe it's simply not possible to perform the check "A<B" where A and B are both Vector2 or Vector3 objects.

 

Here's an example program thingy to play around with:

 

#include <iostream>
#include <vector>
#include <algorithm>
#include <stdlib.h>


template <class T>
class Vector2
{
public:
    Vector2() {}
    Vector2( int x, int y ) : x(x), y(y) {}
    ~Vector2() {}
    int x;
    int y;
    template <class T_> friend inline bool operator==( const Vector2<T_>& lhs, const Vector2<T_>& rhs );
    template <class T_> friend inline bool operator!=( const Vector2<T_>& lhs, const Vector2<T_>& rhs );
    template <class T_> friend inline bool operator< ( const Vector2<T_>& lhs, const Vector2<T_>& rhs );
    template <class T_> friend inline bool operator> ( const Vector2<T_>& lhs, const Vector2<T_>& rhs );
    template <class T_> friend inline bool operator<=( const Vector2<T_>& lhs, const Vector2<T_>& rhs );
    template <class T_> friend inline bool operator>=( const Vector2<T_>& lhs, const Vector2<T_>& rhs );
};


template <class T>
inline bool operator==( const Vector2<T>& lhs, const Vector2<T>& rhs )
{
    return ( lhs.x==rhs.x && lhs.y==rhs.y );
}
template <class T>
inline bool operator!=( const Vector2<T>& lhs, const Vector2<T>& rhs )
{
    return !operator==( lhs, rhs );
}
template <class T>
inline bool operator<( const Vector2<T>& lhs, const Vector2<T>& rhs )
{
    if( lhs.x>=rhs.x ) return false;
    if( lhs.y>=rhs.y ) return false;
    return true;
}
template <class T>
inline bool operator>( const Vector2<T>& lhs, const Vector2<T>& rhs )
{
    return operator<( rhs, lhs );
}
template <class T>
inline bool operator<=( const Vector2<T>& lhs, const Vector2<T>& rhs )
{
    return !operator>( lhs, rhs );
}
template <class T>
inline bool operator>=( const Vector2<T>& lhs, const Vector2<T>& rhs )
{
    return !operator<( lhs, rhs );
}


int main()
{


        std::vector< Vector2<int> > myListOfVectors;
        for( int n = 0; n != 10; ++n )
                myListOfVectors.push_back( Vector2<int>(rand()%100,rand()%100) );


        std::sort( myListOfVectors.begin(), myListOfVectors.end() );


        for( std::vector< Vector2<int> >::iterator it = myListOfVectors.begin(); it != myListOfVectors.end(); ++it )
                std::cout << it->x << "," << it->y << std::endl;


        return 0;
}
 

Any help?

Edited by TheComet

Share this post


Link to post
Share on other sites

Thanks for the quick reply!

 

For higher dimensions, would this be correct?

template <class T>
inline bool operator<( const Vector3<T>& lhs, const Vector3<T>& rhs )
{
if (lhs.x < rhs.x) return true;
if (lhs.x > rhs.x) return false;
if( lhs.y < rhs.y) return true;
if( lhs.y > rhs.y) return false;
return lhs.z < rhs.z
}

Share this post


Link to post
Share on other sites

Yes, that looks right. Note that while this is a valid comparison for the purposes of sorting, it may not be a particularly meaningful sort. In particular when used with floating point numbers, it may not be useful for search.

Share this post


Link to post
Share on other sites

You could create a sort method that takes in a lambda expression so you can sort differently depending on your case. I can envision a lot of ways to sort vectors (by length, by coordinate, by dot product with another vector, etc.) so you might want to keep your options open.

Share this post


Link to post
Share on other sites

Yes, that looks right. Note that while this is a valid comparison for the purposes of sorting, it may not be a particularly meaningful sort. In particular when used with floating point numbers, it may not be useful for search.

Noted, thanks.

 

The purpose of all this is to decrease a bottleneck I have with quickly searching a grid for the existence of an entity. I decided to insert the integer coordinates of the entities I'm interested in into a sorted list so I can do a binary search on the data (O(log n)) instead of having to traverse it (O(n)). The list has well over a thousand entries, so hope this will be more effective.

 

 

You could create a sort method that takes in a lambda expression so you can sort differently depending on your case. I can envision a lot of ways to sort vectors (by length, by coordinate, by dot product with another vector, etc.) so you might want to keep your options open.

Sorting by length and dot product is ambiguous, for instance: vec2(1,1).length() == vec2(-1,-1).length(). I can imagine this would require a more elaborate search algorithm, as I think one of the requirements of binary search is each entry in the list needs to be unique.

Edited by TheComet

Share this post


Link to post
Share on other sites


Sorting by length and dot product is ambiguous, for instance: vec2(1,1).length() == vec2(-1,-1).length(). I can imagine this would require a more elaborate search algorithm, as I think one of the requirements of binary search is each entry in the list needs to be unique.

 

For your purposes, lexicographic order sounds right, but other purposes other kinds of ordering might be better -- for example, if the question is "does an entity exist at X,Y" lexicographic ordering is good, but if the question was "which entities are located between 3 and 5 units from X, Y" then length (defined as distance from X,Y) is good.

 

In general, binary searches have to deal with ambiguity in some way -- as in the list [1,2,3,3,4,5] -- if the result is 3, which 3? In some cases it may not matter to you which 3 is returned or the one returned by convention happens to be the one you're interested in, but if it does matter to you, you might need to apply further criteria.

Share this post


Link to post
Share on other sites

The purpose of all this is to decrease a bottleneck I have with quickly searching a grid for the existence of an entity. I decided to insert the integer coordinates of the entities I'm interested in into a sorted list so I can do a binary search on the data (O(log n)) instead of having to traverse it (O(n)). The list has well over a thousand entries, so hope this will be more effective.

In that case you may want to consider a hash container rather than a sorted list.

Share this post


Link to post
Share on other sites

 

The purpose of all this is to decrease a bottleneck I have with quickly searching a grid for the existence of an entity. I decided to insert the integer coordinates of the entities I'm interested in into a sorted list so I can do a binary search on the data (O(log n)) instead of having to traverse it (O(n)). The list has well over a thousand entries, so hope this will be more effective.

In that case you may want to consider a hash container rather than a sorted list.

 

 

Actually, OP, could you describe more fully what you're trying to achieve -- including the data you start from and the data you hope to end with?

 

Are you trying to determine whether any entity exists at a given grid location? Whether any entity occupies a subset of a grid? Whether a specific entity occupies a subset of a grid?

Share this post


Link to post
Share on other sites

 

 

The purpose of all this is to decrease a bottleneck I have with quickly searching a grid for the existence of an entity. I decided to insert the integer coordinates of the entities I'm interested in into a sorted list so I can do a binary search on the data (O(log n)) instead of having to traverse it (O(n)). The list has well over a thousand entries, so hope this will be more effective.

In that case you may want to consider a hash container rather than a sorted list.

 

 

Actually, OP, could you describe more fully what you're trying to achieve -- including the data you start from and the data you hope to end with?

 

Are you trying to determine whether any entity exists at a given grid location? Whether any entity occupies a subset of a grid? Whether a specific entity occupies a subset of a grid?

 

Sure.

 

The project is what I'd call a generic implementation of Conway's Game of Life. I say generic because I wish to be able to extend the simulation to an arbitrary dimension (3D, 4D, 5D?) without having to change much. I started out with a 2D implementation using dynamic 2D arrays to store all of the cell states, here's a screen shot of that: http://i.imgur.com/l1m1PSw.png

 

Two things were clear to me at this stage:

  1. Extending it to n-dimensions isn't trivial; I'd have to adapt the dynamic 2D array class to work with higher dimensions
  2. The simulation got slow pretty quickly when I started simulating larger systems

I started development on a new topology (not working yet) where I only store cells that are alive, with a way to search this list very quickly for the existence of alive cells. And that's where I stand now.

 

I've thought about perhaps even using a graph-like implementation on top of this (link neighbour cells to each other so a lookup isn't even necessary in the first place). This idea hasn't been touched yet, though.

 

If anyone is interested, they can clone the code from github: https://github.com/TheComet93/game-of-life

 

The classes of interest are: GenericCellField (libgol/GOLGenericCellField.hxx) and SortedList (libgol/GOLSortedList.hxx). Excuse the messy code and lack of a doxygen file. The code should be more or less documented in the appropriate header files.

# clone and checkout to branch "topic"
$ git clone https://www.github.com/TheComet93/game-of-life.git
$ git checkout topic

# building with gnu makefiles. You can insert whatever IDE you use instead of "gmake"
$ premake gmake
$ cd build
$ make
$ export LD_LIBRARY_PATH=../lib
$ ./../bin/debug/2D

# NOTE: The project "3D" will fail to build. This is OK.
# NOTE2: If the topic branch fails to compile, try switching to "devel" (git checkout devel)

Controls:

  • Right-click and drag to adjust view
  • Scroll to zoom
  • Press space to start/stop simulation
  • Left-click to draw and erase cells (only works when simulation is stopped)
Edited by TheComet

Share this post


Link to post
Share on other sites

I started development on a new topology (not working yet) where I only store cells that are alive, with a way to search this list very quickly for the existence of alive cells. And that's where I stand now.

 

I've thought about perhaps even using a graph-like implementation on top of this (link neighbour cells to each other so a lookup isn't even necessary in the first place). This idea hasn't been touched yet, though.


 

 

Given that you are working on a regular grid, thinking of it as a graph is the way to go, but more specifically as an implicit graph. You can use the vertex locations of the grid as the keys for a hash table. For any given grid vertex you can easily figure out the keys for the neighboring vertices so you don't need to store them at each vertex (although slower, if the container is a hash table the lookup is still O(1)).

 

-Josh

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this