Dijkstra

Started by
6 comments, last by Alberth 8 years, 1 month ago

Guys, I really need help with this Dijkstra, I'm stuck for 1 month on this. I watched almost all the stuff on the Internet. I left the project one week ago, but just now I see that our last lecture was on Dijkstra, BFS and DFS, and ALL IN ONE LECTURE, and was explained as awfully as possible( I watched video of the lecture just now ). I understand it in a mathetical point of view, but I can't translate it into computer code, nor can I implement it in a game. All of the source codes from the Internet want the Cost Matrix as input . In my tile-based game( pacman, ofc ) I have 20x20 tiles, how can I make the cost matrix for all of them, and should I read it from file, or from what. How to actually make the tiles, deal with obstacles.Basically I need to calculate the length to the nearest squares, then compare the costs, but all the costs in my game are either 1 or infinity. Then I need to somehow store all the nodes to point to their parents in some kind of a linked list, then to somehow add the node to finished, so I don't check it again...

I have most trouble with storing these stuff in the proper ADT. I miss something, but I don't know what. This time I don't want pseudocode, I want real code, and without <vector> pls. I want brain-dead simple c++ code for below 80 IQ. Basically I want to understand it as well as this guy who made this program here. https://qiao.github.io/PathFinding.js/visual/

And here is some pseudocode which I can't translate into normal code.


 


Given a graph, G, with edges E of the form (v1, v2) and vertices V, and a

source vertex, s



dist : array of distances from the source to each vertex

prev : array of pointers to preceding vertices

i : loop index

F : list of finished vertices

U : list or heap unfinished vertices



/* Initialization: set every distance to INFINITY until we discover a path */

for i = 0 to |V| - 1

dist[i] = INFINITY

prev[i] = NULL

end



/* The distance from the source to the source is defined to be zero */

dist[s] = 0



/* This loop corresponds to sending out the explorers walking the paths, where

* the step of picking "the vertex, v, with the shortest path to s" corresponds

* to an explorer arriving at an unexplored vertex */



while(F is missing a vertex)

pick the vertex, v, in U with the shortest path to s

add v to F

for each edge of v, (v1, v2)

/* The next step is sometimes given the confusing name "relaxation"

if(dist[v1] + length(v1, v2) < dist[v2])

dist[v2] = dist[v1] + length(v1, v2)

prev[v2] = v1

possibly update U, depending on implementation

end if

end for
end while
Advertisement

Sorry if I'm not quite answering anything. The post is a bit hard to read at the moment.

You don't need a "cost matrix" each node can just have it's own cost that it stores in it's data.


struct Node
{
    int Cost = 1
    Node* _pTop
    Node* _pBottom
    Node* _pRight
    Node* _pLeft
    Node* _pTRcorner // Top Right Corner
    Node* _pTLcorner // Top Left Corner
    Node* _pBRcorner // Bottom Right Corner
    Node* _pBLCorner // Bottom Left Corner
}

Note that you can just assume that the cost for going in any direction is uniformly 1. So you only need one cost.

Now... you need to find a path from point A to point some random X.

Point A is where you are starting at, and your cost is currently infinite. You can set this to -1 if you wish, as your cost can be 0, or it can be all real numbers above zero. But it's not allowed to be -1. Or you can set it to the maximum int.

First you need to check if A is your destination. If it is not, then you add A to a visited list. Or an array if you wish.

Make a second list, which will contain data on the nodes you need to visit, and the paths leading up to them.

We compare this value to all points that you can move to. So starting from A and looking onwards, we see that all directions are definitely better choices than infinity. All directions are equal to 1. But because the cost is uniform, you simply will have a favored direction. It's up to how you coded this on where your path will first check towards.

Once again you check if B is destination. If it is not, we check all paths. We notice that there will be some paths to nodes that were previously in our list. We check the cost and take note that to get to them, the cost is 2, compared to the initial 1. So we ignore those nodes, and update the cost to nodes we have not seen before. In this case they will be 3.

Go back to point A and do the same for each node that you have not visited. Then do the same for those nodes until you find your destination.

If that wall of text confused you, I understand. Pathfinding algorithms are not easy to read on paper. It's best if you see a video live.

The harest part is figuring out a way of holding the lists that works for you.

In my tile-based game( pacman, ofc ) I have 20x20 tiles, how can I make the cost matrix for all of them, and should I read it from file, or from what.


You only really need nodes for the intersections. If you really want the ghosts to change direction outside the intersections then you can just make the pathfinding entity a an additional node and patch it into the edge it's traversing. (That's a bit advanced for now though.) You can count the number of tiles between intersections in order to generate edge weights. Since the map doesn't change during gameplay you can bake the graph and load it from a file, yes. If you store your nodes in an array then you can store their scores in a parallel array so that you don't need to dirty the actual graph data during solving.

To start out with you should use big clunky components that will run slowly but are easy to understand. I'd recommend starting here:


struct Edge {
  int weight;
  size_t destinationIndex;
};

struct Node {
  static const int INFINITY = 999999999; //this value is lazy. use an int max in production code

  std::vector<Edge> edges;
  int score = INFITINY;
}; 

Once you've got it working then you can move on to more elegant implementations.

You may want to roll your own MinHeap for this, since you're studying algos and structures.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

Ha, path finding time!

Great smile.png

I read your post, and I think you're going too fast for yourself. You have a mathy description of Dijkstra and want to jump directly to pathfinding over tiles in the game. That means you're getting everything on your plate at the same time:

1. What data structures do I need

2. How do you perform the operations used in the math (I am always fond of phrases like "find pick the vertex, v, in U with the shortest path to s" tongue.png )

3. What the heck is this relaxation thingie?

4. I have different costs.

5. How does a node relate to a tile.

6. What about objects in the tile?

7. What is the "main" loop now, the math just lists things you do.

then you added Internet

8. People use generic cost matrices, templates, generics, factories, and I don't know what, for X, Y, and Z language.

9. People make nice animations (fyi I made such a thing too, and believe me, making such a thing does not help in understanding the algorithm, the GUI code is more complicated)

10. People make videos.

11. People use different terminology for perhaps the same things, or maybe not?

When you spend time on something, and you're just running into walls, you took a too big bite.

If I recognize this happening, I always stop what I am doing (it doesn't work), and completely forget about the problem for some time (a few days).

Then, I take a step back and try to come up with a plan how to do this.

In general, you can either go around the problem (which fails here, as any path finding algorithm is going to be mostly the same), or you go (hopefully) through the problem in smaller steps (since my bite was too big, I need several smaller bites to swallow instead).

It seems the only direction available is to split up the problem.

Vision (or draw) a line from start to end. At the start you have the math description, at the end you have your beautiful solution, blinking fast, with all the bells and whistles you can think of. Next, find the problems you're wrestling with, and place them on the line. When do you need to decide about it, does one problem come before or after another problem?

Can you find other steps that need to be done at some point? Where on the line should they be? If a problem seems to be big, can you break it down in more smaller problems?

(In essence, try to inject some structure in your problem, so you can focus on fewer things at the same time.)

I tend to do this in my head while doing things that don't need deep thought, like walking, doing dishes, showering, in bed, somewhat awake and waiting for the alarm clock to go off, and so on. Other people do this differently, experiment to find out what works for you.

Keep in mind that your line with steps is just a plan, likely it is broken, incomplete, and perhaps just plain nonsense. Change it when you find better ways to do things.

Now back to your problem.

You seem to have major problems with implementation in data structures and/or in tile/game context.

I think you're underestimating the importance of truly understanding the algorithm. If you really understand the algorithm, you can list the operations you must perform as function calls (just the calls, how to implement the function is a more difficult problem to do later). You can write the main loop.

From the operations and decisions you must make, you get ideas how to structure your data. At some point you find all the pieces of puzzle fit, and you get a eureka moment, or as Hannibal Smith used to say "I love it when a plan comes together" (yep, I am getting old tongue.png )

My list with mathy things is usually

1. Understand the math

2. Implement the math

3. Map math onto real problem

I know 3 is waaay too big, but so far you cannot manage 2, and perhaps not even 1. In addition, once you have 1 and 2, you can use the new information to break down 3 in more manageable pieces. We're doing small bites, right? One step at a time.

Ok, about 1.

The point with math is that these things are designed to describe the steps you must do in a compact and unambiguous way, without assuming a certain context, like using a computer to solve it. "find pick the vertex, v, in U with the shortest path to s" says exactly what you must do, it just does not explain how to do it (since this depends on your context).

Secondly, its assumption (Given a graph, G, with edges E of the form (v1, v2) and vertices V, and a source vertex, s) is its world view. [I seem to be missing some cost function here, that got lost somewhere?]

Reading the description, and saying, "I understand every step" is the first item. You seem to have done that.

I then always run the math manually, on a white board or a piece of paper. Take a random graph, from your lesson, the Dijkstra wikipedia page ( https://en.wikipedia.org/wiki/Dijkstra's_algorithm ) or the video above, and DO the algorithm. Don't take shortcuts, simply do each step that it says you must do. (Be a math computer, stupidly do what the 'code' says you must do.)

Play with the algorithm. Try another graph, or other costs. Does it provide correct answers? Can you see patterns in its steps? Can you attach a meaning to set V and U? Can you invent edge cases that would break the 'code'? (Try them!) When does this relaxation take place exactly? What effect does it have? Can you understand why it's there? Does the name make sense?

The goal is to *deeply* understand the math, to 'live' it. Don't try to computerize it yet (a natural tendency for everybody doing programming). You're doing manually what your computer should do. You'll have to explain the computer how to do it. If you don't understand yourself what it should do in every situation, you are not going to be able to explain the algorithm to a computer.

About 2.

You asked for real code, but as a general rule, I don't give solutions to people that are learning. I believe giving a solution defeats the learning goal. Finding and programming a solution yourself is much better. Being able to program it is the proof that you truly understand it. Also, I don't want to take away your Eureka moment when you realize your solution actually works. The feeling you get when you cracked a hard problem is just too good to take away from you.

The goal here is to implement the math. Not your game, but the math. Not efficient, just the math. Not fancy, just the math.

Concentrate on the steps you did in "1". Do everything else with minimal effort to make it work. Take a simple and straight forward data structure, that closely matches the math. Hard-code an example graph into it. (Your aim is coding the algorithm, not making beautiful setup or output code.)

I'd say a list of nodes, like


struct Node {
    int distance;
    bool in_U;
};

/* assuming your example has 10 nodes. */
#define NUM_NODES 10
#define INFTY 1000000

struct Node nodes[10];

void init() {
  /* initialize the nodes[] for the example graph */

  for (int i = 0; i < NUM_NODES; i++) nodes[i].distance = INFTY;

  // etc
}

int cost(int from, int to) {
  if (from == 0 and to == 1) return 1; // path from node 0 to node 1 has cost 1.
  return INFTY;
}

void dijkstra() {
  bool done = false;
  while (!done) {
   // code the step you did in "1".
  }
}

int main() {
  init();
  dijkstra();

  // That's all folks!
  return 0;
}

hope this helps

Guys,
1st: Thanks a lot for the help, really, I had no chance of doing this on my own.
2nd:

I think you're underestimating the importance of truly understanding the algorithm

True, I misunderstood one part of the algorithm. I thought that AFTER I calculate the adjacent edges from the current node, I should choose the lowest cost among THE adjacent nodes, but this was not right, I need to choose the lowest cost among ALL the nodes. And that's why nothing was working as expected.
I'm posting my code for feedback. There is just one more thing to do, and that is to somehow link all the nodes to each other, so I can reconstruct the path, I will try to do it using linked list but maybe later. Now, here's the code:




#include "stdio.h"
#include "conio.h"
 
const int NUM_NODES = 5;
const int INFINITY = 9999;
 
void dijkstra( int NUM_NODES, int rootNode, int cost[10][10], int dist[] )
{
    int i, u, count, finished[10], min;
    for( i = 1; i <= NUM_NODES; i++ )
    {
        finished[ i ] = 0, dist[ i ] = cost[ rootNode ][ i ];
    }
    count = 2;
    while( count <= NUM_NODES )
    {
        min = 99;
        for( i = 1; i <= NUM_NODES; i++ )
        {
            if( dist[ i ] < min && finished[ i ] == 0 )
            {
                min = dist[ i ];
                u = i;
            }
        }
        finished[ u ] = 1;
        count++;
        for( i = 1; i <= NUM_NODES; i++ )
        {
            if( ( dist[ u ] + cost[ u ][ i ] < dist[ i ] ) && finished[ i ] == 0 )
            {
                dist[ i ] = dist[ u ] + cost[ u ][ i ];
            }
        }
    }
}
 
int main()
{
    int v, i, j, cost[10][10], dist[10];
    printf( "\n Enter the cost matrix:\n" );
    for( i = 1; i <= NUM_NODES; i++ )
    {
        for( j = 1; j <= NUM_NODES; j++ )
        {
            scanf( "%d", &cost[ i ][ j ] );
            if( cost[ i ][ j ] == 0 )
            {
                cost[ i ][ j ] = INFINITY;
            }
        }
    }
    printf( "\n Enter the source vertex:" );
    scanf( "%d", &v );
    printf( "\n Enter the destination vertex:" );
    scanf( "%d", &i );
    dijkstra( NUM_NODES, v, cost, dist );
    printf( "\n Shortest path:\n" );
 
    if( i != v )
    {
        printf( "%d->%d, cost = %d\n", v, i, dist[ i ] );
    }
    return 0;
}

The only thing I really don't like is that last 'for' loop in the dijkstra() function, where you calculate the costs. It's very counterintuitive, and it's not very efficient, and it's totally not my style, but that's the best I found. Any suggestions? ( I really want to change it )

Great that you figured it out on your own!

Some feedback:


void dijkstra( int NUM_NODES, int rootNode, int cost[10][10], int dist[] )
// do not re-define global NUM_NODES. All uppercase letters scream to everyone "He, I am a global constant, I won't change!",
// but you sneakily redefine it to a plain integer. Surprises and accidents are waiting to happen.

{
    int i, u, count, finished[10], min;
    for( i = 1; i <= NUM_NODES; i++ )
       // a. It's useful to create variables as local as possible "for (int i = 1; i <= NUM_NODES; i++)" makes "i" local to the loop.
       // b. I'd suggest get used to starting at index 0, and upto, but excluding the upper limit: "i = 0; i < NUM_NODES; i++".
       //    Zero-based is how C is designed to be used. You'll get to hack zero-based code, running into off-by-one errors all the time
       //    Other people working in your code get confused and also get off-by-one errors all the time.
       // c. "finished[10]" and "i <= NUM_NODES"? Always couple upper limits to each other, it will save you a lot of grieve.
       //    C++ doesn't mind writing beyond the array boundary, you however will find seemingly unexplainable changes in random variables
    {
        finished[ i ] = 0, dist[ i ] = cost[ rootNode ][ i ];
        // Don't use the "," operator. Use a ";". Also, use 2 lines, people are bad at finding ";" in the middle of a line of text.
    }
    count = 2;
    while( count <= NUM_NODES )
    {
    // for (int count = 2; count <= NUM_NODES; count++)  ??
    // This will fail however if you add unreachable nodes in your graph.

        min = 99;
        // So what if I use weights >= 100 ?
        // The only proper value here is "INFINITY", which of course also fails for large enough (cumulative) weights.
        // A better solution here is to use MAX_INT rather than my clumsy INFINITY. Sorry for that.
        for( i = 1; i <= NUM_NODES; i++ )
        {
            if( dist[ i ] < min && finished[ i ] == 0 )
            {
                min = dist[ i ];
                u = i;
            }
        }
        finished[ u ] = 1;
        // You blindly assume that "u" has a sane value here. Check that "min" is less than the initial value first.
        // If it is not, you could not find a new node with the smallest distance, which basically means there is
        // no reachable node anymore, thus something like "if (min >= 99) break;" could be applied here.
        // With that condition your entire "count" variable isn't needed anymore (since you process 1 node each iteration
        // eventually you'll run out of nodes, and this condition will trigger, and abort the loop.

        count++;  // Can be removed if you used the "for" above (or the "break" thing).

        for( i = 1; i <= NUM_NODES; i++ )
        {
            if( ( dist[ u ] + cost[ u ][ i ] < dist[ i ] ) && finished[ i ] == 0 )
            {
                dist[ i ] = dist[ u ] + cost[ u ][ i ];
            }
        }
        // This loop looks good to me. You added a new node u, you have to check and update its neighbours distances
        // as preparation for the next iteration. It's a fundamental part of Dijkstra in an arbitrary cost graph.
        // It's inefficient, since the cost array keeps costs for all nodes from every node, if you make separate lists
        // of direct neighbours for each node (and leave out all the "infinite" neighbours!), it gets much better. 
    }
}

Hey, Albert, can you explain again this:

finished[ u ] = 1;

// You blindly assume that "u" has a sane value here. Check that "min" is less than the initial value first.
// If it is not, you could not find a new node with the smallest distance, which basically means there is
// no reachable node anymore, thus something like "if (min >= 99) break;" could be applied here.
// With that condition your entire "count" variable isn't needed anymore (since you process 1 node each iteration
// eventually you'll run out of nodes, and this condition will trigger, and abort the loop.



count++; // Can be removed if you used the "for" above (or the "break" thing).



I need 'min' to be a big number, because when the while() loops for the 2nd time, I need 'min' to be assigned back to the highest possible number. And if I assign min to some element, and if that element is the lowest and it's already visited, I don't really need the 'min' assigned to it, I think it will mess up something( if I even understand my program correctly )
And yeah, I need to find some macro for Thanks Albert - maybe 'tya' ???

And I really hate that count++, but I didn't understand your explanation.

I need 'min' to be a big number, because when the while() loops for the 2nd time, I need 'min' to be assigned back to the highest possible number. And if I assign min to some element, and if that element is the lowest and it's already visited, I don't really need the 'min' assigned to it, I think it will mess up something( if I even understand my program correctly )

This is all correct.

Now consider the case where one node cannot be reached, eg an obstacle in the path or so. (Technically, a graph is just a bunch of nodes and edges, there is no requirement that all nodes must be connected to each other.)

I made a small example with 3 nodes. node 0, 1, and 2. Node 0 is the start, and is connected to node 1 with a cost of 1. Node 2 cannot be reached.

The 'dist' thus looks like:


INFINITY = 9999

dist = [0, 1, INFINITY]

I didn't want to program entire Dijkstra, so I just pre-filled the 'dist' array with the right values.

Now the "min" find code, I don't need to update 'dist', so I concentrate of the 'min' code only:


INFINITY = 9999

dist =     [0, 1, INFINITY]
finished = [0, 0,        0]
u = None # Create 'u' with dummy value

count = 0
while count < 3:
    # Find-min
    min = 99
    for i in range(len(dist)): # Pythoneese for 'for(int i = 0; i < len(dist); i++)'
        if dist[i] < min and finished[i] == 0:
            min = dist[i]
            u = i

    # End of the 'min' & 'u' update loop. I print the found values each iteration.
    print("Iteration {}, min = {}, u = {}".format(count, min, u))

    # Update finished, just as you do.
    finished[u] = 1

    # Update count, so the loop will eventually abort.
    count = count + 1

Running this produces


Iteration 0, min = 0, u = 0
Iteration 1, min = 1, u = 1
Iteration 2, min = 99, u = 1

As you can see, the first 2 iterations (0 and 1) produces expected results. The 3rd one however updates neither 'min' nor 'u'. 'min' keeps the initial assigned 99, and u keeps the value it got from the previous iteration.

The reason is that at the 3rd iteration, finished[0] and finished[1] are set, and for node 2, "dist[2] < min" doesn't hold (9999 < 99? Nope). The computer thus never reaches "min = dist" and "u = i", in the 3rd iteration, and "min" and "u" do not change.

Obviously, this is correct behavior in terms of the algorithm. Node 2 is not reachable, so you cannot find a neighbour from node 0 or node 1 that leads to node 2.

However, in your code, you don't test for this condition, you assume "u" is the new reachable node, and update everything accordingly. In this case running the update on the same node u more than once doesn't harm, it's just wasting cpu cycles. In general however, that doesn't need to be the case.

You can try it in your own code too, add a non-reachable node, and add "printf" statements to see what the code is doing (or use a debugger).

So the point here is that while you have N nodes in your graph, you may need less than N iterations, if some part of the graph cannot be reached.

What to do? The trick is to exploit the fact that "min" does not change while the search is being done. If the computer reaches the "min = ..." statement inside the "if", you know that min becomes smaller. As a result, if min has not changed after the loop finished, you know that "min = ..." never happened (since min would be smaller in that case).

A not changing "min" is thus a detection mechanism for not setting "u", or, there are no reachable nodes any more. That leads to


INFINITY = 9999

dist =     [0, 1, INFINITY]
finished = [0, 0,        0]
u = None # Create 'u' with dummy value

while True:   # No count!
    # Find-min
    min = 99
    for i in range(len(dist)):
        if dist[i] < min and finished[i] == 0:
            min = dist[i]
            u = i

    if min == 99: # min didn't change so there are no new nodes any more, we're done!
        break

    print("Iteration {}, min = {}, u = {}".format(count, min, u))

    finished[u] = 1

One final point here, if all nodes are reachable, this loop will run one more time than you have nodes. The computer doesn't know you have N nodes, you take one away each iteration, so after N iterations it's done. Instead, it will start an N+1 iteration, find no new nodes, and conclude it's done at that point.

This topic is closed to new replies.

Advertisement