I think, if you want to stick to the STL, you should replace the set with a priority queue and see what happens

That's it!

I'm not yet 100% sure i did that correctly, but results match and it's much faster already single threaded and also gives speed up using OMP.

On a six core, priority_queue method takes 0.4 seconds ST and 0.07 seconds MT - just perfect

(set method timings have been 1.4 / 2 seconds)

but rather than using an all-pairs algorithm, you are using a single-source algorithm and repeating it for every pair.

Just to try out if the idea works and Dijkstra was already implemted. There was a note above the code that i already replaced this with a fast method.

I've tried the other minor things you have suggested, but they did not differ measureable in runtime.

I'ts a result of being lazy, paranoid and thinking "compiler should be clever enough to detect this" ;D

I really appreciate your help guys!

In the future I'll look more closely what happens under the hood of STL (use it very rarely).

Here the changes:

EDIT: Fixed a bug - Just saw Pink Horror has already noticed when posting

struct FirstFar { int operator() (const std::pair<int,float> &first, const std::pair<int,float> &second) { return first.second > second.second; } }; static void DijkstraComputePaths2 (int source, const adjacency_list_t &adjacency_list, std::vector<float> &min_distance, std::vector<int> &previous) { int n = adjacency_list.size(); //min_distance.clear(); min_distance.reserve(n); min_distance.resize(n, max_weight); min_distance[source] = 0; //previous.clear(); previous.reserve(n); previous.resize(n, -1); std::priority_queue< std::pair<int,float>, std::vector<std::pair<int,float> >, FirstFar> vertex_queue; vertex_queue.push (std::make_pair(source, min_distance[source])); while (!vertex_queue.empty()) { int u = vertex_queue.top().first; float dist = vertex_queue.top().second; vertex_queue.pop(); // Visit each edge exiting u const std::vector<Neighbour> &neighbors = adjacency_list[u]; for (std::vector<Neighbour>::const_iterator neighbor_iter = neighbors.begin(); neighbor_iter != neighbors.end(); neighbor_iter++) { int v = neighbor_iter->target; float weight = neighbor_iter->weight; float distance_through_u = dist + weight; if (distance_through_u < min_distance[v]) { min_distance[v] = distance_through_u; previous[v] = u; vertex_queue.push(std::make_pair(v, min_distance[v])); } } } }