Optimizing graph triangulation

Started by
2 comments, last by cache_hit 13 years, 7 months ago
EDIT: never mind! Right after writing this, I discovered an optimization that cut my runtime down by a factor of 8. It's at the end of this post if you want to try to figure it out yourself. :)

I've written my own implementation of the S-hull Delaunay triangulation algorithm in pure Python, as part of a procedural mapgen system. The algorithm works, but it's about 700x slower than the reference implementation. Which, granted, the reference implementation is 2800 lines of tightly-implemented and badly-documented C code, while I'm clocking in at 400 lines of Python with a decent level of documentation, so obviously there's going to be some tradeoffs there. But before I start porting my code to Cython (i.e. precompiled Python), I want to make certain I'm not missing any major algorithmic improvements.

According to the profiler, I'm spending the vast majority of my time in one function, makeDelaunay(). This is the function that finds triangle pairs in the graph that do not satisfy the Delaunay condition and flips their shared edge. The profiler says that of 8.706s total CPU time, 7.756s are spent in this function, but practically no time (<.5s) is spent in the functions that it calls. I've looked over the function, and I don't see any clear optimizations I could make to speed things up. But maybe you all have some better ideas?

    ## Given that we're done making a triangulation, make that triangulation    # into a Delaunay triangulation by flipping the shared edge of any two    # adjacent triangles that are not Delaunay.    # ( http://en.wikipedia.org/wiki/Delaunay_triangulation#Visual_Delaunay_definition:_Flipping )    def makeDelaunay(self):        # These are the edges that we know will never need to be flipped, as        # they are on the perimeter of the graph.        hull = self.constructHullFrom(Vector2D(-1, -1), self.nodes)        sortedHull = []        for i, vertex in enumerate(hull):            # Ensure vertices are in a consistent ordering so we can do             # lookups on the hull later.            tmp = [vertex, hull[(i + 1) % len(hull)]]            tmp.sort(sortVectors)            sortedHull.append(tuple(tmp))        sortedHull = set(sortedHull)        edgeQueue = []        ## Add all non-exterior edges to the edge queue.        for sourceNode, targetNodes in self.edges.iteritems():            for targetNode in targetNodes:                tmp = [sourceNode, targetNode]                tmp.sort(sortVectors)                tmp = tuple(tmp)                if tmp not in sortedHull and tmp not in edgeQueue:                    # Edge is interior edge.                    edgeQueue.append(tmp)        # Edges that are currently in the queue, so we can avoid adding        # redundant edges.        queueSet = set(edgeQueue)        while edgeQueue:            (v1, v2) = edgeQueue.pop(0)            queueSet.remove((v1, v2))            n1, n2 = self.getNearestNeighbors(v1, v2)            if not self.isDelaunay(v1, v2, n1, n2):                # Triangles are not Delaunay; flip them.                if v2 in self.edges[v1]:                    self.edges[v1].remove(v2)                if v1 in self.edges[v2]:                    self.edges[v2].remove(v1)                self.edges[n1].add(n2)                self.edges[n2].add(n1)                for vertPair in [(v1, n1), (v1, n2), (v2, n1), (v2, n2)]:                    tmp = list(vertPair)                    tmp.sort(sortVectors)                    tmp = tuple(tmp)                    if tmp not in sortedHull and tmp not in queueSet:                        edgeQueue.append(tmp)                        queueSet.add(tmp)

Some notes:
* isDelaunay() examines the inner angles of the triangle pair, and returns True if the sum of the inner angles is less than pi.
* constructHullFrom() generates a convex hull of the graph; it's also used in other parts of the program.
* self.edges is a dict (hash map) that maps nodes to sets of nodes. In other words, it's an adjacency list.
* sortVectors is a function that orders vectors in an arbitrary but consistent way.

I appreciate any insight you care to share!

EDIT:
So, the optimization that fixed the problem? I'd assumed that most of my time was spent in the "while queue is not empty, pop an edge and examine it" code. Turns out that's not the case! Most of my runtime was spent in preparing the queue to go. In particular, the problematic line was "if tmp not in sortedHull and tmp not in edgeQueue". Of course, edgeQueue is a list, which means that for each new edge I added to the queue, I was examining all the existing edges to see if they were the same edge -- n^2 operations! I simply moved the creation of queueSet up a few lines so that I could do a hash lookup instead of a list lookup, and now things are nice and speedy!
Jetblade: an open-source 2D platforming game in the style of Metroid and Castlevania, with procedurally-generated levels
Advertisement
I'm not familiar with the algorithm, but in the first loop I'm not crazy about the idea of constructing sortedHull first as a list, and then converting it to a set. Why not just have it be a set from the beginning, and keep inserting into it? That saves you one O(n) operation of iterating the hull to convert it to a set.

Granted a single O(n) is probably not the issue when you have so many nested loops, does the profiler tell you any more about which part of the function is using most of the time?

Edit: nvm you found the problem already :) Incidentally, my suggestion probably would have fixed it as well.
A good call, though, and I've made the suggested change. Thanks! I don't think it'll make that big a difference in my runtime, but why be sloppy?

However, your suggested change would not have fixed the problem, since the problem was excessive lookups into edgeQueue, not sortedHull.

Though, come to think, there's no need for edgeQueue to be a list; sets can serve as queues just fine so long as you aren't picky about the order you process the queue in, and I'm not. So I can remove the redundant queueSet variable and save on memory!
Jetblade: an open-source 2D platforming game in the style of Metroid and Castlevania, with procedurally-generated levels
Quote:Original post by Derakon
A good call, though, and I've made the suggested change. Thanks! I don't think it'll make that big a difference in my runtime, but why be sloppy?

However, your suggested change would not have fixed the problem, since the problem was excessive lookups into edgeQueue, not sortedHull.

Though, come to think, there's no need for edgeQueue to be a list; sets can serve as queues just fine so long as you aren't picky about the order you process the queue in, and I'm not. So I can remove the redundant queueSet variable and save on memory!


Well it wouldn't have solved it completely, but it should have made a difference. Because the excessive lookups would have then been happening on a set instead of a list. So an order of complexity's difference. Still not as good as a 0-performance hit though :)

This topic is closed to new replies.

Advertisement