I've written my own implementation of the S-hull Delaunay triangulation algorithm in pure Python, as part of a procedural mapgen system. The algorithm works, but it's about 700x slower than the reference implementation. Which, granted, the reference implementation is 2800 lines of tightly-implemented and badly-documented C code, while I'm clocking in at 400 lines of Python with a decent level of documentation, so obviously there's going to be some tradeoffs there. But before I start porting my code to Cython (i.e. precompiled Python), I want to make certain I'm not missing any major algorithmic improvements.
According to the profiler, I'm spending the vast majority of my time in one function, makeDelaunay(). This is the function that finds triangle pairs in the graph that do not satisfy the Delaunay condition and flips their shared edge. The profiler says that of 8.706s total CPU time, 7.756s are spent in this function, but practically no time (<.5s) is spent in the functions that it calls. I've looked over the function, and I don't see any clear optimizations I could make to speed things up. But maybe you all have some better ideas?
## Given that we're done making a triangulation, make that triangulation # into a Delaunay triangulation by flipping the shared edge of any two # adjacent triangles that are not Delaunay. # ( http://en.wikipedia.org/wiki/Delaunay_triangulation#Visual_Delaunay_definition:_Flipping ) def makeDelaunay(self): # These are the edges that we know will never need to be flipped, as # they are on the perimeter of the graph. hull = self.constructHullFrom(Vector2D(-1, -1), self.nodes) sortedHull = [] for i, vertex in enumerate(hull): # Ensure vertices are in a consistent ordering so we can do # lookups on the hull later. tmp = [vertex, hull[(i + 1) % len(hull)]] tmp.sort(sortVectors) sortedHull.append(tuple(tmp)) sortedHull = set(sortedHull) edgeQueue = [] ## Add all non-exterior edges to the edge queue. for sourceNode, targetNodes in self.edges.iteritems(): for targetNode in targetNodes: tmp = [sourceNode, targetNode] tmp.sort(sortVectors) tmp = tuple(tmp) if tmp not in sortedHull and tmp not in edgeQueue: # Edge is interior edge. edgeQueue.append(tmp) # Edges that are currently in the queue, so we can avoid adding # redundant edges. queueSet = set(edgeQueue) while edgeQueue: (v1, v2) = edgeQueue.pop(0) queueSet.remove((v1, v2)) n1, n2 = self.getNearestNeighbors(v1, v2) if not self.isDelaunay(v1, v2, n1, n2): # Triangles are not Delaunay; flip them. if v2 in self.edges[v1]: self.edges[v1].remove(v2) if v1 in self.edges[v2]: self.edges[v2].remove(v1) self.edges[n1].add(n2) self.edges[n2].add(n1) for vertPair in [(v1, n1), (v1, n2), (v2, n1), (v2, n2)]: tmp = list(vertPair) tmp.sort(sortVectors) tmp = tuple(tmp) if tmp not in sortedHull and tmp not in queueSet: edgeQueue.append(tmp) queueSet.add(tmp)
Some notes:
* isDelaunay() examines the inner angles of the triangle pair, and returns True if the sum of the inner angles is less than pi.
* constructHullFrom() generates a convex hull of the graph; it's also used in other parts of the program.
* self.edges is a dict (hash map) that maps nodes to sets of nodes. In other words, it's an adjacency list.
* sortVectors is a function that orders vectors in an arbitrary but consistent way.
I appreciate any insight you care to share!
EDIT:
So, the optimization that fixed the problem? I'd assumed that most of my time was spent in the "while queue is not empty, pop an edge and examine it" code. Turns out that's not the case! Most of my runtime was spent in preparing the queue to go. In particular, the problematic line was "if tmp not in sortedHull and tmp not in edgeQueue". Of course, edgeQueue is a list, which means that for each new edge I added to the queue, I was examining all the existing edges to see if they were the same edge -- n^2 operations! I simply moved the creation of queueSet up a few lines so that I could do a hash lookup instead of a list lookup, and now things are nice and speedy!