strange performance patterns

Storyyeller · 2012-01-13T19:28:51

I have a python script which does some calculations, and I am trying to figure out how fast it is. I invoke it from the command line and it prints out the time taken after completing. However, there is a mysterious pattern to the timings. When I first run it, it starts out relatively fast. However, subsequent runs become slower and slower. If I leave the computer idle for a while, it becomes fast again. I can't really think of any explanation for this. What is going on?

larsbutler

733

January 12, 2012 09:00 AM

I figured out a way to rewrite my code so it doesn't require getting individual bits at all. Of course, it does mean more work has to be done in other places, but it is still faster over all.

Here's the profiler results after all the optimizations I've done.
...

What did the profile report look like before? Profiling only _after_ an optimization is kind of a useless exercise.

Storyyeller

215

Author

January 12, 2012 04:25 PM

Why? All the old profiling would show is the parts that I already changed.

I trust exceptions about as far as I can throw them.

the_edd

2,109

January 12, 2012 06:54 PM

So do any parts of that profile change significantly when when running your program repeatedly?

http://www.mr-edd.co.uk
http://bitbucket.org/edd

larsbutler

733

January 13, 2012 07:28 PM

I figured out a way to rewrite my code so it doesn't require getting individual bits at all. Of course, it does mean more work has to be done in other places, but it is still faster over all.

Here's the profiler results after all the optimizations I've done.
5209449 function calls (5047367 primitive calls) in 27.757 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.004 0.004 27.757 27.757 <string>:1(<module>) 1 0.000 0.000 0.000 0.000 bisect.py:1(<module>) 1904 0.017 0.000 0.036 0.000 bitset.py:10(bitset) 112424 0.640 0.000 1.129 0.000 bitset.py:3(bitcount) 321 0.008 0.000 0.009 0.000 bitset.py:33(toList) 1 0.000 0.000 0.000 0.000 bitset.py:41(makeKeyMap) 656208 2.846 0.000 4.449 0.000 bitset.py:7(union) 73374/13957 0.587 0.000 21.646 0.002 calculation.py:126(getCostSub2) 582249 1.164 0.000 1.164 0.000 calculation.py:14(isSubset) 31678 2.773 0.000 21.387 0.001 calculation.py:140(getSkeletonConnectLists) 112583 1.627 0.000 6.066 0.000 calculation.py:168(getTouchedChildren) 268969 1.865 0.000 3.028 0.000 calculation.py:17(connectKeyTupIsLessEq) 221132 0.478 0.000 0.478 0.000 calculation.py:173(getTouchSet) 751/109 0.018 0.000 0.118 0.001 calculation.py:186(makeCalcNode) 109 0.002 0.000 0.122 0.001 calculation.py:202(makeCalcRoot) 1 0.001 0.001 0.123 0.123 calculation.py:207(buildCalculationTrees) 321 0.008 0.000 0.077 0.000 calculation.py:22(makeConnectDict_Bitset) 109 0.001 0.000 0.003 0.000 calculation.py:31(__init__) 1983 0.005 0.000 0.005 0.000 calculation.py:36(getCost) 109 0.002 0.000 0.003 0.000 calculation.py:41(__init__) 342 0.032 0.000 25.089 0.073 calculation.py:46(getCost) 86698 1.167 0.000 4.451 0.000 calculation.py:6(insertIncomparableList) 321 0.006 0.000 0.097 0.000 calculation.py:68(__init__) 85023/2223 0.653 0.000 25.051 0.011 calculation.py:78(getCost) 12660/1466 1.070 0.000 25.019 0.017 calculation.py:95(getCostSub) 1453/321 0.022 0.000 0.026 0.000 connect.py:1(connectivityDictSub) 321 0.021 0.000 0.052 0.000 connect.py:18(connectivityDict) 215922 1.427 0.000 1.797 0.000 decomposition.py:13(<lambda>) 1 0.004 0.004 2.441 2.441 decomposition.py:33(carvingDecomposition) 236 0.001 0.000 0.002 0.000 decomposition.py:41(<lambda>) 1 0.006 0.006 0.036 0.036 decomposition.py:52(bitsetizeKeys) 1124 0.007 0.000 0.025 0.000 decomposition.py:53(convertEdge) 106 0.008 0.000 2.433 0.023 decomposition.py:6(greedySplitNode) 5989 0.039 0.000 0.073 0.000 decomposition.py:9(<lambda>) 212 0.001 0.000 0.001 0.000 graph.py:12(removeEdge) 107 0.000 0.000 0.001 0.000 graph.py:15(numNeighbors) 325 0.001 0.000 0.001 0.000 graph.py:4(__init__) 545 0.002 0.000 0.002 0.000 graph.py:8(addEdge) 1 0.003 0.003 0.003 0.003 heapq.py:31(<module>) 3 0.000 0.000 0.000 0.000 ntpath.py:122(splitdrive) 3 0.000 0.000 0.000 0.000 ntpath.py:55(isabs) 1 0.000 0.000 0.000 0.000 ntpath.py:63(join) 1 0.000 0.000 2.601 2.601 rsglib.py:27(__init__) 342 0.005 0.000 25.115 0.073 rsglib.py:40(getCost) 1 0.001 0.001 0.002 0.002 test.py:20(makeGraph) 493 0.007 0.000 0.007 0.000 test.py:31(getSuccessorStates) 1 0.007 0.007 25.145 25.145 test.py:38(aStarSearch) 342 0.010 0.000 25.127 0.073 test.py:41(push) 1 0.005 0.005 27.754 27.754 test.py:57(testLevel) 1 0.001 0.001 0.002 0.002 test.py:9(parseLevel) 220466 0.508 0.000 0.508 0.000 unionfind_bitset.py:18(getContainingSet) 406014 5.661 0.000 11.071 0.000 unionfind_bitset.py:3(unionBitsetLists) 153 0.001 0.000 0.001 0.000 {_heapq.heappop} 342 0.002 0.000 0.002 0.000 {_heapq.heappush} 196397 0.370 0.000 0.370 0.000 {any} 112745 0.250 0.000 0.250 0.000 {bin} 361464 0.640 0.000 0.640 0.000 {len} 2005/1097 0.014 0.000 0.033 0.000 {map} 1 0.000 0.000 0.000 0.000 {method '__enter__' of 'file' objects} 216 0.000 0.000 0.000 0.000 {method 'add' of 'set' objects} 574394 1.092 0.000 1.092 0.000 {method 'append' of 'list' objects} 112424 0.240 0.000 0.240 0.000 {method 'count' of 'str' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 106 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects} 215 0.001 0.000 0.001 0.000 {method 'items' of 'dict' objects} 321 0.001 0.000 0.001 0.000 {method 'iteritems' of 'dict' objects} 431 0.001 0.000 0.001 0.000 {method 'pop' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'readlines' of 'file' objects} 8 0.000 0.000 0.000 0.000 {method 'strip' of 'str' objects} 106 0.001 0.000 0.001 0.000 {method 'union' of 'frozenset' objects} 322 0.001 0.000 0.001 0.000 {method 'values' of 'dict' objects} 15168 0.593 0.000 2.388 0.000 {min} 1 0.000 0.000 0.000 0.000 {open} 65237 0.169 0.000 0.169 0.000 {range} 656208 1.602 0.000 1.602 0.000 {reduce} 7899/1910 0.064 0.000 0.114 0.000 {sorted}

Maybe I wasn't clear enough. What I mean is, showing us the POST-optimization profiling report tells us nothing unless you actually have a PRE-optimization report to compare it to.

If you DO have both reports (but just haven't posted the other one), great! You should now know where your bottle neck is and/or if your 'optimizations' had the desired effect.

strange performance patterns

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

strange performance patterns

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines