Very strange code slowdown

Started by
11 comments, last by Narf the Mouse 12 years ago
I have a very slow piece of code I'm working on. Well, to be more accurate, it does a lot of calculation and has been quite fast at it.

Anyway, it was running at ~0.4 seconds per loop in the test.

I switched to a .Select(Array, Index).Parallel.ForAll() to see if that would speed things up.

First results:

~0.22 seconds per loop.

Later results:

~0.33 seconds per loop.

I switch back to the old code, including a mass undo just to make sure it's exactly the same:

~0.69 seconds per loop.

And this wouldn't be the first time the C# compiler has dumped a sudden, inexplicable slowdown on me with this code. Just going to sleep and running it again in the morning resulted in a slowdown.

So, why would adding Linq code to it result in such a slowdown? Especially after the Linq code has been removed? How consistant is the compiler about speed optimizations?

Thanks.
Advertisement
In the stack trace, would this mean that the array is getting copied instead of passed as a reference?:
> PerlinNoise.dll!PerlinNoise.ModulatedPerlinNoise.Generate(ref float[][] fillArray = {float[1024][]}) Line 94 C#
(Trying Perlin again as an exercise, now that I know more)
Try a profiler.

Try a profiler.

ANTS just finished installing. Can't afford it right now, but I can at least use the free trial for a couple weeks.
1.957% here:

for (x = 0; x < arrayWidth; ++x

It's probably that I need to set arrayWidth explicitly to the width of the inner array, rather than assume it'll be the same - C# can do some optimizations if it knows that the index variable will not exceed the array, apparently.
2.793% is spent on each instance of this instruction, which exists with different +/-'s in four places:

int n = (x - 1) + (y + 1) * 57;

Both x and y are integer.
And the same amount of time here:

float corners = (corner1 + corner2 + corner3 + corner4) * 0.25F;

Which occurs once.
4.469% each for these:

float xFade = fractional_X * fractional_X * fractional_X * (fractional_X * ((fractional_X * 6F) - 15F) + 10F);
float yFade = fractional_Y * fractional_Y * fractional_Y * (fractional_Y * ((fractional_Y * 6F) - 15F) + 10F);

And again, 2.793% each for these:

float i1 = (v1 * (1F - fractional_X)) + (v2 * xFade);
float i2 = (v3 * (1F - fractional_X)) + (v4 * xFade);
float total = (i1 * (1F - fractional_Y)) + (i2 * yFade);

Which are all floats. Aaand...I should really make them all fade values, not the linear fractional.
3.352% here:

array[x] += (total * amplitude) * oneOvertotalAmplitude;
How are you doing the timing? Some mechanisms for counting elapsed time are not precise enough to measure this kind of thing.

There's also a lot of complication in the way .NET executes; JIT compilation and other factors might affect things. There's also any number of unrelated factors on your machine that can affect benchmarking.


Fun experiment: try running a time-sensitive computation with a duration of many seconds. Then do the same thing with a high-res YouTube video playing in the background. Voila! Instant time warp.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]


How are you doing the timing? Some mechanisms for counting elapsed time are not precise enough to measure this kind of thing.

There's also a lot of complication in the way .NET executes; JIT compilation and other factors might affect things. There's also any number of unrelated factors on your machine that can affect benchmarking.


Fun experiment: try running a time-sensitive computation with a duration of many seconds. Then do the same thing with a high-res YouTube video playing in the background. Voila! Instant time warp.

The ANTS performance profiler trial I just installed.

I was thinking that might account for the slowdown - Task Manager has been shoing various things being busy. Just wasn't sure how much it would affect, since I have a dual-core.

Still, I would like to know why those instructions in particular are taking up 33.798% of the program in that profile - If there's anything aside from "They happen a lot". (1024x1024x5).
Consistently confirmed: After compiling the program, explorer.exe hits 50% CPU and stays there for about a minute.
It's hard to know what is causing performance issues without seeing more code. Here's a few educated guesses:

- One thing that can make a big difference to performance is the location of the data you're dealing with in memory. A contiguous array of objects is usually much quicker than an array of pointers to objects. I think in C# to do that you need the items in the array to be structs instead of classes. This type of issue can also make performance vary randomly depending on how lucky you are with where the allocator puts the data.

- Floating point maths can have a few hidden performance issues. Firstly if the data ends up with denormalized / NaN / Inf values the CPU will process them much slower than other values (IIRC over 10x slower in some cases). Secondly as reordering floating point operations affects the result the compiler will normally avoid it. As an example try these alternate lines:

float corners = (corner1 + corner2 + corner3 + corner4) * 0.25F;
float corners = ((corner1 + corner2) + (corner3 + corner4)) * 0.25F;

The results should be very similar, but performance may not be. The second version reduces the dependency chain by one.

- Also note that in C# running the program via the debugger will normally disable all optimizations. You need to test a release build outside of the debugger.

Consistently confirmed: After compiling the program, explorer.exe hits 50% CPU and stays there for about a minute.


Just compiling and not actually running? Do you have a virus scanner that's being a bit hyper or something?

This topic is closed to new replies.

Advertisement