'External Code' taking up most of my CPU?

Started by
13 comments, last by Adam_42 8 years ago
I've been implementing CPU based voxel ray casting on a BSP tree, in the end a node consists of about 30 voxels, however, the performance is terrible, I ran the program through the Visual Studio Community Edition CPU profiler and got the following output:

http://i.imgur.com/SbZIJCn.png

Since it just says that '[External Code]' is taking most of the CPU time, I'm not sure how to proceed. I can't identify many serious performance issues, and I find it hard to believe that the recursion could be the problem. My Ray-AABB intersection tests don't appear to be too slow either.

My code:
BSP Tree code (yes, I know the filename says Octree):
VoxelOctree.cs

GI_IntersectionTests.cs

GIWorld.cs

Any suggestions on how I should approach figuring out this issue would be appreciated.
The ray casting and tree traversal code is in the RayCast function in VoxelOctree.cs
Advertisement

Do you make a system call or call to a library function in there?

Edit:

Nvm, guess I can look.

EditEdit:

Or not. Where is VoxelTree? Oh, you linked it wrong, but I can just navigate to it. : p

Gonna guess that it's this:


BoundingBox b2 = new BoundingBox(Voxels.Values.ElementAt(i).Position - Vector3.One * vSide * 0.5f, Voxels.Values.ElementAt(i).Position + Vector3.One * vSide * 0.5f);

Can you break that out into a separate function and call it there, then check the profile again?

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.
Sorry about that, just noticed and fixed the link.
EDIT: Ah, a minute too late.

EDIT: The boundingbox class doesn't actually do any calculations on the data, but I'll give your suggestion a try in the morning.

I'm wondering if it's the allocation is why I singled that out. I don't see any other external calls in that function, but I'm kind of sleepy as well.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

BoundingBox is your code, so I would expect it to show up in your call stack if it were the problem. And it's a struct, so it shouldn't be causing any heap allocations.

I would focus maybe on the dictionary values enumeration. Isn't there a setting that will show you symbol information for system calls? I know there is for viewing the callstack in the debugger, I don't know if there is for the performance stuff...

You're calling ElementAt four times, it seems you should be able to just call it once and store the result.

I don't see anything obvious in that function that would be causing any heap allocations (and possible performance issues), but .net can be sneaky that way.

I'm tempted to guess the allocation as well. Calling "new" in C# is faster than C++'s global heap, but still not exactly speedy.

It looks like the allocator is being called a lot. Every time you have Voxels !=null, you loop through Voxels.Values.Count calls to the allocator to make a new one. How many voxel values do you have? How many times do you call RayCast? Multiply the two, that's about how many times you do the allocation.

My hunch is that you are hammering the allocator and constructor with many millions of calls for what looks like about 10 seconds.

Ok, it looks like (from my own tests) calling ElementAt on the Dictionary's Values collection is really slow. I'm guessing it is an order-n operation that needs to iterate through items one by one, since the larger the index the longer it takes.

You probably want to use foreach instead (and hopefully the ValueCollection's iterator is a value type, to avoid unnecessary heap allocation)

In the little test I did, for a Dictionary of 10,000 items, this:

for (int i = 0; i < blobs.Values.Count; i++)
{
value += blobs.Values.ElementAt(i).a;
}
was more than a 1000x slower than this:
foreach (Blob blob in blobs.Values)
{
value += blob.a;
}

BoundingBox is your code, so I would expect it to show up in your call stack if it were the problem. And it's a struct, so it shouldn't be causing any heap allocations.


There is a common misconception that structs must live on the stack rather than the heap.

C# makes a distinction between "reference types" and "value types", but that does not mean doesn't affect memory allocations or heap allocations. C# works very hard to avoid specifying exactly where things live. The data stack and the heap are both implementation details for the language. The specification is vague about the heap, allowing for there to be many heaps, and for heaps to be garbage collecting or not,

The C# specification notes that "structs are value types and do not require heap allocation", but that does not mean they require stack allocation.


Structs are created on the heap all the time in C#. Structs can be placed on the stack, but they also can be placed on the heap. Structs that are boxed, structs that are a field of a class, structs not closed in a lambda, structs in an iterator or foreach block, these are all probably on the heap. A raw variable may be on the stack or the heap.

The runtime has even more options. A struct's members can be pulled out into registers during JIT or ngen compilation. Unlike C++, these details don't matter as much in C#. The distinction between "reference types" and "value types" is generally sufficient.

Agreed that struct doesn't necessarily mean it's allocated on the stack, but in the OP's usage scenario it will.

You basically have:


    public struct BoundingBox : IEquatable<BoundingBox>
    {
        public Vector3 Min;
        public Vector3 Max;

        public BoundingBox(Vector3 min, Vector3 max)
        {
            this.Min = min;
            this.Max = max;
        }
    }

and then in the function:


    BoundingBox b2 = new BoundingBox(Voxels.Values.ElementAt(i).Position - Vector3.One * vSide * 0.5f, Voxels.Values.ElementAt(i).Position + Vector3.One * vSide * 0.5f);

In this case, the new is not allocating on the heap. I looked to see somewhere that b2 might be boxed (say, by an implicit cast to IEquatable<BoundingBox> ), and couldn't find any.

Ok, it looks like (from my own tests) calling ElementAt on the Dictionary's Values collection is really slow. I'm guessing it is an order-n operation that needs to iterate through items one by one, since the larger the index the longer it takes.

You probably want to use foreach instead (and hopefully the ValueCollection's iterator is a value type, to avoid unnecessary heap allocation)

In the little test I did, for a Dictionary of 10,000 items, this:

for (int i = 0; i < blobs.Values.Count; i++)
{
value += blobs.Values.ElementAt(i).a;
}
was more than a 1000x slower than this:
foreach (Blob blob in blobs.Values)
{
value += blob.a;
}

Just a FYI:

Dictionary, List, possibly a few others (except for Collection<T>) all have struct enumerators and explicitly implement the GetEnumerator() so the GetEnumerator that actually will be called in that foreach will return the struct enumerator and not do any boxing.

With that said though, ElementAt is an extension method for IEnumerable<T> so the value collection is going to be treated as an IEnumerable<T> and the explicitly implemented GetEnumerator will be called, which means the struct enumerator will be boxed.

This topic is closed to new replies.

Advertisement