• Advertisement

Archived

This topic is now archived and is closed to further replies.

Improving performance of .NET

This topic is 5032 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I wrote a simple raytracer in C++ and then rewrote it into C#, but the the C# version runs about 3 times slower and I can't figure out why. Both projects were built in release mode, the structure of the code between the two projects is as close as I could get it. I used std::vector in the C++ version for my lists and ArrayList in the C# version. My Vector class (math vectors that is) used inlined operator overloading in the C++ version and in the C# version I implemented that with a struct and static operator overloads. When I ran the C# version through the profiler, it seems to spend about 1/3 of its time in mscorlib. In my code the most time seems to be taken up during the return from the overloaded operators in the Vector struct. Anyone know some things to try to speed this up? I wouldn't think it should be 3 times slower so I can only guess that there is something I'm doing wrong. Thanks, Phil [Edited to fix typo.] [edited by - AliasNotFound on April 10, 2004 6:06:19 PM]

Share this post


Link to post
Share on other sites
Advertisement
I''d check for excessive boxing/unboxing in your code. That seems to be a frequent cause of bad performance for C++ code moved to C#. For example, what kind of objects are you putting in your ArrayList?

Share this post


Link to post
Share on other sites
Raytracers are very floating point intensive applications.

.NET is surprizingly fast, but falls flat on its face when faces with floating point math. Microsoft has identified floating point optimizations as a feature that was cut from the 1.0 release of .NET due to time constraints and has been promised for version 2.0 which is due soon (they say and I hope).

You said you used a struct for your vector class, which is a good performance trick unless you are storing your vectors in those ArrayList objects you mentioned. If you are then (un)boxing is killing your performance. Look it up :-) the trick is to use typed C# arrays (which you will have to resize manually) that do not suffer from boxing costs. Generics (also coming in .NET 2.0) should solve this problem as well.

You said a lot of time was spent in mscorlib, can you tell us what functions specifically? Or post a text version of your profiler results? I could give you a better analysis and optimization tips.

-SniperBoB-

Share this post


Link to post
Share on other sites
quote:
Original post by GnuVince
Have you tried profiling your code?


quote:

When I ran the C# version through the profiler, it seems to spend about 1/3 of its time in mscorlib. In my code the most time seems to be taken up during the return from the overloaded operators in the Vector struct. Anyone know some things to try to speed this up? I wouldn''t think it should be 3 times slower so I can only guess that there is something I''m doing wrong.



Read the post before replying

Share this post


Link to post
Share on other sites
I really recommend this book as it tells you important facts about the framework that can help you to improve performance and overall effectivity of your programs.

Regards,
Andre (VizOne) Loker

Share this post


Link to post
Share on other sites
quote:
Original post by SnprBoB86

You said a lot of time was spent in mscorlib, can you tell us what functions specifically? Or post a text version of your profiler results? I could give you a better analysis and optimization tips.

-SniperBoB-


It seems to spend the most time in WaitForSingleObject and ArrayListEnumeratorSimple.MoveNext.

The only items I am putting into the array lists are the light and world objects, (spheres, planes, etc.). But of course they all get iterated over during each call to my trace and shade methods. I don''t belvied there is any boxing/unboxing going on. My Vector obects are never put into an array list. There does seem to be an inordinate amount of time spent returning from the overloaded operator functions though.

Share this post


Link to post
Share on other sites
If you put anything that''s a value object (integers, structs, etc) into an ArrayList, then boxing/unboxing happens. Perhaps you already knew this, but it''s worth mentioning. If you''re spending a lot of your time in MoveNext, then the iterator cost of ArrayList is eating you; you should try using typed arrays instead (you can have typed arrays of class instances, too, which translates to arrays of instance pointers in C++).

Share this post


Link to post
Share on other sites
Well, I replaced the ArrayLists with typed arrays and got about a 30% improvement. Still twice as slow as the C++ version. I wonder how much of the deficientcy is in that floating point problem that was mentioned earlier?

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
quote:
Original post by AliasNotFound
It seems to spend the most time in WaitForSingleObject...
Have you investigated this further? Are you having multiple threads in your application? Is your app CPU bound? Are you by any chance running the C# version under debugger (that would prevent all JIT optimizations)?

Share this post


Link to post
Share on other sites
I did the same thing, with similar results. It turned out that my Vector3 struct didn''t have any of its calls inlined. The .net JIT is fairly conservative in what it inlines. You might want to check that these functions are being inlined, and if not, do the expansion yourself in the time critical areas.

Is your program multithreaded? WaitForSingleObject sounds like it''s the result of synchronized access.

Share this post


Link to post
Share on other sites
quote:
Original post by sjelkjd
I did the same thing, with similar results. It turned out that my Vector3 struct didn''t have any of its calls inlined. The .net JIT is fairly conservative in what it inlines. You might want to check that these functions are being inlined, and if not, do the expansion yourself in the time critical areas.

The JIT can only inline static, non-virtual and operators.

Operators are practically garrentied to be inlined, and non-virtuals are iffy.

quote:
Is your program multithreaded? WaitForSingleObject sounds like it''s the result of synchronized access.

This sounds like the biggest hit. Did you use the ''lock'' keyword?

Share this post


Link to post
Share on other sites
quote:
Original post by ggs
quote:
Original post by sjelkjd
I did the same thing, with similar results. It turned out that my Vector3 struct didn''t have any of its calls inlined. The .net JIT is fairly conservative in what it inlines. You might want to check that these functions are being inlined, and if not, do the expansion yourself in the time critical areas.

The JIT can only inline static, non-virtual and operators.


Yes, and it won''t inline more than 32 bytes of IL, and tends to give up if non-trivial control logic is present as well. And your statement is not exactly true - it could inline virtual methods in a sealed class if invoked from a variable of that type.
Anyway, the best way to find out is to check the generated assembly code.

Share this post


Link to post
Share on other sites
quote:
Original post by sjelkjd
Yes, and it won't inline more than 32 bytes of IL, and tends to give up if non-trivial control logic is present as well.

I also expect JITed native code isnt shared across process boundaries.

But its early days yet for the .NET framework from Microsoft.

quote:
And your statement is not exactly true - it could inline virtual methods in a sealed class if invoked from a variable of that type.

That would because a virtual method in a sealed class is the equivalent of a non-virtual method anyway. Just with an extra level of redirection which never changes.

Under the v2.0(aka next version) of the framework, apparently the JIT can inline virtual function and interface functions as required. Also allows delegates to be upto 50 times faster than the current implementation.

It is no suprise that Microsoft has licensed Sun's java technology(see the recent $1.6 billion settlement between Sun & Microsoft). Since everything is a virtual under java, Sun has put a lot of research into JIT optimizations of virtual methods & interfaces. The license should allow MS access to Sun's research on JIT optimizations.

The .NET JIT is incredibly dumb in its current state, were as JVM JITs are typically fairly smart. And the JVM JITs need to be, since they are working with an architecture which has fundamental performance problems unless you use some clever optimization strategies.



[edited by - ggs on April 12, 2004 7:52:07 AM]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
please stay to topic and don''t make this thread a war.

Share this post


Link to post
Share on other sites
A lot of time inside ArrayListEnumeratorSimple.MoveNext implies you are using foreach which is currently unoptimized.

The addition of the yeild keyword to .NET 2.0 will make foreach faster than regular for loops (because internally it can skip the bounds checking), but until then use for(int i = 0; i < array.Length; i++)

Share this post


Link to post
Share on other sites
Does anyone know if

Dim i As Integer
Dim what As Type
For i = 0 To x - 1
what = whatArray(i)
''whatever
Next

or

Type what;
for (int i, i{
what = whatArray;
//whatever
}

is faster than?:

Dim i As Type
For Each what In whatArray
''whatever
Next

or C# foreach

Share this post


Link to post
Share on other sites
DrGUI: See my post right before yours!

Personally, I use foreach in non performance critical code. But in a game engine, you need to advoid allocating a ton of array enumerators. So I use a standard for loop with a //USEFOREACH comment so I can upgrade to a real foreach once C# 2.0 comes out.

Share this post


Link to post
Share on other sites
I did a little timing demo comparing looping with for-each and an indexer in VB.NET. The app wasn''t complex or wholely scientific...I''m sure there were issues with my method, but the results were pretty consistant.

I tried:
* small lists (10 items)
* large lists (10000 items)
* boxed and non-boxed types (ints, strings, test class)
* Arraylist and CollectionBase

With little variation using the indexer was 2x faster. It may take 30 minutes to create this app for yourself...but it''s a good learning experience. Of course the real trick is to re-run all your test apps once the next version comes out...


Epolevne

Share this post


Link to post
Share on other sites

  • Advertisement