Sign in to follow this  
BlueHabu

Managed Vrs Native Heap Speed Comparison

Recommended Posts

I have created two projects, one is native C++ the other is managed C++/CLI. Both programs create 1GB of data ,write zeros to it one byte at a time, then add 1 to each byte then delete the buffer. I was going to add a timer but did not want to clutter the code if i could tell which is faster by the seat of my pants (which i can). I figured I would put this out there and see what others thought. http://dev.bluehabu.com/Downloads/Allocate.zip What I am wondering about is if there is a small optimization that can be done to the C++/CLI code to make it faster. Now I know I can make the C++ code faster using asm tricks but that is not the point. I was looking at seeing if similar code ran slower or faster depending on being managed or unmanaged. For me the managed code seems noticeably slower but instead of starting yet another C#/C++ holy war I thought I would put these projects here and have people try it out and see which is faster. FYI: Project Allocate2 is native Allocate3 is managed. If you modify the buffer size to something larger let me know what you get it up to. I had an exception thrown if it was too big in Allocate3. My Important Machine Stats AMD 4600 X2 AM2 Nvidia 590 Chipset 2GB of 4-4-4-12 Corsair DDR2-800 Compiler is VC8 My motivation is to see the performance costs of managed code relative to C++. I have been working on a graphics engine for about 6 months now and before I get to far I wanted to see what kind of performance I could get with managed code in terms of using large portions of system memory. Thanks for all that take a look at my crappy code. Project files are included for VS2005

Share this post


Link to post
Share on other sites
Lucky for us developers no real world project does that sort of thing.

edit: Seriously man, with all due respect what is the point of rediculous 'speed' comparisons using code that is in no way representative of real world usage? I could pull code out of my ass that shows any result I want and call it a speed comparison, but you should know that it's worthless to compare stuff that noone in their right mind would ever do in a real program.

Share this post


Link to post
Share on other sites
Also ignoring the arbitrary silliness of the benchmark, if you replace the first loop with Array.Initialize (managed) or remove it altogether from both they run at pretty much the same speed (for me).

Share this post


Link to post
Share on other sites
I never called it a benchmarking tool nor did i say it represents the real world. I am testing the speed of to differnet implementions of the heap under this condition.

Many modern games use more the 1GB of system mem and they will have page faults. This test exaggerates that case which makes detecting any performance diff easier. The fact that games will one day be multi threaded will cause the introduction of more "copies" of data as programmers try to reduced the number of dependencies. So finer grain use of large portions of memory is a valid test IMHO. There is also the fact that moving to a 64 bit OS will increase the number of page tables that will need to be looked at to resolve a logical address to a phyiscal one. However, I believe SP2 with the disable execute bit already uses more page tables to fit the extra bit.

There are many benchmarks that do not directly test real world situations. That fact does not invalidate them. I also have LZMA compression done in both C++/CLI and C++ and sometimes its needs are >600MBs using native code and 630MBs in managed. The managed code is slower in this case as well(after profiling and fixing all the possible hotspots) but no one is getting the source to that.

Telastyn. Thanks I tried that before hand and noticed the byte code was smaller when I used Init(). I also noticed when I added my MMX memset the performance advantage went back to native.

Quote:
Original post by DrEvil
but you should know that it's worthless to compare stuff that noone in their right mind would ever do in a real program.


I am assuming you think 3DMark/Super PI/Prime/Saundra/HDTach/POV-Ray benchmarks are "worthless" because "real world" programmers don't push machine components like these benchmarks do. Is my assumption correct?

Being a professional game developer using C#, C++/CLI and C++ and porting some tools and noticing the slow downs for no other reason other than being managed code, I thought it worthwhile to get the gamedev community involved with this small simple test I had written. That was my mistake. I also thought it a bad idea to make a more elaborate test that could be dissected and everyline argued about.

What I see most here at gamedev recently is the ease at which “Use C#” is thrown around. Not once have I noticed any post that even attempted to evaluate the performance cost of doing so. Nor any post that compared the amount of actual amount of code written after all the try/catch blocks have been added. I read way more than I post and I have noticed that both of you are proponents of C#. What quantitative evaluations have you done to warrant your strong positive opinions of managed code? Are any of these evaultions public so that we can see them?

Thanks for you time.

Share this post


Link to post
Share on other sites
Yes for that specific test, today native is faster. Tomorrow - who knows.
There's a lot more to a language/envorinment than just speed. Most of us already accept the fact that .NET is simply going to be a little slower than native code for many things. Proving this is not necessary.

There are so many other factors to consider between the two. Sure, there will be some people who are primarily interested in raw speed, but most people consider a number of other things higher priority. No memory leaks, no buffer overruns, etc. These things have at least as much importance.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Programmers get themselves into a rut when they stop considering alternative (but functionally equivilent) methodology.

Obviously managed code has seperate memory allocation issues that need to be addressed with algorithms tailored to it. You arent doing that at all.

What you seem to be doing is measuring the cost of rigidly sticking to algorithms well suited to customized memory management under conditions where you don't have such control.

Your performance measurement is irrevocably biased, and I suspect that the slowdowns you claim are "for no other reason than using managed code" have a different reason: That you tried to port things as simply as possible rather than as intelligently as possible.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this