XNA/.Net and a love/hate relationship

posted in Not dead...
Published November 30, 2008
Advertisement
I have a problem with XNA/.Net.
It's not a huge problem but it is one which lurks around the back of my mind and generally bothers me. It's that old bugbear of performance.

Take C++, while a large and complicated language it has had many man hours thrown at it to get compilers to the state they are at today. At the same time it offers, via compiler dependant extensions, ways of getting at important parts of the CPU; namely the SSE instructions of modern processors.

.Net, and thus by extension XNA, doesn't allow this directly (well, x64 generated code apprently uses SSE instructions instead of FPU but they are still singular in nature). Sure, in the PC world you could write some C++ code and then hook it into your .Net application to access these things but you've then got the overhead of getting to and from the unsafe code and of course doing this is impossible on the XB360.

Here we hit my main issue; the lack of VMX access on the XB360. The CPUs have ALOT of power in the VMX is used right and their single floating point units are compartively speaking very weak so not having access to the VMX is not really useful.

The other thing is the way the JIT works. On windows you can pre-JIT software, the process takes a while however it can result in better performance and better code; surprisingly afaik the XB360 has nothing like this. I say surprising because given the known nature of the XB360 you would have thought having a pre-JIT step would have been easier to do; it's not like the application is going to wake up one day and find itself in x64 mode instead of x86.

Don't get me wrong, I do love working with C# and the ability to run my code on the XB360 will be nice once I get something to run on it [smile] These things just.. bother me. Granted, a large part of the reason it bothers me is because at work I have access to a raw XB360, however even if I did write code for it I don't see it getting on XBLA any time soon.

I hope that in time we'll get some decent XNA libs which will allow access to the SSE/VMX instructions for vector and matrix maths functionality. Maybe I should suggest it to someone...
Previous Entry A Farseer Physics Gotcha
Next Entry Discovery!
0 likes 8 comments

Comments

benryves
The Mono guys are working on Mono.SIMD - hopefully we'll see something from Microsoft in the not-too-distant future.
November 30, 2008 03:20 PM
matt_j
Would you happen to have any sources saying there's no JIT on the 360? How much of a speed penalty does that end up being, I wonder.
November 30, 2008 09:05 PM
_the_phantom_
Quote:Original post by matt_j
Would you happen to have any sources saying there's no JIT on the 360? How much of a speed penalty does that end up being, I wonder.


I think you misunderstood me; there is a JIT on the 360 and it works just like the Windows version. What is missing is something like ngen which can be used to pre-JIT/compile the exe so that instead of doing a runtime compile the code is compiled, optimised and then cached on disk ready to be used on the next run. Paint.Net is a good example of an application which does this on install.
December 01, 2008 04:09 AM
Dragon88
Quote:Original post by phantom
Quote:Original post by matt_j
Would you happen to have any sources saying there's no JIT on the 360? How much of a speed penalty does that end up being, I wonder.


I think you misunderstood me; there is a JIT on the 360 and it works just like the Windows version. What is missing is something like ngen which can be used to pre-JIT/compile the exe so that instead of doing a runtime compile the code is compiled, optimised and then cached on disk ready to be used on the next run. Paint.Net is a good example of an application which does this on install.


I understand your point, but I am a little curious what kind of performance penalties you're seeing. Is it a big problem? Sounds to me like it's actually not that bad of a problem, if you approach it creatively. As I understand it the JIT is called on each method/segment of code the first time it's run, so what you would want to do to get the effect of pre-JITing would be to make sure every function is called before gameplay begins. My approach to that would be to have a brief demo play when the game starts, which exercises the engine a bit, and gets everything all JITed so it'll be smooth when the player starts playing.

Thoughts?
December 01, 2008 05:55 PM
_the_phantom_
The main issue is that the JITer is very time constrained; it has to compile and optimise the code as fast as possible which rules out all manner of optimisations simply because it doesn't have the time.

A off line process however isn't time constrained and thus would be able to perform deeper/better optimisation than would be allowed at run time. So, it would have the time to consider code paths and to do things such as auto-vectorisation to take advantage of the SSE/VMX instructions of the x86/x64 and XB360 hardware. (afaik only the Intel Professional C++ compiler auto-vectorises right now, other C++ compilers use intrinsics to get at the features which while requirng brain work is still better than nowt.)

This is no small thing either; the XB360 has 128 VMX registers (per core iirc) and most of its floating point power is wrapped up in these.

For most games the lack of vectorisation isn't going to be a huge problem I grant you, but it just irks me that there is all that "power" laying about which we can't get to, and floating point performance IS a known bad spot for .Net applications and by extension XNA, it's just more likely to show up on the XNA side as games have a tendancy to use floating point functionality more than your average business app; also the later tends not to demand 30/60fps updates [wink]
December 02, 2008 04:15 AM
Dragon88
Quote:Original post by phantom
The main issue is that the JITer is very time constrained; it has to compile and optimise the code as fast as possible which rules out all manner of optimisations simply because it doesn't have the time.

A off line process however isn't time constrained and thus would be able to perform deeper/better optimisation than would be allowed at run time. So, it would have the time to consider code paths and to do things such as auto-vectorisation to take advantage of the SSE/VMX instructions of the x86/x64 and XB360 hardware. (afaik only the Intel Professional C++ compiler auto-vectorises right now, other C++ compilers use intrinsics to get at the features which while requirng brain work is still better than nowt.)

This is no small thing either; the XB360 has 128 VMX registers (per core iirc) and most of its floating point power is wrapped up in these.

For most games the lack of vectorisation isn't going to be a huge problem I grant you, but it just irks me that there is all that "power" laying about which we can't get to, and floating point performance IS a known bad spot for .Net applications and by extension XNA, it's just more likely to show up on the XNA side as games have a tendancy to use floating point functionality more than your average business app; also the later tends not to demand 30/60fps updates [wink]


How many business apps run on an XB360? I'm sure they've thought of this in some form and fashion, as they're certainly pushing XNA and C#, and the XB360 is one of their main targets. How heavily does your application utilize SSE when pre-JITed on a PC?

Worth noting that register count != power, but I understand what you're saying.

Wish I could be of more help, but I'm a relative noob to C#, and I've never used XNA at all.
December 02, 2008 04:21 PM
_the_phantom_
Well, yes, business apps are a large part of why the JIT is fine in most cases. As you say if they intend to push it for the XB360 and beyond this is tech they will probably work out.

Right not the app is purely theorical, also even the PC verison doesn't produce auto-vectorised code either, again another feature which needs to be added and is probably being looked into, although again this will probably be a pre-compiled/JIT step as there simply isn't enough time to work it out; well unless the C# compiler did more of the heavy lifting but that would require an MSIL upgrade as well.

Just to clarify my point; the floating point power is wrapped up in the VMX registers, the fact there is 128 of them is just another matter. 128*4 = 512 floating point numbers on chip at once, even allowing for src, src, dest thats still potentially 336 floating point knocking around; physics could do a fair amount with it. All that said, at the recent GameFest talk we were told that only about 5% of code makes use of the VMX instructions, so while it would be nice and it does irk me it's not what I would call a 'deal breaker'.

The end result is pretty much what was said in the XNA Performance talk from GDC08 which I listened to recently; C#/JIT is probably at the level of around C++ compilers of the mid-90s. Getting there but sometimes it's lacking and things like inlining are a bit hit and miss so expecting vectorisation is a bit much anyways [grin]
December 02, 2008 06:17 PM
Dragon88
Quote:Original post by phantom
The end result is pretty much what was said in the XNA Performance talk from GDC08 which I listened to recently; C#/JIT is probably at the level of around C++ compilers of the mid-90s. Getting there but sometimes it's lacking and things like inlining are a bit hit and miss so expecting vectorisation is a bit much anyways [grin]


TBH, that's probably all they really need. The low hanging fruit has probably already been plucked, the remaining optimizations are going to be the things that give you an extra couple percent but take a lot of work. I guess I haven't ever benchmarked, but I doubt a '90s era C++ compiler was THAT much slower than one minted yesterday. The inefficiency of your algorithms are a far bigger impact on speed than your compiler's minute optimizations (the ones that aren't low hanging) ever will be.

That said, C# is an entirely different beast, and it most certainly does need the optimizations, but I think they're probably near the point of diminishing returns. But that's what I said when everyone was going gaga about quad core too. Look how many people listened then.
December 02, 2008 08:14 PM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Advertisement

Latest Entries

Advertisement