C# .Net Open Source

Started by
22 comments, last by Stainless 9 years, 4 months ago
Maybe Unity will finally be able to upgrade everything to .Net 4.6/C# 5.0!
Advertisement
 

I don't like Mono but might try C# when it will be integrated fully in Ubuntu. When speed is comparable to C++ and doesn't pose install fuss for users, I might use it or simply stick to C++. I'm curious if Unity3D will move to MS C# as they didn't update to the latest Mono.

 

Unity3D has already started using C# as it's scripting Language and in conjuction with MS has developed a Visual Studio Plugin that will interface with it for writing your scripts while in Unity.

Its not going to take many existing java jobs away from places where java is already in place; the only reason such a migration might make sense is if a company was having trouble finding java people (or had an abundance of C# people) in their local area. For new jobs where neither Java nor C# is already in place, C# will be more attractive now -- to be perfectly blunt, C# is a better language than Java, full-stop. The only advantage Java has really had is that it had been more open, and had gotten a head-start, especially on non-Microsoft platforms. Through Mono, C# has already been an option in many places, but people are wary of Mono for fear of it not being "official" or for fear of Microsoft one day coming after them. Those concerns are now moot.

The core of .NET is open, but not everything. So you won't see total compatibility of any .net desktop application over night. What you will see, eventually, is that the open source core will be pulled into /drawn from in projects like Mono or Unity. As a result, those projects will have an easier time maintaining parity with language features, and will have more time to work on the things that aren't part of the open-source core. The runtime, and effectively the languages, are all part of that core though -- I think its just parts of the platform libraries that aren't open yet.


Poor cache awereness in the application code however might hurt the performance more, but then again if you dont do this in C++ you will have the a similar slowdown.

Its true, but the design of managed languages and the CLR give you less control over very precise behaviors of memory use. Cache-aware C# runs better than non-cache-aware C#, but will likely never run as well as cache-aware C or C++, and still lacks truly deterministic resource reclaimation which is also a hindrance to performance-tuned C#.


Unity3D has already started using C# as it's scripting Language and in conjuction with MS has developed a Visual Studio Plugin that will interface with it for writing your scripts while in Unity.

Actually, Microsoft bought a company called SyntaxTree who already made and sold a plugin called UnityVS. Those folks are now working as part of Microsoft, together with the Visual Studio folks to offer a better product. On top of that, the product, now called Unity Tools for Visual Studio, has been made fre, and there's now VS2013 Community that supports such plugins in a free version of Visual Studio. VS Community and UTVS are part of a general trend of making tools more accessible.

throw table_exception("(? ???)? ? ???");

C# will be more attractive now -- to be perfectly blunt, C# is a better language than Java, full-stop.


So much this. I have worked with both languages on and off ever since 2004. I am currently doing Java work.

C# is already attractive and I strongly prefer it over Java. It just feels better. Mono has some quirks but they are not significantly worse than JVM quirks, and just like in Java land, you can learn the quirks and work around them.

Opening the MS implementation means they can hopefully fix so many of those little quirks like the foreach() overhead and nested iterator serialization concerns.

Microsoft bought a company called SyntaxTree who already made and sold a plugin called UnityVS

I love UnityVS. When I got the email from Microsoft that announced they had acquired UnityVS and would be contacting the purchasers with additional information I was very concerned.

Was it going to be Microsoft embrace/extend/extinguish, or a case of promoting it to world class software?

Sadly the jury is still out on this.

If there was a easy fix to get away from the 'stop the world' garbage collector in there

This is by design of the language, for both C# and Java. Stop the world very rarely happens when you follow a few simple development rules.

If you follow a few simple practices (and sometimes a few complex practices) this is actually an amazing feature of the languages.

Let me explain:

First, there is a background-priority task, or "invisible" process, or aspect of the virtual machine, that runs the algorithms. There are multiple types of algorithms with different results.

Concurrent Mark Sweep (CMS) is found both in modern JVMs and Mono, with either the UseConcMarkSweepGC option in Java or the gc=sgen flag in Mono. As a background work while your program is busy loading data or sending graphics commands over the bus or otherwise has some spare time on any processor (which is quite often) it will implement a concurrent mark algorithm. This travels the live objects and marks them as reachable, then it runs a sweep to find and release unreachable memory. It uses a concurrent lock on the memory block, but since this is a low-priority process and runs on idle there is no cost. It is effectively free since it runs during idle, although there is an extremely small but measurable cost in memory to pay for the concurrent marking.

Then there is a second version called Mark/Sweep/Compact (MSC) that does the same thing, but does invoke some rare "stop the world" behavior. In addition to the mark and sweep, there is a compact phase that effectively defragments your memory. While it does stop the world, it also typically runs as a low-priority task so it only hits on idle. When enough objects are ready to move to a different generation of the GC (in Mono the generations are 'Nursery', 'Major Heap', in Java they are "Young Collection" and "Old Space Collection") the threads referencing the memory are paused, a small chunk of memory is migrated from one location to another transparently to the application, and the threads are resumed.

In normal operation, CMS is normally quietly running in the background sweeping up objects you no longer need. Critically: THIS IS A TASK RUN DURING IDLE. You do not see it, you do not pay any significant cost for it. Occasionally, when there is a lot of accumulated 'old' stuff, the "compact" algorithm will pause the few threads referencing something and move it between generations or move it to defragment memory. Critically: THIS IS ALSO A TASK RUN DURING IDLE, but with the caveat that it can potentially stick around a few microseconds into when something would normally get scheduled.

Also note that there is a secondary memory pool for long-term objects. This includes things like models and textures and audio. You don't want it moved around so you specifically load it up into the older-school style of managed memory, with a note I'll cover with the language comparison.

In very exceptional operation, at times when you are trashing memory or leaking resources or doing other stupid bugs, you will run out of memory in one of the pools. In this case it forces the compact to run immediately. This stops the world in the middle of the action and blocks until the memory is scanned. This is the bad one that people complain about.

So let's compare how this works with the C++ model.

In C++ you typically free memory in destructors and also free memory as a task in the middle of other processing. Most game studios have semi-strict rules about when you can allocate memory and (with various degrees of enforcement) only permit you to allocate/release memory during times like level loading or specially designated transitions. The reason for this restriction is that collection (calling delete or free) takes place immediately, and also because if you leave things on the heap it can badly fragment memory. Memory fragment is a lesser concern on the PC with virtual memory reallocations, but is still a minor problem on PC and a major concern on devices. The processing is not delayed until idle, the heap modifications takes place during the busy section of your code. In the roughly 11 major engines I have worked with zero of them displaced the heap processing to a low priority process. In every engine it was entirely up to the developers to not fragment memory, and sadly very often there would be a few bits here and there that would get missed and every game ships with a slowly fragmenting memory space.

So with C++ you get to pick and choose when GC runs, but in my experience this universally means that GC runs at the worst possible time, it runs when the system is under load. With modern versions of both Java and C#, you do not get to chose when GC runs but it is almost always at the best possible time, it runs when the system is idle. On rare occasions it will slip into when you should not be idle, consuming on the order of 1/10,000 of your frame time. On extremely rare occasions (typically caused by bad/prohibited/buggy practices) it will unexpectedly run when the system is under load, exactly like C++ except not under your control. Historically we log those extremely rare occasions and flag them for investigation and floggings.

One of the great practices I learned years ago was the use of threads for organizational management. THIS IS NOT A BEGINNERS ARCHITECTURE AND DOING IT WRONG CAN BE FATAL TO PERFORMANCE, do it with extreme caution and profiling. Effectively, inside the simulator each game object's specific allocations live only within that thread, and cross-thread data is very carefully designed to live in a lock-free environment. Typically the engine has all its allocations moved into the old generation early on, so when the "compact" algorithm is triggered it normally is triggered by game objects and only locks game objects. Each thread gets awoken (very carefully to avoid the stampede of elephants problem) within the pool and given the opportunity to update. So when the "compact" algorithm runs and needs to stop the world, the world it stops is only the single thread among many that are queued. The other simulator threads continue to execute. Since so many other threads are simultaneously scheduled the algorithm's lock has no net effect on performance. You may have several hundred pending tasks and a few get blocked, so the OS immediately picks up the next unblocked task without pause.

For a commercial environment where you can ensure those bad/prohibited/buggy practices are minimized the Java/C# model is far and away better for performance. So C++ you pay the cost during heavy load. In C++/Java you you pay the cost during idle and stop-the-world conditions in practice do not stop the world at all, releasing memory is approximately zero cost.

For those still working in C++ only code and you're on a platform that supports managed code (C++/CLI) you can use the .net libraries to take advantage of gc-managed objects, I strongly suggest you investigate doing so. It is not a hard thing to change from * to ^ notation on the objects and done well it can give some big benefits for managing object lifetimes.

That is way more than I intended to right, but should cover it enough.

tl;dr for this section: The modern GC algorithms in C# and Java are amazingly good, with a default model that completely exceeds the C++ end-of-life memory management model.

Microsoft bought a company called SyntaxTree who already made and sold a plugin called UnityVS


I love UnityVS. When I got the email from Microsoft that announced they had acquired UnityVS and would be contacting the purchasers with additional information I was very concerned.

Was it going to be Microsoft embrace/extend/extinguish, or a case of promoting it to world class software?

Sadly the jury is still out on this.


While the jury might be out I'd be less concerned with this outcome now than a few years ago; the opening up of a free VS with plugins combined with them acquiring the plugin would seem to indicate a desire to support Unity on their platforms (and tempt people to using non-unity tools going forward) so people keep using their platforms - provide a good experience on Windows for Unity dev and people will keep using and releasing on Windows.
(On a related note VS2015 is being specifically tested with UE4 to improve the development experience there too - so it seems MS are keen to support engine users via their tools.)

Same goes for the Android debugging support - with that one announcement my love for MS just went up 10,000 fold because frankly trying to do anything on Android from Windows right now remains a complete cluster-fuck and I have no faith that Google will do anything useful about it any time soon.

I've been saying this for a while but recent actions have made it clearer; this isn't the old MS, there is a large shift happening from giving away things free to admitting other OSes exist beyond Windows, so I suspect a lot of the old habits are going to die too.

If you follow a few simple practices (and sometimes a few complex practices) this ["no way to get away from the 'stop the world' garbage collector"] is actually an amazing feature of the languages.

The GC may be amazing, but why is barring you from having any control an amazing feature? Wouldn't it be nice if you could choose to opt in to specifying the sizes of the different heaps, hinting at good times to run the different phases, specifying runtime limits, providing your own background threads instead of automatically getting them, etc? Would it harm anything to allow devs to opt in to that stuff? Do the amazing features require the GC to disallow these kinds of hints?

With modern versions of both Java and C# ... On rare occasions [when GC runs at the wrong time, it consumes] on the order of 1/10,000 of your frame time.

16.667ms / 10000 = 1.7 microseconds
Having seen GC's eat up anywhere from 1-8ms per frame in the past (when running on a background low-priority thread), claims of 1?s worst-case GC times sound pretty unbelievable -- the dozen cache misses involved in a fairly minimal GC cycle would alone cost that much time!

I know C# has come a long way, but magic of that scale that are justifiably going to be met with some skepticism.
Combine that skepticism with the huge cost involved in converting an engine over to use a GC as it's core memory management system, and you've got still a lot of resistance in accepting them.
Also, often it's impossible to do an apples to apples comparison because the semantics used by the initial allocation strategies and the final GC strategy end up being completely different, making it hard to do a valid real world head-to-head too...

while your program has some spare time on any processor (which is quite often)

Whether it's quite often or not entirely depends on the game. If you're CPU bound, then the processor might never be idle. In that case, instead of releasing your per-frame allocations every frame, they'll build up until some magical threshold out of your control is triggered, causing a frame-time hitch as the GC finally runs in that odd frame.

Also when a thread goes idle, the system knows that it's now safe to run the GC... but the system cannot possibly know how long it will be idle for. The programmer does know that information though! The programmer may know that the thread will idle for 1 microsecond at frame-schedule point A, but then for 1 millisecond at point B.
The system sees both of those checkpoints as equal "idle" events and so starts doing a GC pass at point A. The programmer sees them as having completely different impacts on the frame's critical path (and thus frame-time) and can explicitly choose which one is best, potentially decreasing their critical path.

In C++ ... collection (calling delete or free) takes place immediately ... this universally means that GC runs at the worst possible time, it runs when the system is under load.

I assume here we're just dealing with the cost in updating the allocator's management structures -- e.g. merging the allocation back into the global heap / the cost of the C free function / etc?

In most engines I've used recently, when a thread is about to idle, it first checks in with the job/task system to see if there's any useful work for it to do instead of idling. It would be fairly simple to have free push the pointer into a thread-local pending list, which kicks a job to actually free that list of pointers once some threshold is reached.
I might give it a go biggrin.png Something like this for a quick attempt I guess.

However, the cost of freeing an allocation in a C++ engine is completely different to the (amortized) cost of freeing an allocation with a GC.
There's no standard practice for handling memory allocation in C++ -- the 'standard' might be something like shared_ptr, etc... but I've rarely seen that typical approach make it's way into game engines.
The whole time I've been working on console games (PS2->PS4), we've used stack allocators and pools as the front-line allocation solutions.

Instead of having one stack (the call stack) with a lifetime of the current program scope, you make a whole bunch of them with different lifetimes. Instead of having the one scope, defined by the program counter, you make a whole bunch of custom scopes for each stack to manage the sub-lifetimes within them. You can then use RAII to tie those sub-lifetimes into the lifetimes of other objects (which might eventually lead back to a regular call-stack lifetime).
Allocating an object from a stack is equiv to incrementing a pointer -- basically free! Allocating N objects is the exact same cost.
Allocating an object from a pool is about just as free -- popping an item from the front of a linked list. Allocating N objects is (N * almost_free).
Freeing any number of objects from a stack is free, it's just overwriting the cursor pointer with an earlier value.
Freeing an object from a pool is just pushing it to the front of the linked list.

Also, while we're talking about these kinds of job systems -- the thread-pool threads are very often 'going idle' but then popping work from the job queue instead of sleeping. It's pretty rediculous to claim that these jobs are free because they're running on an otherwise 'idle' thread. Some games I've seen recently have a huge percentage of their processing workload inside these kinds of jobs. It's still vitally important to know how many ms each of these 'free' jobs is taking.

In the roughly 11 major engines I have worked with zero of them displaced the heap processing to a low priority process.

The low priority thread is there to automatically decide a good 'idle' time for the task to run. The engines I've worked with recently usually have a fixed pool of normal priority threads, but which can pop jobs of different priorities from a central scheduler. The other option is the programmer can explicitly schedule the ideal point in the frame for this work to occur.

I find it hard to believe that most professional engines aren't doing this at least in some form...?
e.g.
When managing allocations of GPU-RAM, you can't free them as soon as the CPU orphans them, because the GPU might still be reading that data due to it being a frame or more behind -- the standard solution I've seen is to push these pointers into a queue to be executed in N frame's time, when it's guaranteed that the GPU is finished with them.
At the start of each CPU-frame, it bulk releases a list of GPU-RAM allocations from N frames earlier.
Bulk-releasing GPU-RAM allocations is especially nice, because GPU-RAM heaps usually have a very compact structure (instead of keeping their book-keeping data in scattered headers before each actually allocation, like many CPU-RAM heaps do) which can potentially entirely fit into L1.

Also, when using smaller, local memory allocators instead of global malloc/free everywhere, you've got thread safety to deal with. Instead of the slow/general-purpose solution of making your allocators all thread-safe (lock-free / surrounded by a mutex / etc), you'll often use a similar strategy to the above, where you batch up 'dead' resources (potentially using wait-free queues across many threads) and then free them in bulk on the thread that owns the allocator.
e.g. a Job that's running on a SPU might output a list of Entity handles that can be released. That output buffer forms and input to another job that actually performs the updates on the allocator's internal structures to release those Entities.

One engine I used recently implemented something similar to the Actor model, allowing typical bullshit style C++ OOP code to run concurrently (and 100% deterministically) across any number of threads. This used typical reference counting (strong and weak pointers) but in a wait-free fashion for performance (instead of atomic counters, an array of counters equal in size to the thread pool size). Whenever a ref-counter was decremented, the object was pushed into a "potentially garbage" list. Later in the frame schedule where it was provable that the Actors weren't being touched, a series of jobs would run that would aggregate the ref counters and find Actors who had actually been decremented to zero references, and then push them into another queue for actual deletion.

Lastly, even if you just drop in something like tcmalloc to replace the default malloc/free, it does similar work internally, where pointers are cached in small thread-local queues, before eventually being merged back into the global heap en batch.

When enough objects are ready to move to a different generation of the GC (in Mono the generations are 'Nursery', 'Major Heap', in Java they are "Young Collection" and "Old Space Collection") the threads referencing the memory are paused, a small chunk of memory is migrated from one location to another transparently to the application, and the threads are resumed.

Isn't it nicer to just put the data in the right place to begin with?
It's fairly normal in my experience to pre-create a bunch of specialized allocators for different purposes and lifetimes. Objects that persist throughout a whole level are allocated from one source, objects in one zone of the level from another, objects existing for the life of a function from another (the call-stack), objects for the life of a frame from another, etc...
Often, we would allocate large blocks of memory that correspond to geographical regions within the game world itself, and then create a stack allocator that uses that large block for storing objects with the same lifespan as that region. If short-lived objects exist within the region, you can create a long-lived pool of those short-lived objects within the stack (within the one large block).
When the region is no longer required, that entire huge multi-MB block is returned to a pool in one single free operation, which takes a few CPU cycles (pushing a single pointer into a linked list). Even if this work occurs immediately as you say is a weakness of most C++ schemes, that's still basically free, vs the cost of tracking the thousands of objects within that region with a GC...

On extremely rare occasions (typically caused by bad/prohibited/buggy practices) it will unexpectedly run when the system is under load, exactly like C++ except not under your control.

So no - the above C++ allocation schemes don't sound exactly like a GC at all tongue.png

smile.png

C++ memory handling is a completely different subject, you could write a garbage collector for your c++ project, but I don't advise it.

We typically have four different memory managers in our c++ engines, all for different situations, and it's up to the coders to manage their use themselves. Yes it does mean I have to yell at people when they do something stupid. Yes it does mean that a single bad check in can break the entire game, but it's still preferable to a garbage collector.

Also I have seen cases when the C# garbage collector can hang for several seconds, and in one case actually crashed the game, but in all these cases it was bad programming that caused the issue. Not a fault in C#

Anyway going back to the OP's original question.

1) Will it change programming?

Of course not. We will still be sat in front of a monitor typing on a keyboard swearing at a compiler, .Net won't change that.

2) Will it hurt Java

Yes, and so it should java is a pile of ....<<3 megabytes of expletives removed by filter >> Several years of my life have been wasted writing Java VM's so I know how it works and I wish it had been starngled at birth.

3) Eclipse et al.

Will continue. People like what they know and are hard to persuade to change. Hell I am sure there are people that still miss Netscape. I still have a copy of Vi on my machines. Nothing Microsoft does will change that.


The GC may be amazing, but why is barring you from having any control an amazing feature? Wouldn't it be nice if you could choose to opt in to specifying the sizes of the different heaps, hinting at good times to run the different phases, specifying runtime limits, providing your own background threads instead of automatically getting them, etc? Would it harm anything to allow devs to opt in to that stuff? Do the amazing features require the GC to disallow these kinds of hints?

You can do a lot of that now. You've always been able to hint that a new collection should be run (which you might want to do right after loading a new level, say). Newer versions of .NET let you disable the GC and turn it back on again for sections of your code, so you could leave it disabled for your main loop and then flip it back on again during level load. There are different levels to this feature as well; you can set it to *never* run, or set it to "low latency", where it almost never runs unless you get critically close to running out of memory. You can also manually compact the LOH, letting you choose good times to reduce fragmentation.

If you want even more control, like taking full control of thread scheduling of the GC or setting size limits, you can host the CLR, similar to how Unity works. There are a crazy amount of knobs to tweak there.

Of course, the simplest advice that avoids all of this is what it has always been in both the managed and native worlds: during the main loop of your game, don't heap allocate. Not necessarily easy, but simple to understand. It's certainly easier to do in C++, but also doable in C# (in fact, it was almost a hard requirement that you do that for Xbox Arcade XNA games, since the Xbox's GC was pretty crappy). Unlike in some other managed languages that will remain unnamed, the .NET CLR supports value types, so you can with just a bit of effort cut down heavily on the amount of garbage you're generating.

For the times you absolutely need heap allocations but really need to avoid the managed heap, you can always just *allocate native memory* anyway! There's nothing stopping you from malloc-ing some native memory blocks and doing your work there. I do this pretty commonly in my own projects for certain classes of memory where I need explicit control over lifetime or alignment.

Mike Popoloski | Journal | SlimDX
One thing that hasn't been considered in all this GC discussion is that the GC only knows about memory.

That's it.

The C++ "RAII" idiom actually can clean up any resource immediately. And this is incredibly important. It is, in fact, so important that .NET has the IDisposable idiom to perform RAII-like tasks in its GC world.

Sure, I may not mind if my 6-core desktop with 32gb of ram leaves a few data structures around and cleans them up with a background thread later, but I do want it to not keep a file open forever because the GC has decided that it doesn't need to run yet.

And that doesn't even get into the issues that GC in general has on memory-limited devices like phones, tables, and consoles. In fact there are special versions of .NET that run on some of these devices with a non-generational GC because they can't afford all the extra memory a generational GC requires. And that kind of GC is most certainly not "invisible" to your program.

Well-written GCs (like .NET's) are great for memory on non-memory constrained systems. But they never mean you don't have to care about memory. And they don't do anything for you with non-memory resources.

Indeed, you still have to play nice with the GC. For example, do not generate tons of medium lifetime objects. Short-lived objects are more or less free, large allocations are also more or less free (from a processing perspective) but medium lifetime objects tend to trigger the more expensive collections which notably affects framerate etc. And those do take their time, so need to avoid getting there.

Use value types (aka structs) where appropriate. A giant array of structs is still only a single GC reference, unless they have reference members ofcourse. They are also more cache friendly.

You can use pools in C# as well. A lot of people seem to forget about this. Granted, they are a bit messier to use since you have to preallocate the actual objects you will reuse, but still.

Be aware of what language features cause memory allocations (aka garbage) and avoid using those in tight situations. Foreach, yield, params/variable argument methods, string concatenations, .. If you have ReSharper (you should), there's a plugin which highlights heap allocations.

This topic is closed to new replies.

Advertisement