Using whitelisting to run untrusted C# code safely

Started by
12 comments, last by WozNZ 8 years, 6 months ago

Hi there! I've been working on a proof-of-concept for a game-maker idea I've had for a while. It boils down to running user-written, untrusted C# code in a safe way. I've gone down the path of AppDomains and sandboxes, using Roslyn to build code on the fly and running the code in a separate process. I have a working implementation up and running, but I've hit some snags.

My biggest issue is that it seems like Microsoft have given up on sandboxing code. See https://msdn.microsoft.com/en-us/library/bb763046%28v=vs.110%29.aspx. They added the "Caution" box a few months back, including this gem: "We advise against loading and executing code of unknown origins without putting alternative security measures in place". To me, it feels like they've deprecated the whole thing.

There is also the issue that AppDomain sandboxing isn't very well supported across platforms. There's no support in Mono. I had hopes for a fix from the CoreCLR, but then I found this: https://github.com/dotnet/coreclr/issues/642 - so no luck there.

So! I've started exploring whitelisting as a security measure instead. I haven't figured out how big a part of the .NET library I need to include yet, but it feels like I mainly need collections and some reflection stuff (probably limited to messing with public fields). I think I can do all this by examining the code with Roslyn and not allowing namespaces/classes that aren't explicitly listed.

I'm comparing my approach with Unity, which does more or less the same thing, e.g exposing only a safe subset of the framework. In their case it's an actual stripped down version of Mono (if I've understood it right), but seems to me the results would be pretty much the same if I get it right.

TLDR:

If you have experience with these kind of problems, would you say that is a safe approach? Am I missing something big and obvious here?

Advertisement

With reflection, you can circumvent such whitelists.

In general, it is not wise to trust user-written code at all. It is unrealistic to assume that one could take into account all possible attack vectors.

In "game makers", the user usually produces just data (or very primitive logic as in LBP), and STILL some users are able to hack the systems via said data.

Case in point, I played Mario Maker two days ago and I stumbled upon a level named something like "this will crash the game" - which it indeed did. It is not far-fetched to think that such level could then execute arbitrary code by using the level data as the injection vector, even though presumably the MM level system was not by any means designed to run user code.

Case in point 2: by trivially editing some save data of certain Wii games, one used to be able to cause a buffer overflow which could then be used to run arbitrary code, including overwriting the system firmware with a custom one loaded from USB or SD. The trivial edit? Change a save file name to be just slightly longer than the buffer allocated for it. The vulnerable games, which I won't mention here, were not particularly obscure either.

Niko Suni

I have little experience with C# but I have read into sandboxing a bit in the past. From my knowledge sandboxing works better in some languages than in others (and not at all in a few).

Lua for example should work pretty well because the language constructs available can't be reasonably used to break out of the sandbox or sabotage it. You have to be careful about which 'standard library' functions you give the user though (metatable manipulation and raw* functions are an obvious red flag, but they have a wiki page talking about it).

"Reasonably" is a relative term :)

Niko Suni

Let's say it like this: I'm pretty certain you cannot break the sandbox without giving a malicious user access to the Lua standard library. The malicious user can still make the Lua VM run out of memory though and the host program has to handle that correctly (mishandled error conditions which usually do not happen are after all a very popular exploit vector). The malicious user can also deadlock a thread unless extra precautions are taken.
Whichever API you inject into Lua (and you have to inject something, unless you just want Lua as a glorified .ini file) must also handle being called in malicious ways.

I don't think you can get much better than Lua for sandboxing purposes though. Out of the box it gives you nothing dangerous and you can control exactly which piece of user code sees what. A lot of other languages can do something like an 'import <feature>' or have access to global objects which you cannot prevent without jumping through a lot of extra hoops.

I have some major philosophical problems with the title, I don't see how you can run untrusted code safely in any way.

Either you trust the code and then by definition it's safe enough to run it, or you don't trust the code which means it is unsafe to run by definition. (The latter doesn't need to stop you from running it, it is mostly a slight shift towards expecting malicious behavior by default.)

People have been trying to keep crackers out of systems by doing anything you can imagine and a lot more you didn't yet think of, and so far, in the past 25-30 years, they failed. And that is about systems you can only access remotely, by network cable. You let code run on a CPU inside the machine itself! I believe you're just fooling yourself if you think it can be anywhere near safe if you don't trust the code.

If you want to make the C# code useful, you must give it some power, which means it is exploitable, even if it's just DoS-ing the local system. You can make it difficult to do harm, but in the end, a malicious person with enough motivation or knowledge cannot be stopped.

I have some major philosophical problems with the title, I don't see how you can run untrusted code safely in any way.
Either you trust the code and then by definition it's safe enough to run it, or you don't trust the code which means it is unsafe to run by definition. (The latter doesn't need to stop you from running it, it is mostly a slight shift towards expecting malicious behavior by default.)

People have been trying to keep crackers out of systems by doing anything you can imagine and a lot more you didn't yet think of, and so far, in the past 25-30 years, they failed. And that is about systems you can only access remotely, by network cable. You let code run on a CPU inside the machine itself! I believe you're just fooling yourself if you think it can be anywhere near safe if you don't trust the code.

If you want to make the C# code useful, you must give it some power, which means it is exploitable, even if it's just DoS-ing the local system. You can make it difficult to do harm, but in the end, a malicious person with enough motivation or knowledge cannot be stopped.


I strongly object to this line of thought. Let's look at StarCraft (the original) and Warcraft 3. A player could go online and browse for games to play with people. While both games came with premade maps, they also contained powerful map editors. When you entered a game with a map you did not have, you automatically downloaded it (as well as having a thriving ecosystem of sites where users could upload, comment and download maps). Apart from the pure map data the maps could also contain significant amount of scripts (after all, that's where the whole Defense of the Ancients concept came from: a Warcraft 3 map which ended up being popular).
That worked because the map scripts were properly sandboxed and to my knowledge there was never any exploit where the risk to a player was ever more than "it's not fun".

Granted, the OP's initial idea of using C# is probably not feasible (but I already wrote about that and talked about alternatives). Care must be taken to properly sandbox things but it has been done in the past and, especially if you only need a limited scope, it is doable.

Thanks for the replies so far! I should have explained my situation a bit more. It's about the same as BitMaster's example of WC3 maps. I want to use C# for scripting-type of work. Even when limited, I expect it to be very useful. Some points for context:

  • Users will download mods as code and compile+run them locally. There's no downloading/running of arbitrary .exes or other files. I can examine the code thoroughly before running it.
  • I'll examine the actual semantic model of the code through Roslyn, not match raw code strings.
  • Disallowing the unsafe keyword should avoid problems with buffer overruns, etc. (Well, if I haven't missed something, which is why I'm posting this!)
  • Crashing isn't an issue. I can't help if a mod crashes the sandbox process, but it won't bring down the entire application at least. I imagine mods that crashes the game for you won't be that popular.
  • Allowing reflection isn't a requirement.

I'm interested to hear about specific ideas/examples for how you'd be able to attack this setup, given the constrains I mentioned above. I know it's a tricky thing to guarantee that something is secure, but at the same time I can't come up with a single concrete example in my setup where this would be an actual problem. If you'd like, consider it a challenge! smile.png

Side note: I use C# instead of Lua because I prefer that language, and I'm hoping to ride a bit on the XNA/Unity-wave. I can use Roslyn for real-time compiling, giving error messages, providing with intellisense-like tooling, etc. Also, it lets me use a common framework for my engine code and mod code. Basically, it saves me a *ton* of works, which makes this a feasible (more or less...) project for me.

Have you considered using a form of ecmascript or javascript as your scripting language?

Javascript has been proven as a reasonable language to sandbox as it is used on websites the world over and it is extremely rare you get javascript exploits which allow execution of unsafe code or intrusion into other sandboxes.

In the grand scheme of things, for the complexity and popularity of the language it is quite secure and there are literally licensed libraries for it...

Roslyn cannot make the distinction between good and malicious intents. Something like encrypting all files in "my documents" is just business as usual for the system, but the majority of users certainly don't want a game to do that.

The overarching problem is that it is impossible to recognise harmful code in advance. And when all of your users get hacked by a malicious content pack that slipped past your inspection, it largely becomes your problem.

Niko Suni

This topic is closed to new replies.

Advertisement