Sign in to follow this  
Rhalin

Sandboxing embedded scripting and other security concerns

Recommended Posts

I'm looking to add a scripting language into an application library (not specific to game dev, but possibly useful for it) I've been going over the options with some of the other developers on the team, suggestions have ranged from writing our own, to perl, php, python, etc etc to avoid "reinventing the wheel" syndrom. It seems though, that we're having trouble finding a single language that suits our needs, which are: Easy to learn and use for those that may not have programmed much before, Interpreter able to act as an "instance" object that can exist in multiple instances, and is thread-safe. Easy to expose C++ classes and objects in, with minimal hair pulling and last but not least, access to some kind of DOM-compliant XML library for manipulating xml data trees Lua got tossed out fairly quick because most of the people on the team weren't happy with it fulfilling the "easy to learn and use" requirements. Python became our next choice, and our belief is that we can overcome the "instance" obstacle with a combination of using python's thread features and namespaces (boost libraries are not an option), and use SWIG to handle the glue code to expose objects. It also seems to have access to XML libraries. The major problem being that we're unfamilier with python in general, but we had some concerns regarding the libraries python has access to. Basically, when you embed python, is there any way to deny scripts that are run in your application access to certain library functions, and/or directories? This system is going to be used in a library where virtually anonymous users can send other users scripts to execute, and if python can't be sandboxed very easily, do you have any other suggestions? Thanks ahead of time, sorry for the long post!

Share this post


Link to post
Share on other sites
I can't give you a complete answer, since I don't know one.

First of all, it is possible to eliminate access to a built-in function. For example, look at this:

C:\> python
>>> file = 0
0
>>> x = file("Hi there", "r")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: 'int' object is not callable

So that makes it harder to do file I/O. Of course, it may be difficult to patch all the holes, but it should be possible. Another hole is the "import" statement. However, I believe it is possible to "hook" the import statement, change its behavior. I think that if you combine those two mechanisms, you can do effective sandboxing.

Share this post


Link to post
Share on other sites
lol! Seems like most of our questions about python have had the answer "it's impossible" (such as instancing) But we've pressed forward trying to make it work, because it seems like a very handy and powerful language. Only problem with that is that it's power makes it a security nightmare, it seems. It's probably "possible" but we'd spend less time writing our own python interpreter, then sandboxing the standard ;)

I am open to suggestions though, if you're aware of any other scripting languages that might take better to sandboxing. The DOM xml isn't -that- big of a deal as long as the language supports objects (I've got a custom class that glue's Xerces DOM to another scripting language(QSA) that I can convert)

Thanks for reaffirming my suspicions though ;)

Share this post


Link to post
Share on other sites
You could do a sort of sandboxing by restricting the language itself: First, make use of the 'codeop' module to break the input script into individual "statements". Then eval() each of them, rather than exec'ing them. This will throw things like def's and classes out the window, and not even allow for control flow - but it will still give enough power to let people call functions in your "API", and possibly do other configuration stuff. If you want to let up a bit more, you could allow expressions of the form "{identifier} = {expression}" (and parse them with the 're' module).

In order to be sure of the sandboxing, you'll want to provide the user with a dedicated namespace for both globals and locals - and also remove access to the built-ins, if you don't want people creating files and so on:

>>> eval("file('whatever.txt', 'w')", {"__builtins__":None}, {})
Traceback (most recent call last):
File "<pyshell#1>", line 1, in ?
eval("file('whatever.txt', 'w')", {"__builtins__":None}, {})
File "<string>", line 0, in ?
NameError: name 'file' is not defined
>>> eval("__builtins__.file('whatever.txt', 'w')", {"__builtins__":None}, {})
Traceback (most recent call last):
File "<pyshell#2>", line 1, in ?
eval("__builtins__.file('whatever.txt', 'w')", {"__builtins__":None}, {})
File "<string>", line 0, in ?
AttributeError: 'NoneType' object has no attribute 'file'
>>> eval("globals().keys()", {}, {})
['__builtins__']
>>> eval("locals().keys()", {}, {})
[]
>>> eval("globals(), locals()", {"__builtins__":None}, {})
Traceback (most recent call last):
File "<pyshell#5>", line 1, in ?
eval("globals(), locals()", {"__builtins__":None}, {})
File "<string>", line 0, in ?
NameError: name 'globals' is not defined
# Based on that thread that was linked
>>> eval("# coding: utf7\n"
"+AG8AYgBqAGUAYwB0AC4AXwBfAHMAdQBiAGMAbABhAHMAcwBlAHMAXwBf-()[12]('whatever.txt', 'w')")
<open file 'whatever.txt', mode 'w' at 0x00CF8EE0&gt;
>>> eval("# coding: utf7\n"
"+AG8AYgBqAGUAYwB0AC4AXwBfAHMAdQBiAGMAbABhAHMAcwBlAHMAXwBf-()[12]('whatever.txt', 'w')", {"__builtins__":None}, {})
Traceback (most recent call last):
File "<pyshell#7>", line 2, in ?
"+AG8AYgBqAGUAYwB0AC4AXwBfAHMAdQBiAGMAbABhAHMAcwBlAHMAXwBf-()[12]('whatever.txt', 'w')", {"__builtins__":None}, {})
File "<string>", line 0, in ?
NameError: name 'object' is not defined
# Proof that you can still do SOMETHING ;)
>>> eval("234", {"__builtins__":None}, {})
234



Of course, you would make named objects for the dicts, so that their contents would persist across calls:


global_mocker = {"__builtins__":None}
# Add functions to the global_mocker to expose an API
local_mocker = {}
for statement in script:
(varname, expression) = parse_assignment(statement)
result = eval(expression, global_mocker, local_mocker)
if varname: local_mocker[varname] = result


Share this post


Link to post
Share on other sites
Guest Anonymous Poster
As you have access to the sources of python, so I think ( but I've not look at that yet ) it should be easy to #define IO functions like fopen in the Python sources to deal only with you own IO Manager, that do whatever you want :
* restrict access of the IO
* allow reading in zip files
* be compliant with your Game FileSystem.


It's in my ( huge ) todo list, but it's not listed as an insanly risky operation...

hope it helps,

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this