Followers 0

# Security

## 29 posts in this topic

Just a little question about the bytecodes. How "close to the system" is the bytecode really? All pointers etc is generated at linking, right? The reason I'm asking is that I'm thinking about a system in my game engine where the server could send scripts to the client. If this is sent as source I'm pretty sure I could prevent malicious code (simply by not registering potentially dangerous stuff for access by the scripts). I'm a bit worried about sending bytecode though (which would be nicer in many ways). If the bytecode can be constructed in such a way it'd start accessing stuff it shouldn't, it might be a pretty big problem. So, basically the question is... Uhm... All of the above, formulated as a question somehow. :) /Anders Stenberg
0

##### Share on other sites

It is for example possible to insert the following bytecode:

SET4 0 // value

The virtual machine would gladly execute this code.

The bytecode generated by the compiler shouldn't contain any hardcoded addresses though. So if you can somehow guarantee that nobody tampers with the bytecode you should be able to pass the bytecode from the server to the client (assuming the engines have been configured with the exact same functions, in the exact same order).

I don't think I will be able to make the VM somehow validate the bytecode so that it doesn't do anything it shouldn't. If you have any suggestions on that I would be very interested in hearing them.
0

##### Share on other sites
Quote:
 Original post by WitchLordSET4 0 // valueSET4 0 // addressWRT4 // *address = valueThe virtual machine would gladly execute this code.

But would the bytecode deserializer (I haven't checked out the saving/loading of bytecode really) allow for such a construct to be generated? I thought you had some metadata when saving about what all pointers should point to, and relink them pointers that make sense in the current environment when loading? (I have to admit I haven't looked much at the bytecode stuff at all, so I might not make sense at all. :)
0

##### Share on other sites
Actually, when the bytecode is saved it's just a direct copy of the in memory bytecode, with some extras so that the module can be rebuilt after loading (function declarations etc).

The only thing that guarantees that the bytecode doesn't do any bad stuff is the compiler. The VM verifies that an object pointer isn't null when accessing methods or properties, it also verifies that no division by null is made. That's about all the verification that is done after compilation. Any more and the performance would get really bad.

It would be easy to verify that

SET4 0 // value

isn't executed. But it would also be very easy to hide this code with a few more instructions:

SET4 0 -- Store 0 in a variable
PSF 1
WRT4
POP 1

SET4 0 -- Write 0 to an adress stored in a variable
PSF 1
RD4
WRT4
POP 1

Both of these sequences are perfectly normal, and could be found in correct code. Though if they are run together as is, they will result in the same as the previous example.

The only way to make the bytecode safe is to have the application register the memory ranges that the VM should allow access to and then have the VM verify each instruction that accesses memory. As you can imagine this would be extremely inefficient.
0

##### Share on other sites
I still don't really understand how the bytecode could be just saved and loaded right off. Mustn't _all_ pointers _always_ be resolved when loaded? I mean, how could possibly a stored pointer make sense between two runs of a program?
0

##### Share on other sites
I would assume the 'pointers' in the bytecode are relative the AngelScript stack, not the system. So as long as the stack is the same size, it's all there.

You said you're sending code from the Server to the Client - if some client messes with it and runs it, they are screwing themselves. If someone writes a malicious server, that's altogether different. And, of course, it would be rather easy for the client to cheat.
0

##### Share on other sites
It's really only global variables that would be accessed through memory pointers, the rest use offsets from the function stack frame or stack pointer. I solved the access to global variables by indexing the list of variable declarations, it's slightly slower than direct memory access but it simplifies other things. The VM doesn't verify out-of-bounds access to these lists though as it relies on the compiler for that.

0

0

##### Share on other sites
It could and it's probably a good idea, but it would only protect against uncompatible engine configurations. It still wouldn't solve the other issue above with malicious code.
0

##### Share on other sites
If the loader/linker/whatever knows what indices make sense, it should be able to bolt out if the indices in the bytecode are way off? Or maybe the bytecodes are too low level to easily know what's legal and not?
0

##### Share on other sites
Quote:
 Or maybe the bytecodes are too low level to easily know what's legal and not?

That's exactly it. The byte codes are too low level. The instructions that use indices can be checked, but other instructions like write, read, etc work with address pointers directly. A malicious code sequence can put an address on the stack with SET4 and at any time in the future call WRT4 on it.

If I can think of some way to make the bytecode secure (without loosing too much performance) I will do it.
0

##### Share on other sites
how does Lua manage security?

[Edited by - EddHead on September 15, 2004 12:02:16 AM]
0

##### Share on other sites
I read the reference manual for Lua, and from what I can see Lua doesn't allow saving/loading of compiled byte code, which effectively eliminates the security problem we are discussing here.

Note, the security risk is most notable when sending the compiled byte code between machines. If the compiled byte code is installed on the client machine together with the rest of the program then the bytecode poses no greater security risk than the application itself. A hacker with malicious intent could manipulate the application directly if he has access to it. So if you do not intend to pass byte code between machines I don't think you have to worry too much about the security.

If you do intend to pass compiled byte code between machines, you need to set up a protocol that protects hackers from tampering with the data stream. SSL has been proven to be a good way to do this.

In either case, if I can think of some way to make the bytecode safer I will implement it.

0

##### Share on other sites
Security is really just narrowing down who can successfully circumvent any safety measures that are put in place.

* AngelScript as source - Anyone with notepad and some basic programming knowledge may be dangerous.

* AngelScript as byte code - Someone with a binary file editor and knowledgeable with AngelScript's byte code may be dangerous

* AngelScript as byte code in compressed and/or encrypted file - Someone with a memory debugger and knowledgeable with AngelScript's byte code may be dangerous.

The last should be handled by the application as the time for compression/encryption is a trade off with speed.

0

##### Share on other sites
Well, my ideas would involve the serving sending bytecode to the client. What I'd want to avoid is someone setting up a server sending stuff that screws up the clients computer. (I guess a lockup or so would be okay, but if it can create code that destroys data on the client or something like that, it'd be a lot worse.)

I could send just source, but bytecode would be so much neater since it's more compact and at least a bit obfuscated.
0

##### Share on other sites
If someone really wants to alter the "script" it'll happen anyhow so I think serving the source files is better. It would allow for conditional compilation, and problems (they are huge AFAIK) of invalid pointers would be avoided.

After the script/binary has been loaded it's in memory and can be read back and/or altered. No point in trying to prevent it.
0

##### Share on other sites
I agree with Kurioes. You really ought to send the source files, not the bytecode. That way the compiler can do it's job and prevent access to memory the script shouldn't mess with.

You can use compression to decrease the size of the files. I suggest you take a look at zlib for this, as it is a free library (same license as AngelScript).

Never rely on obfuscation to hide code, use encryption instead. TEA is a really simple algorithm that can be implemented with only a few lines of code (I have an example on my site). The algorithm is very secure (at least as secure as private key encryption can be) and uses 128bit keys. The algorithm is also quite fast, and shouldn't slow down the download very much.

0

##### Share on other sites
Quote:
 Original post by Mad DuganSecurity is really just narrowing down who can successfully circumvent any safety measures that are put in place.* AngelScript as source - Anyone with notepad and some basic programming knowledge may be dangerous.* AngelScript as byte code - Someone with a binary file editor and knowledgeable with AngelScript's byte code may be dangerous* AngelScript as byte code in compressed and/or encrypted file - Someone with a memory debugger and knowledgeable with AngelScript's byte code may be dangerous.The last should be handled by the application as the time for compression/encryption is a trade off with speed.Mad

I only agree with the last two points.

The scripts can only do what the application allows them to do. If the application don't register any functions for accessing memory, then the script can't do that (of course that would make for a pretty useless script [wink]). If the script's interface to the application is secure then the script is also secure.
0

##### Share on other sites
Quote:
 Original post by WitchLordYou can use compression to decrease the size of the files. I suggest you take a look at zlib for this, as it is a free library (same license as AngelScript).

Yeah, well it's not really a problem anyway. It's not like a script will be sent every frame or something. :)

Quote:
 Never rely on obfuscation to hide code, use encryption instead. TEA is a really simple algorithm that can be implemented with only a few lines of code (I have an example on my site). The algorithm is very secure (at least as secure as private key encryption can be) and uses 128bit keys. The algorithm is also quite fast, and shouldn't slow down the download very much.

Well, in this case encryption is just another form of obfuscation. If the client can read it, no matter how tough the encryption is, so can the guy using the client one way or another. It wasn't thought as some bulletproof protection, just a way not to just swing the door wide open.

This isn't about cheating protection or anything like that. The idea was just a system somewhat like the mutators in Unreal Tournament. That the server can have modifications that are automatically downloaded and used. I've got no idea how extensive or secure the mutator-system is. I think they probably just send script code. (I haven't looked into it.)

Uhm. Okay. Now I actually have looked into it, and yes, it just sends script code as far as I can see.

Anyway, my point was that my ideas doesn't fall if bytecode can't be sent. I'm just exploring the possibilities. :)
0

##### Share on other sites

As another solution to the problem you could digitally sign the compiled bytecode. That way the client can verify if anyone has altered the code and not run it. This would only be a viable solution if you can keep the private key secure. If you plan on having public servers then the private key isn't secure so the digital signature cannot be trusted.
0

##### Share on other sites
Quote:
 Original post by WitchLordAs another solution to the problem you could digitally sign the compiled bytecode. That way the client can verify if anyone has altered the code and not run it. This would only be a viable solution if you can keep the private key secure. If you plan on having public servers then the private key isn't secure so the digital signature cannot be trusted.

This doesn't help if the code is malicious at the source, at the server. (I.e. someone has set up a server just to spread a pleague. :)
0

##### Share on other sites
Would the following pseudo code help?

	...	case BC_WRT4:		a = *stackPointer;		stackPointer++;		d = *stackPointer;		if( *(asDWORD*)a < LOWEST_ADDRESS_OF_ALL_DECLARED_VARIABLES ) // Check the pointer		{			SetInternalException(TXT_NO_EXECUTE_ACCESS?);			return;		}		*(asDWORD*)a = d;		programCounter += bcSize[BC_WRT4];		return;	...

If you cannot trust anyone, Server operator, client nor 3rd party eavesdropper, then scripts probably need to be as source (Protects against viscous server), run length encrypted with checksum(protects from network listeners), and finally, realize that you can never trust the client as it will always have all the information needed to access the byte code, you just need to make sure it isn't trivially easy.

In a server/client game scenerio: Having clients run scripts is to offload the server from having to send the state changes generated by said script and are not replacements for offloading all of the work. The server should run state changing scripts to maintain the clients current state, sending snapshots so the client can autocorrect.

0

##### Share on other sites
Quote:
 Original post by WitchLordI read the reference manual for Lua, and from what I can see Lua doesn't allow saving/loading of compiled byte code, which effectively eliminates the security problem we are discussing here.
Lua does allow that. Functions can be dumped as bytecode, and all functions that load scripts will work on bytecode as well as source code.

Lua has two major tricks up its sleeve WRT malicious code.

The first is that the debug library allows you to preverify code, in the same way that the Java VM does. This obviously doesn't detect malicious intent, but makes sure that nothing is writing to bizarre, out-of-range stack addresses.

The other thing is that, as a VM with only stack and table operation opcodes, it doesn't really have the "features" necessary to enable code to break out of the sandbox. Scripts can crash, of course, but only within the confines of the VM execution.
0

##### Share on other sites
Quote:
 Original post by DentoidThis doesn't help if the code is malicious at the source, at the server. (I.e. someone has set up a server just to spread a pleague. :)

That's why I said that it wouldn't work with public servers, i.e server that anyone can install and run. If you only have private servers then the private key can be kept secure which would make it incredibly difficult for an impostor to pose as a valid server.
0

##### Share on other sites

Your pseudo code doesn't really help. I can't just set a lower limit for the variables. It is quite possible that there exists memory between the lowest accessible memory and the highest accessible memory that should not be allowed to be accessed.

Sneftel:

Oh, ok. I didn't see that functionality when speeding through the manual.

I was just wondering how Java does it. Thanks for letting us know.

It seems the only solution to make the bytecode secure is to follow Lua's example. Just like the script can't manipulate pointers, the bytecode can't be allowed to either. I will analyze this to see if it is possible to do something like that, but I have a feeling that it won't be an easy task. I don't want to limit the interface that I have with C++ applications.

Maybe it is possible to have the VM analyze the bytecode when loading it to see what operations it is doing and then decide if it is trying to access anything it shouldn't. It should be possible to verify if any absolute addresses are being read from or written to. And the instructions that pushes the address of a stack location on the stack (for future manipulation) can be checked so that the location is within the stack frame of the function. In this pass I can also verify that all function ids are valid, and global variable indices as well.

One potential problem is the instruction ADDOFF, which adds an offset to a pointer. How should the VM know the allowed limit to ADDOFF? It can't. I'll have to think about how this can be worked around.

I will start working on a function like this for a future version of AngelScript. Even if it might not be able to verify everything at start I will be able to analyze the weaknesses better, and either remove them or find a way to protect them.

0

## Create an account

Register a new account