Python for data-driven design?

Started by
1 comment, last by Sky Warden 9 years ago

Hi, everyone. I'm working on a game project with C++ using SFML for studying purpose. A while ago I read about data-driven design (or is it called data-driven programming?) in this forum, and I think it's a really neat idea, so I gave it a try using jsoncpp. I've got it working, but I'm curious about using Python for the data side. I've also heard that Python is often used alongside other programming languages like C++. I couldn't find any satisfying article about data-driven design, so I hope I can get some advice from this forum.

I've seen it done in a video game series called Mount&Blade, where the modules (mods) are written with Python, but the game itself is made with C++. What are the advantages of using Python files over JSON? And what are the risks? Since Python is a scripting language, I figure it would be more dangerous to use.

Also, if you don't want your data files to be open to the users, how do you pack or encrypt them? Any advice on optimizing data-driven design would be very welcome too. Thanks in advance.

Advertisement

What are the advantages of using Python files over JSON? And what are the risks? Since Python is a scripting language, I figure it would be more dangerous to use.


Python is a scripting language that can encode logic and behavior while JSON is a data format for hierarchical information storage. They're utterly different things. You'd use a scripting language when you want extension of behavior and logic and you'd use a data format like JSON when you want to tweak and modify existing behavior and logic via simple data definitions. For instance, do your users need to define whole new AI modules for different enemies, or do they merely need to provide input values like Aggression to a single core unified AI module?

You can define data in code and have a Python-based data-driven extension system to a C++ game. A python module could call a SetAggression function for an enemy. That would still be data-driven; it'd just be somewhat harder to use as it's difficult to create a nice non-programmer-oriented GUI for editing enemy behavior if the GUI has to parse and generate scripting code (though it's possible for some limited use cases).

You can define limited logic via pure data in something like JSON to allow lots of customization of a C++ game without any real scripting language, but the result is often more trouble than it's worth. You need code that can interpret the data, you need a format to describe logical preconditions and sequences of events, and you need tools and documentation to allow users to produce data in that format which the interpreter can then consume. You often will find you've just created a slow, over-complicated, and difficult to use scripting language that's still not able to solve every need of the designers and you'd have been better off just integrating a scripting language.

Use data-driven methods to control relatively fixed logic, e.g. as inputs into an algorithm. Use a scripting language when you want to allow users to change the algorithm itself.

And of course you can use both. You can use data for most things but then still allow custom scripts to override the built-in behavior when needed. You could store the data in one type of file (e.g. some JSON format) and then allow that data to reference script files for specific overrides. For instance, the JSON for enemies might set values like AI.Aggression for most enemies but then have an AI.Script for one or two enemies that points at a script file; the engine would use the script if specified or the default AI (with the given input values) otherwise.

Since Python is a scripting language, I figure it would be more dangerous to use.


Certainly. That can be a trade-off. If you don't need users to define custom logic (e.g. you have a very fixed game design and you know exactly what the game needs to do) then you might be better off avoiding a scripting language. On the other hand, a more flexible game is all but impossible to define ahead of time entirely in the core C++ code and it can be a huge boon to allow scripting.

Recall that most "real" games are made by many dozens or even many hundreds of people. Most of those developers will not be competent C++ programmers, but may still need to define custom logic; for instance, gameplay designers, level designers in some genres, technical artists, etc. In those cases, it can be better to give the users an easy scripting language rather than trying to force them to use data-driven techniques for everything, since they'll no longer have to wait for a trained engineer to implement a new C++ feature for every little new behavior.

If you're building out an engine for a team of developers, the absolute most important thing to focus on is tools. Artists, designers, QA, producers, even other engineers all have myriad tasks that don't need or warrant writing and debugging lots of code. That's where data-driven design comes in; the designer shouldn't need to write and debug new code just to make an Orc Scout that moves 10% faster than the Orc Warriors. On the other hand, a QA engineer may well need to write new code in order to hook into gameplay systems in order to build an automated test suite for the game; that code often is very test-specific and has little purpose to be hard-coded into the game itself, so a well-integrated scripting language is a often the best tool for that use case.

Also, if you don't want your data files to be open to the users, how do you pack or encrypt them


Don't worry about this. Your only concern should be making the data fast to load and small in size; doing this will often involve storing it in a custom binary format and storing it inside a compression pack file, which is why most game data appears to be protected or encrypted to the average novice game developer or modder. The files aren't being intentionally protected; they're just being incidentally obfuscated in the quest to optimize the game's load times and download size.

Going out of your way to make it harder for users to mod your game does nothing but make your game less popular. Online games keep all of the critical data and scripts on a server that the user has no ability to mod. For single-player games or the client of multi-player games, though, modding should at least be tolerated if not outright encouraged; a heavy modding community for a game is a sign of popularity and longevity.

Sean Middleditch – Game Systems Engineer – Join my team!

Wow. That's a very informative post, Sean. Thanks.


You'd use a scripting language when you want extension of behavior and logic and you'd use a data format like JSON when you want to tweak and modify existing behavior and logic via simple data definitions. For instance, do your users need to define whole new AI modules for different enemies, or do they merely need to provide input values like Aggression to a single core unified AI module?

Ah I see. So scripting language is used to provide extension to the core game. I can imagine a lot of things to try. Though I don't think I will use it on my current project, or at least not yet, I will definitely take a deeper look at it.


Going out of your way to make it harder for users to mod your game does nothing but make your game less popular. Online games keep all of the critical data and scripts on a server that the user has no ability to mod. For single-player games or the client of multi-player games, though, modding should at least be tolerated if not outright encouraged; a heavy modding community for a game is a sign of popularity and longevity.

Yeah. That game I mentioned is famous mostly because of the modding community. When I was designing this game, modding did come to mind, but I asked because I think there might be some data that needs to be hidden from the users. The modding I plan would be limited to making new entities like monsters or items, so JSON would be enough.

This topic is closed to new replies.

Advertisement