Jump to content
  • Advertisement
  • entries
  • comments
  • views

Designing a Robust Input Handling System for Games



[size=2](Note that this post is now also available as an article-proper.)
One of the more common questions asked on the GDNet technical forums relates to handling input in a game engine. Typically, there's a few core issues that pretty much every game faces, and unlike many areas of game implementation, input is one where we can more or less build a one-size-fits-all framework. Fortunately, this is a pretty straightforward thing to do, and even is portable across APIs and platforms pretty easily.

For the purposes of this article, we'll focus on the platform-independent side of things. Getting input from the underlying hardware/OS is up to you; for Windows, Raw Input is pretty much the de facto standard; XInput might be useful if you want joystick/controller support. For other platforms, research and select the API or library of your choice.

Got a way to get pure input data? Good. Let's take a look at the overall input system architecture.

Designing a Robust Input Handling System
We have a few goals here which should make this system (or ones based on it) applicable for pretty much any game, from a simple 2D platformer to an RTS to a 3D shooter:

  • Performance is important; input lag is a bad thing.
  • It should be easy to have new systems tap into the input stream.
  • The system must be very flexible and capable of handling a wide variety of game situations.
  • Configurability (input mapping) is essential for modern games.

    Thankfully, we can hit all of these targets with fairly minimal effort.

    We will divide the system into three layers:

    1. Raw input gathering from the OS/etc.
    2. Input mapping and dispatch to the correct high-level handlers
    3. High level handler code

    The first layer we have already decided to gloss over; its specifics aren't terribly important. What matters is that you have a way to pump pure input data into the second layer, which is where most of the interesting stuff happens. Finally, the third layer will implement your specific game's responses to the input it receives.

    The central concept of this system is the input context. A context defines what inputs are available for the player at a given time. For instance, you may have a different context for when a game menu is open versus when the game is actually being played; or different modes might require different contexts. Think of games like Final Fantasy where you have a clear division between moving around the game world and combat, or the Battlefield series where you get a different set of controls when flying a helicopter versus when running around on the ground.

    Contexts consist of three different types of input:

    1. Actions
    2. States
    3. Ranges

    An action is a single-time thing, like casting a spell or opening a door; generally if the player just holds the button down, the action should only happen once, generally when the button is first pressed, or when it is finally released. "Key repeat" should not affect actions.

    States are similar, but designed for continuous activities, like running or shooting. A state is a simple binary flag: either the state is on, or it's off. When the state is active, the corresponding game action is performed; when it is not active, the action is not performed. Simple as that. Other good examples of states include things like scrolling through menus.

    Finally, a range is an input that can have a number value associated with it. For simplicity, we will assume that ranges can have any value; however, it is common to define them in normalized spans, e.g. 0 to 1, or -1 to 1. We'll see more about the specifics of range values later. Ranges are most useful for dealing with analog input, such as joysticks, analog controller thumbsticks, and mice.

    Input Mapping
    The next feature we'll look at is input mapping. Simply put, this is the process of going from a raw input datum to an action, state, or range. In terms of implementation, input mapping is very simple: each context defines an input map. For many games, this map can be as straightforward as a C++ map object (aka a dictionary or table in other languages). The goal is simply to take an identified type of hardware input and convert it to the final type of input.

    One twist here is that we might need to handle things like key-repeat, joysticks, and so on. It is especially important to have a mapping layer that can handle ranges intelligently, if we need normalized range values in the high-level game logic (and I strongly recommend using normalized values anywhere possible). So an input mapper is really a set of code that can convert raw input IDs to high-level context-dependent IDs, and optionally do some normalization for range values.

    Remember that we need to handle the situation where different contexts provide different available actions; this means that each context needs to have its own input map. There is a one-to-one relationship between contexts and input maps, so it makes sense to implement them as a single class or group of functions.

    There are two basic options for dispatching input: callbacks, and polling. In the callback method, every time some input occurs, we call special functions which handle that input. In the polling method, code is responsible for asking the input management system each frame for what inputs are occurring, and then reacting accordingly.

    For this system, we will favor a callback-based approach. In some situations it may make more sense to use polling, but if you're writing game code for those scenarios, chances are you don't need any advice on how to build your input system wink.gif

    The basic design looks like this:

    • Every frame, raw input is obtained from the OS/hardware
    • The currently active contexts are evaluated, and input mapping is performed
    • Once a list of actions, states, and ranges is obtained, we package this up into a special data structure and invoke the appropriate callbacks

      Note that we specifically might want to allow more than one context to be valid at once; this is often useful for cases where basic activities (running around) are always available to the player, but specific activities need to be restricted based on the current scenario (what weapons I'm carrying, perhaps).

      I recommend implementing this as a simple ordered list: each context in the list is given the raw input for the frame. If the context can validly map that raw input to an action, state, or range, it does so; otherwise, it passes on to the next context in the list. This can be done effectively using something like a Chain of Responsibility pattern. This allows us to prioritize certain contexts to make sure they always get first crack at mapping input, in case the same raw input might be valid in multiple active contexts. Generally, the more specific the context, the higher priority it should carry.

      The other half of this scenario is the callback system. Again there are several ways to approach this, but in my experience, the most powerful and flexible method is to simply register a set of general callbacks that are given input every frame (or whenever input is available). Again, a chain of responsibility works well here: certain callbacks might want first crack at handling the mapped input. This is again useful for special situations like debug modes or chat windows.

      Have the input mapper wrap up all of its mapped inputs into a simple data structure: one list of valid actions, one list of valid states, and one list of valid ranges and their current values. Then pass this data on to each callback in turn. If a callback handles a piece of input, it should generally remove it from the data structure so that further callbacks don't issue duplicate commands. (For instance, suppose the M key is handled by two registered callbacks; if both callbacks respond to the key, then two things will happen every time the player presses the M key! Oops! So if the first callback to handle the key "eats" it from the list, then we don't have to worry, and we can use a simple priority system to make sure that the most sensible callback gets dibs on the input.)

      High Level Handling
      Once the input is available, we simply need to act on it. For actions and states, this is just a matter of having our callbacks investigate the data list and take action appropriately. Ranges are similar but slightly more complex in that we have to turn the input value into something useful. For things like joysticks, this is easy: use a normalized -1 to 1 value and just multiply that by your sensitivity factor, and poof, you have a mapped range of input. (Try using a logistical S-curve or other interpolator for better results than just multiplication.) For mice, you can use the value to tell you how far to move the cursor/camera, again possibly by using a scaling factor for sensitivity purposes.

      The specifics of this third layer are really up to your game's design and your imagination.

      Putting Everything Together
      So, let's recap the basic flow of data through the system:

      1. The first layer gathers raw input data from the hardware, and optionally normalizes ranged inputs
      2. The second layer examines what game contexts are active, and maps the raw inputs into high-level actions, states, and ranges. These are then passed on to a series of callbacks
      3. The third layer receives the callbacks and processes the input in priority order, performing game activity as needed

      That's all there is to it!

      A Word on Data Driven Designs
      So far I've been vague as to how all this is actually coded. One option is certainly to hard-code everything: in context A, key Q corresponds to action 7, and so on. A far better option is to make everything data driven. In this approach, we write code once that can be used to handle any context and any input mapping scheme, and then feed it data from a simple file to tell it what contexts exist, and how the mappings work.

      The basic layout I typically use looks something like this:

      • rawinputconstants.h (a code file) specifies a series of ID codes, usually in an enumeration, corresponding to each raw input (from hardware) that we might handle. These are divided up into "buttons" and "axes." Buttons can map to states or actions, and axes always map to ranges.

        • inputconstants.h (a code file) specifies another set of ID codes, this time defining each action, state, and range available in the game.

          • contexts.xml (a data file) specifies each context in the game, and provides a list of what inputs are valid in each individual context.

            • inputmap.xml (a data file) carries one section per context. Each context section lists out what raw input IDs are mapped to what high-level action/state/range IDs. This file also holds sensitivity configurations for ranged inputs.

              • inputranges.xml (a data file) lists each range ID, its raw value range (say, -100 to 100), and how to map this onto a normalized internal value range (such as -1 to 1).

                • A code class called RangeConversions loads inputranges.xml and handles converting a raw value to a mapped value.

                  • A code class called InputContext encapsulates all of the functionality of mapping a single context worth of inputs from raw to high-level IDs, including ranges. Sensitivity configurations are applied here. This class basically just exists to act on the data from inputmap.xml.

                    • A code class called InputMapper encapsulates the process of holding a list of valid (active) InputContexts. Input is passed into this class from the first-layer code, and out into the third-layer code.

                      • A code class (usually a POD struct in C++ versions of the system) called MappedInput holds a list of all the input mapped in the current frame, as covered above.

                        • Each frame (or whenever input is available), the first layer of input code takes all of the available input and packs it into an InputMapper object. Once this is finished, it calls InputMapper.Dispatch() and the InputMapper then calls InputContext.MapInput() for each active context and input. Once the final list of mapped input is compiled into a MappedInput object, the MappedInput is passed into each registered callback, and the high-level game code gets a chance to react to the input.

                          And there you have it! Complete, end-to-end input handling. The system is fast, easily extended to handle new game functionality, easily configurable, and simple to use.

                          Go forth and code some games!

                          If you'd like to see an example of how this works in action, check out the Input Mapping Demo at my Google Code repository.


Recommended Comments

That was a good read - it was all fairly logical, but it is good to see it all listed out. Do you have any interest in putting together some sample code for it?

This type of abstraction layer would be useful to add to the Hieroglyph 3 engine, since at the moment it uses a fairly rudimentary event system (not much more than a wrapper for the Win32 callback). If you have an implementation laying around, and have the interest to add it to an open source project, then let me know!

Share this comment

Link to comment
Unfortunately the (good) implementations I've done in the past are all in shipped titles, and as such I can't share them :-)

But I'd be happy to rewrite one if I get some spare time; it's pretty easy to do and shouldn't take terribly long. Shoot me a PM and we'll work out the details.

Share this comment

Link to comment
Nice! This is exactly the system I am implementing right now. One thing i was thinking of, is whether to actually enlist "ALL game actions" in some configuration file, as this may lead to difficulties when adding/removing actions as the game is developed. The main structure of 3 layers though is EXACTLY what i designed myself. Excellent read! would love to see a code sample.

Share this comment

Link to comment
Nice stuff, I based my input code on this.

I don't know if it's just my design, but I ended up doing away with Actions. When I created callbacks to the inputs, I have an onPressed and onReleased and I can easily have a state callback work like an action by only implementing the onPressed or onReleased call and not both.

So in my input system, there is only Range, and State.

If you think about it, even Range and State are the same. State is just Range but with either 0 or 1, with no in between.

Share this comment

Link to comment
Have you found a way to unify pulling and callbacks? I'm working on an input conversion subsystem to handle joystick/pedal input as well as keyboard and mouse, but be able to make the mouse act like a simple js should a real js not be available.

This is going to take some time to think on and was hoping to get some advice/tips.

Share this comment

Link to comment
Nice article to get together the basics, thank you.

One thing I would change is to not map one single raw input to an action, but a RawInput structure which specifies the exact state of the raw input so the action is triggered. For example: Trigger the Copy action if CTRL and C are down.

Share this comment

Link to comment


I'm so confused about the series of callbacks.

Is it a one-to-one relationship between contexts and callbacks?

like menu context to menu callback versus fighting context to fighting callback?

Share this comment

Link to comment

First, I thought I was sending a comment until that horrible moment when I realized I PM'd instead. My apologies ApochPiQ.


Second, I realize that this is exactly what I have been looking for and the post was well done. However, I am unable to follow the off-site code example. Some 'Stupid Action' gets mapped where? I'd love to follow a 'chain-of-custody', for lack of better words, through the different pages of code.

Share this comment

Link to comment

I know this is a fairly old thread but there is something I don't quite understand about the structure of the Context files.

In a Context file Buttons are mapped to actions like so...
3 0
which means that Button 3 is mapped to Action 0. The numbers later get translated to expressive enums.
Now my question with this is, is "Button" supposed to refer only to Keyboard buttons or to buttons from an arbitrary input device like Gamepad, Jostick and so on. If the latter is the case wouldn't there be the problem that any button that produces a certain number code always maps to the same Action, which means that arbitrary layouts for more than one input device are not possible.

Share this comment

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!