Jump to content

- - - - -

Demo and Explanation of Third Person Camera

4: Adsense

I got a message from someone asking for more information about the camera system I briefly described in the previous entry, so rather than responding with an explanation via message I'll just dump it here. TLDR version: At the end of this post is a link to a small third-person ARPG foundational demo to play with, done with Urho3D+Lua, that shows the camera I describe here in action.

Long and windy version: Here is the file camera.lua, directly from my scripts directory:


It's written in Lua for Urho3D, of course, but I think it's rather clear enough that anyone ought to be able to adapt as needed to other engines.

First of all, it's implemented as an Urho3D ScriptObject component. Urho3D is a component-based engine, where components (such as cameras, input handlers, AI controllers, animated models, etc...) are added to Nodes, which act as transformation hierarchies + component containers. If you want an object to have a bit of static geometry, you add a StaticModel component. Want it to emit noise, you add a SoundSource. Want it to have a scripted AI system, you add a ScriptObject with a reference to the script class to drive it. The camera presented here is an example of the latter: a script object module that is added to a Node in order to implement the camera behavior.

Urho3D has an event-passing backbone that is central to the behavior of the camera. Each frame an event, Update, is sent to all objects that subscribe to listen for it. This Update event passes along a time step parameter, that can be used to advance things by the specified amount of time. Additionally, custom events can be sent by other objects, and subscribed to by any interested parties.

When a ScriptObject component is created, a script hook provided in the script class called Start() is called, if it exists. I use this hook to set default data for how the camera is to be configured. Currently, the Start() hook looks like this:

function ThirdPartyCamera:Start()
	self.cellsize=128           -- Orthographic on-screen size of 1 unit 
	self.pitch=30            -- 30 degrees for standard 2:1 tile ratio
	self.yaw=45            -- 45 degrees for standard axonometric projections
	self.follow=10              -- Target zoom distance
	self.minfollow=1            -- Closest zoom distance for perspective modes
	self.maxfollow=20           -- Furthest zoom distance for perspective modes
	self.clipdist=60            -- Far clip distance
	self.clipcamera=true        -- Cause camera to clip when view is obstructed by world geometry
	self.springtrack=true       -- Use a spring function for location tracking to smooth camera translation
								-- Set to false to lock camera tightly to target.
	self.allowspin=true         -- Camera yaw angle can be adjusted via MOUSEB_MIDDLE + Mouse move in X
	self.allowpitch=true        -- Camera pitch can be adjusted via MOUSEB_MIDDLE + Mouse move in y
	self.allowzoom=true         -- Camera can be zoomed via Mouse wheel
	self.orthographic=false     -- Orthographic projection
	self.curfollow=self.follow  -- Current zoom distance (internal use only)
	self.followvel=0            -- Zoom movement velocity (internal use only)
	self.pos=Vector3(0,0,0)         -- Vars used for location spring tracking (internal use only)
There are code comments beside each parameter to indicate their use, but I'll give a quick rundown. The first parameter is cellsize, which only has significance if the camera is an orthographic projection. Orthographic projections have no perspective; ie, things don't get larger as they get nearer the camera. A thing 10 units away appears exactly as large on screen as a thing 1000 units away. Typical orthographic projections map the screen resolution 1:1 to the world unit size; ie, if the resolution is 1280x720, the screen represents a view of the game world that is 1280 units wide and 720 units tall. For most applications, this really isn't suitable. It is far more convenient to be able to work in units, rather than tens of units or hundreds of units, as would be necessary for the default ortho projections. That is, if you want to use the system of 1 unit = 1 meter, then if you try to draw an object 1 meter tall in a default orthographic projection, it will appear onscreen as only a single pixel tall, since 1 unit=1 pixel.

So the cellsize is useful to scale the screen resolution in order to determine a proper orthogonal projection box. It indicates how many pixels on-screen a single unit "covers". That is, if you set cellsize=128, then that 1 meter object you drew will now be 128 pixels tall on screen.

This kind of scaling is especially important if you are using physics, for example. Most physics libraries like to work in real-world units (velocities expressed as meters per second, masses expressed as kilograms, etc...) and have "optimal" ranges within which the simulation is most stable. For example, a particular library might indicate it works best with objects ranging in size from, say, 0.1 meter up to 1000 meters. Values beyond these ranges can introduce instability or inaccuracy in the simulation. So it is important that your objects themselves occupy the correct numerical ranges. It would not be practical for you to use a default orthographic projection, make your 1-meter object be 128 units tall, and expect the physics simulation to handle that well. (Well, it might, but I wouldn't really count on it.) So the cellsize scaling parameter is highly useful for orthographic projections.

The next parameter is pitch. This parameter indicates the starting pitch, or angle above the horizon, of the camera. I set it to 30 degrees out of old habit; a 30 degree pitch (coupled with the 45 degree yaw denoted by the next parameter, yaw), and an orthographic projection provides the very common 2:1 tile ratio "isometric" projection that you see in old classics such as Diablo and Diablo 2.

yaw, of course, denotes the spin around the vertical axis. 45 is standard for an isometric.

follow indicates the starting zoom level, or distance of the camera away from the point at which it is aimed. This value, when zooming is allowed, is clamped by the values of minfollow and maxfollow. That is, these two values indicate the nearest and the furthest that the camera can be from the target point. Note that follow, or zoom, has no meaning for an orthographic projection, outside of providing location for near/far clipping. So in ortho projections, it's usually best to disable zoom and set the camera at a reasonable fixed follow value.

clipdist indicates the distance of the far clipping plane.

clipcamera is the flag that controls whether or not the camera should "clip" to occluding objects. If you have ever played World of Warcraft, for example, you have seen that the game provides a freely rotatable camera; you can alter the pitch and yaw by holding the mouse button and moving the mouse, to move the camera around. It always stays centered on the player, and if a solid object such as a tree or a rock comes between the camera and the target, the camera is moved forward until the thing is no longer occluding it. Once the camera moves so that it is no longer being occluded, it smoothly swoops back out to either it's set location, or the location of the nearest occlusion again, if any. By setting this clipcamera flag to true, you enable this behavior. Each frame, the camera update method will perform a geometry raycast from the target location toward the camera. It collects all possible intersections, and iterates them, examining each one for a particular flag, solid. (This is specified as a boolean in the Node's user vars structure, an Urho3D-specific means of per-Node arbitrary data.) If any solid Node is encountered, the intersection point is calculated, a small bias value is subtracted, and the camera zoom distance is clamped to this value.

springtrack is a parameter that determines whether or not the camera softly tracks the target. When the flag is false, the camera locks tightly onto the target, moving exactly as the target moves with no deviation. When the flag is true, the camera is attached to the target by a simple spring function. As the target moves, the camera will lag a little bit behind. The spring has a tendency to dampen and soften any sudden movements. If the camera changes targets, the hard lock will cause an instant jump from one target to the other. The soft track will cause the camera to swoop off across the map, drawn by the spring that is suddenly stretched by having the new target be further away, closing in on the target and slowing, or damping, as it draws near.

The hard lock is suitable for games like Diablo, where the player is always controlling a single unit. The soft lock is suitable for party-based games, such as my own Goblinson Crusoe, where the camera is constantly switching from unit to unit, and a hard camera switch can be very disorienting.

allowspin, allowpitch and allowzoom are boolean flags that indicate whether or not the camera's spin (yaw), pitch or zoom levels can be adjusted. If allowspin is true, then you can hold the middle mouse button down and move the mouse left to right to spin the camera around the vertical axis, centered upon the target. If allowpitch is true, then up and down mouse movements while the mouse button is held cause the pitch, or the angle above the horizon, to change, clamped between 0 (camera at full horizontal) and 89 degrees. The allowzoom flag allows you to use the mouse wheel to zoom in or out (adjusting the follow parameter, subject to the constraints of the clipcamera flag, of course). The zoom is smooth, attached to a spring function to avoid jarring translations. Camera clipping to occluding objects is instantaneous; however, the return to zoom level is smooth. These parameters default to true, but if you wish to disable any of them set them to false.

orthographic, of course, indicates whether the projection is to be orthographic or perspective.

The final parameters are for internal-use only; they are the various internal data used by the spring functions that drive zooming and soft tracking, as well as the parameters that control camera shaking.

When the script object is created, these members are set to their default values. After the component is created, any of the parameters can be overridden by setting them to new values. Once you have set the values as desired, you call the Finalize() method of the script class:


This is where the transform hierarchy is created. The script object is created into a Node object, and any script object class can access its node through the self.node member. The hierarchy is created by 3 Nodes, in addition to the root Node to which the component is added. The root level Node is used for positioning the camera; whatever location this Node sits at, that is where the camera will point. Move the root level node, you move the camera view. This node is also used for applying the camera spin, or yaw. The remaining Nodes are created as a chain of children from this root Node. The first Node in the chain is the shakenode. This is the Node to which any camera shaking movement is done. (More about the shake in a little bit.) The next Node is called the AngleNode; it is where the camera pitch is applied. The final Node is called the CameraNode. This is where the actual Camera component is added, and is also where the camera zoom translation is applied.

The way the hierarchy works, then, is that the CameraNode is translated along its local Z-axis according to the value of follow. The AngleNode is rotated around its local X-axis according to the value of pitch. The ShakeNode sits at identity for the most part, unless camera shake is being applied. The root Node is positioned wherever in the world it needs to look, and is rotated around the Y-axis according to the value of yaw. By splitting it up into a hierarchy of Nodes like this, you avoid the math of concatenating it all into a single transform manually; the engine will handle the concatenations. Once the Node hierarchy is setup, then the camera is set to a viewport in the Renderer (an internal Urho3D requirement) and any orthographic settings are set based on the orthographic flag and the value of cellsize.

After the Nodes are created and initialized, there comes a block of code where events are subscribed to. As I mentioned earlier, Urho3D has a central event-passing system. The camera subscribes to listen to the engine's Update event, as well as listening for a list of custom user events. These user events are generated externally to the camera. For example, if any other object sends a SetCameraPosition event, with a Vector3 data packet containing the position at which the camera should look, the camera will "hear" that event and respond accordingly. If another object sends a ShakeCamera event, with requisite data packets for shake magnitude, speed and damping, then the camera will listen for that as well, and apply camera shaking as requested. These events provide the interface by which objects can interact with the camera.

Nestled in there are a couple of useful custom events, RequestMouseRay and RequestMouseGround. These utilities will calculate a ray corresponding to the location of the mouse, and a Vector3 corresponding to the intersection of the mouse ray with the ground plane, respectively. The Ray query is especially useful for picking and targeting. A couple other requests in there let external objects query for the position and rotation of the camera. The rotation query is especially useful, for example, with WASD-style player controllers that do over-the-shoulder camera following, like WoW. There, the player's transform often mirrors the yaw of the camera, so it is necessary to query the yaw and set the avatar yaw to match.

Now, the event subscription phase includes specifying class methods used to handle the events being subscribed. Many of them are "simple" getters/setters, maybe with a little computation involved as with the mouse Ray and ground intersection routines. I'm not really going to go over all of those too deeply. The real interesting stuff takes place in HandleUpdate(), which is called every frame:


HandleUpdate starts out by obtaining the TimeStep parameter from the eventData that is passed to the method. All events can accept packets of arbitrary data, and the eventData is how it is done. eventData is an instance of the in-engine data type, VariantMap, which is a map of Variants keyed on a string. (Or, rather, string hash, as internally all strings are hashed down to an integer. However, the Lua bindings deal with strings on the outside.) A Variant can hold some arbitrary piece of data, be it a float, a vector of floats, a boolean, a string, or whatever. (Variants are a common paradigm in computer science. See, for example, boost::any.) By setting your data to named fields in a VariantMap and sending it along in a SendEvent() call, you can hand off whatever data you require.

The update method first does the calculation required for camera shake. Camera shake is specified by a magnitude, a speed and a damping factor. (These are set in the HandleShakeCamera method.) Speed determines how quickly the ShakeNood translates during the shake, magnitude determines how far it translates, and damping determines how quickly the movement damps down to 0. By specifying the correct parameters in a ShakeCamera event, you can generate shakes from large, swoopy, long-term earthquake-like motions to short, snappy, hard little thumps. ShakeCamera events can be generated for example in spell payloads, to provide a visceral feedback for explosions and the like. Never underestimate the gritty feel of a good camera shake to add "juice" to your fireball spells.

The next segment handles zoom based on mouse wheel input. It first detects whether zoom is enabled, and ensures that zoom is not applied if the cursor is over any UI elements, since some of those elements might use the mouse wheel to scroll and it would be inappropriate to also adjust the camera zoom while scrolling a list. Then it simply adjusts the value of follow based on the amount of mouse wheel movement, clamps it to the minfollow,maxfollow range and carries on.

Next, the allowspin and allowpitch flags are queried, and if either is permitted and the mouse button is held down then a code block is entered that hides the mouse cursor during camera adjustment, and queries the motion of the mouse and adjusts the values of pitch and yaw accordingly, clamping pitch to range to keep camera weirdness from occurring. The else block to the conditional re-shows the mouse cursor.

Next, the method SpringFollow() is called, given the timestep parameter. SpringFollow is where the spring function that smooths out the movement of camera zoom is applied. It calculates the value of a parameter, curfollow, that denotes the actual distance of the camera, as calculated by the spring equation. If curfollow is not equal to the follow parameter set in the zoom block, then an acceleration is calculated, a velocity is calculated from that, a damping is applied, and a new position is determined that draws the camera zoom nearer to its target goal.

After the zoom spring is calculated, the value of the clipcamera flag is checked, and if true a code block is entered that performs the ray picking described earlier, testing against objects marked solid. If a solid intersection is found, and the distance to the intersection is less than the value of curfollow, then curfollow is clamped to that distance.

Next, the shake factors calculated earlier are applied as a translation to the ShakeNode. I do it here rather than earlier, since I don't want the shake to affect the clipping done in the clipcamera block. It can cause the camera to go a little haywire during camera shake otherwise.

The next block determines whether to apply a soft track to the camera position, and applies a similar spring function to the Node's position if so. Otherwise, it just sets the position to the last position set in the event handler for SetCameraPosition event.

Finally, the pitch and yaw factors are set to their corresponding nodes, and the zoom is set as a Z-Axis translation on the camera node. The next time a render update happens, the Node transformations are marked dirty and recalculated, providing the proper transformation for the camera.

In order to use the camera, you instance the script object into a Node, set any parameter overrides, and call Finalize:

local cameranode=scene:CreateChild()
local camera=cameranode:CreateScriptObject("ThirdPersonCamera")
And as easy as that, a camera object is created and placed in the scene, waiting to be remotely controlled by another object using a camera controller. At the end of the camera.lua file is an example of a simple camera controller.


function CameraControl:Start()

function CameraControl:TransformChanged()
	local x,y,z=self.node:GetWorldPositionXYZ()
	self.vm:SetFloat("x", x)
	self.vm:SetFloat("y", y+self.offset)
	self.vm:SetFloat("z", z)
	self.node:SendEvent("SetCameraPosition", self.vm)
This is about the simplest you can make a camera controller, suitable for a hard-tracking camera on a single unit, such as a Diablo-alike. The controller allows specifying an offset parameter (set to default 0.5 in Start) that allows for pointing the camera at the target object's head rather than at its feet as would occur with an offset of 0. In Urho3D, any time a Node's transform is changed, a script hook on any script objects called TransformChanged is called to alert the script object of that fact. The camera controller uses TransformChanged() to bundle together the Node's position into a VariantMap and pass it along in a call to SendEvent, which sends the SetCameraPosition event. Remember that the camera object hooks this event, so any controller sending it will affect the camera's location. A simple controller like this, then, is not really suitable for a game with multiple potential camera targets. In more complicated games like that, such as Goblinson Crusoe, each controller needs to have an active flag, and some means for specifying which camera controller is active at any given time. It might also need additional functionality for handling hard camera position setting in the case of soft camera tracking (for example, to avoid a camera spring at game start, when the camera location defaults to (0,0,0) and thus has to spring to follow the specified object which is elsewhere in the world.)

To use a camera controller, you simply instance the script object into the object being followed:

After that, any time the object's transformation is changed, an event will be fired off to control the tracking location of the camera.

For games such as RTS games, where you don't directly track any single object, then instead of adding a camera controller to any of the units, you would add it to a specially scripted Node that could be moved around using mouse controls and/or keyboard controls. For example, you might pan the camera view by clicking and dragging, or clicking direcly on a mini map, or by moving the mouse pointer to the edge of the screen. All of this would be implemented in the camera controller, which would then package up and send off a SetCameraPosition event to move the camera as required.

Get the demo application here (Google Drive)
Here is the link to a demonstration of the camera setup. It sets up a simple maze, a player avatar and a handful of wandering dudes. A few bits have been added to the camera to facilitate on-the-fly changes to the camera flags. On-screen text describes the toggle controls for the camera flags.

To run: Open a command line, navigate to the root of the project, and execute Urho3DANL.exe Scripts/main.lua. You can pass any of the command line options supported by the Urho3DPlayer executable detailed here. Alternatively, you can just execute the batch file makeithappen.bat. The executable is a 64-bit Windows exe; if you are on a 32-bit system, I also packaged a 32-bit exe, Urho3DANL_32.exe, that can be used instead. Controls are simple: point and click to move. The demo includes a very quick Recast/Detour navmesh-based path following point-and-click system, similar to what you might see in a typical third-person RPG. It's a tad bit glitchy; sometimes you or the dudes might get stuck. I'm still in the process of figuring out the system, and there are some fixes and updates I would like to make to it eventually, but for the purpose of this demo it works well enough. (Watch out; currently, it is not possible to flag any geometry as non-walkable, so sometimes you or the dudes might get spawned on top of a rock. Nothing you can do about it but exit and run again, at the moment.)

If you press Space, a camera shake impulse will be sent to the camera so you can see it in action. Additionally, the player and dudes include the ghost materials detailed in the previous journal entry for drawing colored silhouettes when the objects are occluded. You can see how the ghosts as implemented are kind of a glitchy, finicky thing, though; very far or very near zooms and orthographic projection, particularly, are problematic.

When yaw/pitch are enabled you can use the middle mouse button to move the camera around. When zoom is enabled, the mouse wheel zooms in and out.

If you want to do some further experimentation, open up the file Data/Scripts/levelgenerator.lua to tweak the various objects generated into the map. One edit in particular can be illustrative: under the levelobjects.player definition, you can see a line commented out that reads --{Type="ScriptObject", Classname="WASDController"}. Right above it is the line {Type="ScriptObject", Classname="PlayerController"}. If you comment out the PlayerController line instead, and uncomment the WASDController, you get a very rudimentary WASD player movement controller, in lieu of the point-and-click controller. This controller is particularly useful with yaw/pitch/zoom/clipcamera enabled, for WoW-style movement. There are all sorts of other things in there you can fiddle with as well.

If you have any comments, issues or questions, let me know. If you make something cool from this, let me know as well. I'm always on the lookout for cool stuff.

Feb 27 2014 11:25 PM

Thank you for sharing!

Mar 01 2014 12:14 PM

Woot, thanks man!

Note: GameDev.net moderates comments.