Artist Ex Machina

Published May 26, 2006
Advertisement
The first word that comes to mind is "enlightening."


Actually, the first thing that comes to mind is rather a phrase, and the words typically average four letters. I won't repeat it here, as I suspect even the veteran sailors among us would be made slightly uncomfortable.


(Now I'm out of hyperbole and melodrama, so I'll just tell it straight. OK, maybe with a little bit of exaggeration. You know, screw it - 99% of what follows is probably total lies. But at least they'll hopefully be entertaining lies.)


So this cutscene system is getting larger. And by "getting larger" I mean that I am more or less resigned to dwelling in a cave for several millenia, doing nothing but writing code and subsisting on recycled guano, just to get the outline of the functionality done. It should only take a couple of lesser eternities to actually produce a working alpha.

It isn't really as bad as I'm trying to make it sound, but it is rather severe. Originally, I'd understood the system as having two main parts: a sort of generic rendering service that draws pretty movie clips on the screen, and a semi-intelligent decision system that picks out different interesting things to draw, and feeds them into the rendering chain. I arrived here at the company office proud of my accomplishment (I'd finished a viewable prototype of the render chain, loading a dynamic scene from XML files) and was ready for my parade in the streets, complete with fountains of beer and numerous women with... shall we say... highly friendly dispositions. Well, I would have settled for a "cool, you're on schedule, that's great. What's next?"

Instead you'd think I came in and announced that I was going to kill everyone by suffocating them in my rectum. There was that kind of awkward silence and shuffling-of-feet going on. Although, to be fair, I was asked to give the presentation rather suddenly. I'd dismantled the actual live-demo part in the airport during a layover on the trip over, so all I really had to show was stuff that's been in the project wiki for almost two months, and the demo XML file. I'd also been roughly awakened from my jetlag-induced nap no more than an hour before. I'm not entirely sure if the response was due to my presentation's utter lack of content, or if I inadvertently used a term like "schweinhund" along the way. I may never find out.


In any case, as it happens, the expectations for what this system will do have grown quite a lot since the last team discussion we had (at least, the last one I was involved in). The "semi-intelligent" side of things - the part that picks out cool stuff to show on the screen - has inflated a bit. I would have been ready to handle, say, Data from Star Trek, or perhaps even the HAL 9000. Unfortunately, things are slightly out of hand. Suffice it to say that by the time this thing works according to the current spec, it will set new standards for measuring just how smart something is. In fact, if you randomly pick a sequence of 10 lines of code from the (hypothetical) finished code, those ten lines will - by themselves - be smarter than Chuck Norris. I think it is now clear just how far things have gone.


The system now is going to need at least three tiers on top of the existing (and basically done) rendering framework. The first is a sort of cameraman that is responsible for composing individual shots, given a list of subjects (a ship, a planet, and that thing over there that's blowing up) and some priorities (explosions cool, planets pretty, ship not so important just now). This will be challenging enough simply because it requires a lot of logic that is typically approached subjectively by the artists - which is to say, there are no written rules for how to make a particular shot "look good."

It doesn't stop there, though! The next layer up is a sort of director, who will go over a scene and pick out groups of interesting subjects. The director then gives the subjects to the cameraman logic and says "make a nice shot showing this stuff over here." After the cameraman decides on a shot, he generates some parameters for the render pipeline, which then in turn boils things down into atomic operations in the actual 3D engine itself.

Overwhelmed yet? We're not done. The final layer is an evolution of the "event watcher" system that I'd originally thought was all the smartness needed by this thing. This top layer observes the entire universe and acts as a sort of agent for the player, picking out hotspots of "important things" and telling the director logic to show them.


To make this all the more interesting, each layer has to be accessbible directly without touching the upper layers. So we need the power to manually control a scene (which is now basically finished), or to manually control what shots are displayed, or to manually specify an environment in which the director should find interesting stuff to show.

Thankfully, I've got a very solid base design that will be able to handle all this data, and stacking the different systems will work out. It actually fits nicely into the horizontally-stratified abstraction/"mini dialect" concept that I've burbled about at great length. The only downside is that all the people interested in the system itself are also people who don't spend their time contemplating software architecture, so it's very hard to convince them that A) all the work I've done on the base rendering chain is actually necessary, and B) it's better to build this from the bottom up as a general system, instead of just hacking in a bunch of special-case code to finish the tech demo they want to see.


Interestingly, the artists themselves don't really see this as feasible for the most part. Most of the pressure is coming from Down On High to reduce the amount of work needed to generate cutscenes, plus of course the obvious bonus that once we have this technology we can use it all throughout the game for various cool things. The term "clever code" has been abused enough to make a two-dollar hooker feel positively loved. I've started privately referring to this entire escapade as the "Artist ex Machina." I deeply fear that the code may need to be more clever than I am, which makes it a bit hard to write - and debug. (There's some famous quote about that somewhere but I'm too lazy to dig it up.)

All of this stuff might be possible, maybe, but it's going to take a heck of a lot of work, and even after all that, we have no guarantees that the results will look good. So no pressure on me, really; we don't have an entire product design riding on this, or anything. That's definitely not a big load to carry out of bed every morning. Nope.

But then again, hell, this is what I live for. I was perpetually pissed at my last job because I never really had any major challenges. I'm just getting what I wished for... now I have to find out if that wish was a dumb mistake [smile]


There's plenty of other interesting little tidbits of team culture and personality coming to light, but I'll have to bore you with those at a later time, because right now I should probably be working [grin]
Previous Entry Goodies
Next Entry Mmmm.... beer.
0 likes 3 comments

Comments

Ysaneya
Ah ah, cool post. Indeed sounds like a real challenge. I'm planning to do something similar in a new Infinity prototype in the coming months; a planet generator, where you change some parameters (heightmap generator, texturing type, atmosphere, vegetation type, etc..) on the high-level, and then the prototype automatically generates "random" camera views of some interesting places on the planet. All the problem lies in defining what's "interesting" or not, and how to evaluate if some place is interesting or not. I can already imagine the problems i'll have, and your system sounds much more ambitious :) Good luck.
May 26, 2006 06:25 AM
takingsometime
Cinematography sucks. This is a bit of a rant, but hopefully it'll be useful [smile].

It sounds like you're on the right track, as the multi-level architecture appears to be the 'standard' method for approaching the problem. I'm not sure if it'll help, but here is a list of papers that I used while doing my PhD that may be in some way related to what you're doing. Thesis bibliography.

I could be speaking out of my ass (in fact I'm pretty sure I am), but I have thought about such systems a little bit.

I personally build everything on top of my constraint-based camera system, but this should work regardless of how the camera is implemented. Each 'atomic' property is encoded as a Profile, which dictates what sort of visualization to use. Each profile can make use of any constraints it wishes (e.g. Height, distance, size in viewport, etc. are all valid constraints). Each profile is then treated as a state, so I have a profile for an establishing shot, a profile for two-shots, etc.

To keep things simple, I just encode higher level knowledge into a state machine (similar to He et al, except mine isn't hierarchical). Since most cinematography follows a given sequence (i.e. show establishing shot first, then over-the-shoulder to speaking actor, etc.), a state machine is often sufficient to achieve cinematics. The cinematics are generally general enough that they can be applied to arbitrary scenes. To make them general you should avoid making too many transitions based on time. Transitions on who is talking, action starting, etc. are much more reliable over multiple scenes.

I am in the middle of building a graphical editor that allows you to create arbitrary cinematics (as a state machine), with customised transitions. It's not finished yet, but I'll be posting it through my journal when it's ready/usable.

I think one of the articles at GDC this year talked about encoding cinematography as a Markov Model, but I haven't really attempted to track down the information or the paper. Marc Christie also has a new cinemtography paper (from EuroGraphics?) that I haven't read, and Arnav Jhala has one in this years AIIDE. They might provide some more ideas/reassurance.

A weighted system should work well in this case.

Basically, each 'event' occurring has a pre-defined weight that is assigned by the artists. Events such as people dying, explosions etc. are given higher initial weights. The weights could be from a set (low,medium,high), or just arbitrary numerical values.

As time progresses, the weights decay (either linearly or otherwise), giving some events occurring at the same time but are less interesting but still pertinent (such as a ship landing) a chance of being selected. Each time an event is restarted, the weights are reset, allowing the event to be visualized by the system.

For different effects, I would encode several state machines, each with a given 'mood'. Angry/action scenes have choppier cuts, and often use closer shots. If you don't feel like storing multiple state machines, you could probably parameterise (word?) the state machine, allowing the mood to be defined later. I have yet to see an approach for automatically generating/deriving the mood of a scene, although Nicolas Halper talks about a possible metric in his PhD thesis (Downloadable from here). This wasn't really my area of research, so such solutions might exist.

I've often contemplated associating cinematic information with the audio being processed. When an explosion sound effect is played, you derive how important it is (it's weight), and determine if it's close enough to the current action to decide whether to show it or not. This method doesn't have to be based on audio, so the distance-to-weight metric can still be applied.

As you've noted in your post, I'd be pretty wary of adding too much intelligence to the process. It's a shit to implement, often sucks from a computational stand-point, and can be a real bastard to debug. Then you have the added hassle of measuring its effectiveness based on its artistic merits.

I don't expect you to use what I've suggested (or share how you're going about it), but hopefully it's given you a few things to ponder. These are just some cobbled together ideas, but I'm happy to discuss this further if you feel the need/desire [smile].
May 26, 2006 09:08 PM
ApochPiQ
Awesome! That looks like some very handy reading.

I have to confess that, for all the myriad things I've goofed around with in my time, I've never touched cinematography or even really much to do with camera systems. So this is all totally foreign territory for me. It seems like some of this stuff can be done algorithmically, but when it comes down to things like shot composition, there doesn't even seem to be a consensus among artists as to the best subjective approach - let alone anything that can be codified into an objective system.

My thought right now is to try and sort out some "characters" - for each input constraint on a shot, have a sort of flexible range of possible values. Different combinations of various ranges will lead to different "feels" of the final shot, and if done nicely might even end up looking like work by different directors in an actual film sequence. I'm not sure if I'm being overly optimistic about mathematizing all of this, but it sounds good in my head at least [smile]


Anyways, thanks a heap for the links. I might drop you a PM and pick your brain later if I run into any terribly hard questions.

As for releasing details... due to NDA and such I probably won't be able to do much more than drop vague hints, but we'll see.
May 27, 2006 04:18 AM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Profile
Author
Advertisement
Advertisement