New take on Conversation in Video Games

Started by
32 comments, last by GalvonicBond 8 years, 7 months ago

I've been looking around but I can't find anything that even remotely touches on an idea I've had, so I think that I might actually have had a completely original idea, which is interesting since I think it's a rather simple idea.

Let me begin with what I know of the current generation of conversation engines inside of games. There are 2 types of conversations, as I see them.

1: No player input at all. Usually used in cut scenes and on the move talking. It's good because the player can be actually playing the game while the conversation goes on and it doesn't break the flow of the game, though some reactive dialogue would be nice for when the player does something stupid, like try to hit the NPC or something, to make it feel more alive.

2: Dialogue trees. Recent examples include a good deal of rpg's like mass effect and the telltale games. This is good because it let's the player make a choice about how their character acts, even if it's just the illusion of choice, though it would be nice if the dialogues changed more things so that the player is incentivized to actually roleplay a little more.

But I have come up with a different kind of system, and I have no idea if it would work, and how much work it would be to implement it.

First we start with an absolutely huge cache of words, phrases, and paragraphs.

Then, we get the NPC or PC to choose what they want to use based on a kind of finite state machine like so.

state 1: Relationship to the one the character is speaking to.

state 2: Current disposition toward the one the character is speaking to.

state 3: Current mood of the character.

state 4: perceived disposition of the one the character is speaking to.

state 5: perceived mood of the one the character is speaking to.

state 6: Topic of discussion.

Given these 6 states you make the NPC choose their dialogue. Now, it would be fairly easy to add this to an NPC engine and let NPC dynamically choose their dialogue and provide the player with a dialogue tree. But I propose that we give the player the exact same state machine that we give the NPC.

This would allow us to make conversation a part of the game rather than something that happens in cut scenes or scripted dialogue scenes. We tell the player about the relationship, we give them the ability to change their disposition and mood within reason, we let their perception stats determine how they perceive the mood and disposition of the other character, and give them a few topics they can cycle through based on their knowledge and the character.

Then the conversation would flow like it does in real life, one topic flowing into the next and with the conversation able to dynamically break and reform based on player action.

Can you envision how such a system would work? What are the pro's and con's that you can see, just from reading about it. I would like some feedback on your ideas about this system, to help me refine the idea before I start prototyping it.

Advertisement

Those aren't the only 2 types of conversation.

3. Text-only games like MUDs and the King's Quest series commonly featured a vocabulary of key words, which the player used as their main method of interacting the game, including talking to NPCs. They were commonly limited to 3-word sentences, because an Eliza-type AI (or CleverBot-type) has more difficulty parsing sentences in any sane way the longer they are. In some cases the player must pick a sentence paradigm, which is like a mad-lib, then fill in the blanks in the sentence, either by typing or from a drop-down menu. The downsides of this system include making the player remember keywords and emphasizing the fact that the NPCs aren't really human.

4. Some RPGs, instead of giving the player specific dialogue choices, give the player a standard array of actions, like: agree, disagree, ignore, intimidate/taunt, charm/joke, buy/sell/bribe, ask for more info, answer truthfully, lie, etc. The game then supplies the player's dialogue to fit the choice and the situation, and understands the player's intent without needing to interpret anything, allowing the NPCs to always react correctly to the player's intent. This system can combine well with a system keeping track of which NPCs (and the player) know a piece of information (including false info). I think the Erasmatron engine was an example of this type.

I want to help design a "sandpark" MMO. Optional interactive story with quests and deeply characterized NPCs, plus sandbox elements like player-craftable housing and lots of other crafting. If you are starting a design of this type, please PM me. I also love pet-breeding games.

Well, what you describe is not really new. But it is nice to see another soul that's thinking how to replace the simplistic dialog tree mechanics. So don't get me wrong because most of the following sounds like critique. smile.png

a) There are more dialog mechanics besides the both you mentioned. The group of "branching dialogs" consists of the dialog tree but also of the "hub and spoke" variant; there are even planner like mechanics. See e.g. this and this article on Gamasutra.

b) A FSM is by definition a collection of states with conditional directed transition in-between. In an FSM just 1 state is active (as long as you do not mean HFSM). Implementing an FSM with those 6 degrees of freedom would practically become impossible very fast. What you describe seems me to be something different, more like some kind of decision tree.

c) A "state" you've forgotten is the need / goal of the speaker. This is very important, because a person is willing to ignore mood and whatnot in case that it helps to yield a goal. It must even be possible to lie. All this may be important to drive the story, too.

d) Having the possibility to change the topic is nice not only for the player but also for the NPC, because it allows an NPC to drive the dialog into a direction that matches the story. But care must be taken that an NPC does not rapidly hop from one topic to another. When the player des so then the the opponent should probably get angry (i.e. changing its mood). See e.g. Emily Short's articles about that problem. (BTW: Emily has many interesting stuff to say, although she's more related with IF instead of video games.)

e) Besides the mechanics, also the UI plays a big role. The player need to recognize whether a sentence may endear or insult the opponent. Remember that not all players may understand the language in which they are playing good enough to recognize nuances. It is also nicer for the player not to need reading 4 times 5 lines of text just to decide which answer to give. There is an article about dialog UIs on Gamasutra (unfortunately I do not find it ATM). The more influence a dialog system has, the more important is its flawless controlling.

f) Regarding the effect of stats on dialogs and the possibility to change stats due to dialog, you may want to read the PDF about SimDialog available here.

I think a state machine or decision tree are too limited as approaches. If each variable other than topic has 4 states, you're already at 1,024 edges or nodes. Obviously, you wouldn't write out a variation for every single possibility, but even at 1% of variations, you're talking 10 edges and that'll get very complicated very quickly. I think you'd drive yourself crazy managing them.

My idea would be to set the preconditions and results of each snippet. Something like this:

Action: "Lament about your upbringing"

Text: "Let me tell you about...(blah blah blah blah blah)"

Mood: NOT (Angry OR Happy)

Relationship: Friend OR Lover OR coworker

Childhoods: Active

Importance: 80

Repeatable: Never

Result: Mood = Sad, Childhoods += 2, Orphanage += 3, Parents += 1, Profession += 1

Action is the choice you'd display to the player, and text is what would get displayed when it was selected. The next three variables set when this statement makes sense (what's your mood and relationship, and is childhoods an active topic). Importance is for filtering which options to give when many are possible. Repeatable is how soon you could return here, e.g., once per conversation (you'd probably need something a little fancier to keep from retelling the story with a different mood).

Results would change variables. Some set the appropriate tone: the states you discussed. The others set reasonable transitions: if the conversation is about being an orphan, the other conversant could talk about their childhood or parents, or talk about what you've since become. Basically any side road off the given text. I included values for these, on the idea that a topic decays over the course of a conversation. You can double back to talk about the orphanage more up until a point, and then we've moved on.

This way you can just write content without getting too caught up in mapping all the ways it could flow together. Maybe there's a murdered_parents variable, if this variable is activated (getting them to admit it, someone else tells you, putting together clues), you can go accuse them without worrying too much about how you got there or exactly when you do the accusation (maybe you wait for just the right moment). This feels more natural to me as a writer: choose a scene, write it out, move on (rather than writing variant after variant and constantly interupting to map flow).

That said, this is a rather ad hoc method, and it would be very easy to leave dangling story lines or branches that just can't be reached or conversations that tend to peter out. I think you'd need some code doing smart analysis to help you find what content needs writing, and a good content generation tool to keep track of variables.

Anyways, just a brain storm, hope it helps.

sunandshadow, I'm afraid I have to disagree with your examples, to me both your examples appear to be dialogue trees where the choices are merely obfuscated by the system. The text only games simply let you add a couple words here or there or let you choose your dialogue option by typing it in, while the standard choice of those RPGs you mention simply connect standard options to specific dialogue options. Neither would work all that well for what I'm thinking of, unfortunately.

The differance is that most, if not all of the dialogue choices would be scripted specifically for that situation, while I want a system that dynamically allocates phrases and words from a bank in such a way that it seems consistent and logical without having to script it out specifically.

I don't want to design a slightly different phrase depending on what mood someone is in when they deliver a message. I want that phrase/word to be chosen by the system itself.

haegarr, I like some of the ideas you put fourth here. A goal is a nice touch, and topic changing segues as well as topics the NPC would like to avoid and such. Topics could be highlighted in such a way that indicates what your character THINKS would happen if they said it. This would not always be entirely correct, as the player/character will not always have all the information, just like in real life.

Polama, you approach is interesting, and believe me, I know the difficulty of writing all the possible content. But notch didn't design each and every block, each vista and ocean when he made minecraft, did he? What I propose to be done is to write words and phrases and let the system choose how they fit together based on the rules we give it.

I propose there be 2 databases. 1 database for tags, and 1 for actual words and phrases.

Each word and phrase will have tags associating them. So, say, KILL would be tagged with attack, anger, enemy and HELP would tagged with assist, (positive emotion), friend/ally. These tags and the system that reads them would have to be refined, but you get the idea.

Then the phrase, "%person% %question% %action% %character%." could be informed by Relationship: Boss->Subordinate, Disposition: Giving order, Mood: calm, topic: attack, enemy. this would make the phrase "I order you to kill Caren."

But who would make all the content? Who would tag all the words? Well, people, of course. People are willing to help other people most of the time, especially if helping is a game. So over time a huge database of words and tags could be created and plugged into the system, which would be optimized to do the sort of word choosing quickly.

Thoughts?

There's a company, Narrative Science, working on this sort of thing. You can feed them a log of a baseball game and they generate an article about it. It's a hard problem, though. It's hard not to get lots of very similiar, simple sentences. "Caren stole my crown. I want my crown back. I order you to kill Caren. I order you to retrieve my crown." Fine, acceptable prose.

But as a human, I'd write something like "My investigators have been scowering the town, and an adventurer named Caren left the night of the awful theft. My crown, god, the symbol of my power! There's enough unrest as it is. I've got everyone I can fanning out; I want you to head West towards Adventureshire and see if you can learn anything of this 'Caren'. Get me my crown back. I want this kept as quiet as possible so don't bother bringing her back, just eliminate her. And I know those rebels have been spreading dirty words about my generosity, but I promise, if you succeed you'll have more treasure than you'll know what to do with."

I think generative content is an interesting problem to attack, but there's so many ways to construct a sentence, so much below the surface weaving together sentences. And so much knowledge! What can be held? Lit on fire? Comforted? Consider "an adventurer named Caren". We could do "a beautiful woman named Caren", "an orphan named Caren", "a level 3 wizard named Caren", "some jerk named Caren", "Henry's sister, Caren,". Each would be appropriate for some piece of text, not all would fit well or make sense in the boss's command.

So at the end of the day, I wouldn't expect anything better than poorly written text. That would be pretty amazing, though. It's a very cool problem to make progress on.

One thing to try is using deep learning. I have no idea how or why, but that seems to generate human like results when it works =)

Another interesting point, Polma! What if we systematized paragraphs as well as full conversations on top of individual sentences, with lots of back checking. Keep doing this, reiterate on the problem fast enough, and you can get something that works acceptably with enough testing. Especially when the testers have the ability to directly affect the script using the controls.

But what tests would we do that could be specific enough for a computer to do quickly, but vague enough that it works on a language problem?

Using some sort of grammar for paragraph layout should work. There's a continuum, where at one extreme you're writing extremely generic patterns and working very hard at them and your representation of the world and artistic quality, and hoping they'll work for anything. At the other extreme, you're writing out the content directly. In the middle you have madlib style patterns and small modular pieces that go together, etc.

A very generic paragraph might be: What? Who? When? Where? How?. "You must retrieve my crown. Caren has it. Do this now! She was last seen in Adventureshire. Go there and kill Caren."

You can also insert "tell me more" sentences anywhere. "You must retrieve my crown. It is gold with a ruby inset. Ruby is a red gem exported by Dwarvia. Dwarvia is an unfortunately named elven settlement."

But for higher quality content, you'd want more specificity. A love proclamation might be [statement about the current situation. Statement that they like a positive quality about you in the current situation. Positive shared memory. Positive shared memory. "I've been meaning to tell you, but reason I haven't." Proclamation of love.] Even more specifically, you might have different grammars to handle telling a person being kidnapped you love them, telling a person after a big battle you love them, telling a person on a romantic date you love them...

For content, you can have a generic "Oh no, I've (lost my [object] | my [object] was [destroyed])!. Oh no, I've lost my sword! Oh no, my cat was killed! Oh no, my hair was burned off!" Or you can have a separate sentence or two for a lost pet versus a lost weapon. And for more specificity, you can have content for a murdered beloved pet of a gruff warrior, or a dead wizard's familiar that was brand new to the job. But the nice thing is that you could fill in the levels of detail over time: if there's nothing else written, it's just "oh no, I've lost my eyeball". If there's a long rant written up for eyeball-eaten-by-spider, you can pull that.

For testing, you could have something surreptitious for marking errors within the game, just right click on a sentence and choose "ungrammatical, awkward, inappropriate, etc." Have it send that info back to a server to collect. When examining the errors, I'd suggest having the engine generate a bunch of results with small tweaks, so you can say which other things display the issue. Was it a bad pattern and all this is broken? Is it specifically that a dragon liver is not a good wedding present? Would it be a good birthday present though? No, not a present at all?"

Anyways, fun stuff to think about. I do think it is plausible, but I do think you'll need lots and lots of fairly specialized logic and content.

Cool. Reminds me somewhat of the dialogue system in Glass Rose

When I first read this topic, I immediately recalled a video that I found so many years ago (but forgot the name of it). Today, I happened to come across it again. Check this out:

[media]https:

[/media]

Edit:

I like the visible word bank approach. It alleviates much of the frustration that is often involved in "guess the verb" systems.

This topic is closed to new replies.

Advertisement