Jump to content
  • entries
    91
  • comments
    340
  • views
    123219

Designing Intelligent Artificial Intelligence

slayemin

1794 views

 

Below is my preliminary draft design for the AI system within Spellbound. I'm slowly migrating away from scripted expert systems towards a more dynamic and fluid AI system based on machine learning and neural networks. I may be crazy to attempt this, but I find this topic fascinating. I ended up having a mild existential crisis as a result of this. Let me know what you think or if I'm missing something.

Artificial Intelligence:

Objectives:
Spellbound is going to be a large open world with many different types of characters, each with different motives and behaviors. We want this open world to feel alive, as if the characters within the world are inhabitants. If we went with pre-scripted behavioral patterns, the characters would be unable to learn and adapt to changes in their environment. It would also be very labor intensive to write specific AI routines for each character. Ideally, we just give every character a self-adapting brain and let them loose to figure out the rest for themselves. 

Core Premise: (very dense, take a minute to soak this in)
Intelligence is not a fixed intrinsic property of creatures. Intelligence is an emergent property which results directly from the neural topology of a biological brain. True sentience can be created if the neural topology of an intelligent being is replicated with data structures and the correct intelligence model. If intelligence is an emergent property, and emergent properties are simple rule sets working together, then creating intelligence is a matter of discovering the simple rule sets.

Design:
Each character has its own individual Artificial Neural Network (ANN). This is a weighted graph which uses reinforcement learning. Throughout the character's lifespan, the graph will become more weighted towards rewarding actions and away from displeasurable ones. Any time an action causes a displeasure to go away or brings a pleasure, that neural pathway will be reinforced. If a neural pathway has not been used in a long time, we reduce its weight. Over time, the creature will learn.

A SIMPLE ANN is just a single cluster of connected neurons. Each neuron is a “node” which is connected to nearby neurons. Each neuron receives inputs and generates outputs. The neural outputs always fire and activate a connected neuron. When a neuron receives enough inputs, it itself fires and activates downstream neurons. So, a SIMPLE ANN receives input and generates outputs which are a reaction to the inputs. At the end of neural cycle, we have to give response feedback to the ANN. If the neural response was positive, we strengthen the neural pathway by increasing the neural connection weights. If the response was negative, we decrease the weights of the pathway. With enough trial runs, we will find the neural pathway for the given inputs which creates the most positive outcome.

The SIMPLE ANN can be considered a single cluster. It can be abstracted into a single node for the purposes of creating a higher layer of connected node networks. When we have multiple source inputs feeding into our neural network cluster and each node is running its most optimal neural pathway depending on the input, we get complex unscripted behavior. A brain is just a very large collection of layered neural nodes connected to each other. We’ll call this our “Artificial Brain” (AB)

Motivation, motivators (rule sets):
-All creatures have a “desired state” they want to achieve and maintain. Think about food. When you have eaten and are full, your state is at an optimally desired state. When time passes, you become increasingly hungry. Being just a teensy bit hungry may not be enough to compel you to change your current behavior, but as time goes on and your hunger increases, your motivation to eat increases until it supersedes the motives for all other actions. We can create a few very simple rules to create complex, emergent behavior.
    Rule 1: Every creature has a desired state they are trying to achieve and maintain. Some desired states may be unachievable (ie, infinite wealth)
    Rule 2: States are changed by performing actions. Actions may change one or more states at once (one to many relationship).
    Rule 3: “Motive” is created by a delta between current state (CS) and desired state (DS). The greater the delta between CS and DS, the more powerful the motive is. (Is this a linear graph or an exponential graph?)
    Rule 4: “relief” is the sum of all deltas between CS and DS provided by an action.
    Rule 5: A creature can have multiple competing motives. The creature will choose the action which provides the greatest amount of relief.
    Rule 6: Some actions are a means to an end and can be chained together (action chains). If you’re hungry and the food is 50 feet away from you, you can’t just start eating. You first must move to the food to get within interaction radius, then eat it.

Q: How do we create an action chain?
Q: How do we know that the action chain will result in relief?
A: We generally know what desired result we want, so we work backwards. What action causes desired result (DR)? Action G does (learned from experience). How do we perform Action G? We have to perform Action D, which causes Action G. How do we cause Action D? We perform Action A, which causes Action D. Therefore, G<-D<-A; So we should do A->D->G->DR. Back propagation may be the contemporary approach to changing graph weights, but it's backwards.
Q: How does long term planning work?
Q: What is a conceptual idea? How can it be represented?
A: A conceptual idea is a set of nodes which is abstracted to become a single node?


Motivators: (Why we do the things we do)
    Hunger
    Body Temperature
    Wealth
    Knowledge
    Power
    Social Validation
    Sex
    Love/Compassion
    Anger/Hatred
    Pain Relief
    Fear
    Virtues, Vices & Ethics
Notice that all of these motivators are actually psychological motivators. That means they happen in the head of the agent rather than being a physical motivator. You can be physically hungry, but psychologically, you can ignore the pains of hunger. The psychological thresholds would be different per agent. Therefore, all of these motivators belong in the “brain” of the character rather than all being attributes of an agents physical body. Hunger and body temperature would be physical attributes, but they would also be “psychological tolerances”.

Psychological Tolerances:

{motivator} => 0 [------------|-----------o----|----] 100
                 A            B           C    D    E

A - This is the lowest possible bound for the motivator.
B - This is the lower threshold point for the motivator. If the current state falls below this value, the desired state begins to affect actions.
C - This is the current state of the motivator.
D - This is the upper threshold point for the motivator. If the current state exceeds this value, the desired state begins to affect actions.
E - This is the highest bounds for the motivator.

The A & E bounds values are fixed and universal.
The B and D threshold values vary by creature. Where you place them can make huge differences in behavior.

Psychological Profiles:
We can assign a class of creatures a list of psychological tolerances and assign their current state to some preset values. The behavioral decisions and subsequent actions will be driven by the psychological profile based upon the actions which create the sum of most psychological relief. The psychological profile will be the inputs into an artificial neural network, and the outputs will be the range of actions which can be performed by the agent. Ideally, the psychological profile state will drive the ANN, which drives actions, which changes the state of the psychological profile, which creates a feedback loop of reinforcement learning.

 

Final Result:
We do not program scripted behaviors, we assign psychological profiles and lists of actions. Characters will have psychological states which drive their behavioral patterns. Simply by tweaking the psychological desires of a creature, we can create emergent behavior resembling intelligence. A zombie would always be hungry, feasting on flesh would provide temporary relief. A goblin would have a strong compulsion for wealth, so they'd be very motivated to perform actions which ultimately result in gold. Rather than spending lots of time writing expert systems styled AI, we create a machine learning type of AI. 

Challenges:
I have never created a working artificial neural network type of AI. 
 

Experimental research and development:

The following notes are crazy talk which may or may not be feasible. They may need more investigation to measure their merit as viable approaches to AI.

Learning by Observation:
Our intelligent character doesn’t necessarily have to perform an action themselves to learn about its consequences (reward vs regret). If they watch another character perform an action and receive a reward, the intelligent character creates a connection between an action and consequence. 
    
Exploration Learning:
A very important component to getting an simple ANN to work most efficiently is to get the neurons to find and establish new connections with other neurons. If we have a neural connection topology which always results in a negative response, we’ll want to generate a new connection at random to a nearby neuron. 

Exploration Scheduling:
When all other paths are terrible, the new path becomes better and we “try it out” because there’s nothing better. If the new pathway happens to result in a positive outcome, suddenly it gets much stronger. This is how our simple ANN discovers new unscripted behaviors.

The danger is that we will have a sub-optimal behavior pattern which generates some results, but they’re not the best results. We’d use the same neural pathway over and over again because it is a well travelled path.

Exploration Rewards:
In order to encourage exploring different untravelled paths, we gradually increase the “novelty” reward value for taking that pathway. If traveling this pathway results in a large reward, the pathway is highly rewarded and may become the most travelled path.


Dynamic Deep Learning:
On occasion, we’ll also want to create new neurons at random and connect them to at least one other nearby downstream neuron. If a neuron is not connected to any other neurons, it becomes an “island” and must die. When we follow a neural pathway, we are looking at two costs: The connection weight and the path weight. We always choose the shortest path with the least weight. Rarely used pathways will have their weight decrease over a long period of time. If a path weight reaches zero, we break the connection and our brain “forgets” the neural connection.
        
Evolutionary & Inherited Learning:
It takes a lot of effort for a neural pathway to become developed. We will want to speed up the development. If a child is born to two parents, those parents will rapidly increase the neural pathways of the child by sharing their own pathways. This is one way to "teach". Thus, children will think very much like their parents do. Other characters will also share their knowledge with other characters. In order for knowledge to spread, it must be interesting enough to be spread. So, a character will generally share the most interesting knowledge they have. 

Network Training & Evolutionary Inheritance:
An untrained ANN results in an uninteresting character. So, we have to have at least a trained base preset for a brain. This is consistent with biological brains because our brains have been pre-configured through evolutionary processes and come pre-wired with certain regions of the brain being universally responsible for processing certain input types. The training method will be rudimentary at first, to get something at least passable, and it can be done as a part of the development process.
When we release the game to the public, the creatures are still going to be training. The creatures which had the most “success” will become a part of the next generation. These brain configurations can be stored on a central database somewhere in the cloud. When a player begins a new game, we download the most recent generation of brain configurations. Each newly instanced character may have a chance to have a random mutation. When the game completes, if there were any particular brains which were more successful than the current strain, we select it for “breeding” with other successful strains so that the next generation is an amalgamation of the most successful previous generations. We’ll probably begin to see some divergence and brain species over time?

Predisposition towards Behavior Patterns via bias:        
Characters will also have slight predispositions which are assigned at birth. 50% of their predisposition is innate to their creature class. 25% is genetically passed down by parents. 25% is randomly chosen. A predisposition causes some pleasures and displeasures to be more or less intense. This will skew the weightings of a developing ANN a bit more heavily to favor particular actions. This is what will create a variety in interests between characters, and will ultimately lead to a variety in personalities. We can create very different behavior patterns in our AB’s by tweaking the amount of pleasure and displeasure various outputs generate for our creature. The brain of a goblin could derive much more pleasure from getting gold, so it will have strong neural pathways which result in getting gold.

AI will be able to interact with interactable objects. An interactable object has a list of ways it can be interacted with. Interactable objects can be used to interact with other interactable objects. Characters are considered to be interactable objects. The AI has a sense of ownership for various objects. When it loses an object, it is a displeasurable feeling. When they gain an object, it is a pleasurable feeling. Stealing from an AI will cause it to be unhappy and it will learn about theft and begin trying to avoid it. Giving a gift to an AI makes it very happy. Trading one object for another will transfer ownership of objects. There is no "intrinsic value" to an object. The value of an object is based on how much the AI wants it compared to how much it wants the other object in question.
        
Learning through Socialization:
AI's will socialize with each other. This is the primary mechanism for knowledge transfer. They will generally tell each other about recent events or interests, choosing to talk about the most interesting events first. If an AI doesn't find a conversation very interesting, they will stop the conversation and leave (terminating condition). If a threat is nearby, the AI will be very interested in it and will share with nearby AI. If a player has hurt or killed a townsfolk, all of the nearby townsfolk will be very upset and may attack the player on sight. If enough players attack the townsfolk, the townsfolk AI will start to associate all players with negative feelings and may attack a player on sight even if they didn't do anything to aggravate the townsfolk AI.
 



3 Comments


Recommended Comments

What is it that you're hoping that an AI like this adds to Spellbound?

The mindless undead that attack the player... well they're mindless, so they don't really need an AI, right? For an AI like this to be applied to an end boss or other main character that may or may not be adversarial, I would think that you would run the risk of not providing a consistent experience for players when the boss is encountered.  Which then would leave assorted NPCs like villagers to be given the AI, the main purpose of which, from a game play perspective, I would think would be to generate side quests which I would expect there to be simpler ways to go about generating.

 

Share this comment


Link to comment

Interesting.

You've managed to reinvent several concepts, including drive reduction theory, mirror neurons (sort of), neural Darwinism, and NEAT (NeuroEvolution through Augmenting Topology). I'm quite impressed.

With that having been said, I'm afraid kseh is correct. Adding neural net-based AI of this kind of complexity wouldn't contribute much to your game. Additionally, depending on how enemies are processed, the size of each neural net, and the specific implementation, you may find that using this kind of AI will simply bog down the user's computer compared to the relatively computationally cheap expert system.

AI stuff is super fun to geek out about, as you've seen, but much of it is also very impractical. In your previous blog post, you had concerns over whether anyone would care. I think the answer is that AI enthusiasts might care, but your players won't. There's an article written by some Halo developers on Halo's AI. They originally had an elaborate AI scheme utilizing fuzzy logic, but playtesting found that players consistently failed to notice and/or take advantage of the NPC's behaviors.

I think you're on the right track with abstracting actions into "abilities" to simplify AI development. For more intelligent enemies, using some system to establish preferences seems like a good idea, but I would abstain from neural nets. Perhaps you should use an expert system template and just define preference values for each enemy type?

Edited by Size_A5

Share this comment


Link to comment

Yeah, I'm a novice at AI and have not spent a lot of time studying it formally. That's probably why I reinvent AI concepts familiar to AI developers. Currently, my developer attitude is, "What does it take to ship right now?" mixed with "How do I avoid painting myself into a corner?"

I'm currently modifying my expert system to use abilities, but structuring my abilities system to be something that can be treated as nodes in a graph network if I ever want to transition to an ANN. The underlying reasoning for this is that eventually my list of characters is going to be pretty large and complicated, and as I add in more characters, the scope and complexity increases. I'll need to have a strategy for reducing the developer work load and being able to adapt behaviors to game design changes without completely refactoring my expert systems AI code. What I wrote above is a rough outline for a direction I can eventually go in.

I'm thinking that this may be a bit of a waste of time right now, but I've convinced myself that there is something truly magical about having the illusion of an intelligent creature interacting with you in virtual reality. A part of that magic comes from being surprised by the actions and behaviors of a creature. The less scripted and novel the behavior seems, the more amazing it is. If eventually we have lots of AI systems doing complex behavior to "live" in the virtual world and the players actions are a big contributing factor in the behavior of the world characters, then the replay value and player engagement increases by several orders of magnitude. Players can have really different game play experiences when they do a "good" play through vs. "evil" play through, and everything in between. I think the variety in consequences within the game makes the moral choices really interesting and becomes a way for players to explore their own nature/hearts within a consequence free world, and then they take those learned lessons back to real life. A flexible/adaptive AI system would be a necessary component to exploring the long term consequences of moral decisions within the framework of a game. Hopefully, the end result would be that virtuous actions are always better.

Share this comment


Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Advertisement
  • Advertisement
  • Blog Entries

  • Similar Content

    • By sprotz
      What pathfinding technique is used for an enemy to find their way around when looking for the player, and also I'm curious... What pathfinding technique did Goldeneye 64 use when an enemy is alerted to the player's presence and begins to move through the level until they find the player?
    • By Melodrive
      Over the last 1.5yrs, I've worked with my colleagues on Melodrive, an AI music engine that enables people with limited to no musical skills to easily create highly adaptive music. The music engine uses AI to generate music from scratch, in realtime. The music smoothly transitions between different emotional states. We're planning on releasing the engine for free to indies. This means that indies can quickly get a soundtrack, which is more dynamic than anything that exists today on the market - composers included! 
      We’ve just released four Melodrive tech demos. These showcase Melodrive's AI music potential across a range of experiences from VR to music branding. You can download the demos directly from our website here: http://melodrive.com You can also sign up to become an alpha tester.
      Hope this may be of interest to you!
       
    • By Packt
      Artificial Intelligence is one of the hottest technologies currently. From work colleagues to your boss, chances are that most (yourself included) wish to create the next big AI project.
      Machine Learning and Artificial Intelligence are revolutionizing the way game developers work.
      Packt wants to learn more about what Artificial Intelligence (AI) means for people who work in the tech industry and wants to help developers understand how AI will impact them today and tomorrow.
      Take the AI Now 2018 Survey  (shouldn’t take more than 2 minutes and all responses are anonymous).
      There's a huge discount for any Packt eBook or Video once you complete it!
    • By Camillelola
      Hi Folks,
      I am learning Artificial Intelligence  and trying out my first real-life AI application. What I am trying to do is taking as an input various sentences, and then classifying the sentences into one of X number of categories based on keywords, and 'action' in the sentence.
      The keywords are, for example, Merger, Acquisition, Award, product launch etc. so in essence I am trying to detect if the sentence in question talks about a merger between two organizations, or an acquisition by an organisation, a person or an organization winning an award, or launching of a new product etc.
      To do this, I have made custom models based on the basic NLTK package model, for each keyword, and trying to improve the classification by dynamically tagging/updating the models with related keywords, synonyms etc to improve the detection capability. Also, given a set of sentences, I am presenting the user with the detected categorization and asking whether its correct or wrong, and if wrong, what is the correct categorization, and also identify the entities.
      So the object is to first classify the sentence into a category, and additionally, detect the named entities in the sentence, based on the category.
      The idea is, to be able to automatically re-train the models based on this feedback to improve its performance over time and to be able to retrain with as less manual intervention as possible. For the sake of this project, we can assume that user feedback would be accurate.
      The problem I am facing is that NLK is allowing fixed length entities while training, so, for example, a two-word award is being detected as two awards.
      What should be my approach to solve this problem? Is there a better NLU (even a commercial one) which can address this problem? It seems to me that this would be a common AI problem, and I am missing something basic. Would love you guys to have an input on this.
      Thanks & Regards
      Camillelola
    • By EGDEric
      Have you ever played Starcraft, or games like it? Notice when you tell a group of air units to attack a target, they bunch up and move towards it, but once they're in range, they spread out around the target. They don't all just bunch up together, occupying the same spot, even though the game allows them to pass through each other.
      How would you approach this problem?
      I went with a "occupation grid": It's just a low-resolution 2D boolean array (640x480). Each ship (my game only has ships) has one, and updates it every frame. When attacking, they refer to the grid to figure out where they should move to. It works pretty good: The ships are nice and spread out, and don't just all occupy the same space, looking like they merged into one ship.
      The problem is is that this way is pretty inefficient. Just updating every ship's grid sometimes takes 24-31% of the CPU time. Using Bresenham line-drawing algorithm for every ship is the culprit.
      I'm thinking of having one shared grid for all the ships on a team, and instead of using a simple boolean 2D array, allow each square of the grid to keep track of every ship that is using it, by using a data structure with a linked list of references to the ships using that square. That way, I wouldn't have to update a grid for each and every ship.
      Maybe the solution would be to use a much simpler grid, minus the bresenham line drawing, just: a bunch of squares, try to stay in your grid square. Maybe allow larger ships to occupy more than one square.
      Another solution might be evading me completely, one that doesn't involve grids at all. Any thoughts?
       
       
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!