Constructing a Learning AI

Dan Violet Sagmiller (He/Him)Principal Engineer
2x MS MVP in Unity
2x game dev books
2x Unite speaker
2x Lynda courses
2x college instructor
2 decades experience
As game developers, we quickly learn that Artificial Intelligence (AI) doesn’t need to be so tough.  To reference Space Ghost: “ Moltar, I have a giant brain that is able to reduce any complex machine into a simple yes or no answer.”  Despite the humor of their conversation, he pegged the nature of AI.  Turn anything into a simple yes or no answer.  A simple AI, might be, “If player is to the left, move the enemy ship to the left” which is a very simple AI for interception or chase, but it is not a learning AI.  A learning AI must have the ability to decide to go right for reasons it was not specifically programmed for in the first place.

A learning AI is highly important for ensuring that non-player characters (NPCs) will have more interactive choices for the players, without requiring developers or AI designers to program AI for each possible option combination in the game.  (A game with 10 spells, 10 items and 10 effects would be 10*10*10 combinations, or 1000 combinations to be programmed for.)

To cover the AI, I’ll discuss two different types of learning AI, reactionary and meditative.

Reactionary Learning

Reactionary is where the game learns immediately.  It must still depend on pre-programmed choices, but those choices can depend on historical data.  I.E. In Mortal Combat, let’s say the player keeps winning with one particular move when approaching as a first strike.  After a few times of this happening, the AI should change its response to a different approach particularly if it is known counter to the player’s move of choice.  

Reactionary requires that the game’s AI is not just paying attention to the present state of things, but is also keeping certain histories of past conditions.  Normally the AI might choose a basic punch or kick against a player as they approach, but the AI could be extended to look at what the last 3 action choices were after the player started moving towards the AI NPC.  If there is a majority, I.E. 2 or 3 of the same move type, presume that is a favorite, and select a choice that counters it.  I.E. if 2 out of the last 3 attacks chosen were low kicks, the AI may instead choose to jump.  

Meditative Learning AI

Unlike reactionary learning, meditative learning doesn’t get processed live with the game.  I.E. it is not used immediately.  If it is a client only game (no server) the game might still analyze larger amounts of data, but this would typically be during level completion, saving or in a separate lower level thread all the time.  The point of this learning is that it can take a look at a lot more data, and engineer new tactics to use in game play.  This is not limited to the choice of actions, but also which environment variables to pay attention to.

To start out, we’ll break the game up into abilities and metrics.  Abilities would be what an NPC can do.  Metrics include data points to listen to.   (Please note that metrics can also include reactionary data to pay attention to, but for simplicity, I’ll leave that option out)  

Abilities might include
Attack with ranged weapon
Attack with melee weapon
Change weapon
Heal myself
Run away

You can see that the abilities don’t specify which weapon, but simply choose a weapon with range or melee.  These are the choices an NPC might have.  These can have simple requirements as to whether or not an NPC can even do it.  For instance, an NPC who only has a dagger, can not have “Attack with ranged weapon”, nor could they have “Change weapon” as an option.

Metrics might include
Last change in opponent health from this - This looks at the impact this had on the opponents health last time the ability with this was used.

Current opponent health
Distance of opponent
Current health
Average health change per action

The metrics are simply points of data.  For each ability, you would give it a handful of metrics to pay attention to.  It should only be what you think are obviously related to the ability most of the time.  Depending on the complexity of the processing requirements, it will have less options.  For instance, one metric might be the "total possible strength of any spell opponent could cast".  That would require that the metric looks at the opponents’ mana, then through all of their spells they could execute for that mana amount, and then determine the amount of damage that might do, including checking armor types of the NPC, effects, etc…  Where as “Opponent HP” is considerably less computation.  

For each of the metrics on an ability, you need to supply a percentage that the value of that metric matters.  This should have a combined limit of less than 70%. (or some range)   The AI will use the remaining percentage to make stuff up and see how it works.

Creating an AI for an NPC

First, design your NPC.  Perhaps an Orc, with a dagger and a single heal potion in its inventory.  Second, select what you think might be its most common abilities, perhaps “Attack with melee weapon” and “Heal”.  Finally, you need to establish a percentage of preference for certain things.  For instance, it will choose “Attack with melee weapon” at 40% and “Heal” at 30%.  This leaves another 30% undefined.  The AI will use this for its learning, similar to the metrics we established per ability earlier.

Optionally, you might also give it X coins for spending on items, or upgrading spells, for the NPC to use.  A completely AI driven character would have 0 abilities by default, and simply be given a larger amount of coins to spend on items and spells.

Now that the general setup is made, the AI needs a way to study different setups.  This part I usually refer to as a learning engine.

Learning Engine

A learning engine is where fake battles occur that experiment with different possibilities and analyze the outcomes.  The engine creates variations of the NPC you selected.   (typically using a database to store all the data)  For instance, it had 30% of its actions still available in the AI.  It will randomly select a handful of other features to account for it.   For instance, perhaps it chooses 30% for “Run Away”, or 10% for “Run Away” and 20% for “Heal”.  Since the unit already had heal at 30% that would mean that heal now has a 50% chance to use the ability.  Optionally, the AI may also choose to use only 10% leaving 20% left.  But these are all choices the AI picks at random.  The AI holds no preferences for what to pick.  If any of the NPCs are identical, it will replace it with another new random selection.

It also does this same thing with the choices of metrics that each ability will pay attention to.  Using the available percentages to add other random metrics to pay attention to.  For instance, Perhaps it will add “Does opponent have [X]” and check if the player has a “water flask” or a “shield”.  The water flask will probably not matter ever, but the AI will experiment any way.

If the AI has coins to spend, it would choose extra features/items prior to applying percentages, and then choose from the increased list of available abilities.  

Next, it takes each of the NPCs it generated and pits each one against each other one at least X amount of times in a simulated no graphics battle.  (X could be 3 times or 1000)  Each gets a score at the end of the battle, based on how many hit points it started vs. how many it ended with, Same for magic points, The time it took to win, etc...  Each AI is then ranked in a list, to see which AI came out the best and then second best and so on.   AI's that lost 100% of the time, are established as pretty dumb and discarded.  AI's that win 100% of the time should be flagged for review, to see if they exploited a weakness in your game balancing.

When a player enters an Orc dungeon, there could be X number of areas in the tunnel, and as the player progresses, the game engine selects the AI to use (the set of abilities and metrics to pay attention to) from the list of AI’s for orcs, by rank, making it tougher the farther you proceed.

Player’s Learning Engine

The player’s learning engine operates almost the same way.  Except that you would have to design a few player characters, I.E. where they would be at certain levels.  For instance, select your starting level character, then one or two variations of where they would be at level 2, then a few more variations at level 3, etc…

Each player option then gets pitted against different AIs established earlier in a variety of battles.  If level 1 never beats any of the AI’s, this shows you that you probably have a balancing issue for the game.  Either Orcs shouldn’t be found in level 1 areas, or the Orcs abilities should be reduced.   (the learning engine shouldn't pit player vs. player unless you expect that as a common feature in your game.)

Typically, for an average level 1 player, you would want simpler AIs that win only 10% of the time against a level 1 to be right at the beginning.  Then as they progress, prior to levelling up, the player would find AIs that tend to win about 40% of the time, and so on.  The player could select a difficulty level, which would simply limit the AI choices to ones that tend to win less.


So this is basically making the AI for your game fit into a statistical analysis tool, so you don’t have to do all the work of determining AIs manually.  As your game and AIs progress, you can tweak the AI metrics and abilities to pay attention to features that players use more often.  

Despite this section being a recap, there is still more.  But the next items are here to give some ideas of how to bend and apply this to your game.

Automatic Balancing

Balancing metrics is pretty easy.  

Character deals X damage, defends X attacks, wields X powers, etc…  

Metrics are simple to compare from NPC to NPC or NPC to character.  The challenge is getting the AI to change over time.  Players eventually find weaknesses in the AIs, and will learn to exploit them to gain abilities faster.  Guides are posted, masses apply it, making a unbalanced MMO RPG with 95% one class type, or all preferring one specific weapon.

To resolve this, metrics can be recorded live in game, particularly about a specific player type, at a particular level and against particular AIs.  If it is common for a particular set-up to win far more often than any other, I.E. a thief wins 98% of the time, while the wizard class wins only 80% of the time.  Common characters of both types can be run through the player learning engine to determine AI that can help balance that out.  Then start increasing the number of times the particular player set-up runs into that AI type.  The monster can stay the same, I.E. each player would run into an Orc.  But a wizard would see an AI tuned to wizards, while thieves would see AIs tuned for a thief.

Basic Application

Even if you don’t take the time to build a learning engine to try all these battles out, you could still benefit from developing the AI to occur from abilities that respond to particular metrics.  Aside from being a good way to separate things out, you could still apply a learning engine later, without having to rewrite your AI.

Naturally, I've not included every detail; like the fact that some metrics need limits: I.E. health metrics would pay attention to health dropping below a certain level, and that the learning engine may try experimenting with that - or the fact that abilities change if you change your equipped items mid-battle.  The point of this is article that you have a general idea of how to apply the ideas to get started.
Dan Violet Sagmiller (He/Him)Principal Engineer
2x MS MVP in Unity
2x game dev books
2x Unite speaker
2x Lynda courses
2x college instructor
2 decades experience

Comments (1)

Most Valuable Expert 2013

An excellent article on the basics of AI design for both the player and NPCs


Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.