AI 101 – Part 2: Actions, Rewards and Video Games

In Part 1, we took a small detour away from what people expected of this series – which is arguably not the best way to begin but hey, this is my show.  We spent some time looking at how people think: through understanding the rewards that we can achieve by doing things, we establish a series of beliefs, desires and intentions that ultimately drive our decision making processes.

Now, you may be wondering what any of this has to do with video games, or artificial intelligence for that matter.  The truth is that all of these issues are really important to both areas.  So let’s start looking at this in more detail.

Impact Upon Video Games

When we consider playing video games, we can actually address this in two different ways: how we play video games, as well as how video games are designed.

Playing Video Games

Cast your mind back, if you can, to the first time you ever played a video game.  Can you remember what that video game was?  Do you remember what the experience was like?

And it was then that the art of leaning your body to make the avatar move faster was born.
And it was then that the art of leaning your body to make the avatar move faster was born.

For most of us – i.e. me – remembering the first time you played a video game can prove difficult, given it may have been a long time ago.  Perhaps you were more focussed on the activity of playing the game, rather than what you were experiencing at this point in time. Video games, not only like board games but also real life, rely on us to understand them in order to be good at them.  This typically relies on us to achieve the following:

  • A basic understanding of how we can can cause change using the user interface peripheral: joystick, mouse, keyboard, gamepad, touchscreen etc.
  • What impact do those changes have in the game world: can certain rules be established based upon the rewards we gain?


By developing an understanding of a video game through its reward structures, we can build knowledge that aids in becoming better at playing the game.  If we consider classic platformers such as Super Mario Bros.Mega Man (X) and Sonic the Hedgehog, we quickly establish some basic rules about how the game works through simple interaction and observation.  By pressing certain buttons, we begin to move either left or right.  In the case of the examples given, more of the game world is shown to us as we move right.  We learn that a certain button will correspond in the avatar jumping.  In time, we learn that specific interactions between the avatar and artefacts in the game world, will help establish rules of play.

We land on spikes and Mega Man (X) dies. This is sufficient for us to learn from the experience.

For example, we are often quick to learn that certain interactions in games will kill our character instantly: such as Mario falling down a gap in the map or Mega Man and Sonic landing on spikes.  These actions result in a strong negative reward.  These negative rewards have far reaching consequences; resulting in the player having to return to a checkpoint or start the level from the beginning.  In classic (read: old) video games, this was often even more perilous, since it would trigger the loss of a life and potentially result in ‘Game Over’.  As such, we are quick to learn that the actions that result in these effects are not something we wish to repeat.

Yeah that's right, keep smiling jackass.  I'm coming to get you.  Once I beat all the levels I already completed to reach you.
Yeah that’s right, keep smiling jackass. I’m coming to get you. Once I beat all the levels I already completed to reach you.

So in short: we rely on our ability to understand how actions dictate rewards in video games.  Through a learning process, we slowly understand what are the best actions to take at a given point in time, allowing for us to excel at that game.

Note: In addition to learning how to play an existing game, we are often quick to transfer this knowledge to similar, yet different problems. This makes sense, since a lot of this knowledge is applicable in different games. Earlier I mentioned three platforming series: Super Mario Bros., Mega Man and Sonic the Hedgehog, all of which rely on similar mechanics. Once we can play one, we can apply our knowledge to figure out how to play the others. This is still a big challenge faced by AI researchers, with new competitions being established to address it. I have written about the challenges and my experiences with this in the General Video Game AI Competition.

Designing Video Games

While the understanding of rewards in context of actions is useful for players.  It is critical for developers, given if this is not evident, it may impede a players desire to continue playing.

Establishing the Rules of Play

It is important to remember that the rules of play that we take time to learn are already established when the game is released.  It’s part of the development process and game developers are responsible for ensuring these rules are well established.  This relies on two important elements:

  • The rules of the world are robust and consistent.  If not, cheats or glitches can emerge in the game.
  • The rules of the world are expressed such that a player can learn them through play.

The latter is arguably the most important of the two, given that players must be able to infer the rules of play.  This allows us not only to play as discussed earlier, but identify when things do not go as planned.  This is may be due to the game changing the mechanics of play or because something in the game is inherently broken.

Fighting Psycho Mantis in Metal Gear Solid is one of the best examples of how game design is deliberately manipulated to confuse the player.
Fighting Psycho Mantis in Metal Gear Solid is one of the best examples of how game design is deliberately manipulated to confuse the player.

Changing the mechanics of play can prove dangerous, unless it is conveyed clearly to the player why this has happened.  A good (and weird) example of this being achieved in game design is the infamous Psycho Mantis battle in Metal Gear Solid for the Playstation.   The protagonist Solid Snake must fight Psycho Mantis: a psychic and telepathic assassin.  The character explains that he can read your ‘thoughts’, in this case input from your game pad.  As a result, he can predict your every move, resulting in the player needing to change the input port used by their controller.  In addition, Psycho Mantis causes a number of effects to occur on screen, leading you to believe that the game is actually suffering from fatal errors.  While jarring at first, it is very well done, given that it works in conjunction with the game’s narrative.

If it was not made clear to the player courtesy of the story, we would have assumed the game suffered from a significant number of bugs. This is how glitches in games are recognised, given that they deviate from our expectations of how the game will play.

Maintaining Players Interest

Designers consider how reward structures are established within games, given that they are required to maintain your interest in the game.  One element that proves useful to appreciate from a design perspective is the notion of ‘compulsion loops’, a theory that has not been around for a particularly long time (read the Gamasutra article by John Hopson from Bungie on his research into the area).  The idea is that the brain begins to associate patterns of behaviour with rewards.  Certain design tropes within video games can be seen classified as either short. medium or long term compulsion loops. These classifications based primarily on how much reward is given for an activity, but also the frequency with which rewards are given.

Short-Term Loops

Short-term loops are often focused on keeping you preoccupied with otherwise mundane tasks, were it not for the instant gratification we receive for completing these actions.  Arguably the most recognisable, and best, example can be found in Super Mario Bros.: collecting coins.  When a coin is collected it provides a nice chime, a counter is increased and you feel compelled to grab the next one.  It required very little effort on your end, but it felt gratifying and it keeps you pre-occupied with collecting more coins.  It also, interestingly, helps build towards a medium-term compulsion loop.

Collecting coins provides instant gratification.
Collecting coins provides instant gratification, but also provides deferred reward in 1-Ups.
Medium-Term Loops

This is where the designer hopes to maintain your interest: short-term loops can only sustain you for so long.  Sooner or later you want to look for the next interesting thing.  Now we want the long-term loops to focus on outstanding achievements.  So you need something to bridge the gap.  Medium-term loops provide a context for everything the short-term loops are achieving, to the point that the player begins to understand what the larger goal of the game is that needs completing.

Once again, the Mario series is a great example of this.  Firstly, we can look once again at the coins.  Upon earning enough coins, we gain a 1-Up.  We understand the importance and value of an extra life in the context of the game given the aforementioned deaths we encounter during play.  It gives purpose to the short-term loop of collecting coins and is ultimately rather useful.
Provided as reference. You can see that Level 1-1 is not very long! Image credit goes to Ian Albert.

In addition, Mario levels are traditionally not very long.  So we discretise the experience of playing Mario into chunks of gameplay that provide deferred gratification upon completing them.  You are then compelled to complete another level.  In time we also realise that the game is broken up into ‘worlds’ each four levels long.  So both the gratification and challenge increase as we complete each world, knowing that ultimately you will reach the last level of play.

Long-Term Loops

Everything up until now has built up to this moment.  The long-term loop is the ultimate purpose of everything we have done up until now.  It is the culmination of all rewards achieved to date, that you feel a tremendous sense of accomplishment in reaching this point.  It is the most defining aspect of playing that game.  Typically, this means you have defeated the game.  However, it is important that the game does not ever fulfil this loop: if anything it isn’t really a loop, it’s merely a long journey that never completes itself.

Prepare to Plan?
The ultimate Long-Term compulsion loop?

To my mind one of the strongest experiences of this kind is From Software’s Dark Souls, where the player is forced through a myriad of grueling challenges until they defeat the final boss.  It is an empowering moment that almost feels like the loop will close, until the game provides the New Game+ option: daring you to beat the game again, with even more difficult opponents.  For many people this is a no-brainer: you feel compelled to beat the game again despite the increased challenge you will be faced with.  Primarily, this is because you know  you can do it: you did it before and you can and will do it again.


Part 2 has raised how the ideas of rewards, decisions, actions and beliefs are important not only when we play video games, but also design them.  This is building towards the all-important point: the impact it has upon artificial intelligence.

Everything we have discussed thus far is important for us to consider for an AI which is either a character within a game, or designed to play the game itself.  In Part 3, we look at how all of this is taken on board as we look at the concept of intelligent agents.

Related Reading

Peter Collier – Compulsion Loops in the Short, Medium and Long-term.


Berke, J.D., and Hyman, S.E., (2000) “Addiction, dopamine, and the molecular mechanisms of memory.” Neuron 25.3 (2000): 515-532.

Enjoying AI and Games? Please support me on Patreon!
Tommy Thompson Written by:

Tommy is the writer and producer of AI and Games. He's a senior lecturer in computer science and researcher in artificial intelligence with applications in video games. He's also an indie video game developer with Table Flip Games. Because y'know... fella's gotta keep himself busy.