Teaching Robots to Play: Why Science Loves Gaming

This article and video are a reproduction of a talk originally given during Rezzed Sessions of EGX 2015 at the NEC in Birmingham, UK in September 2015.

One of the most interesting aspects of human behaviour – and by extension applications of intelligence as a species – is the relationship we have with games.  In principle, games are tasks or challenges designed to stimulate the brain.  However, while our brains are often forced to deal with demanding and often stressful mental tasks as part of our daily lives, we still have time for games and the notions of play.  Games are designed to keep us challenged, but ultimately do not prove so taxing that they ultimately prove too demanding or stressful for our regular consumption.  It is certainly interesting to observe the relationship between humans and the notion of play as a means of relaxation.

Games as Scientific Problems

There are a number of reasons why games prove to be an interesting domain to explore scientific problems and more importantly AI challenges.  Perhaps unsurprisingly, these are largely reasons that humans already embrace them.

Structures of Play

One major reasons that (good) games prove so effective is that they act as frameworks for reward through structured activity.  Games define loops of behaviour that if completed and repeated successfully, will reward the player through a number of means.  Some of these rewards are mild in nature and may be purely cosmetic, whereas others allow for a sense of progression to be conceptualised by the player.  These interactions continue to increase in scale, but help maintain a players interest and momentum, until the long-term and explicit reward is achieved.

The Super Mario Bros. series is one of the finest examples of how reward frameworks can be used to drive and maintain players interests.  Reward interactions and the loops of behaviour required to release them are often referred to as compulsion loops (Kim, 2014), whereby we maintain a players interest by ensuring a reward within an abstract and relative time-frame.  Short-term loops are often the result of simple interactions that may be largely cosmetic, but help maintain a players engagement. The interaction and response from the collection of coins in Super Mario Bros. may seem simple, but the use of counters and sound effects provide positive reinforcement to users that their actions not only make sense, but also work towards their long-term goals.  This subsequently scales to medium-term compulsion loops conceptualised through levels: given every level of Super Mario games celebrate the fact the player has completed that activity.  Returning back to coin collection, continued adoption of the short-term loop rewards you with extra lives.  This medium-term loop is now reinforcing a players continued adoption of the short-term loop and gives not only context, but a real quantifiable reason to continue doing it. This scales farther into long-term loops of activity, as levels are grouped into worlds with the requirement to defeat a boss enemy with closure achieved through the defeat of Bowser in the eighth and final world.

Ultimately, the point here is that we have this confined system within which intelligent decisions can be made: we can quantify their value as well as identify their position within the roadmap of future actions you can take in order to win that game.  Conversely, good games are able to quickly point out bad interactions you’re making and thus reinforce to you that doing certain things is bad.  While Mario does a good job of this, it is arguably his competitor Sonic the Hedgehog who signifies this even better.  With significantly exaggerated behaviour in the event rings are lost (thus losing all progress on that medium-term loop) but also when losing lives.

‘Fun’ Suggests Complexity

The next major factor that helps build a games as valid scientific problems is that they are fun. While fun is a largely subjective notion, there is evidence to suggest that the level of challenge involved must meet a certain threshold in order for it to be interesting in the eyes of humans (Viglietta, 2012). When we consider this from a computer science perspective, we would actually classify that games must be at least NP-Hard (non-deterministic polynomial time hard): which in short means this is something of a non-trivial problem.  In computer science, NP problems are typically ones that require some intelligent algorithmic process in order for them to be solved in a reasonable amount of time.

If we were to look at the range of games out there, from online-FPS games such as Call of Duty to racing games like Forza and even smaller and functionally simpler games such as Flappy Bird, we can begin to recognise a large range of problems that not only sufficiently difficult video games, but carry a range of equally interesting decision problems in their own right.

AI Playing Games

Despite this assertion of a certain level of computationally-defined difficulty, we would not paint all games as the same.  Games can carry a variety of problem traits that make them interesting for autonomous systems to try and solve.  These traits can change between genres of games and even releases within the same game series.  These traits not only result in games exhibiting particular artificial intelligence problems, but also begin to necessitate the use of certain AI techniques and methodologies given they are useful for that particular type of problem.  Contrary to the popular opinion: AI is not some black-box design that will work in any and all circumstances.  AI systems and the techniques used to build them are typically specialist in nature: focussing on very particular types of problems.  Only now, after over 50 years of research in this area, are we seriously looking at the challenges of building general intelligence systems, which we will discuss later.

Properties of Play

There are several properties of a game that we will typically consider when trying to figure out how best to approach the problem.  There are three that prove to be rather important:

  • Accessible Knowledge
    Just how much do we know about the game we are playing at any point in time?  This is can be both a blessing and curse depending on how much we actually know.  In some games, we may not actually know everything about the current state of the game at this point in time.  This is typically the case in card games, ranging from Texas Hold’em to Hearthstone: we don’t know what the cards the other player is holding, but can make some educated guesses that ultimately guide our decision making.  Conversely, we exploit this imperfect information given that the opponent does not know what hands we might play.But this doesn’t mean that knowing everything about the world will help us that much either.  One of the best examples of this can be seen in fighting games such as Street Fighter, Mortal Kombat and Killer Instinct.  In each case we can see the whole state of the world: where the enemy player is, how much health or energy bar they have and the time remaining in that round.  Despite this, the number of possible actions that can be executed in that state leads to a large number of possible future states (also known as successor states).  This large number of actions and future states implies the branching factor of a given state, meaning that even if we start thinking three or four moves ahead, we need to start filtering out decisions that we don’t think the opposing player will make, given the number of possibilities is massive.


  • Predicting the Unpredictable
    One vital aspect of gaming is being able to see things before they actually happen, allowing us to make quick decisions and react to changes in the world.  When playing platforming game, we quickly learn the minutia of the movement mechanics: meaning we can predict whether we can make certain jumps in particular circumstances and quickly adapt to survive.Predictability can also come in really handy for dealing with enemies: learning the behaviour patterns of bosses in games such as Dark Souls or Titan Souls is key to knowing when to attack and when to fall back and defend.  But sometimes our model of that predictability is broken and that makes things so much harder for us.  One of the best examples of this can be found in Pac-Man, where the original ghosts are deterministic in nature, meaning we can learn in-time what an enemy will do at any point in time.  However, in the sequel Ms Pac-Man, the ghosts are able to take random moves at junctions if they wish.  This results in a non-deterministic system that we can no longer predict safely, making the game significantly more difficult.


  • The Players, The Enemies, The Actors
    Just how many characters are in this game and making (semi-)intelligent decisions?  This ties back to not only the complexity issue, but also the branching factor discussed earlier.  The branching factor is influenced not just on how many actions you can make at any given state/frame of that game, but also the actions that any other character can make in that world. The number of unique configurations of the game world can explode at an exponential rate once you have multiple characters that can all do different things at once.  We need to figure out a) which information is useful to us b) what we can ignore and c) how to do we ensure that the space of all potential game configurations is tractable, meaning that an AI can actually search it to find answers.To give you an example, StarCraft has been classified as at least an EXP-TIME algorithm, due to the whole range of actions you can make at any time as the player, but also all the potential things that can happen either by your forces as well as the opponents in every frame.  This means that the problem scales at an exponential rate based upon particular items (units, resources, objectives) within the game space. If we were to imagine that StarCraft is exponential as a problem with respect to the number of units on the map, it would scale horrendously: a solution that takes 3 seconds with only 1 unit on the map, would scale to the point that it would take 110 years with 20 units.  That is not a sound strategy at all.




Despite all this foreboding and gloom, there is still an awful lot to celebrate!  AI research in games kicked off in full swing in the mid-2000’s, with a number of big projects bringing the community together, as well as solving some interesting challenges.

The Pac-Man AI competition was largely successful in for AI research and methods.
The Pac-Man AI competition was largely successful in for AI research and methods.
  • The Pac-Man AI Competition: One of the first game competitions of its kind in which researchers pitted their own AI-driven Pac-Man against their research peers.  This was focussed on the more difficult Ms. Pac-Man due to the non-determinism.  This led to a body of research in areas such as neuro-evolution (where AI uses artificial brain-like data structures to process information in real-time) as well as reinforcement learning (where AI learns from its mistakes).


  • The Mario AI Competition: The second major problem area to gather attention was AI that can play Mario, which turned out to be a lot easier than we originally envisaged.  In fact, the more interesting problems were not whether AI could play Mario, but whether iit could build Mario levels: leading to a sudden surge in research in procedural content generation.


  • The 2K Bot-Prize: Something of an interesting challenge: to create AI that can play Unreal Tournament 2004.  However, unlike most challenges this was not about trying to beat the game or be the most effective at it, but whether you could fool other humans into believing that the bot was not an AI.  This is an example of the Turing Test: in which you build AI that can tackle tasks we would expect of a human, but design it such that it cannot be distinguished from humans when observed.  This competition ran for several years until a winner was found that was able to fool the judges into believing it was actually a human player.


The Big Challenges Ahead

Despite this, there is still a lot of work to do and many challenges yet to be solved.  We break down some of the bigger talking points, as well as point to some interesting reading material for you to check out if you’re interested.

Can we crack the challenge of RTS games and craft AI that can defeat human Starcraft players?
Can we crack the challenge of RTS games and craft AI that can defeat human Starcraft players?


  • Procedural Content Generation: PCG has became a big talking point in the academic community for a number of reasons.  It is perhaps not considered AI as such in the wider discussion, but generative systems are making intelligent decisions to craft artefacts.  What makes this an even bigger task is that how to evaluate this content is highly subjective. Unlike many other AI problems such as robotics, scheduling and even playing games, we cannot wholly evaluate the quality of the final output.  In a robotics problem we can evaluate against the expected behaviour or even how well the robot works in specific circumstances.  However with generated content, while we can evaluate whether the it adheres to specific functional aspects, we might struggle to identify more aesthetic and subjective aspects of that content.  So while we can quantify whether a gun can actually hurt an enemy or if a level playable, it is much harder to establish whether that gun was interesting or if that level is fun.
To That Sect: a game crafted by the automated game system ANGELINA.
To That Sect: a game crafted by the automated game system ANGELINA.
  • Player Modelling: While we typically think of AI to be attempting to solve problems on its own, it can often find means by which to learn from existing data.  Research fields such as machine learning often rely on systems from which real-world data can be used or approximations of that data can be generated for the AI to learn from.  While we would typically use data of the problem space, we can easily use data gathered from watching players in games either to replicate their behaviour in an AI, or understand their behaviour well enough for an AI to be able to ‘think like them’.  This ties heavily into the world of player analytics, where data is gathered to learn how players play games.  Arguably the most well-known example of this in games is the ‘Drivatar’ system from the Forza series.  However it can be seen in a variety of genres, from the ‘Shadow’ mode in Killer Instinct to the ‘Director’ AI of games such as Left 4 Dead and the Far Cry series.


  • General Intelligence: One of the most exciting fields happening in AI right now is the notion of general intelligence.  The reason for this is in actuality AI systems are typically specialist in nature: i.e. they are very good at one thing and one thing alone.  This is contrary to a lot of science-fiction, in that for example SkyNet in The Terminator or Shodan in System Shock, these systems are largely omnipotent and can solve any problem placed in front of them.  This can be seen when developing AI that can play/solve particular games: while we can write an AI that can play Pac-Man, it cannot play Super Mario Bros. and vice versa.  This is an issue that spreads far beyond games and into larger real-world problems.  General Intelligence is the the challenge of building AI that can solve any problem you give it, which is far more in line with the original aspirations of AI from the early 20th century.  This is now a big problem, with research departments at universities as well as big tech companies aiming to solve it.

As games become more increasingly complex, so do the artificially intelligent systems that seek to learn from them.  We are fortunate in that gaming is such a vibrant and creative field, given it provides a continuous body of complex and interesting problem spaces to be working within.  In our own way, science loves gaming for our own selfish reasons: with complex problems spaces that require reactive and long-term decision making systems to handle some of the most dynamic and multi-faceted domains outside of the real world itself.  Though to be honest, science is into games for pretty much the same reasons as everyone else: we’re here to have fun!

Enjoying AI and Games? Please support me on Patreon!
Tommy Thompson Written by:

Tommy is the writer and producer of AI and Games. He's a senior lecturer in computer science and researcher in artificial intelligence with applications in video games. He's also an indie video game developer with Table Flip Games. Because y'know... fella's gotta keep himself busy.

Comments are closed.