Previously in part 6 of AI 101, we discussed some core fundamental principles of how to perceive a problem, such as a video game, that we begin to explore a bit more in detail in parts 7 through 9.
- States: snapshots of the world that tell us about everything relevant in the problem we are currently looking at.
- Actions: actions that can be taken by a player/AI that make some change to the world.
- State Transitions: the relationship between actions and states, whereby committing actions causes the current state to change.
- Search Fundamentals: States that are crucial for any search process to take place. This includes the initial state, which is where the player/AI finds itself at the very beginning of search and the goal state(s): one or more states that satisfy a criteria that we have set.
These are the foundations on which a search algorithm is built. Provided we know how to model the world (states), means by which to change it (actions) and a representation of causal change (state transitions), then we can begin to search from our initial to goal location.
Now we do not jump straight into the AI-driven search straight-away, instead we focus on a couple of simple approaches to allow you to understand the structure and means by which we conduct search. These techniques are typically referred to as uninformed search, given that while they do exhibit some structure they are not intelligent in their search exploration process. We will explain at the end of this piece, what must be considered for an algorithm to be intelligent.
The Structure of Search
If we consider the state-transition system we have previously discussed, this allows for us to begin to appreciate how states are connected to one another. We could consider it as a graph-like structure: a mesh where nodes are states and edges are the actions that allow us to move between them. However that would not be entirely accurate, given that an edge between two states could actually represent not only a variety of unique actions for that transition, but also actions that are dependant on which way that transition is taking place.
For example, if we had two states representing being on the first and second floor of a building, the edge between them could express walking up stairs or going up in an elevator. However, that assumes the transition was from the first to the second floor. That same edge also means going down stairs (which is a rather different action) and going down in an elevator.
Of course, we could also represent each of these as unique edges, but this is potentially becoming cumbersome, and the mesh would begin to look messy. So instead, we defer to a more practical approach, the use of a tree.
Tree Search & Terminology
Trees are how we often envisage AI search processes, given they provide a nice, top-down visualisation of the state space. We can see as we move down through the tree that we are moving from one state to another. In addition, this resolves the previous problem by being unidirectional: the tree is only ever drawn downwards, never upwards. This means that if we have an action that allows us to backtrack, we can either a) draw the same state again further down the tree or b) simply ignore these transitions if we don’t think they’re necessary.
The diagram shown above shows this off quite effectively, with the top of the tree expanding outwards. We can then see how many successor states that a node has by looking one level down in the tree. This introduces some key terminology we need to remember:
- The root of the tree – the node at the top of the diagram – is the initial state we are searching from.
- Nodes are linked to other nodes via edges (except those on the bottom row). These edges represent actions, connecting to a successor states. Given we can commit a range of different actions at each state, we have to consider the branching factor, which is the number of subsequent nodes this one is connected to.
- Successor states are often referred to as children, given they are further down in the tree.
- We need consider the row or layer of the tree we are looking at. This is typically referred to as the search depth, identifying just how far down into the search tree we have travelled.
- While we search the tree, we need to identify what are the subsequent states we have decided to visit next. This collection of future states, known as the agenda is stored in some sort of data structure. As we will see, the data structure actually has a big impact on how the algorithms work.
There are also elements of how we assess the quality of a search algorithm. This is broken down into a number of key metrics and properties:
So with our shiny new tree, how can do we navigate it? Well, let’s get on those uninformed search techniques I’ve been babbling on about. These are not overly complex, given the names themselves are actually largely reflective of how these algorithms work.
Breadth-First Search (BFS) is a simple approach in that it explores the search space by looking in each layer of the search tree. Referring back to our description of terms above, we check each layer of the tree – the breadth of the search space – to see whether the goal state is anywhere in this layer. A diagram showing the order in which nodes would be traversed from the sample tree shown earlier is provided below.
In the event that a given state is not the goal, the child states are added to the end of the agenda, thus treating the agenda like a queue system. What this means is that we visit each layer of the tree, from the leftmost to rightmost child. It is only afterwards do we move down one level in the tree.
The alternative is known as depth-first search, where instead of going across the breadth of the search space, we dive deep into a particular branch of the tree. We visit the first child of the root and check if its the goal. If it isn’t, we then check the first child of the child and repeat that process until either we find the goal, or hit a dead-end. If we hit a dead-end, we back up to the last state we visited that still has children to check. This results in the image shown below, where we burrow deep into one segment of the tree, then backtrack to other areas we didn’t visit already.
Depth-first search is, in terms of the actual software, is remarkably similar to breadth-first search. In fact, it’s the exact same algorithm. The only difference is that the agenda is stored as a stack, rather than a queue. A stack ensures we will look next at the children of the last state we visited, whereas the queue ensures we visit them in the order which we found the children.
Now both of these methods share particularly useful traits. Firstly, both are complete, in that each will search the entirety of the search space. However, there is a small caveat to this, which we will discuss in a moment. Breadth-first search is optimal, in that it will find the shortest path to the solution. However, depth-first search is not, given it burrows deep into the search space and then crops back up afterwards. Despite what are seemingly two rather effective methods of searching the space. There are actually a number of problems with these methods:
- They’re not particularly efficient: neither of these methods attempt to prioritise areas of the tree given some metric or quantifier. As a result, while they are complete, neither is particularly efficient in terms of both the time taken and the memory required in the computer to execute them.
- Both can repeat themselves: one issue neither of these algorithms (in their simplest form) handle is that sometimes your search space may have loops. State A leads to B leads to C which leads back to B again. This can result in notably depth-first search being trapped because the algorithm never factors whether it is in a state it visited already. Meanwhile breadth-first will waste time visiting previously explored states.
For a search to be intelligent, the structure of how it searches must be malleable and changes with respect to information gained from the problem. In the simplest of cases, this is achieved by understanding cost: how ‘expensive’ an action can be. What this action cost is can vary and is largely dependant on the problem: for navigation we would consider the distance travelled as cost, but we can also consider the likes of time taken and energy consumption.
In addition, another major problem is that right now, not only do we happily visit states we have already been to, but we are also prone to getting stuck inside the tree if there are loops. A simple depth-first search is notorious for this, given it it can potentially loop infinitely should a states first successor be itself courtesy of a particular action.
We will return to these issues and how to address them as we introduce informed search basics in part 8.