Subtitles section Play video
[WHISTLE] Hello, and welcome to a coding challenge, Tic, Tac,
Toe.
Oh, with the Minimax algorithm.
That's why I'm here, because I made this other coding
challenge, Tic-Tac-Toe, where I created a kind of a big--
it was kind of a mess if I'm being perfectly honest.
But I made a working version of the Tic-Tac-Toe game
that just played with two players picking random spots.
So since then, you can check one of my live streams
if you want to find where I did this.
I made some adjustments to it so that I could, as a human being,
play the game.
So right now I'm going to play the random computer picker.
I'm going to go here.
And then I'm going to block X. And then I'm going to go here.
And ha, ha, I win.
O wins.
So what I have the adjustment that I
made is that I added a mouse pressed function
where I find where did I click?
And I put my human variable, which
is the letter o onto the board.
And then I called next turn where next turn
picks a random spot in the board and makes
that the AI spot or X's spot.
So the whole point of this video is
for me to implement something called
Minimax, which is an algorithm, a search algorithm, if you
will, to find the optimal next move for the AI.
I mean, it could find it for myself.
And then I could implement it.
But the idea is I want this player, the computer player
to beat me at the game or at least
tie to play a perfect game.
If you want to learn more about the Minimax algorithm,
I would suggest two resources that you
can look at that I actually looked at before beginning
this coding challenge.
I haven't programmed this before.
But I watched this video by Sebastian Lague that
explains the Minimax algorithm, also something called
alpha-beta pruning, which I'm not going to implement
but could be a good exercise, next step for you.
And I also found this article on the geeksforgeeks.org website,
which is three-part series about the Minimax algorithm
and how to apply it to Tic-Tac-Toe.
So those resources are probably your best bet
for learning about the theory of the Minimax algorithm
and how it can be applied to a wide variety of scenarios.
But I'm going to look at the specifics of Tic-Tac-Toe, which
is a great starting example, because it's
a simple enough game that I can actually diagram out
all of the possibilities, which even if that's computationally
possible, it's very hard to diagram.
And there's lots of scenarios where you couldn't even compute
all of the possible outcomes.
And chess is an example of that.
So Minimax is typically visualized as a tree.
So the root of the tree is the current state
of the game board.
[MUSIC PLAYING]
So I've drawn a configuration here of the Tic-Tac-Toe board
midway through the game because I
don't want to start at the beginning
where there's so many possibilities
that I couldn't diagram.
So right now it's X's turn.
And X has three possible moves.
So I could express that in a tree diagram as move one,
move two, and move three.
So let's take a look at what I mean by that.
[MUSIC PLAYING]
I'm going to use a blue marker so you can see the new moves.
So let's say one possible move is X going here.
Let me diagram out the next two possible moves.
[MUSIC PLAYING]
Another possible move is X going here.
[MUSIC PLAYING]
And another possible move is X going here.
So we need to look at each of these
and determine is the game over, or is it continuing?
So only in this case is the game over.
And in this case, the game is over, and X has one.
So I'm going to mark this green for the state of X
having won the game.
Now, it's O's turn.
And for each of these spots, O could
make two possible options.
[MUSIC PLAYING]
O could go here, or O could go here.
Oh could go here, or O could go here.
And look at these.
In this case and this case, O has won.
Remember, we're trying to solve for the optimal move
for X. X is making the first turn.
So this means I can mark these were O wins as red, as
like, those are bad outcomes.
So while these are not terminal states,
there's only one move possible for X. So I could draw an arrow
and put that down here.
But ultimately I can just consider this
as X has to go here.
So these, you could consider as terminal and in which case
X will win.
Now, you see that I visualized all the possible outcomes
of the Tic-Tac-Toe game based on this starting configuration.
So how does X pick which path to go down
where we can see here that the goal is X to win the game
to get to here, here, or there.
How does it do that?
And the way that it does that--
by the way, spoiler alert.
This is the move it should pick.
It should just win instantly.
But the idea of the Minimax algorithm
is if we could score every possible outcome,
those scores can ripple back up the tree
and help X decide which way to go to find the highest
possible score.
The nice thing about Tic-Tac-Toe is
there's not a range of scores.
The endpoints are either you won, you lost, or you tied.
I could say if X wins, it's a score of plus 1 .
If O wins, it's a score of negative 1.
And if it's a tie, it's a score of 0.
So in other words, we can give this a score of plus 1.
This a score of plus 1 .
This a score of negative 1.
This a score of negative 1, and this a score of plus 1.
But we've got an issue.
I'm trying to pick between this option, this option,
and this option.
These two don't have a score.
How do I pick their score?
Well, I can pick their score based on which one of these
would be picked.
But who's picking these?
This is why the algorithm is called Minimax,
because we have two players each trying
to make an optimal score.
X is what's known as the maximizing player.
So to go from here to here, it's X's turn.
And X is maximizing.
But here, it's now O's turn.
O wants to win.
X can assume that O is going to play their optimal move.
Now, you could have a situation where your player doesn't
play optimally.
And that's something you might have to consider
in terms of game theory.
But in this case of the algorithm,
we can assume that O is going to make the best possible move.
I mean, if it makes a worse move, that's only better for us
anyway so it's all going to work out.
So O is what's referred to as the minimizing player--
the minimizing player.
So O has the option of picking negative 1 or plus 1,
negative 1 or plus 1.
Which one is O going to pick as the minimizing player?
The option to win so we can then take the best outcome for O,
the minimizing outcome for O and pass that back up
here's the score and the same thing here.
Now, X is the maximizing player.
Which one of these is it going to pick--
negative 1, plus 1, or negative 1.
These are negative 1, because even
though it could win here, if X goes this way,
O is going to win.
If X goes this way, O is going to win if O plays optimally.
So this is the way to go.
This scenario that I picked might not
be the best demonstration of the idea.
In fact, there's no ties here.
There's only two turns being played.
So maybe what you might want to do is pause this video