Placeholder Image

Subtitles section Play video

  • [WHISTLE] Hello, and welcome to a coding challenge, Tic, Tac,

  • Toe.

  • Oh, with the Minimax algorithm.

  • That's why I'm here, because I made this other coding

  • challenge, Tic-Tac-Toe, where I created a kind of a big--

  • it was kind of a mess if I'm being perfectly honest.

  • But I made a working version of the Tic-Tac-Toe game

  • that just played with two players picking random spots.

  • So since then, you can check one of my live streams

  • if you want to find where I did this.

  • I made some adjustments to it so that I could, as a human being,

  • play the game.

  • So right now I'm going to play the random computer picker.

  • I'm going to go here.

  • And then I'm going to block X. And then I'm going to go here.

  • And ha, ha, I win.

  • O wins.

  • So what I have the adjustment that I

  • made is that I added a mouse pressed function

  • where I find where did I click?

  • And I put my human variable, which

  • is the letter o onto the board.

  • And then I called next turn where next turn

  • picks a random spot in the board and makes

  • that the AI spot or X's spot.

  • So the whole point of this video is

  • for me to implement something called

  • Minimax, which is an algorithm, a search algorithm, if you

  • will, to find the optimal next move for the AI.

  • I mean, it could find it for myself.

  • And then I could implement it.

  • But the idea is I want this player, the computer player

  • to beat me at the game or at least

  • tie to play a perfect game.

  • If you want to learn more about the Minimax algorithm,

  • I would suggest two resources that you

  • can look at that I actually looked at before beginning

  • this coding challenge.

  • I haven't programmed this before.

  • But I watched this video by Sebastian Lague that

  • explains the Minimax algorithm, also something called

  • alpha-beta pruning, which I'm not going to implement

  • but could be a good exercise, next step for you.

  • And I also found this article on the geeksforgeeks.org website,

  • which is three-part series about the Minimax algorithm

  • and how to apply it to Tic-Tac-Toe.

  • So those resources are probably your best bet

  • for learning about the theory of the Minimax algorithm

  • and how it can be applied to a wide variety of scenarios.

  • But I'm going to look at the specifics of Tic-Tac-Toe, which

  • is a great starting example, because it's

  • a simple enough game that I can actually diagram out

  • all of the possibilities, which even if that's computationally

  • possible, it's very hard to diagram.

  • And there's lots of scenarios where you couldn't even compute

  • all of the possible outcomes.

  • And chess is an example of that.

  • So Minimax is typically visualized as a tree.

  • So the root of the tree is the current state

  • of the game board.

  • [MUSIC PLAYING]

  • So I've drawn a configuration here of the Tic-Tac-Toe board

  • midway through the game because I

  • don't want to start at the beginning

  • where there's so many possibilities

  • that I couldn't diagram.

  • So right now it's X's turn.

  • And X has three possible moves.

  • So I could express that in a tree diagram as move one,

  • move two, and move three.

  • So let's take a look at what I mean by that.

  • [MUSIC PLAYING]

  • I'm going to use a blue marker so you can see the new moves.

  • So let's say one possible move is X going here.

  • Let me diagram out the next two possible moves.

  • [MUSIC PLAYING]

  • Another possible move is X going here.

  • [MUSIC PLAYING]

  • And another possible move is X going here.

  • So we need to look at each of these

  • and determine is the game over, or is it continuing?

  • So only in this case is the game over.

  • And in this case, the game is over, and X has one.

  • So I'm going to mark this green for the state of X

  • having won the game.

  • Now, it's O's turn.

  • And for each of these spots, O could

  • make two possible options.

  • [MUSIC PLAYING]

  • O could go here, or O could go here.

  • Oh could go here, or O could go here.

  • And look at these.

  • In this case and this case, O has won.

  • Remember, we're trying to solve for the optimal move

  • for X. X is making the first turn.

  • So this means I can mark these were O wins as red, as

  • like, those are bad outcomes.

  • So while these are not terminal states,

  • there's only one move possible for X. So I could draw an arrow

  • and put that down here.

  • But ultimately I can just consider this

  • as X has to go here.

  • So these, you could consider as terminal and in which case

  • X will win.

  • Now, you see that I visualized all the possible outcomes

  • of the Tic-Tac-Toe game based on this starting configuration.

  • So how does X pick which path to go down

  • where we can see here that the goal is X to win the game

  • to get to here, here, or there.

  • How does it do that?

  • And the way that it does that--

  • by the way, spoiler alert.

  • This is the move it should pick.

  • It should just win instantly.

  • But the idea of the Minimax algorithm

  • is if we could score every possible outcome,

  • those scores can ripple back up the tree

  • and help X decide which way to go to find the highest

  • possible score.

  • The nice thing about Tic-Tac-Toe is

  • there's not a range of scores.

  • The endpoints are either you won, you lost, or you tied.

  • I could say if X wins, it's a score of plus 1 .

  • If O wins, it's a score of negative 1.

  • And if it's a tie, it's a score of 0.

  • So in other words, we can give this a score of plus 1.

  • This a score of plus 1 .

  • This a score of negative 1.

  • This a score of negative 1, and this a score of plus 1.

  • But we've got an issue.

  • I'm trying to pick between this option, this option,

  • and this option.

  • These two don't have a score.

  • How do I pick their score?

  • Well, I can pick their score based on which one of these

  • would be picked.

  • But who's picking these?

  • This is why the algorithm is called Minimax,

  • because we have two players each trying

  • to make an optimal score.

  • X is what's known as the maximizing player.

  • So to go from here to here, it's X's turn.

  • And X is maximizing.

  • But here, it's now O's turn.

  • O wants to win.

  • X can assume that O is going to play their optimal move.

  • Now, you could have a situation where your player doesn't

  • play optimally.

  • And that's something you might have to consider

  • in terms of game theory.

  • But in this case of the algorithm,

  • we can assume that O is going to make the best possible move.

  • I mean, if it makes a worse move, that's only better for us

  • anyway so it's all going to work out.

  • So O is what's referred to as the minimizing player--

  • the minimizing player.

  • So O has the option of picking negative 1 or plus 1,

  • negative 1 or plus 1.

  • Which one is O going to pick as the minimizing player?

  • The option to win so we can then take the best outcome for O,

  • the minimizing outcome for O and pass that back up

  • here's the score and the same thing here.

  • Now, X is the maximizing player.

  • Which one of these is it going to pick--

  • negative 1, plus 1, or negative 1.

  • These are negative 1, because even

  • though it could win here, if X goes this way,

  • O is going to win.

  • If X goes this way, O is going to win if O plays optimally.

  • So this is the way to go.

  • This scenario that I picked might not

  • be the best demonstration of the idea.

  • In fact, there's no ties here.

  • There's only two turns being played.

  • So maybe what you might want to do is pause this video