Placeholder Image

Subtitles section Play video

  • Jabril: John Green Bot are you serious?!

  • I made this game and you beat my high score?

  • John-Green-bot: Pizza!

  • Jabril: So John Green Bot is pretty good at Pizza Jump, but what about this new game we made, TrashBlaster?

  • John-Green-bot: Hey, that's me!

  • Jabril:Yeah, let's see watch you've got.

  • John-Green-bot: That's not fair, Jabril!!

  • Jabril: It's okay John Green Bot we've got you covered.

  • Today we're gonna design and build an AI program to help you play this game like a pro.

  • INTRO

  • Hey, I'm Jabril and welcome to Crash Course AI!

  • Last time, we talked about some of the ways that AI systems learn to play games.

  • I've been playing video games for as long as I can remember.

  • They're fun, challenging, and tell interesting stories where the player gets to jump on goombas

  • or build cities or cross the road or flap a bird.

  • But games are also a great way to test AI techniques because they usually involve simpler

  • worlds than the one we live in.

  • Plus, games involve things that humans are often pretty good at like strategy, planning,

  • coordination, deception, reflexes, and intuition.

  • Recently, AIs have become good at some tough games, like Go or Starcraft II.

  • So our goal today is to build an AI to play a video game that our writing team and friends

  • at Thought Cafe designed called TrashBlaster!

  • The player's goal in TrashBlaster is to swim through the ocean as a little virtual

  • John-Green-bot, and destroy pieces of trash.

  • But we have to be careful, because if John-Green-bot touches a piece of trash, then he loses and

  • the game restarts.

  • Like in previous labs, we'll be writing all of our code using a language called Python

  • in a tool called Google Colaboratory.

  • And as you watch this video, you can follow along with the code in your browser from the

  • link we put in the description.

  • In these Colaboratory files, there's some regular text explaining what we're trying

  • to do, and pieces of code that you can run by pushing the play button.

  • These pieces of code build on each other, so keep in mind that we have to run them in

  • order from top to bottom, otherwise we might get an error.

  • To actually run the code and experiment with changing it, you'll have to either click

  • open in playgroundat the top of the page or open the File menu and clickSave

  • a Copy to Drive”.

  • And just an fyi: you'll need a Google account for this.

  • So to create this game-playing AI system, first, we need to build the game and set up

  • everything like the rules and graphics.

  • Second, we'll need to think about how to create a TrashBlaster AI model that can play

  • the game and learn to get better.

  • And third, we'll need to train the model and evaluate how well it works.

  • Without a game, we can't do anything.

  • So we've got to start by generating all the pieces of one.

  • To start, we're going to need to fill up our toolbox by importing some helpful libraries,

  • such as PyGame.

  • The first step in 1.1 and 1.2 loads the libraries, and step 1.3 saves the game so we can watch

  • it later.

  • This might take a second to download.

  • The basic building blocks of any game are different objects that interact with each other.

  • There's usually something or someone the player controls and enemies that you battle

  • -- All these objects and their interactions with one another need to be defined in the

  • code.

  • So to make TrashBlaster, we need to define three objects and what they do: a blaster,

  • a hero, and trash to destroy.

  • The blaster is what actually destroys the trash, so we're going to load an image that

  • looks like a laser-ball and set some properties.

  • How far does it go, what direction does it fly, and what happens to the blast when it

  • hits a piece of trash?

  • Our hero is John-Green-bot, so now we've got to load his image, and define

  • properties like how fast he can swim and how a blast appears when he uses his blaster.

  • And we need to load an image for the trash pieces, and then code how they

  • move and what happens if they get hit by a blast, like, for example, total destruction

  • or splitting into 2 smaller pieces.

  • Finally, all these objects are floating in the ocean, so we need a piece of code to generate

  • the background.

  • The shape of this game's ocean is toroidal, which means it wraps around, and if any object

  • flies off the screen to the right, then it will immediately appear on the far left side.

  • Every game needs some way to track how the player's doing, so we'll show the score too.

  • Now that we have all the pieces in place, we can actually build the game and decide

  • how everything interacts.

  • The key to how everything fits together is the run function.

  • It's a loop of checking whether the game is over; moving all the objects; updating

  • the game; checking whether our hero is okay; and making new trash.

  • As long as our hero hasn't bumped into any trash, the game continues.

  • That's pretty much it for the game mechanics.

  • We've created a hero, a blaster, trash, and a scoreboard, and code that controls their

  • interactions.

  • Step 2 is modeling the AI's brain so John-Green-bot can play!

  • And for that, we can turn back to our old friend the neural network.

  • When I play games, I try to watch for the biggest threat because I don't want to lose.

  • So let's program John-Green-bot to use a similar strategy.

  • For his neural network's input layer, let's consider the 5 pieces of trash that are closest

  • to his avatar.

  • (And remember, the closest trash might actually be on the other side of the screen!)

  • Really, we want John-Green-bot to pay attention to where the trash is and where it's going.

  • So we want the X and Y positions relative to the hero, the X and Y velocities relative

  • to the hero, and the size of each piece of trash.

  • That's 5 inputs for 5 pieces of trash, so our input layer is going to have 25 nodes.

  • For the hidden layers, let's start small and create 2 layers with 15 nodes each.

  • This is just a guess, so we can change it later if we want.

  • Because the output of this neural network is gameplay, we want the output nodes to be

  • connected to the movement of the hero and shooting blasts.

  • So there will be 5 nodes total: an X and Y for movement, an X and Y direction for aiming

  • the blaster, and whether or not to fire the blaster.

  • To start, the weights of the neural network are initialized to 0, so the first time John-Green-bot

  • plays he basically sits there and does nothing.

  • To train his brain with regular supervised learning, we'd normally say what the best

  • action is at each timestep.

  • But because losing TrashBlaster depends on lots of collective actions and mistakes, not

  • just one key moment, supervised learning might not be the right approach for us.

  • Instead, we'll use reinforcement learning strategies to train John-Green-bot based on

  • all the moves he makes from the beginning to the end of a game, and we'll evolve

  • a better AI using a genetic algorithm which is commonly referred to as GA.

  • To start, we'll create some number of John-Green-bots with empty brains

  • (let's say 200), and we'll have them play TrashBlaster.

  • They're all pretty terrible, but because of luck,

  • some will probably be a little bit less terrible.

  • In biological evolution, parents pass on most of their characteristics to their offspring

  • when they reproduce.

  • But the new generation may have some small differences, or mutations.

  • To replicate this, we'll use code to take the 100 highest-scoring John-Green-bots and

  • clone each of them as our reproduction step.

  • Then, we'll slightly and randomly change the weights in those 100 cloned neural networks,

  • which is our mutation step.

  • Right now, we'll program a 5% chance that any given weight will be mutated, and randomly

  • choose how much that weight mutates (so it could be barely any change or a huge one).

  • And you could experiment with this if you like.

  • Mutation affects how much the AI changes overall, so it's a little bit like the learning rate

  • that we talked about in previous episodes.

  • We have to try and balance steadily improving each generation with making big changes that

  • might be really helpful (or harmful).

  • After we've created these 100 mutant John-Green-bots, we'll combine them with the 100 unmutated

  • original models (just in case the mutations were harmful) and have them all play the game.

  • Then we evaluate, clone, and mutate them over and over again.

  • Over time, the genetic algorithm usually makes AI that are gradually better at whatever they're

  • being asked to do, like play TrashBlaster.

  • This is because models with better mutations will be more likely to score high and reproduce

  • in the future.

  • ALL of this stuff, from building John-Green-bot's neural network to defining mutation for our

  • genetic algorithm, are in this section of code.

  • After setting up all that, we have to write code to carefully define what doingbetter

  • at the game means.

  • Destroying a bunch of trash?

  • Staying alive for a long time?

  • Avoiding off-target blaster shots?

  • Together, these decisions about whatbettermeans define an AI model's fitness.

  • Programming this function is pretty much the most important part of this lab, because how

  • we define fitness will affect how John-Green-bot's AI will evolve.

  • If we don't carefully balance our fitness function, his AI could end up doing some pretty

  • weird things.

  • For example, we could just define fitness as how long the player stays alive, but then

  • John-Green-bot's AI might play \TrashAvoider\ and dodge trash instead of TrashBlaster and

  • destroy trash.

  • But if we define the fitness to only be related to how many trash pieces are destroyed, we

  • might get a wild hero that's constantly blasting.

  • So, for now, I'm going to try a fitness function that keeps the player alive and blasts

  • trash.

  • We'll define the fitness as +1 for every second that John-Green-bot stays alive, and

  • +10 for every piece of trash that is zapped.

  • But it's not as fun if the AI just blasts everywhere, so let's also add a penalty

  • of -2 for every blast he fires.

  • The fitness for each John-Green-bot AI will be updated continuously as he plays the game,

  • and it'll be shown on the scoreboard we created earlier.

  • You can take some time to play around with this fitness function and watch how John-Green-bot's

  • AI can learn and evolve differently.

  • Finally, we can move onto Step 3 and actually train John-Green-bot's AI to blast some trash!

  • So first, we need to start up our game.

  • And to kick off the genetic algorithm, we have to define how many randomly-wired John-Green-bot

  • models we want in our starting population.

  • Let's stick with 200 for now.

  • If we waited for each John-Green-bot model to start, play, and lose the gamethis

  • training process could take DAYS.

  • But because our computer can multitask, we can use a multiprocessing package to make

  • all 200 AI models play separate games at the same time, which will be MUCH faster.

  • And this is all part of the training.

  • This is where we'll code in the details of the genetic algorithm, like sorting John-Green-bots

  • by their fitness and choosing which ones will reproduce.

  • Now that we have the 100 John-Green-bots that we want to reproduce, this code will clone

  • and mutate them so we have that combined group of 100 old and 100 mutant AI models.

  • Then, we can run 200 more games for these 200 John-Green-bots.

  • It just takes a few seconds to go through them all thanks to that last chunk of code.

  • And we can see how well they do!

  • The average score of the AI models that we picked to reproduce is almost twice as high

  • as the overall average.

  • Which is good!

  • It means that the John-Green-bot is learning something.

  • We can even watch a replay of the best AI.

  • Uheven the best isn't very exciting right now.

  • We can see the fitness function changing as time passes, but the hero's just sitting

  • there not getting hit and shooting forward - we want John-Green-bot to actually play,

  • not just sit still and get lucky.

  • We can also see visual representation of this specific neural network, where higher weights

  • are represented by the redness of the connections.

  • It's tough to interpret what exactly this diagram means, but we can keep it in mind

  • as we keep training John-Green-bot.

  • Genetic algorithms take time to evolve a good model.

  • So let's change the number of iterations in the loop in STEP 3.3, and run the training

  • step 10 times to repeatedly copy, mutate, and test the fitness

  • of these AI models.

  • Okay, now I've trained 10 more iterations.

  • And if I view a replay of the last game, we can see that John-Green-bot is doing a little

  • better.

  • He's moving around a little and actually sort of aiming.

  • If we keep training, one model might get lucky, destroy a bunch of trash, has a high fitness,

  • and gets copied and mutated to make future generations even better.

  • But John-Green-bot needs lots of iterations to get really good at TrashBlaster.

  • You might consider changing the number of iterations to 50 or 100 times per click

  • which might take a while.

  • Now here's an example of a game after 15,600 training iterations just look at John-Green-bot

  • swimming and blasting trash like a pro.

  • And all this was done using a genetic algorithm, raw luck, and a carefully crafted fitness function.

  • Genetic algorithms tend to work pretty well on small problems like getting good at TrashBlaster.

  • When the problems get bigger, the random mutations of genetic algorithms are sometimeswell,

  • too random to create consistently good results.

  • So part of the reason this works so well is because John-Green-bot's neural network

  • is pretty tiny compared to many AIs created for industrial-sized problems.

  • But still, it's fun to experiment with AI and games like TrashBlaster.

  • For example, you can try to change the values of the fitness function and see how John-Green-bot's

  • AI evolves differently.

  • Or you could change how the neural network gets mutated, like by messing with the structure

  • instead of the weights.

  • Or you could change how much the run function loops per second, from 5 times a second to

  • 10 or 20, and give John-Green-bot superhuman reflexes.

  • You can download the clip of your AI playing TrashBlaster by looking for game_animation.gif

  • in the file browser on the left-hand side of the Colaboratory file.

  • You can also download source code from Github to run on your own computer if you want to

  • experiment (we'll leave a link in the description).

  • And next time, we'll start shifting away from games and learn about other ways that

  • humans and AI can work together in teams. See ya then.

  • Crash Course AI is produced in association with PBS Digital Studios.

  • If you want to help keep Crash Course free for everyone, forever, you can join our community

  • on Patreon.

  • And if you want to learn more about genetics and evolution check out Crash Course Biology.

Jabril: John Green Bot are you serious?!

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it