Placeholder Image

Subtitles section Play video

  • Creating an AI Musician with JavaScript

  • Thomas Drach KATIE: hello? There we go. That's me. Good

  • morning. Whoo! Is everybody ready? Yeah. We have a really, really cool talk to start the

  • day with. Thomas Drach is here. And when I, you know, he gave me some fun facts. But actually

  • we had a really fascinating conversation just now that I'm gonna share with you. So, back

  • in the '90s, the first two CDs that I ever bought. The first was DJ Jazzy Jeff and the

  • Fresh Prince, the rapper, parents don't understand, which was amazing. And I bought a Rick Astley CD

  • with a famous song on it that people use to Rickroll each other. And Thomas said his first

  • tape was Nsync. Clearly his musical taste is much better than me. Let's give it up for

  • Thomas Drach. [ Applause ]

  • THOMAS: Good morning. Thanks for being here early for this talk. I want to give a huge

  • thanks to JSConf for having me. They have been awesome. So grateful to be here. I also

  • want to thank any open source contributors. Because I feel like I'm using cheat code sometimes,

  • just like using someone else's code. If you're anything like me, I may have squirmed a little

  • bit at this AI acronym. And that's good. We're going through that today. But just your willingness

  • to be here, shows you're open to the ideas and pushing the boundaries. I think that's

  • commendable. Thanks for being here. Okay. My name's Thomas Drach like we talked about.

  • Thomasdrach on Twitter if you want to bug me there, please do. I consider myself a designer

  • and a bit of a hacker. Not in the sense of this awesome movie. But in the writing bad

  • code sense. I'm really good at that. I have a little design studio called Subtract where

  • I make what I hope are useful products. One of which we'll go through today. And I have

  • a product called Cleverstack too if you want to check that out. So, I want to start with

  • a man called Paul Thomas. He had a little garage in Phoenix called the Thomas Brothers

  • Garage. Today people might call him an entrepreneur or a founder. But he was actually an inventor.

  • He filed for over a hundred patents and I'm still trying to track most of them down. I

  • found about a dozen of them. Most reminded me of the Rube Goldberg machines. I don't

  • know if you have seen these before. They're simple machines that create this weird trap

  • like mechanism. All the patents I found from him were complex machines like that. Like

  • nothing you would actually use. It's kind of hard to see here. But the name of this

  • one is panel manufacturing method. And it's basically a patent for this giant machine

  • that creates these like brick and concrete slabs and then we're supposed to ship them

  • to job are sites for people to build houses with them. Kind of made me sad a little bit

  • because it just seemed like a crazy person was documenting and patenting all of these

  • things. But the machines that get built. This is the one that we were just talking about.

  • The manufacturing machine to make the giant slabs. And they actually had to design and

  • patent these, like, semitrucks to ship them to job sites. They had these weird little

  • trucks that would like transport like these palettes of bricks everywhere. And clearly

  • they didn't invent the screw or the lever or the racket pinion or these things, but

  • they combined something to make something new and useful. And it's especially interesting

  • to me because this man was my great grandfather and my namesake. Some of you might be familiar

  • with this Henry Ford quote. The funny thing is, Henry Ford never said this. I did research

  • and the first time it was attributed to him was in 1999 in the cruise industry news quarterly.

  • And other people started using it and now people say it on stages like this. Sometimes

  • it's paired with like the Steve Jobs quote and kind of create this is genius complex

  • of, they didn't know what they want. We have to show them or whatever. But I think there's

  • something that people miss about this quote in particular. I think it resonates for a

  • reason. But my interpretation of this quote is, big progress isn't necessarily just like

  • an iteration of the last thing, but it's like a mutation of something that happened before.

  • Maybe a little bit like this. We could accidently combine a few unrelated things to find something

  • new. This is Tim Berners Lee talking about inventing the Internet. He said I just had

  • to take the hypertext idea and connect it to the TC P&D NS ideas and can be ta da, I

  • had the World Wide Web. There's an old LinkedIn, his profile page, it just said like web developer.

  • But he goes on this interview, and I recommend listening or reading interviews from him.

  • He goes on just to attribute all these other inventions and says if these didn't happen,

  • the Internet, at least I wouldn't have created it at that time. I don't know what would have

  • happened. So, the definition of mutation in the dictionary is the changing of structure,

  • resulting in a variant form that may be transmitted to subsequent generations. Hendrix famously

  • took right handed guitars, flipped them upside down and then eventually changed music. And

  • he did so in part because it was before Les Paul invented the electronic guitar. They

  • didn't invent it to invent it, but because they wanted the acoustic guitar louder. And

  • Grace Hopper, is one of the inventers of what we now call programming. A big reason we're

  • here today. And she started with knobs and switches on the Mark I. All of these were

  • mutations. Like it was different enough. Hopper's was a mutation. And I think AI is a built

  • of a mutation, at least how we talk about it today. There's much more data, advances

  • in machine learning, compute power thanks to Moore's Law. And it kind of created the

  • opportunity for something like AI to work. This is what I get when I search "AI" on Google.

  • I don't know about you. But this isn't very helpful for me. So, I'll ask a little bit

  • different question today. I want to ask, what are intelligent machines? We might be able

  • to define this. Just intelligence + machines. So, let's define intelligence. This is a quote;

  • I'm just going to read it really quick. People generally distrust the concept of machines

  • that approach and thus why not pass our own human intelligence. I think a lot of people

  • feel like this today. And this quote was actually written in 1970 in the book called the architecture

  • machine actually by the person who founded the MIT Media Lab. And it goes on to say that,

  • machines must be aware of their context in order to be intelligent. So, you can't have

  • like a machine without using the context, interacting with the world. It's not intelligent

  • in that case. There's no lack of context in the new Tesla roadsters. So, for our purposes,

  • and I'm just gonna say intelligence means using context. So, now we can define machines.

  • This should be pretty easy. We go to the dictionary and find a mechanically, electronically, or

  • electrically operated device for a task. Sounds good to me. Okay. So, with intelligence and

  • machines defined, I would like to introduce you to the concept of somewhat intelligent

  • machines. And this is what we're gonna build today. And this is just something that uses

  • context and rapidly completes something that a human could not. And we're gonna do all

  • of it in JavaScript. So, this is the actual machine instrument, musician, AI, whatever

  • you want to call it. This is what we're gonna build today. I'm going to walk through how

  • to generate drumbeats using pre trained machine learning models, APIs, libraries, stuff like

  • that. We're going to piece it together. And I find it a little bit hard for me to follow

  • tiny code so it's gonna be a little pseudo codey. Like I said, the first thing we needed

  • were a couple libraries we're going to use Magenta. If you haven't heard of Magenta already,

  • please check it out. It's incredible. A couple of people have talked about it already here

  • at JSConf. And then we're going to use Tone, which is actually a dependency Magenta which

  • gives us an easier to code interface for musical stuff. All right. Let's play some drums. This

  • is what the data structure for the drums will look like. You can set up a step sequence,

  • but this is a step sequence in Magenta. There's a pitch, there's an attribute that tells it's

  • a sample based pitch, not a tonal keyboard like thing. And there's quantization info.

  • There's a method that does that for you so you don't have to worry about it. Okay. So,

  • all we need to play that note sequence is two lines of code. We're going to create a

  • new instance in the Magenta music player. And I'm going to call it player.star on that.

  • And we're gonna get something like this. [drumbeats]

  • This is just our basic pattern that we plugged in. Right? It's not that exciting. We kind

  • of want something a little bit more like this. Like feed it in and we'll get something better

  • in the material. All right. But in order to do that and do super quick ML crash course.

  • I am not the one to go in depth about this. But let's all get on the same page. Okay.

  • So, usually write functions something like this. We want something, we put something

  • in, we want something back. Machine learning, it's a little bit more abstract, right? We

  • don't know abstractly like what we  how we would get there. Like, is this image a dog?

  • Here's the image. I don't know. Some of you might see 

  • [ Laughter ] Memes like this. This is one of my favorites.

  • Chicken fingers and the goldendoodles, I think. I don't know how dogs and cats became like

  • the hello world of machine learning. But I'm not mad about it. So, here's what we call

  • training data. If it was training data, it was probably labeled. So, this is a dog. I

  • probably should have said like fried chicken. That's not an actual chicken. So, you feed

  • all that to the machine. The machine says all these are dogs. They have this weird odd

  • thing on their face. We call that a feature. That feature to us looks like a nose. The

  • machine goes, okay, there's a nose. It's probably a dog. So, we feed them the image and it's

  • gonna guess, dog. All of these are just like probabilities. For our purposes, we want to

  • give it some drums and we want some better drums in turn. So, that's where Magenta comes

  • in. Magenta has a couple different models available. All of these are super cool and

  • it seems like they're coming out with more every week, every month. So, there's a MusicRNN

  • model, a Music VAE and a Piano Genie. Right now the Piano Genie is a VAI as well. Just

  • quick, RNN stands for recounter neural network, a bunch of nodes. It's like one of those.

  • But it loops through itself. And a VAE is a variational auto encoder. If you're familiar

  • with encoding and decoding, it works similarly to that. For our purposes, we're going to

  • use the MusicRNN models just in the context of Magenta. They have a little bit better

  • support for like individual instruments like drums. And this is kind of what that might

  • look like. So, if you have nodes on the network, you have it looping through itself and you

  • have an in and an out. For us, we're going to put in our initial drum beat and we're

  • going to expect a generated drum beat in return. Okay. So, we picked our MusicRNN models. This

  • is what the actual checkpoint is. So, this is like a pre trained model. Trained with

  • millions of drumbeats and it has a sense of what drumbeats are. There's a kick on one,

  • there's a snare on two or something like that. So, these are the three lines of code that

  • we need to generate a new drumbeat. So, you just create a new instance of our MusicRNN

  • model, the checkpoint that we had. We initialize the model, it loads itself up. And then we

  • call this method continue sequence. We feed it in our note sequence, feed the number of

  • steps which is kind of arbitrary. Could be 16, 32 or whatever. And then we feed it a

  • number from zero to two. We'll go over the temperature a little bit letter. So, after

  • we do that, we just get a sample in return and we play the same way we played the other

  • one. So, this is what that looked like. This is gonna be generated beat with a temperature

  • of 1.5. [[drumbeat]

  • And if you generate it again, it's going to come up with now beats that we've never heard.

  • Cool. So, yeah. [ Applause ] all right. That was cool. But

  • it was a little bit of a blackbox. So, I want to go through what happens when we call them

  • with a continue sequence. We call it here in the three lines are of code. All we're

  • gonna do here is what's happening behind the scenes is we're gonna convert the note sequence

  • which is that drum thing. We're going to convert it to a Tensor. And then we're going to encode

  • the tensor to match the model, the checkpoint that you have. If you're wondering what a

  • tensor is, you probably already know. If you remember municipal math, is scalars and vectors.

  • A tensor has three dimensions. That's why you hair the word shape when talking about

  • machine learning. And especially TensorFlow. These are all  but the last is tensor. And

  • then there's an internal method called sampleRNN. The inputs go into the TensorFlow library

  • and generates the next notes. If you want to get into the nitty gritty, TensorFlow.JS

  • is a great place to actually get your hands dirty there. It helps me to visualize like

  • this. Once more, the continueSequence. And the note here, the noteSequence, convert it

  • to a model, call the sampleRNN and get the new drums. I told you we were gonna talk about

  • temperature. It's interesting to me because it's one of the few inputs we have available.

  • Restructure train it with different drumbeats, which is kind of cool. But temperature is

  • like the level of entropy in the system. So, the lower the temperature, the more predictable

  • result we're gonna get. The higher, the less predictable it will be. So, just as an example,

  • and drop it down here to 0.2. Sounds like really similar to the original

  • drumbeat. And if we keep generating it, it's pretty much like the same, right? So, now

  • we're gonna try cranking it up to 1.5. [Drumbeats  ]

  • So, a little bit more exciting, for sure. This is the temperature I like. More fun.

  • And after we do that, we just have like a little demo button here. It will generate

  • a new file. And then sometimes what I will do is I will drop it in this, garage band

  • and use it as a musician for my band. If you're wondering why there's no audio right now,

  • it's because we're not judging my music skills today. We're talking about JavaScript. Okay.

  • So, that was cool. It was like almost somewhat intelligent. I wanted to take it one step

  • further. So, I wanted to give the machine a little bit of motivation with applause.

  • So, depending on how much you applaud it, the machine would then generate a new temperature.

  • Here we go. The machine would generate a new temperature based on like the average amplitude

  • of a couple seconds period of time. I wanted more context to have a better definition of

  • our somewhat intelligent machine. So, I literally injected more context into it. So, this is

  • pretty simple. I'm just getting the user's microphone. And I have this little method

  • here called analyze sound. I'm going to use create script processer and just take the

  • average volume over a couple seconds. Okay. And against my better judgment, we're gonna

  • do a live demo. Okay. So, this is the drumbeat. That's the normal drum beat we programmed

  • in. Then we can generate one. Drop it down a little bit. So, this is like pretty cool.

  • Generating a new one every time. Okay. Now I need your help. So, I've created

  • this little perform feature. When I click the button, it's going to wait for applause

  • for a couple seconds and then it's gonna take that average amplitude over that period of

  • time, decide on what temperature to play, and then generate the beat based on that.

  • I promise I'm not trying to manufacture applause for myself. Maybe a little bit. Okay. So,

  • let's try this. On the count of three, be like nice. But loud. I'm  on the count of

  • three, start applauding. I'm going to hit the button right after you start applauding

  • and then we'll see what happens. Live demos always work. So with, this should be great.

  • On the count of three, one, two, three. [ Applause ]

  • Yeah! All right. So, that's that. So, it actually goes from like zero to 2. It goes up pretty 

  • it still is morning. But you're being considerate. I'm fine with that. Cool. So, that's that.

  • That is our somewhat intelligent machine. So, did we use context? I think so. We put

  • in our drumbeat. We took applause. We told it the steps we wanted. It definitely rapidly

  • completed something that we couldn't do on our own, right? We can generate like a dozen

  • or so drumbeats just in a couple seconds. So, I think we did it. Other people have created

  • some really cool things. This is called a neural computer. Usually play a couple notes,

  • an arpeggio, bounce back and forth. But this will take the temperature into effect. It

  • uses the improvRNN model from Magenta. I really like it. The Magenta team created

  • kind of like what we just did, but inside of able10. If you use it, you can do what

  • we just did, and generate right inside. And the Flaming Lips actually created this thing

  • called the framing  the Flaming Lips and Magenta created this thing called Fruit Genie.

  • And it was fruit, but it would say like orange and it would feed the model. And then they

  • created like these giant pool toy type things that had censors on them and then threw them

  • into the audience and asked people to feed it into the same model and create this, like,

  • melody. This is a little clip of what that looked like.

  • >> Written this song especially for tonight's occasion.

  • THOMAS: So, they threw out these things into the audience. And people  and you could hear

  • it in the melody like cycle back. There's not like 

  • >> Apple  THOMAS: So, all of these things, all the stuff

  • we just talked about. All of it was just Tone.js and Magenta and we created our own as well.

  • We used a couple other previous inventions, sure. But that was kind of the point, right?

  • Combining these simple machines to kind of create something more complex. We didn't reinvent

  • the wheel, by any means. We didn't have to. We just created something a little bit smarter

  • than it was before with the tools that I had at our disposal. I think we can keep doing

  • this. We can keep like flipping our tools and creating things that are new and useful

  • for people and helpful and interesting. And hopefully the inventions that we piece together,

  • the sum will be greater than its parts. This is such an exciting time to be building stuff.

  • And I can't wait to see what we all build next. So, thank you.

  • [ Applause ] KATIE: Wow. Oh, my gosh, all right, I'm gonna

  • gush for a second about the Flaming Lips. They're one of my favorite bands. I've seen

  • them live four or five times. If you haven't seen them, even if you don't particularly

  • love their music, it's an amazing experience. You should go and do it. I'm going to stop

  • gushing about the Flaming Lips and now I'm going to gush about Thomas. That was really

  • cool and I really love his message that, you know, like he's not some kind of crazy genius.

  • He's just like a person who is really into music and really wanted to try something cool.

  • And that we all could do this with JavaScript. It's like amazing, right? Anyway, so, coming

  • up next we have Sophia Shoemaker is going to talk about building a PWA that had to work

  • off the grid in an African country which I can't remember which one, but we need to be

  • back here at 10:30 for that. So, you have a couple minutes to go out and switch rooms

  • if you want. But you shouldn't. You should stay here. All right. Thanks, everybody.

  • [ Applause ]

Creating an AI Musician with JavaScript

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it