Placeholder Image

Subtitles section Play video

  • All right.

  • So this is simultaneously really impressive and really frightening at the same time.

  • And it's hitting me in ways that I didn't really expect.

  • So, do you remember Will Smith eating spaghetti?

  • Do you remember when this was what AI generated videos look like?

  • Remember when we said, OK, this AI stuff is cool and all but clearly, there's a long way to go before there's any need for concern.

  • Well, welcome to the future, people, because this is also an AI-generated video.

  • And so is this. Completely synthesized out of thin air by computers.

  • This one too.

  • This is not real. Absolutely ridiculous, how far we've come in literally one year.

  • This does feel like another ChatGPT DALL-E moment for AI and maybe I'm overreacting because OK, I'm a video creator.

  • So an AI that's actually doing my job, maybe that feels a little more threatening.

  • So I'm particularly impressed by it.

  • But also this stuff is really good.

  • So today, Sam Altman and OpenAI announced a new model called Sora and it can generate full up to one-minute video clips from just text input.

  • So the same way DALL-E was able to understand our text input and turn it into a photo realistic or stylized image or whatever you want.

  • Same thing with Sora.

  • But now since it's videos, it also needs to "understand" how all these things like reflections and textures and materials and physics all interact with each other over time to make a reasonable looking video.

  • And of course, right away, there's a bunch of examples on their website that are crazy.

  • Now, before I show you these, I just need you to keep this in mind.

  • You're about to watch a bunch of AI-generated videos and you know that you're about to watch a bunch of AI-generated content.

  • So your brain, you're already looking for this stuff and it's not perfect.

  • You will find imperfections but not everybody who sees AI-generated content on the internet knows to be looking for that.

  • So also keep that in mind.

  • This is also the worst that this technology is going to be from here on now.

  • So, OK, here's one of the videos. There's no audio to any of these clips, but the prompt for this one is a stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage.

  • She wears a black leather jacket, a long red dress and black boots.

  • This video is already miles ahead of where we were.

  • It has accurate lighting, it has materials, it has skin tones, movements, even has reflections all over the place.

  • Now, of course, if you look at it for more than about 10 seconds very closely, there are lots of giveaways like this dude in the background kind of looks like he's gliding in a weird way.

  • The frame rates and the reflections in the water are for some reason lower than the rest of the video.

  • The camera movement overall is just a bit inconsistent and it just, I don't know, it just kind of feels a little bit off, but then again, and this is where we were one year ago.

  • So just keep that in the back of your head for all this.

  • OK. How about this one?

  • This is another one which has a long prompt about a camera following behind a white vintage SUV with a black roof rack as it speeds up a steep dirt road.

  • This is also, again, really good.

  • It kind of looks a little more video gamey because of how rock solid the drone footage is but clearly very usable.

  • Here's another one, a litter of golden retriever puppies playing in the snow, their heads pop in and out of the snow covered in it.

  • It's so good.

  • It feels like the physics of the fur and the ears and everything with the snow flying around in slow motion is incredible.

  • I've looked through all the sample videos on OpenAI's website and clearly these are the hand picked best ones that they chose to share where they just put in some text and then get a video and don't modify it.

  • But there's really impressive stuff in there.

  • Some of it has humans, some of it doesn't, some of it is more realistic feeling like the truck driving one.

  • But some of them are more video gamey or more stylized.

  • A lot of it is slow motion.

  • I just have to say how insanely fast these models are improving is genuinely like, that's the shocking part.

  • Like I remember not even that many months ago, DALL-E 3, really, really high end and you could always still find something off about it.

  • Like especially if you ask it for something like a photo realistic image of a human, something about like the hands or the ears would always just be a little bit off, never mind the physics.

  • But even this video here is crazy at first glance. The prompt for this AI-generated video is a young man in his twenties is sitting on a piece of a cloud in the sky reading a book.

  • This one feels like 90% of the way there for me.

  • Like it's beyond the Uncanny Valley of like Apple's Personas which are actually based on humans.

  • This is a made-up person.

  • I mean, his eyes are kind of weird and the motion of the pages in the book are kinda odd and yeah, obviously he's in a cloud and that's a giveaway, but like, the lighting and the shadows and the skin tones and then all the realism of the textures on the shirt and the way the shirt and the pants move and the hair, they're all really impressivee.

  • And then for this one, they typed in a movie trailer featuring the adventurers of the 30-year-old spaceman wearing a red wool-knitted motorcycle helmet.

  • Blue sky, salt desert, cinematic style, shot on 35 millimeter film and the close-ups of his face, the fabrics on the helmet, the film grain through every shot and the cinematic style, this is one of the most convincing AI-generated videos I've ever seen.

  • Minus maybe the weird physics of that dude walking kind of in fast motion.

  • So Sam Altman, if you follow him on Twitter, he's going through a whole bunch more of like people's requests and posting a bunch more generated videos.

  • And so if you want to check out his profile, you can see those.

  • But here's the thing about these AI-generated videos now as good as they've gotten to this point.

  • They can and will pass as real videos to people who are not looking for AI-generated videos.

  • Now that is obviously insanely sketchy during an election year in the US and also terrifying for a bunch of other internet related reasons.

  • But it's also perfect for stock footage.

  • Like there are already all kinds of presentations and advertisements and then powerpoints that are in need of oddly specific stock videos and these AI-generated videos are already good enough to 100% pass for that purpose.

  • Like look at this one, this one with the waves at Big Sir, this drone shot.

  • Honestly, if I saw this on Twitter, I wouldn't even think twice.

  • I'd be like, "Oh nice drone shot, dude."

  • Wouldn't even think about AI if I wasn't pixel peeping at like the way the water was moving like this, this is a totally usable video in an ad for some California-based product.

  • And that has all sorts of implications for the drone pilot that no longer needs to be hired for all the photographers and videographers whose footage no longer needs to licensed to show up in that ad that's being made.

  • It's already that good.

  • There's other stuff like this wall of TVs, which would be a totally expensive and difficult thing to shoot with a camera and all these old expensive props.

  • But if you can just generate it, this well with reflections and the environment and everything else around it, I mean, why do it any other way?

  • It's also very capable of historical themed footage.

  • So this is supposed to be California during the gold rush.

  • It's AI-generated but it could totally pass for the opening scene in an old western with the right music over it.

  • How long until an entire ad, every single shot is completely generated with AI or what about an entire YouTube video or an entire movie?

  • I'm tempted to say like, we're a long way away from that because, you know, this still has flaws clearly and there's no sound and there's a long way to go with the prompt, engineering to iron these things out.

  • But then again, the spaghetti was like a year ago.

  • Now actually like that OpenAI on their website, they show some of the downfalls too of this particular model and because who would know better than the people who have been using it.

  • This is a very private tool, by the way, right now, it's in super limited access.

  • So it's in the hands of red teamers, which basically means people testing it, pushing the limits, trying to break it and a few trusted creators.

  • But they have found plenty of weird edge stuff like this clip here of a bunch of Gray Wolf pups. Looks normal at first, but then it's pretty clear that something's kind off with the way they're just kind of appearing out of nowhere and walking through each other.

  • That's kind of weird.

  • Or this clip of a guy running on a treadmill, which I mean, I don't really have to say much more about why this one's weird.

  • But this is my favorite one. Again.

  • So again, just try to try to put yourself in the mind of someone who's not expecting AI. You're just scrolling through Facebook or Twitter or something, right?

  • So you just see this video.

  • So, first, I just want you to watch this clip as if it's just a stock video you found of a grandma celebrating her birthday and just try to try to think like, "I wonder what birthday she's celebrating."

  • Right?

  • I don't know. How old do you think she is?

  • 60? 65?

  • Maybe it's the big 70. She seems to really like that cake.

  • Now, did you see it?

  • Did you catch that?

  • I'm gonna play it again.

  • But this time watch the video knowing that AI-generated photos and videos have trouble accurately doing hands.

  • I'll play it again, and now it feels super obvious like every time you watch it, watch a different set of hands, it gets weirder and weirder.

  • You can watch it like five times and there's dead giveaway after dead giveaway.

  • Not even mentioning the weird inconsistencies with the direction of the wind on the candles.

  • But even as I'm saying, all that, even as it's coming out of my mouth, I can't help but remember that 12 months ago, we were critiquing this.

  • So what does this all mean?

  • Well, I mean, there's what it means now and there's what it means for the future.

  • Now, Sora this thing that they've made is clearly a really impressive video generation AI tool that is both going to fool people and also be very useful.

  • There's also a watermark in the bottom corner of every video generated by it.

  • So if you see one of those videos and ideally it hasn't been cropped out, then that's at least a pretty clear indicator that it's AI-generated.

  • It's a Sora video.

  • But I also do think they're going to have to be very careful with this.

  • They're gonna have a whole bunch of safety stuff to keep in mind.

  • I think they'll probably have to be even more safe than DALL-E.

  • Like you shouldn't be able to generate people's likenesses.

  • Like you shouldn't be able to make a politician look like they're doing something on video, especially this year, You probably won't be able to make Will Smith eating spaghetti.

  • But it also definitely means stock video generation is absolutely going to take a dent out of video licensing.

  • Like I can basically guarantee that.

  • Like, logistically, why would anyone making something pay for footage of a house in the cliffs when they can generate one for free or for a small subscription price?

  • Like that is the real scary part of what this tool implies.

  • But in the future, it gets pretty existential, man.

  • I mean, OK, if this is trained on all videos that have ever been made by humans, then surely it can't be innovative or creative in ways that humans haven't already been, right?

  • I don't know, either way, I'll have all the links below for all the Sora stuff, for OpenAI stuff.

  • And I guess I'll talk to you next year when we look back and go, "Remember that first version of Sora and how bad those wolf pups looked when they spawned out of nowhere?"

  • Just remember, this is the worst that this technology is going to be from here on out.

  • Thanks for watching.

  • Catch you in the next one.

  • Peace.

All right.

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it