Subtitles section Play video
Hi, I'm John Carmack.
I'm the Chief Technology Officer for Oculus.
I work on virtual reality.
I've been challenged today to talk about one concept at five levels of increasing complexity.
So we're going to be talking about reality and virtual reality, what the technology allows us to do today, what it may allow us to do in the future, and whether that should even be our goal to approximate reality.
So do you know what virtual reality is?
Yes, it's simple.
It's like a video game, except it feels like you're in the video game.
That's actually a really good description.
The idea is that if you've got a system here that can make you see whatever we want you to see, then we can make you believe that you're anywhere else, like on top of a mountain, or in a dungeon, or under the ocean.
Or in Minecraft.
Yeah, or in Minecraft.
When you look at a TV on the wall there, like a picture of a mountain or something, how can you tell that it's not just a window and there's something else behind it?
Because it always doesn't look quite right.
If you have a static picture of a person on a screen and you move around like this, it's not really changing.
And it's interesting.
Those are things that we have to figure out the ways around it to fool you in virtual reality.
We need to figure out when you look at something in reality, how can you tell whether it's real or not?
Have you been to a 3D movie where you put on the little glasses?
Um, yeah.
So what they do, the trick for that is, if you're ever at a theater and you take up the glasses and you look at it, you'll see it's blurry, where there's actually two pictures that they're showing at the same time.
And what those little glasses do is they let one eye see one picture and the other eye see a different picture.
So then your eyes can say, oh, it looks like I'm seeing right through the screen, or something is floating out in front of it.
In the VR headsets, what we do is there's actually either two screens or one screen split in half so that it draws a different picture, a completely different picture for each eye.
And we make sure that each eye can only see the picture it intended to.
And that's what can make things feel like they've got this real depth to them, that it's something that you could reach out and touch.
And it doesn't feel like a flat TV screen.
Try and look over there and concentrate on his face over there.
Can you see me waving my hand without turning your eyes?
No.
All right.
So at some point, you can probably see it right now.
All right.
So without moving your eyes, this is kind of hard, tell me how many fingers I'm holding up.
Three?
You can just barely tell.
And you see that even in, like, if you hold out your hand and you focus on your hand, then your foot would be blurry because there's that difference there.
And you can change between that.
Like, you can then focus on your foot and your hand gets blurry because your eyes can see different amounts of detail in different places.
So we're hoping in the future that hardware can be like that, where we can make a display that puts lots of detail right where you're looking.
And every time you look someplace else, it moves the detail over there.
So we don't need to render a hundred times as much as we've got right now.
Figuring out where you're looking is a pretty hard problem.
What we try to go about this is by taking a camera and looking at people's eyes and then try to figure out, is the eye looking over here or up here?
And we're working hard on stuff like this right now so that we hopefully can have a virtual reality that's as detailed and realistic feeling as the reality that we've actually got around us.
But it's going to be a long time before we get to where we can really fool people.
So do you have a basic sense of what latency is?
Yeah.
So my understanding of what latency is, is it's basically the time delay between the rendering at different points.
So it's basically a delay and it happens in all parts of the system.
Monitors can be a big one.
Consumer televisions can often have 50 milliseconds or more of latency just in the TV part.
And then you've got the processing in the computer and all of these add up to the total latency.
The latency is, in my opinion, the most important part of VR because if you have that offset, your body is no longer immersed and you gain that motion sickness, which can pull a lot of people out of the experience.
Games that feel really good, they've got that sense that it happens It really is a testament to this kind of technology and how it's developing and how over time it's going to just be, you're going to be able to pack more pixel density in those displays and it's going to get a lot more immersive.
Like 30 years ago, you had desktop PCs, which were, you spent whatever on that.
But there was always this idea, well, you can spend a million dollars and buy a supercomputer and it's going to be a lot faster.
And that's not really true today.
For scalar processing, when you just do one thing after another, a high-end overclocked, cooled gaming PC is about the fastest thing in the world.
It is within a very small delta.
Yeah, for some things, you'll get some IBM power system that might be a little bit faster, but not a whole lot.
So if I'm looking at this and saying, we need to be five times faster, what do you do?
You can't just say, make each thing faster.
As a developer, you're making a trade-off.
I can put my effort into making this more efficient or making it more fun and more fun usually wins out for very good reasons.
So there's some good judgment and trickery that goes into the design of things.
You can always design a game that will just not work well.
I mean, in the old days, games had to be so precisely designed.
Nowadays, you've got a ton more freedom.
You really can.
It's just the hardware.
Old games are running off of 8-bit and that's all the data you can have.
Any crazy idea you do now, you can probably make a pretty good video game out of it, which is a wonderful, wonderful thing.
Yeah, a lot of freedom.
But VR makes you have to give up a little bit of that freedom.
You have to not do so many crazy things so that you can wind up having it be as responsive and high quality as it needs to be.
That's very interesting.
You know, an interesting topic is what are the limits to what we can do with virtual reality, where I'm pretty pleased with what we have today, what we can show people and say, virtual reality, it's cool.
People get an amazing response from it.
But we're still clearly a very, very long ways from reality.
That kind of notes back to realism in our history and how realism was a response to romanticism.
And realism was meant to capture the mundane, everyday lives of individuals and not idealize any of their activities in any way.
And I think that that's really important for virtual reality.
I think it's kind of like a rite of passage for any kind of art technology to go Mostly in VR, we talk about the display and optics, the visual side of things, but we should at least tick off the other senses.
And haptics is an interesting thing about virtual reality really doesn't have that aspect of touching things.
You can move your hands around, you could do everything, but it's a disconnected experience because, you know, you don't have the actual solidity there.
And I am pessimistic about progress in haptics technology, about almost all other areas.
I'm an optimist.
I'm excited about what's coming up, but I don't have any brilliant vision about how we're going to revolutionize haptics and make it feel like we're touching the things in the virtual world.
So I've tried the demos where there's at VRLA, there's one that has waves, like audio waves, I believe, that come up and then you can put your hands through that and feel the waves whenever you're supposed to be feeling bubbles or any kind of force field or something.
And those are pretty interesting.
I've seen some pretty interesting things that you could do with audios.
You can cut down a lot of the storage, I guess, and the power that you would need in order to power a huge you can just mimic the sounds of those scenes actually being there and then not actually build them out.
For example, a professor at USC would have the sound of a train drive-by without ever actually rendering the sound, and you'd feel like you're deeply immersed in this world without having to have such an expensive scene built around you.
So I think those are pretty significant.
And that is one potential quality improvement that's still on the horizon, is when we do spatialization, we use the HRTF, the head relative transfer function, to make it sound like it's in places.
But usually we just use this one generic, here is your average human, HRTF function.
And it's possible that, of course, if you are right in the average, then it's perfect for you, but there's always people off to the extremes that it doesn't do a very good job at.
And there may be better ways to allow people to sample their own perfect HRTF, which can improve the audio experiences a lot.
It all comes down to all these trade-offs.
With display and with resolution, it's one of those things where if people have one bad experience, they kind of outrule everything else.
It's really difficult to build trust again with people who haven't done VR before, but it's easy to break all that trust whenever they do a bad experience.
There was a huge concern about that at Oculus, and the term internally that went around was poisoning the well.
They were very, very concerned.
I mean, for a long time, there was a fight about whether Gear VR should even be done, because the worry was that if we let a product go out, like Gear VR, that didn't have those things, that if somebody saw it, and it was bad, it made them sick, it made their eyes hurt, then they would be like, I'm never going to try VR again.
I tried it that time, and it was terrible.
And there was legitimate arguments about whether it was even a good idea to do that.
And it turned out that, yes, it's obviously better to have all of those things, but you can still do something that's valuable for the user without it.
It's weird being at the beginning of a medium like this.
Right, I'm very excited to see how filmmakers tackle creating content in those things, especially if they're already experiencing a traditional medium.
Mostly today, I've been talking a lot about what can we do, what's possible, what we think might be possible in the next couple years.
But really, at the professional level, it's more the question of wisdom of what should we be doing.
That's one of the things we're trying to figure out is from an artist and storytelling perspective, what are the things that will make this meaningfully different from what we're used to, like a television on our wall.
And we've been finding a lot of things, aspects of virtual reality, that very much do that, in my opinion.
Things that allow you to feel presence, first and foremost, where you get lost and you have to remind yourself, this isn't actually happening.
And things that ultimately allow you to embody other characters, things where you can actually change your own self-perception and play with neuroplasticity and teach yourself things that are bizarre and unique.
As an engineer, of course, I love quantifiable things.
I like saying, here's my 18 millisecond motion to photon, here's my angular resolution, and I'm improving, I'm doing the color space right.
But you can look at not too far back, where you say, we have Blu-ray DVDs at this amazing resolution, but more people want to watch YouTube videos at really bad early internet video speeds, where there are things that if you deliver a value to people, then these objective quantities may not be the most important thing.
And while I'm certainly pushing as hard as we can on lots of these things that make the experience better in potentially every way, or maybe just for videos or the different things, I don't think that it's necessary.
I've commented that I think, usually my favorite titles on mobile that are fully synthetic are ones that don't even try.
They just go and do like light mapped, flat shaded, and I think it's a lovely aesthetic.
I think that you don't wind up fighting all of the aliasing, while you get some other titles we're going to be high tech with our specular bump maps with roughness, and you've got aliasing everywhere, and you can't hold frame rate, and it's all problematic.
While some of these that are clearly very synthetic worlds, where it's nothing but these cartoony, flat shaded things with lighting, but they look and they feel good, and you can buy that you're in that place, and you want to know what's around that monolith over there.
We did a project called Life of Us, which was exactly that mindset.
We're like, let's embrace low poly aesthetic and just simple vertex shading, and we end up realizing you can embody these various creatures and transform yourself, and when you do that with co-presence of another creature, another human, it makes for a totally magical journey.
You don't even think for a second.
You actually dismiss the whole idea of photorealism and embrace that reality for what it is.
I think it actually helps put you at ease a little bit.
The end goal of reality, of course, in computer graphics, people have chased photorealistic for a long time, and basically we've achieved it.
Photorealism, if you're willing to throw enough discrete path traced rays at things, you can generate photorealistic views.
We understand the light really well.
Of course, it still takes a half hour per frame like it always has, or more, to render the different things.
It's an understood problem, and given infinite computing power, we could be doing that in virtual reality.
However, a point that I've made to people in recent years is that we are running out of Moore's Law.
Maybe we'll see some wonderful breakthrough in quantum structures or whatever, but if we just wind up following the path that we're on, we're going to get double and quadruple, but we're not going to get 50 times more powerful than we are right now.
We will run into atomic limits on our fabrication.
Given that, I'm trying to tell people that start buying back into optimization.
Start buying back into thinking a little bit more creatively, because you can't just wait.
It's not going to get to that point where it really is fixed to those highest degrees just by waiting for computing to advance.
If we want this to be something used by a billion people, then we need it to be lighter, cheaper, more comfortable.
There's constantly novel UX innovations like Google Earth, the way that they did the elimination of your periphery as you zoom in and move, and also giving you the user action to decide where you're going and looking.
There's constantly seeing people coming up with ways to break from the paradigms of actual reality and then introduce mechanisms that work very well.
There's tons of opportunities for the synthetic case where you want to be able to have your synthetic fantasy world where everybody is a creature that's created by the computer and simulated reasonably.
We still don't do people well simulated.
That's a hard problem.
We've been beating our heads against it for a long time.
I do think we're making progress, and I wouldn't bet on that being solved in 10 years, but maybe 20 years, because it is going to take a lot of AI.
It's going to take a lot of machine learning where it's not going to be a matter of us dissecting all the micro expressions that people do.
It's going to be, let's take every YouTube video ever made and run it through some enormous learner that's going to figure out how to make people look realistic.
Yeah, absolutely.
There are these crucial thresholds where you pass a technological hurdle and all of a sudden that unlocks a whole world of creative potential.
But I think to your point very much, we need to solve the actual human and social challenges and turn those into opportunities to figure out how this technology fits into our lives.
I'm still a believer the magic's out there.
We haven't found it yet, so somebody's going to happen upon the formula.
I feel like I've felt little pockets of magic.
Yeah, exactly.
Oh, you can imagine the utility behind this for creating a world, or you can imagine the power of a story that would be told you in this context.
A lot of it, I think, is just picking those, putting them together in a meaningful way, and then crafting something that's really bigger and intentional.
But I've now been joining with real people in virtual reality.
I think we also have different levels of connection, like the audio side is leaps and bounds above in terms of the nuance of personality and humanity.
So when I hear people laughing and joking and really enjoying something, soon we'll get into that.
We're in macro gesture land where I can wave and say thumbs up.
But when we get into micro gestures and actually getting a sense of facial reactions and other things, I think then we'll have a really incredibly rewarding time spending time with each other.
So VR right now is pretty amazing.
When you look at it, it's things that you haven't seen, but we are just getting started.
The next five years, both technologically and creatively, are going to really take this medium someplace that you've never imagined.
