Subtitles section Play video Print subtitles MALE SPEAKER: This is my attempt to increase the sartorial quotient of Google, and it hasn't worked at all. On the other hand-- well, I noticed you have a coat on, that's true. Greg Chesson gets two points for showing up with a coat. It's a real pleasure to introduce Bruce Schatz to you. I've known Bruce for rather a long time. My first introduction to him came as we both began getting excited about digital libraries and the possibility of accumulating enormous amounts of information in digital form that could be worked on, manipulated by, processed through software that we hope would augment our brain power. So Bruce has been in the information game for longer than he's actually willing to admit I suspect. He's currently at the University of Illinois, Champaign-Urbana. As you will remember, that's also the area where the National Center for Supercomputer Applications is located. Bruce was around at the time when Mark and Jason was doing work on the first browsers, the mosaic versions of the browsers derived from Tim BernersLee's work. Actually, the one thing that Bruce may not realize he gets credit for is teaching me how to pronounce caenorhabditis elegans. I looked at it before and I couldn't figure out, and maybe I didn't even say it right this time. But this is a tiny little worm that consists of 50 cells. It was the first living organism that we actually completely sequenced the genome for. Then we got interested in understanding how does the genome actually reflect itself as this little worm develops from a single fertilized cell. So Bruce introduced me to the idea of collecting everything that was known about that particular organism, and to turn it into a database that one could manipulate and use in order to carry out research. Well, let me just explain a little bit more about his background and then turn this over to him, because you're here not to listen to his bio, but to listen to what he has to say. He's currently director of something called CANIS-- C-A-N-I-S. I thought it had to do with dogs until I re-read it. It says Community Architecture is for Network Information Systems. BRUCE SCHATZ: That's why they let me in the building. MALE SPEAKER: I'm sorry. BRUCE SCHATZ: That's why they let me in the building. MALE SPEAKER: Because along with the other canines that are here. It's at the University of Illinois, Champaign-Urbana, and he's been working on federated all the world's knowledge, just like we are, by building pioneer research systems in industrial and academic settings. He's really done a lot of work over a period of 25 or 30 years in this domain. The title of the talk uses the term telesophy, which he introduced as a project at Belcorp in the 1980s. Later on, he worked at UIUC on something called DeLIver D-E-L-I-V-E-R, and now more recently on semantics. That's the reason that I asked him to come here. He's working on something called BeeSpace, which is spelled B-E-E, as in the little buzzing organism. This is an attempt as I understand it, but I'm going to learn more, an attempt to take a concept space and organize it in such a way that we can assist people thinking through and understanding more deeply what we know about that particular organism. So this is a deep dive into a semantic problem. So I'm not going to bore you with any more biographical material, except to say that Bruce has about nine million slides to go through, so please set your modems at 50 gigabits per second because he's going to have to go that fast to get through all of it. I've asked him to leave some time at the end for questions. I already have one queued up. So Bruce, with that rather quick introduction, let me thank you for coming out to join us at Google and turn this over to you to teach us about semantics. BRUCE SCHATZ: Thank you. I have one here, so you can just turn yours off. Thank you. I was asked to give a talk about semantics, which I supposedly know something about. So this is going to be both a talk that's broad and deep at the same time, and it's going to try to do something big and grand, and also try to do something deep that you can take away with it. So that may mean that it fails completely and does none of those, or maybe it does all of those. I've actually been giving this talk for 25 years and-- now, of course, it doesn't work. Am I not pointing it in the right place? I'm pushing it but it's not going. Oh, there it goes. OK, sorry. Can you flip it back there? Sorry about that. Small technical difficulty, but the man behind the curtain is fixing it. So I gave this talk first more than 20 years ago in the hot Silicon Valley research lab that all the grad students wanted to go to, which was called Xerox PARC. I think a few people actually have heard of Xerox PARC. It sort of still exists now. We went down completely? There we go. Thank you very much. I was pushing this idea that you could federate and search through all the world's knowledge, and the uniform reaction that was, boy, that would be great, but it's not possible. And I said, no, you're wrong. Here, I'll show you a system that searches across multiple sources and goes across networks, and does pictures and text and follows links, and I'll explain each piece about how it works. Then they said, that's great, but not in our lifetime. Well, 10 years later was mosaic and the web. And 20 years later I'm delighted to be here, and all of you have actually done it. You've done all the world's knowledge to some degree. What I want to talk about is how far are you and what you need to do before you take over the rest of the world and I die, which is another 20 years. So what's going to happen in the next 20 years. The main thing I'm going to say is a lot's happened on tele, but not too much on sophy. So you're halfway to the hive mine, and since I'm working on honey bees, at the end you will see a picture of honey bees and hear something about hive mines, but it will be very short. Basically, if you look at Google's mission, the mission is doing a lot about access and organization of all the world's knowledge. Actually, to a degree that's possible, you do an excellent job about that. However, you do almost nothing about the next stages, which are usually called analysis and synthesis. Solving actual problems, looking at things in different places, combining stuff and sharing it. And that's because if you look at the graph of research over the years, we're sort of here, and you're doing commercially what was done in the research area about 10 years ago, but you're not doing this stuff yet. So the telesophy system was about here. Mosaic was about to here. Those are the things that searching across many sources-- like what I showed, we're really working pretty well in research labs with 1,000 people. They weren't working with 100 million. But if Google's going to survive 10 more years, you're going to have to do whatever research systems do here. So pay attention. This doesn't work with students. With students I have to say I'm going to fail you at the end. But you have a real reason, a monetary reason, and a moral reason to actually pay attention. So back to the outline. I'm going to talk about what are different ways to think about doing all the world's knowledge, and how to go through all the levels. I'm going to do all the levels and sort of say you are here, and then I'm going to concentrate on the next set of things that you haven't quite got to. The two particular things I'm going to talk about our scalable semantics and concept navigation, which probably don't mean anything to you now, but if I do my job right, 45 minutes, actually now 10 of them are up,