Subtitles section Play video Print subtitles SPEAKER 1: Remco was previously working at Boeing. And he got his degree from Brown University. And he will tell us about how to simplify dance models with the three buildings while still preserving legibility. Thanks, Emil. REMCO CHANG: Thank you guys for coming here. It's a great privilege to be at Google. So I'm here today to talk about kind of a broader research question that I've been working on. And the idea is try to understanding urban environments through the use of a thing called urban legibility. And I'll talk about that a little bit. So this talk is actually going to get broken down into about two parts. The first half is going to be on simplification of the urban models, while maintaining urban legibility. And this is a talk that I gave at SIGGRAPH this summer. So I apologize to people that were there. It will be a repeat, almost verbatim repeat. And the second half will be on discussion and future work, where I'll be talking about some of the things that we've been working on, as well as some of the things that we would like to be working on in the future. And before I start, I'd just like to talk a little bit about what we're doing at the visualization center at Charlotte. And specifically, one thing that we've been really interested in is in this idea of knowledge visualization. And to give you an example of what we mean by knowledge visualization, you can imagine like you have a screen of labels, there's lots of text, and they're kind of overlapping each other. Now, if somehow you can take this screen and extract some sort of higher knowledge, either from the scene or from the user, then it is theoretically possible that you can focus your resources on the things that you want the user to focus on. So in other words, we think of it as, you want to minimize the resources that you're using, but you want to maximize the information that you're giving to the user. And resource can really be anything. It can be CPU time, it could be a number of polygons. And at this particular case, it really is just a number of pixels that you're using on the screen. To give you an idea what we consider as a great knowledge visualization paper, here's something done by Agrawala and Stolte, who are both at Stanford. They're with Pat Hanrahan's group. And this paper is on rendering effective route maps. And here's an example of directions that would be given by MapQuest. And you see that this direction is physically accurate. It shows you exactly how to get from point a to point b. But that's about all it is. You don't really get a lot of information out of this. Whereas, typically speaking, if you ask somebody to give you directions, this will be something that people would draw you. So this hand sketch thing is not at all physically accurate. I mean, it's showing the highway 101 into a very small amount. Where if emphasis more on how you get onto highway 101, or 110, and how to get off. So of course, in their paper that was published at SIGGRAPH, 2001, they were able to mimic what humans typically do. And in this case, really showing off the important information in this task of giving people directions and maps. So we want to take this idea of knowledge visualization and apply it to urban models. So here's a model of a city in China. And the question is, what is the knowledge in this scene? What is it that we want to be able to preserve and to highlight? To answer that, we turn to this idea of urban legibility. And urban legibility is a term that was made famous by Kevin Lynch in his 1960 book called The Image of the City. So what he did for this book was that he went around the city of Boston, and he just asked local residents, and asked them to sketch out-- just kind of use a pen and sketch out their immediate surroundings. So what he actually got was a stack of these images that you see on the right here, where people just simply sketched out, you know, this is where I live, this is a big road around me, and so on. And he took this stack of sketched images, and he categorized the important things into five groups. He categorized into paths, which are highways, railroads, roads, canals. Edges, shore lines or boundaries. Districts, industrial, residential district. Nodes, which you can think of as areas where lots of activities converge. So as an example, Time Square in New York City. And then landmarks. And landmarks can really be anything. It can be a tree, it can be a post sign, it can be a big building. It's whatever people use to navigate themselves in an urban environment. So Kevin Lynch defined this idea of urban legibility as "the ease with which a city's parts may be recognized and can be organized into a coherent pattern." So that's kind of a mouthful. But to me, what that really says is, if you can somehow deconstruct a city into these urban legibility elements, we can still be able to organize a city in a coherent pattern. The use of urban legibility in computer science really goes back a little ways. Ruth Dalton, in her 2002 paper, just chronicles the history of what people have done in computer science in the use of urban legibility. And it kind of broke down to two groups. There's one group that tries to justify whether or not the idea of urban legibility actually makes sense. So what they did was, they tried to figure out if these elements are actually important to human navigation. And what they found out, interesting enough, is that paths, edges, and districts are very important to navigation, but landmarks are kind of questionable. There's some groups that think that it's very useful, there's some groups that say that it's totally useless. And the one element that's missing here is the element of nodes. And people have not really been able to successfully quantify what really a node is. So there hasn't been as much research done on trying to figure out if nodes are helpful at all. And the other group of researchers just use urban legibility, and in particular, in graphics and visualization. Most notably, Ingram and Benford has a whole series of papers where they try to use urban legibility in navigating abstract data spaces. So the question is, why did we decide to use urban legibility? And to give you an idea, here we take an original model. These are a bunch of buildings in our Atlanta data set, looked at from a top down view. This is what you would get if you use a traditional simplification method, such as QSlim. And I'm assuming people know what QSlim is. But what you see is that a lot of the buildings get decimated to a point where it doesn't really look like a building anymore. Whereas, our approach is a little bit different. We take an aggregated approach, and this is what you will get. And if we're apply a texture map onto our model, this is what you end up at the end. So, it's actually really interesting that when we take these four models, and we put in the fly-through scene, just kind of a test scenario, and we measure how many pixels are different from the original model. And this is the graph that we get. So you don't have to look at it carefully. But the important thing here that I'm picking out is that, basically, using all these models, they end up with very, very similar differences in terms of pixel errors. And what that says to us is that, even though you look at these four models and you say, well, they look very different to me. But in effect, if you measure it purely quantitative using pixel errors, they actually come out to be very similar. So what that really says to us is, we can't really just use pixel errors as the driving force behind simplification of urban models. We have to use something a little bit different. We have to use a higher level information in here. And to simplify this, let me just state, our goal for this project is to create simplified urban models that retain the image of the city from any view angles and distances. And as an example of what we get, you see the original model on the left. The middle image shows the model having been reduced to 45% of the polygons. And the last one is 18%. And you kind of see a little bit of a dimming effect across, when it goes from original to less polygons. But the important thing here to notice, that when you're doing this, the important features in the city are retained. So for example, the road here is still kept. The city square area is kept. And you pretty much still get the sense that this is the same city that you're looking at, even though there's only 18% of the polygons in the scene. I'm just going to run the application really quickly, and hopefully, nothing goes wrong. OK. So this is using the Chinese city data set. And this is running live. So as you can see, I can just kind of look around. Move to different places. And here-- hold on one second. This is where the demo goes wrong. OK. So I'm just going to start zooming out from this view. AUDIENCE: Can you mention how you got that geometry in the [UNINTELLIGIBLE]? Is that made out [UNINTELLIGIBLE PHRASE]. REMCO CHANG: Those textures are totally fake. AUDIENCE: [UNINTELLIGIBLE PHRASE]. REMCO CHANG: The geometry is actually real. So what we got was, we got the original footprint information, and we got approximate height information in terms of number of stories, or number of flights per building. And we estimated that each story is about three meters. So the geometry, it's kind of the extrusion of footprints. So it's not real in terms of the true 3D models, but the footprints and the positions are actually absolutely correct. AUDIENCE: Do you [UNINTELLIGIBLE] the fact that [UNINTELLIGIBLE] you get repeated texture patterns? REMCO CHANG: There is definitely some. But I'll talk about that in a little bit. Yes, sir? AUDIENCE: What kind of specification [UNINTELLIGIBLE PHRASE]? REMCO CHANG: As it turns out-- I'll get into that a little bit later, too-- this is-- AUDIENCE: [UNINTELLIGIBLE PHRASE] REMCO CHANG: Oh. OK, sorry. So the question was, what kind of hardware I'm running this on. And the honest truth is, I have no idea. But what I do know is that this is kind of the state of the art laptop from Dell. But as it turns out-- I'll explain this a little bit-- but the bottleneck's actually not in the graphics card. It's actually in my crappy code where I'm not transferring data fast enough. It's the pipeline that's actually the bottleneck right now. But that's just my fault. I wrote some crappy code. So here I'm just zooming out from that particular viewpoint.