Subtitles section Play video Print subtitles The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high-quality educational resources for free. To make a donation or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu. JULIAN SHUN: Hi, good afternoon, everyone. So today, we're going to be talking about graph optimizations. And as a reminder, on Thursday, we're going to have a guest lecture by Professor Johnson of the MIT Math Department. And he'll be talking about performance of high-level languages. So please be sure to attend the guest lecture on Thursday. So here's an outline of what I'm going to be talking about today. So we're first going to remind ourselves what a graph is. And then we're going to talk about various ways to represent a graph in memory. And then we'll talk about how to implement an efficient breadth-first search algorithm, both serially and also in parallel. And then I'll talk about how to use graph compression and graph reordering to improve the locality of graph algorithms. So first of all, what is a graph? So a graph contains vertices and edges, where vertices represent certain objects of interest, and edges between objects model relationships between the two objects. For example, you can have a social network, where the people are represented as vertices and edges between people mean that they're friends with each other. The edges in this graph don't have to be bi-directional. So you could have a one-way relationship. For example, if you're looking at the Twitter network, Alice could follow Bob, but Bob doesn't necessarily have to follow Alice back. The graph also doesn't have to be connected. So here, this graph here is connected. But, for example, there could be some people who don't like to talk to other people. And then they're just off in their own component. You can also use graphs to model protein networks, where the vertices are proteins, and edges between vertices means that there's some sort of interaction between the proteins. So this is useful in computational biology. As I said, edges can be directed, so their relationship can go one way or both ways. In this graph here, we have some directed edges and then also some edges that are directed in both directions. So here, John follows Alice. Alice follows Peter. And then Alice follows Bob, and Bob also follows Alice. If you use a graph to represent the world wide web, then the vertices would be websites, and then the edges would denote that there is a hyperlink from one website to another. And again, the edges here don't have to be bi-directional because website A could have a link to website B. But website B doesn't necessarily have to have a link back. Edges can also be weighted. So you can have a weight on the edge that denotes the strength of the relationship or some sort of distance measure corresponding to that relationship. So here, I have an example where I am using a graph to represent cities. And the edges between cities have a weight that corresponds to the distance between the two cities. And if I want to find the quickest way to get from city A to city B, then I would be interested in finding the shortest path from A to B in this graph here. Here's another example, where the edge weights now are the costs of a direct flight from city A to city B. And here the edges are directed. So, for example, this says that there's a flight from San Francisco to LA for $45. And if I want to find the cheapest way to get from one city to another city, then, again, I would try to find the shortest path in this graph from city A to city B. Vertices and edges can also have metadata on them, and they can also have types. So, for example, here's the Google Knowledge Graph, which represents all the knowledge on the internet that Google knows about. And here, the nodes have metadata on them. So, for example, the node corresponding to da Vinci is labeled with his date of birth and date of death. And the vertices also have a color corresponding to the type of knowledge that they refer to. So you can see that some of these nodes are blue, some of them are red, some of them are green, and some of them have other things on them. So in general, graphs can have types and metadata on both the vertices as well as the edges. Let's look at some more applications of graphs. So graphs are very useful for implementing queries on social networks. So here are some examples of queries that you might want to ask on a social network. So, for example, you might be interested in finding all of your friends who went to the same high school as you on Facebook. So that can be implemented using a graph algorithm. You might also be interested in finding all of the common friends you have with somebody else-- again, a graph algorithm. And a social network service might run a graph algorithm to recommend people that you might know and want to become friends with. And they might use a graph algorithm to recommend certain products that you might be interested in. So these are all examples of social network queries. And there are many other queries that you might be interested in running on a social network. And many of them can be implemented using graph algorithms. Another important application is clustering. So here, the goal is to find groups of vertices in a graph that are well-connected internally and poorly-connected externally. So in this image here, each blob of vertices of the same color corresponds to a cluster. And you can see that inside a cluster, there are a lot of edges going among the vertices. And between clusters, there are relatively fewer edges. And some applications of clustering include community detection and social networks. So here, you might be interested in finding groups of people with similar interests or hobbies. You can also use clustering to detect fraudulent websites on the internet. You can use it for clustering documents. So you would cluster documents that have similar text together. And clustering is often used for unsupervised learning and machine learning applications. Another application is connectomics. So connectomics is the study of the structure, the network structure of the brain. And here, the vertices correspond to neurons. And edges between two vertices means that there's some sort of interaction between the two neurons. And recently, there's been a lot of work on trying to do high-performance connectomics. And some of this work has been going on here at MIT by Professor Charles Leiserson and Professor Nir Shavit's research group. So recently, this has been a very hot area. Graphs are also used in computer vision-- for example, in image segmentation. So here, you want to segment your image into the distinct objects that appear in the image. And you can construct a graph by representing the pixels as vertices. And then you would place an edge between every pair of neighboring pixels with a weight that corresponds to their similarity. And then you would run some sort of minimum cost cut algorithm to partition your graph into the different objects that appear in the image. So there are many other applications. And I'm not going to have time to go through all of them today. But here's just a flavor of some of the applications of graphs. So any questions so far?