Subtitles section Play video Print subtitles [MUSIC PLAYING] BRIAN YU: OK, let's get started. Welcome, everyone, to the final day of CS50 Beyond. And goal for today is going to be to take a look at things at a bit of a higher level. There is going to be less code in today's lecture. The focus of today is on two main topics-- security and scalability-- which are both important as you begin to think about, you're writing all this code for your web application. You're ready to deploy it so that people can actually use it. What are the sorts of considerations you need to bear in mind? What are the security considerations in making sure that wherever you're hosting the application, you and the application itself is secure and that your users are secure from potential vulnerabilities or potential threats? And also, from a scalability perspective, we've been designing applications that so far probably only you or a couple other people have been using. But what sorts of things do you need to think about as your applications begin to scale, as more and more people begin to use it, and you have to begin to think about this idea of multiple people trying to use the same application at the same time? So a number of different considerations come about there. We'll show a couple of code examples. But the main idea of this is going to be high level, just thinking abstractly, sort of trying to design the product, trying to design the project, trying to figure out how exactly we need to be adjusting our application to make sure that it's secure and to make sure that it's scalable. So we'll go ahead and start with security. And on the topic of security, we're going to look at a number of different security considerations as we move all throughout the week, from the beginning of the week until the end of the week, thinking about the types of security implications that come about. And so one of the first things we introduced in the class was Git, the version control tool that we were using to keep track of different versions of our code in order to manage different branches of our code, so on and so forth. And so a couple of important security considerations to be aware with regards to Git. You all probably created GitHub repositories over the course of this week, maybe for the first time. And GitHub repositories by default are public. And this is in the spirit of the idea of open source software, the idea that anyone can see the code. Anyone can contribute to the code. And that, of course, comes with its trade offs. On one hand, everyone being able to see the code certainly means that anyone can help you to find bugs and identify bugs. But it also means that anyone on the internet can see the code, look for potential vulnerabilities, and then potentially take advantage of those vulnerabilities. So definitely, trade offs, costs, and benefits that come along with open source software. And another thing just to be aware of, we mentioned this earlier in the week, but your Git commit history is going to store the entire history of any of the commits that you have made, as the name might imply. And so if you make a commit and you do something you shouldn't have done, for instance-- you make a commit that accidentally includes database credentials inside of the commit somewhere or includes a password inside of the commit somewhere-- you can later on remove those credentials and make another commit and remove the credentials. But the credentials are still there inside of the history. If you go back, you could still find the credentials if you had access to the entire Git repository and could go back and find that point in Git's history. So what are the potential solutions for if you do something like this, accidentally expose credentials at some point in the repository and then remove them? What could you do? Yeah? AUDIENCE: Change the credentials. BRIAN YU: Certainly. Changing the credentials, something you should almost definitely do. Change the password. It's not enough just to remove them and make another commit. And there's also something you can do known as Git purge, where you can effectively purge the history of commit, sort of overwrite history, so to speak, in order to replace that, as well. But even that, if it's been online on GitHub, who knows who may have been able to access the credentials? So definitely always a good idea to remove those, as well. On the first day, we also took a look at HTML. We were designing basic HTML pages. And there are a number of security vulnerabilities you could create just with HTML alone. Perhaps one of the most basic is just the idea that the contents of a link can differ from where the link takes you to. There's probably a pretty obvious point where you often have text that links you to a particular page. But this can often be misleading and is commonly used in phishing email attacks, for instance, whereby you have a link that takes you to URL one, but by default, it shows you URL two, which can be misleading, for sure. Or I can have situations where I could-- let's go into link.html-- I have a link that presumably takes me to google.com. But if I click on google.com, it could take me anywhere else-- to some other site, for instance. And the way that it does that is quite simply by just having a link that takes you to a URL, but the contents of that URL are something different or something else entirely. And so that alone is something to be aware of. But that problem is compounded when you consider the idea that even though your server-side code-- application code you write in Python and Flask, for instance-- you can keep secret from your users, HTML code is not kept secret from users. Any users can see HTML and do whatever they want with it. And so on the first day, you may have been trying to take a look at an HTML page and try and replicate it using your own HTML and CSS, for example. The simplest way to do something like that would just be to copy the source code. So I could go to bankofamerica.com, for instance, Control-Click on the page, view the page source, and all right. Here's all the HTML on Bank of America's home page. I could copy that, create a new file, and call it bank.html. Paste the contents of it in here. Go ahead and save that. And now, open up bank.html. And now, I've got a page that basically looks like Bank of America's website. And now, I could go in. I could modify the links, change where Sign In takes you to, make it take you to somewhere else entirely. And so these are potential threats, vulnerabilities, to be aware of on the internet that are quite easy to actually do. So this is less about when you're designing your own web applications but, when you're using web applications, the types of security concerns to definitely be aware of. So let's keep moving forward in the week-- yeah, question? AUDIENCE: Can you copy JavaScript source code in the same way? BRIAN YU: Yes. Any JavaScript code that is on the client, you can access and you can modify. You can change variables and so on and so forth. And this is actually a pretty easy thing to do. So if I go to like, I don't know, The New York Times website, for instance, and I look at the source code there-- let me go ahead and inspect the element, and I'll try and hover over a main headline. OK. This is the name of a CSS class. You could access any JavaScript. You can also run any JavaScript in the console arbitrarily. So I could say, all right, document.query selector all let's get everything with that CSS class. Or maybe it's just the first one, because it's two CSS classes. All right. Great. I'll take the first one, set its inner HTML to be, like, welcome to CS50 Beyond. And you can play around with websites in order to mess around, change them. So all of the JavaScript CSS classes, all of that, is accessible to anyone who is using the page, for example. Other questions before I go on? Yeah. AUDIENCE: Any thoughts on JavaScript obfuscation? BRIAN YU: JavaScript obfuscation-- certainly something you can do. So since JavaScript is available to anyone who has access to the web page, there are programs called JavaScript obfuscators gators that basically take plain old looking JavaScript and convert it into something that's still JavaScript but that's very difficult for any human to decipher. It changes variable names and does a bunch of tricks in JavaScript to still execute the exact same way but that looks quite obscure. Definitely something you can do.