Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • BRIAN YU: OK, let's get started.

  • Welcome, everyone, to the final day of CS50 Beyond.

  • And goal for today is going to be to take a look at things

  • at a bit of a higher level.

  • There is going to be less code in today's lecture.

  • The focus of today is on two main topics--

  • security and scalability-- which are both important as you

  • begin to think about, you're writing all this code for your web application.

  • You're ready to deploy it so that people can actually use it.

  • What are the sorts of considerations you need to bear in mind?

  • What are the security considerations in making

  • sure that wherever you're hosting the application, you and the application

  • itself is secure and that your users are secure from potential vulnerabilities

  • or potential threats?

  • And also, from a scalability perspective,

  • we've been designing applications that so far probably only you

  • or a couple other people have been using.

  • But what sorts of things do you need to think about

  • as your applications begin to scale, as more and more people begin to use it,

  • and you have to begin to think about this idea of multiple people trying

  • to use the same application at the same time?

  • So a number of different considerations come about there.

  • We'll show a couple of code examples.

  • But the main idea of this is going to be high level, just thinking abstractly,

  • sort of trying to design the product, trying to design the project,

  • trying to figure out how exactly we need to be adjusting our application

  • to make sure that it's secure and to make sure that it's scalable.

  • So we'll go ahead and start with security.

  • And on the topic of security, we're going

  • to look at a number of different security considerations

  • as we move all throughout the week, from the beginning of the week

  • until the end of the week, thinking about the types of security

  • implications that come about.

  • And so one of the first things we introduced in the class was Git,

  • the version control tool that we were using

  • to keep track of different versions of our code

  • in order to manage different branches of our code, so on and so forth.

  • And so a couple of important security considerations to be aware with

  • regards to Git.

  • You all probably created GitHub repositories

  • over the course of this week, maybe for the first time.

  • And GitHub repositories by default are public.

  • And this is in the spirit of the idea of open source software, the idea

  • that anyone can see the code.

  • Anyone can contribute to the code.

  • And that, of course, comes with its trade offs.

  • On one hand, everyone being able to see the code certainly

  • means that anyone can help you to find bugs and identify bugs.

  • But it also means that anyone on the internet can see the code,

  • look for potential vulnerabilities, and then

  • potentially take advantage of those vulnerabilities.

  • So definitely, trade offs, costs, and benefits that

  • come along with open source software.

  • And another thing just to be aware of, we mentioned this earlier in the week,

  • but your Git commit history is going to store the entire history of any

  • of the commits that you have made, as the name might imply.

  • And so if you make a commit and you do something

  • you shouldn't have done, for instance-- you make a commit that accidentally

  • includes database credentials inside of the commit somewhere

  • or includes a password inside of the commit

  • somewhere-- you can later on remove those credentials

  • and make another commit and remove the credentials.

  • But the credentials are still there inside of the history.

  • If you go back, you could still find the credentials

  • if you had access to the entire Git repository

  • and could go back and find that point in Git's history.

  • So what are the potential solutions for if you do something like this,

  • accidentally expose credentials at some point in the repository

  • and then remove them?

  • What could you do?

  • Yeah?

  • AUDIENCE: Change the credentials.

  • BRIAN YU: Certainly.

  • Changing the credentials, something you should almost definitely do.

  • Change the password.

  • It's not enough just to remove them and make another commit.

  • And there's also something you can do known as Git purge, where

  • you can effectively purge the history of commit, sort of overwrite history,

  • so to speak, in order to replace that, as well.

  • But even that, if it's been online on GitHub,

  • who knows who may have been able to access the credentials?

  • So definitely always a good idea to remove those, as well.

  • On the first day, we also took a look at HTML.

  • We were designing basic HTML pages.

  • And there are a number of security vulnerabilities

  • you could create just with HTML alone.

  • Perhaps one of the most basic is just the idea that the contents of a link

  • can differ from where the link takes you to.

  • There's probably a pretty obvious point where you often

  • have text that links you to a particular page.

  • But this can often be misleading and is commonly

  • used in phishing email attacks, for instance,

  • whereby you have a link that takes you to URL one,

  • but by default, it shows you URL two, which can be misleading, for sure.

  • Or I can have situations where I could--

  • let's go into link.html--

  • I have a link that presumably takes me to google.com.

  • But if I click on google.com, it could take me anywhere else--

  • to some other site, for instance.

  • And the way that it does that is quite simply by just

  • having a link that takes you to a URL, but the contents of that URL

  • are something different or something else entirely.

  • And so that alone is something to be aware of.

  • But that problem is compounded when you consider the idea

  • that even though your server-side code-- application code

  • you write in Python and Flask, for instance--

  • you can keep secret from your users, HTML code is not

  • kept secret from users.

  • Any users can see HTML and do whatever they want with it.

  • And so on the first day, you may have been

  • trying to take a look at an HTML page and try and replicate it

  • using your own HTML and CSS, for example.

  • The simplest way to do something like that

  • would just be to copy the source code.

  • So I could go to bankofamerica.com, for instance, Control-Click on the page,

  • view the page source, and all right.

  • Here's all the HTML on Bank of America's home page.

  • I could copy that, create a new file, and call it bank.html.

  • Paste the contents of it in here.

  • Go ahead and save that.

  • And now, open up bank.html.

  • And now, I've got a page that basically looks like Bank of America's website.

  • And now, I could go in.

  • I could modify the links, change where Sign In takes you to,

  • make it take you to somewhere else entirely.

  • And so these are potential threats, vulnerabilities,

  • to be aware of on the internet that are quite easy to actually do.

  • So this is less about when you're designing your own web applications

  • but, when you're using web applications, the types of security

  • concerns to definitely be aware of.

  • So let's keep moving forward in the week-- yeah, question?

  • AUDIENCE: Can you copy JavaScript source code in the same way?

  • BRIAN YU: Yes.

  • Any JavaScript code that is on the client, you can access

  • and you can modify.

  • You can change variables and so on and so forth.

  • And this is actually a pretty easy thing to do.

  • So if I go to like, I don't know, The New York Times website, for instance,

  • and I look at the source code there--

  • let me go ahead and inspect the element, and I'll

  • try and hover over a main headline.

  • OK.

  • This is the name of a CSS class.

  • You could access any JavaScript.

  • You can also run any JavaScript in the console arbitrarily.

  • So I could say, all right, document.query selector all let's

  • get everything with that CSS class.

  • Or maybe it's just the first one, because it's two CSS classes.

  • All right.

  • Great.

  • I'll take the first one, set its inner HTML to be,

  • like, welcome to CS50 Beyond.

  • And you can play around with websites in order to mess around, change them.

  • So all of the JavaScript CSS classes, all of that,

  • is accessible to anyone who is using the page, for example.

  • Other questions before I go on?

  • Yeah.

  • AUDIENCE: Any thoughts on JavaScript obfuscation?

  • BRIAN YU: JavaScript obfuscation-- certainly something you can do.

  • So since JavaScript is available to anyone who has access to the web page,

  • there are programs called JavaScript obfuscators gators

  • that basically take plain old looking JavaScript

  • and convert it into something that's still JavaScript

  • but that's very difficult for any human to decipher.

  • It changes variable names and does a bunch of tricks in JavaScript

  • to still execute the exact same way but that looks quite obscure.

  • Definitely something you can do.