Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • DAVID MALAN: So today we're going to talk

  • about challenges at this crucial intersection of law and technology.

  • And the goal at the end of today is not to have provided you with more answers,

  • but hopefully generated more questions about what this intersection is

  • and where we're going to go forward.

  • Because at this intersection lie a lot of really interesting and challenging

  • problems that are at the forefront of what we're doing.

  • And you, as a practitioner, may be someone

  • who is asked to confront and contend with and provide resolutions

  • for some of these problems.

  • This lecture's going to be divided into two parts roughly.

  • In the first part, we're going to discuss

  • trust, whether we can trust the software that we receive

  • and what implications that might have for software

  • that's transmitted over the internet.

  • And in the second part, we're going to talk about regulatory challenges that

  • might be faced.

  • As new emergent technologies come into play,

  • how is the law prepared, or is the law prepared

  • to contend with those challenges?

  • But let's start by talking about this idea of a trust model,

  • trust model being a computational term for basically

  • do we trust something that we're receiving over the internet?

  • Do we trust that software is what it says it is?

  • Do we trust that a provider is providing a service in the way they describe,

  • or are there doing other things behind the scenes?

  • Now, as part of this lecture, there's a lot of supplementary reading materials

  • that we've incorporated in that we're going to draw on quite a bit

  • throughout the course of today.

  • And the first of those is a paper called "Reflections on Trusting Trust."

  • This is arguably one of the most famous papers in computer science.

  • It was written in 1984 by Ken Thompson.

  • Ken Thompson was one of the inventors of the Unix operating

  • system, on which Linux was based, on which subsequently,

  • based on a version of Linux, Mac OS is based.

  • And so he's quite a well-known figure in the computer science community.

  • And he wrote this paper to accept an award called the Turing Award, again,

  • one of the most famous awards in computer science.

  • And in it, he's trying to highlight the problem of trust in software.

  • And he begins by discussing about a computer

  • program that can reproduce itself.

  • We typically refer to this as what's called a quine in computer science.

  • But the idea is can you write a simple program that reproduces itself?

  • And we won't go through that exercise here.

  • But Thompson shows us that, yes, it is relatively trivial actually

  • to write programs that do this.

  • But what does this then lead to?

  • So the next step of the process that Thompson discusses is

  • stage two in this paper, is how do you teach a computer

  • to teach itself something?

  • And he uses the idea of a compiler.

  • Recall that we use compilers in some programming languages

  • to turn source code, the human-like syntax

  • that we understand-- languages like C, for example,

  • will be written in source code.

  • And they need to be compiled, or transformed,

  • into zeros and ones, machine code, because computers only

  • understand these zeros and ones.

  • They don't understand the human-like syntax

  • that we're familiar with as programmers when we are writing our code.

  • And what Thompson is suggesting that we can do

  • is we can teach the compiler, the program that

  • actually takes the source code and transforms it into zeros and ones,

  • to compile itself.

  • And he starts out by doing this by introducing a new character

  • for the compiler to understand.

  • The analogy is drawn to the newline character, which we

  • type when we reach the end of a line.

  • We want to go down and back to the beginning of a new one.

  • We enter the newline character.

  • There are other characters that were not initially envisioned

  • as part of the C compiler.

  • And one of those is vertical tab, which basically

  • allows you to jump down several lines without necessarily resetting back

  • to the beginning of the line as newline would.

  • And so Thompson goes through the process,

  • that I won't expound on here because it's

  • covered in the paper, of how to teach the compiler what

  • this new character, this vertical tab means.

  • He shows us that we can write code in the C programming language

  • and then have the compiler compile that code into zeros and ones that

  • create something called a binary, a program

  • that a computer can execute and understand.

  • And then we can use that newly created compiler

  • that we've just created to compile other C programs.

  • Which means that once we've taught the computer how

  • to understand what this vertical tab character is,

  • it then can propagate into any other C program that we write.

  • The computer is learning, effectively, a new thing to interpret,

  • and it can then interpret that in every other program.

  • But then Thompson leads us into stage three,

  • which is, what if that's not all the computer or the compiler does?

  • What if instead of just adding that vertical tab character

  • whenever we did it, we also secretly, as part of the source code,

  • insert a bug into the code, such that now whenever we compile the code

  • and we encounter that backslash V, that vertical tab character,

  • we're not only putting that into the code

  • so that the computer can understand and pass this slash

  • V, the character that it never knew about before,

  • but we've also sort of surreptitiously hidden a bug in the code.

  • And again, Thompson goes into great detail

  • about exactly how that can be done and exactly what steps we can then

  • take to make it look like that was never there.

  • We can change the source code, modify it,

  • and make it look like we never had a bug in there,

  • even though it is now propagating into all of the source code

  • we ever write or we ever compile going forward.

  • We've created a way to surreptitiously hide bugs in our code.

  • And the conclusion that Thompson draws is, is it

  • possible to ever trust software that was written by anyone else?

  • In this course we've talked about some of the tools that are available

  • to programmers that would allow them to go back in time-- for example,

  • we've discussed GitHub on several occasions to go back in time--

  • and see prior versions of code.

  • In the 1980s, when this paper was written,

  • that wasn't necessarily possible.

  • It was relatively easy to hide source code changes so that the untrained eye

  • wouldn't know about them.

  • Code was not shared via the internet.

  • Code was shared via floppy disks or hard disk that were being

  • passed between people who needed them.

  • And so there was no easy way to verify that code that was written by somebody

  • else is actually trustworthy.

  • Now, again, this paper came out 35-plus years ago now.

  • And it came out around the time that the Computer Fraud and Abuse

  • Act, which we've also previously discussed,

  • was being drafted and run through Congress.

  • Did lawmakers heed the advice of Ken Thompson?

  • Do we still today trust that our programs that we receive

  • or that we write are free of bugs?

  • Is there a way for us to verify that?

  • What should happen if code is found to be buggy?

  • What if it's unintentionally buggy?

  • What if it's maliciously buggy?

  • Do we have a way to challenge things like that?

  • Do we have a way to prosecute those kinds of cases

  • if the bug creates some sort of catastrophic failure in some business?

  • Not exactly.

  • The challenge of figuring out whether or not we should trust software

  • is something that we have to contend with every day.

  • And there's no bright line answer for exactly how to do so.

  • Now let's turn to perhaps a more modern interpretation of this idea

  • and take a look at the Samsung Smart TV policy.

  • So this was a bit of news a few years ago,

  • that Samsung was recording or was capturing voice commands

  • so people could make use of their television without needing a remote.

  • You could say something like, television,

  • please turn the volume up, or television, change the channel.

  • But it turned out that when Samsung was collecting this information,

  • they were transmitting it to a third party, a third-party language

  • processor, who would ostensibly be taking the commands they hear

  • and feeding them into their own database to improve the quality of understanding

  • what these commands were.

  • So it would hear--

  • let's say thousands of people use this brand of television.

  • It would take the thousands of people's voices all making the same command,

  • feed it into its algorithm to process this command, and hopefully try

  • and come up with a better or more comprehensive understanding of what

  • that command meant to avoid the mistake of I say one thing,

  • and the TV does something else because it misinterprets what I do.

  • If you take a look at Samsung's policy, it says things like the device

  • will collect IP addresses, cookies, your hardware and software configuration, so

  • the settings that you have put onto your television, your browser information.

  • Some of these TVs, these smart TVs, have web browsers built into them.

  • And so you may be also sharing information about your history

  • and so on.

  • Is this necessarily a bad thing?

  • When it became a news story it was