Subtitles section Play video Print subtitles The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high-quality educational resources for free. To make a donation, or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu. PROFESSOR: OK. So yeah, problem set 2. There's a lot of work done. Congratulations, all the workers. 16 trillion hashes performed. How can we prove that? So this is a personal gripe that I hear a lot. That people say that proof of work doesn't scale. And that really bugs me because sometimes I think I sort of know what they're talking about, and they mean like, bitcoin doesn't scale, or like, these block chains have poor scalability properties, which sure, sure. They definitely do. But proof of work, it scales perfectly, in a theoretical sense. There's nothing that can scale better. You can prove an arbitrary amount of work in 0 of 1. Right? So in this case, well, how big were these headers? They were less than 100 bytes, right? 100 characters. And with that space, you can prove actually the entire work that happened over the entire problem set. So yeah. block chains, whole bunch of scalability problems. There's complex systems, all sorts of scaling issues. But proven work itself, in this pure form, scales really great. OK. So question. Not super intuitive, but how do you prove all the work ever done throughout the entire problem set in one line? Does anyone have any intuition about this how do you prove all the work from all 1,800 blocks with just one piece of data? AUDIENCE: Well you know that for each block, 2 to 33, work had to go into it. So you just need to know the number of blocks produced times the-- PROFESSOR: OK. Yeah, so the thing is how do I prove the number of blocks without showing all of them, right? So OK, it's a weird trick question. Andrew-- I think-- I remember Andrew Miller, who's now a professor at somewhere, Cornell? Who was not, during the time, wrote about this initially in the bitcoin forums. What you do is you just show the luckiest block, and I have not yet defined luckiest block. But in this case, it was mined by turtle. This is the block, the previous block reference was of 0065a2 turtle 1654244 and the hash of that block is 000c49a414 blah, blah, blah. So anything interesting or novel about this particular block that you can see? It's not-- AUDIENCE: [INAUDIBLE] PROFESSOR: What? AUDIENCE: It's better than... PROFESSOR: It's better. There's more work. So we didn't count the things as having more or less work just by a sum number of zeros. We just looked at the threshold, did it have enough 0 bits and accepted or rejected. But in this case, these 4 bytes, right? You needed 4 bytes and then 1 extra bit. You needed 33 bits. So these 4 are always worth 0. But then in c and red, there's another extra byte that's all 0s. And another extra half a byte that's all 0s. And then a c means for that nibble, that the highest bit was 1, right? So you've got this red stuff. There's a byte and a half extra-- or almost a byte and a half extra. So, yeah. So what you can do for a compact proof work. So if you look at this again, there's 4 green bytes, byte and a half is red, right? So that's 5 and 1/2 bytes, so 44 bits. And 2 to the 44 is 17 trillion, right, if you do out 17 something, which is what we expect, that's our proof, right? We did 16 trillion hashes for our calculation, and it shows up here. And another way to look at it is we needed 33 bits for a valid block. We have 44 bits here. That's 11 bits extra. That's 11 bits of being lucky. And so that's 11 bits to the 11, that's 2,048, which is pretty close to the 1,862 that we actually performed, right? So in this case, we're a little lucky. We might have only had 2 of the 10 and then we could only prove that we'd done like 8 trillion work instead of 16 trillion work. So there's probabilities here, but this is an interesting property that actually does work. You can prove all the work, to some approximation usually within a factor of 2, that the system has ever done just with 1 block header. So, yeah. This is fun because another way to look at it is that you've got this metagame, where for every block found, take the, I'm doing a hash and I need to find a hash with a lot of 0s to prove that I've done work. And you go a level deeper and say, OK, I'm finding a block. And I want to prove that I found an even better block than everyone else, right? The entry for admission here is find a valid block. And then from those blocks, since it's a uniform distribution with 1s and 0s, you're going to have this tail-end of like, happened to have lots of 0s that you didn't need in the beginning. And that can prove all the work ever done. There's a really interesting paper called, HyperLogLog, that uses this for non-bitcoin applications that uses it for set counting. Where, like on a website, you want to see how many unique visitors you've gotten or something. And you can store that in 0 of 1 space because you could keep track of OK, let me keep track of every IP address that's ever visited or something like that, or cookie. But instead, you hash them. See if the hash starts with a bunch of 0s or any arbitrary character, and then just store the lowest. And then, every time someone visits, hash it, compare. If it's lower, replace. If not, ignore. And then you have a very compact indication of how many visitors have visited. Anyway, that's like a super high-level view of it. But if you're interested in this stuff, HyperLogLog is a paper. It builds off of some other things. It has nothing to do with bitcoin, other than this property, where you've got this random function and you see how many 0s are in it. But I think these are cool, and so to me, this is a fun-- this is not used in bitcoin, right? In bitcoin, you actually download all the headers, but people have written papers about how you could use it. If long term the headers are big, you could prove. Have some checkpoint, where look, I proved all the previous work and now I build from there. Any questions about this idea? Yes? AUDIENCE: It's not really a proof. It's just a probability weighted. PROFESSOR: Yes, but the proof of work itself is the same. Not of real proof because you might have gotten lucky. So finding a block, it's like, well that's not a proof. There could be luck involved, there could be probability, but it's the exact same luck and probability that the underlying proof of work uses. So there's no further reduction in security or certainty because of this. Not really. And so I remember talking about this with someone a few years ago and saying, yeah proof of work is a misnomer. It's not really a proof, right? Maybe it's an argument of work or some probabilistic argument of work. How could you make it more certain as a proof? There's a bunch of ways. One way would be to have multiple nonces, where instead of just finding one nonce that satisfies it, you have to find several replaceable nonces and then iterate through them. That would be a much more certain proof. It would remove the idea of probability, to some extent. It would also completely break the system in a way that's like fairly unintuitive, and so I always was sort of joking like, I should make an altcoin, where you've got multiple nonces, and you could be like, yes, for more security and then people would buy it, but it completely breaks. The completely broken incentives and system, maybe I'll get to-- I'll let you guys think about it til Wednesday, and then draw it out and be like, why wouldn't this work? It's fun. It breaks in subtle but bad ways. This is talking about proof of work optimization. So if you look at this slide anyway, you've got these headers or blocks or whatever we're mining. So you've got kezike17, tomriddle, Thalita, all these people mining. It's interesting to see what people use. Some people use looks like base64, some people just use a decimal number, all sorts of different things. Who knows. There was also a lot of invalid blocks with valid work submitted, where there was four or five different things in my spaces. So there's been count, but actually a lot more work was done, and that's not even counting the human work of doing all these assignments. So sending this over the wire, or storing it on disk, some inefficiencies may jump out at you. What do you think you could do to make this more efficient or compress it or? Yeah? AUDIENCE: [INAUDIBLE] PROFESSOR: OK. That's an easy one. Oh, this doesn't work when I gave you the slides because you can just look at the next slide. Whoops, OK. Never mind. Yeah, so the first 8 characters are always going to be 0s by definition. If it's not, it's an invalid block so don't even bother sending it. So the first characters are 0, don't send them over the wire. Just have that implied and just start at the ninth character. That saves, well, really you'd have the serializing binary, so it only saves 4 bytes. In our case, it would say 8 bytes. That's cool. And then, also the entire previous hash. When you're sending a list of 5 here, like here, in order for it to be valid, this has to be the hash of the line above it. So just don't send the whole thing, right? Just send name nonce. The hash is also computable from anyone receiving it. That takes almost all of this space and we could compress this entire blockchain to just the first line, and that's the only full line we need. After that, it's just named nonce named nonce, and we get rid of 70-something percent of this base. That's pretty cool, right? Yeah this kind of header optimization is also very much possible in bitcoin, and it's not implemented. It's not done. If you want to, you could program it, change bitcoin, make a pull request. I think-- I mean, we've discussed it, and people are like, yeah that'd be cool but, no one's done it. So if you want to leave your mark and be like, I'm a bitcoin core contributor, and I optimized the proof of work propagation system or something