Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • DAVID MALAN: Recall that an algorithm is just step-by-step instructions

  • for solving some problem.

  • Not unlike this problem here wherein I sought Mike Smith among the whole phone

  • book of names and numbers.

  • But up until now, we've really only focused

  • on those step-by-step instructions and not

  • so much on how the data we are searching is stored.

  • Of course, in this version of that problem, it's stored here on paper,

  • but in the digital world, it's of course not going to be paper, but 0's and 1's.

  • But it's one thing to say that the numbers and maybe even the names

  • are stored ultimately as 0's and 1's, but where and how exactly?

  • There's all those transistors and they're flipping on and off,

  • but with respect to each other, are those numbers laid out left to right,

  • top to bottom, are they all over the place?

  • Let's actually take a look at that question now

  • and consider how a computer leverages what

  • are called data structures to facilitate implementation of algorithms.

  • Indeed, how you lay out a computer's data inside of its memory

  • has non-trivial impacts on the performance or efficiency

  • of your algorithms, whereas the algorithm itself

  • can be correct as we've seen, but not necessarily efficient logically.

  • Both space and the representation underneath the hood of your data

  • can also make a significant impact.

  • But let's simplify the world first.

  • And rather than focus on, say, a whole phone book of names and numbers,

  • let's focus just on numbers, and much smaller numbers that aren't even

  • phone numbers, but just integers, and only save seven of them at a time.

  • And I've hidden these seven numbers, if you will,

  • behind these seven yellow doors.

  • And so by knocking and opening, one of these doors

  • will reveal one number at a time.

  • And the goal at hand, though, is to find a very specific number,

  • just like I sought one specific phone number before, this time I want

  • to find the number 50 specifically.

  • Well, where to begin?

  • I'll go with the one closest to me and knock, knock, knock--

  • 15 is the number.

  • So a little bit low.

  • Let's proceed from there to see 23.

  • We seem to be getting closer.

  • Let's open this door next and-- oh, we seem to have veered down smaller,

  • so I'm a little confused.

  • But I have four doors left to check.

  • So 50 is not there and 50 is not there and 50 is, in fact, there.

  • So not bad.

  • Within just six steps, have I found the number in question.

  • But of course, to be fair, there were only seven doors.

  • So if we generalize that to say that there were n doors where n is just

  • a number, well that was roughly n doors I had to open among the n doors

  • just to find the one that I sought.

  • So could I have done better?

  • You know, my instincts like yours were perhaps to start at the left

  • and move to the right, and we seem to be on a good path initially.

  • We went from 15 to 23, and then darn it if 16

  • didn't throw a wrench in the works, because I expected it,

  • perhaps naively, to be bigger and bigger as I moved right.

  • But honestly, had I not told you anything-- and indeed I did-- then

  • you wouldn't have known anything about these numbers other than maybe

  • the number 50 is actually there.

  • I told you nothing as to the magnitude or the size

  • of any of the other numbers, let alone the order,

  • but in the world of the phone book, of course,

  • we were able to take for granted that those names were

  • sorted by the phone company for us-- from left to right, from A to Z.

  • But in this case, if your data is just added to the computer's memory one

  • at a time in no particular order, the onus

  • is on you, the programmer or algorithm, to find that number

  • you're interested in nonetheless.

  • Now what was left here?

  • And indeed 4 is even smaller than 50.

  • So these seven doors were by design randomly assigned a number.

  • And so you could do no better.

  • I might have gotten lucky.

  • I might not have gone with my initial instincts

  • and touch the number 15 at left.

  • I might have, effectively blinded, gone and touched 50 and just gotten lucky,

  • and then it would have been just one step.

  • But there's only a one in seven chance I would have been correct so quickly,

  • so that's not really an algorithm that I could

  • reproduce with the same efficiency again and again.

  • So how can I do better?

  • And how does the phone company enable us to do better?

  • Well they, of course, put in a huge amount of effort upfront

  • to sort all of those names and associated numbers from left

  • to right, from A to Z. And so that's a huge leg up for us,

  • because then I can assume I can do divide and conquer or so-called binary

  • search, dividing that phone book in two as

  • implied by "bi" in "binary," having the problem again and again.

  • But someone's got to do that work for us, be it the phone company

  • or perhaps me with these numbers.

  • So let's take one more stab at this problem,

  • this time presuming that the seven doors in question

  • do, in fact, have the numbers behind them sorted from left to right,

  • small to big.

  • So where to find the number 50 now?

  • I have seven doors behind which are those same numbers,

  • but this time they are sorted from left to right.

  • And no skipping ahead thinking that, well, I remember all the other numbers,

  • so I know immediately where 50 is.

  • Let's assume for the moment that we don't know anything

  • about the other numbers other than the fact that they are sorted.

  • Well, my inclination is not to start at the left with this first door,

  • much like my inclination ultimately with that phone book was not to start

  • with the first page, but the middle.

  • And indeed, I'm going to go here to the middle of these doors and--

  • 16.

  • Not quite the one I want.

  • But if the doors are sorted now, I know that that number 50 is not to the left,

  • and so I'm going to go to the right.

  • Where do I go to the right?

  • Well, I have three doors left, I'm going to follow the same algorithm

  • and open that door in the middle and-- oh, so close.

  • I found only, if you will, the meaning of life.

  • So 42, though, is not the number I care about,

  • but I do know something about 50-- it's bigger than 42.

  • And so now, it's quite simply the case that-- aha, 50 is there,

  • it's going to be in that last number.

  • So whereas before took me up to six steps to find the number 50,

  • and only then by luck did I find it where

  • it was because it was just randomly placed,

  • now I spent 1, 2, 3 steps in total, which is, of course, fewer than six.

  • And as these numbers of doors grow in size

  • and I have hundreds or thousands of doors,

  • surely it will be the case just like the phone book that having this problem

  • again and again is going to get me to my answer

  • if it's there in logarithmic instead of linear time, so to speak.

  • But what's key to the success of this algorithm-- binary search--

  • is that the doors are not only sorted, but they are back-to-back-to-back.

  • Now I have the luxury of feet and I can move back and forth

  • among these numbers, but even my steps take me some amount of time and energy.

  • But fortunately, each such step just takes one unit of energy, if you will,

  • and I can immediately jump wherever I would like one step at a time.

  • But a computer is purely electronic, and in the context of memory,

  • doesn't actually need to take any steps.

  • Electronically a computer can jump to any location in memory

  • instantly in so-called constant time.

  • So just one step, that might take me several.

  • And so that's an advantage a computer has and it's just one of the reasons

  • why they are so much faster than us at solving so many problems.

  • But the key ingredient to laying out the data for a computer to solve

  • your problems quickly is that you need to put your data back-to-back-to-back.

  • Because a computer at the end of the day,

  • yes, stores only 0's and 1's, but those 0's and 1's are generally

  • treated in units of, say, eight--

  • 8 bits per byte.

  • But those bytes, when storing numbers like this,

  • need those numbers to be back-to-back-to-back and not just

  • jumbled all over the place.

  • Because it needs to be the case that the computer is

  • allowed to do the simplest of arithmetic to figure out where to look.

  • Even I in my head am sort of doing a bit of math figuring out,

  • well where's the middle?

  • Even though among few doors you can pretty much eyeball it quickly.

  • But a computer's going to have to do a bit of arithmetic, so what

  • is that math?

  • Well if I have 1, 2, 3, 4, 5, 6, 7 doors initially,

  • and I want to find the middle one, I'm actually just going to do what?

  • 7 divided by 2, which gives me 3 and 1/2-- that's

  • not an integer that's that useful for counting doors,

  • so let's just round it down to 3.

  • So 7 divided by 2 is 3.5 rounded down to 3 suggests

  • mathematically that the number of the door that's in the middle of my doors

  • should be that known as 3.

  • Now recall that a computer generally starts

  • counting at 0 because 0 bits represent 0 in decimal,

  • and so this is door 0, 1, 2, 3, 4, 5, 6.

  • So there's still seven doors, but the first is 0 and the last is called 6.

  • So if I'm looking for number 3, that's 0, 1, 2, 3.

  • And indeed, that's why I jumped to the middle of these doors,

  • because I went very specifically to location 3.

  • Now why did I jump to 42 next?

  • Of course, that was in the middle of the three remaining doors,

  • but how would a computer know mathematically where to go,

  • whereas we can just rather eyeball it here?

  • Well if you've got 3 doors divided by 2, that gives me, of course, 1.5--

  • let's round that down to 1.

  • So if we now re-number these doors, it's 0, 1, 2,

  • because these are the only three doors that exist, well door 1 is 0, 1--

  • the 42, and that's how a computer would know to jump right to 42.

  • Of course, with just one door left, it's pretty simple.

  • You'd needn't even do any of that math if there's just one,

  • and so we can immediately access that in constant time.

  • In other words, even though my human feet are taking a bit of energy

  • to get from one door to another, a computer has the leg-up, so to speak,

  • of getting to these doors even quicker, because all it has to do

  • is a little bit of division, maybe some rounding,

  • and then jump exactly to that position in memory.

  • And that is what we call constant time, but it presupposes, again,

  • that the data is laid out back-to-back-to-back so that every one

  • of these numbers is an equal distance away from every other.

  • Because otherwise if you were to do this math

  • and coming up with the numbers 3 or 1, you

  • have to be able to know where you're jumping in memory,

  • because that number 42 can't be down here, it has to be numerically in order

  • exactly where you expect.

  • And so in computer science and in programming is this kind of arrangement

  • where you have doors or really data back-to-back-to-back known

  • as what's called an array.

  • An array is a contiguous block of memory wherein values are stored

  • back-to-back-to-back-to-back-- from left to right conceptually,

  • although of course, direction has less meaning once you're inside

  • of a computer.

  • Now it is thanks to these arrays that we were able to search,

  • even something like a phone so quickly.

  • After all, you can imagine in the physical world,

  • a phone book isn't all that unlike an array,

  • albeit a more arcane version here, because its pages are indeed

  • back-to-back-to-back-to-back from left to right, which is wonderful.

  • And you'll recall when we searched a phone book,

  • we were already able to describe the efficiency via which

  • we were able to search it-- via each of those three algorithms.

  • One page at a time, two pages at a time, and then one-half

  • of the remaining problem at a time.

  • Well it turns out that there's a direct connection even

  • to the simplification of that same problem.

  • If I have n doors and I search them from left to right, that of course

  • might take me as many six, seven total steps or n if the number I'm seeking

  • is all the way at the end.

  • I could have gone two doors at a time, although that really

  • would have gone off the rails with the randomly-sorted numbers,

  • because there would have been no logic to just going left to right twice as

  • fast because I would be missing every other element never knowing

  • when to go back.

  • And in the case of binary search, my last algorithm where

  • I started in the middle and found 16, and then

  • started in the middle of that middle and found 42, and then

  • started in the middle of the middle and found my last number,

  • binary search is quite akin to what we did by tearing that problem in half

  • and in half.

  • So how did we describe the efficiency of that algorithm last time?

  • Well we proposed that my first algorithm was linear, this straight line in red

  • represented here by the label n, because for every page in the phone book,

  • in the worst case you might need one extra step to find someone

  • like Mike Smith.

  • And indeed, in the case of these doors, if there's just one more door added,

  • you might need one more step to find that number 50 or any other.

  • Now I could, once those doors are sorted,

  • go through them twice as fast, looking two doors at a time,

  • and if I go too far and find, say, 51, I could double-back and fix that mistake.

  • But what I ultimately did was divide and conquer.

  • Starting in the middle, and then the middle of the middle,

  • and the middle of the middle of the middle,

  • and that's what give me this performance.

  • This so-called logarithmic time-- log base 2 event

  • which if nothing else means that we have a different shape fundamentally

  • to the performance of this algorithm.

  • It grows so much more slowly in time even as the problem gets really big.

  • And even off the screen here, imagine that even as n gets huge,

  • that green line would not seem to be going very high

  • even as the red and yellow ones do.

  • So in computer science, there are actually

  • formal labels we can apply to this sort of methodology of analyzing algorithms.

  • When you talk about upper bounds, on just how much time an algorithm takes,

  • you might say this-- big O, quite literally.

  • That an algorithm is in a big O of some formula.

  • For instance, among the formulas it might be are these here--

  • n squared, or n log n, or n, or log n, or 1.

  • Which is to say you can represent somewhat simply

  • mathematically using n-- or really any other place holder-- as your value

  • a variable that represents the size of the problem in question.

  • So for instance, in the case of linear search,

  • when I'm searching that phone book left to right

  • or searching these doors left to right, in the worst case,

  • it might take me as many as n steps to find Mike or that 50,

  • and so we would say that that linear algorithm is

  • in big O of n, which is just a fancier way of saying quite simply that it's

  • indeed linear in time.

  • But sometimes I might get lucky, and indeed in the best case,

  • I might find Mike or 50 or anything else much faster,

  • and computer scientists also have ways of expressing lower bounds

  • on the running times of algorithms.

  • Whereby in the best case, perhaps, an algorithm

  • might take only this much time and at least this much time.

  • And we use a capitalized omega to express that notion of a lower bound,

  • whereas again, a big O represents an upper bound on the same.

  • So we can use these same formulas, because depending on the algorithm,

  • it might indeed take n squared steps or just 1 or constant number thereof,

  • but we can consider even linear search to having a lower bound,

  • because in the best case, maybe Mike or maybe 50

  • or any other inputs of the problem just so

  • happens to be at the very beginning of that book or those doors.

  • And so in the best case, a lower bound on the running time of linear search

  • might indeed be omega of 1 because you might just

  • get lucky and take one step or two or three

  • or terribly few, but independent of the number n.

  • And so there, we might express this lower bound as well.

  • Now meanwhile there's one more Greek symbol

  • here, theta, capitalized here, which represents a coincidence of upper

  • and lower bounds.

  • Whereby if it happens to be the case for some algorithm

  • that you have an upper bound and a lower bound that are the same,

  • you can equivalently say not both of those statements, but quite simply

  • that the algorithm is in theta of some formula.

  • Now suffice it to say, this green line is good.

  • Indeed, any time we achieve logarithmic time instead of, say, linear time, we

  • have made an improvement.

  • But what did we presuppose?

  • Well, we presupposed in both the case of the phone book

  • and in the case of those doors that they were sorted in advance for us.

  • By me in the case of the doors and by the phone

  • company in the case of the book.

  • But what did it cost me and what did it cost

  • them to sort all of those numbers and names

  • just to enable us ultimately to sort logarithmically?

  • Well let's consider that in the context of, again, some numbers, this time

  • some numbers that I myself can move around.

  • Here we have eight cups, and on these eight cups are eight numbers from 1

  • through 8.

  • And they're indeed sorted from smallest to largest, though I could equivalently

  • do this problem from largest to smallest so long as we all

  • agree what the goal is.

  • Well let me go ahead and just randomly shuffle some of these cups

  • so that not everything is in order anymore,

  • and indeed now they're fairly jumbled, and indeed not in the order I want,

  • so some work needs to be done.

  • Now why might they arrive in this order?

  • Well in the case of the phone book, certainly new people

  • are moving into a town every day, and so they're coming in not themselves

  • in alphabetical order, but seemingly random,

  • and it's up to the phone company to slot them

  • into the right place in a phone book for the sake of next year's print.

  • And the same thing with those doors.

  • Were I to add more and more numbers behind those doors,

  • I'd need to decide where to put them, and they're not necessarily

  • going to arrive for my input source in the order I want.

  • So here, then, I have some randomly-ordered data,

  • how do I go about sorting it quickly?

  • Well, let's take a look at the first problem I see.

  • 2 and 1 are out of order, so let me just go ahead and swap, so to speak,

  • those two.

  • I've now improved to the state of my cups,

  • and I've made some progress still, but 2 and 6

  • seem OK even though maybe there should be some cups in between.

  • So let's look at the next pair now.

  • We have 6 and 5, which definitely are out of order, so let's switch those.

  • 6 and 4 are the same, out of order.

  • 6 and 3, just as much.

  • 6 and 8 are not quite back-to-back, but there's probably

  • going to be a number in-between, but they are at least in the right order,

  • because 6, of course, is less than 8.

  • And then lastly we have 8 and 7.

  • Let's swap those here and done--

  • or are we not?

  • Well I've made improvements with every such swap, but some of these cups

  • still remain out of order.

  • Now these two are all set.

  • 2 and 5 are as well, even though ultimately we

  • might need some numbers between them, but 4 and 5 are indeed out of order.

  • 3 and 5 just as much.

  • 6 and 5 are OK, 7 and 6 are OK, and 8 and 7 as well.

  • So we're almost done there, but I do see some glitches.

  • So let's again compare all of these cups pairwise--

  • 1, 2; 2, 4-- oops, 4, 3, let's swap that.

  • Let's keep going just to be safe.

  • 4, 5; 5, 6; 6, 7; 7, 8.

  • And by way of this process, just comparing cups back-to-back,

  • we can fix any mistakes we see.

  • Just for good measure, let me do this once more.

  • 1, 2; 2, 3; 3, 4; 4, 5; 5, 6; 6, 7; 7, 8.

  • Now this time that I've gone all the way from left to right checking

  • that every cup is in order, I can safely conclude that these cups are sorted.

  • After all, if I just went from left to right and did no work,

  • why would I presume that if I do that same algorithm again,

  • I'd make any changes?

  • I wouldn't, so I can quit at this point.

  • So that's all fine and good, but perhaps we could have sorted these differently.

  • That felt a little tedious and I felt like I was doing a lot of work.

  • What if I just try to select the cups I want rather than deal

  • with two cups at a time?

  • Let's go ahead and randomly shuffle these again in any old order,

  • making sure to perturb what was otherwise left to right.

  • And here we have now another random assortment of cups.

  • But you know what I'm going to do this time?

  • I'm just going to select the smallest I see.

  • 2 is already pretty small, so I'll start as before on the left.

  • So let's now check the other cups to see if there's something smaller that I

  • might prefer to be in this location.

  • 3, 1-- ooh, 1 is better, I'm going to make mental note of this one.

  • 5, 8, 7, 6, 4-- all right, so 1 would seem to be the smallest number.

  • So I'm going to go ahead and put this where it belongs,

  • which is right here at the side.

  • There's really no room for it, but you know what?

  • These were randomly-sorted, let me just go ahead

  • and evict whatever's there, too, and put 1 in it's place.

  • Now to be fair, I might have messed things up a little bit,

  • but no more so than I might have when I received these numbers randomly.

  • In fact, I might even get lucky-- by evicting a cup,

  • I might end up putting it in the right place so it all washes out in the end.

  • Now let's go ahead and select the next smallest number,

  • but not bother looking at that first one anymore.

  • So 3 is pretty small, so I'll keep that in mind.

  • 2 is even smaller, so I'll forget about 3 and now remember 2.

  • 5 is bigger, 8 and 7 and 6 and 4--

  • all right, 2 now seems to be the next smallest number I can select.

  • I know it belongs there, but 3's already there, so let's evict 3

  • and there you go, I got lucky.

  • Now I have 1 and 2 in the right place.

  • Let's again select the next smallest number.

  • I see 3 here, and again, I don't necessarily know as a computer

  • if I'm only looking at one number at a time

  • if there are, in fact, anything smaller to its side.

  • So let's check-- 5, 8, 7, 6, 4-- nope.

  • So 3 I shall select, and I got lucky, I'll leave it alone.

  • How about the next smallest number?

  • 5 is pretty small, but 8, 7, 6, 4 is even smaller.

  • Let's select this one, put it in its place, evicting the 5

  • and putting it where there's room.

  • 8 is not that small, but it's all I know now.

  • But ooh-- 7 is smaller, I'll remember this.

  • 6 is even smaller, I'll remember that, and it feels

  • like I'm creating some work for myself.

  • 5 is the next smallest, 8's in the way.

  • We'll evict 8 and put 5 right there.

  • 7 is pretty small, but 6 is even smaller, but still smaller than 8,

  • so let's pick up 6, evict 7, and put 7 in its place.

  • Now for good measure, we're obviously done, but I as the computer

  • don't know that yet if I'm just looking at one of these cups or, if you will,

  • doors at a time.

  • 7's pretty small, 8 is no smaller, so 7 I've selected

  • to stay right there in its place.

  • 8 as well, by that same logic, is now in its right place.

  • So it turns out that these two algorithms

  • that I concocted along the way actually do have some formal semantics.

  • In fact, in computer science, we'd call the first

  • of those algorithms that thing here, bubble sort.

  • Because in fact, as you compare two cups side-by-side and swap them on occasion

  • in order to fix transpositions, well, your largest numbers

  • would seem to be bubbling their way up to the top,

  • or equivalently, the smallest ones down to the end, and so bubble sort

  • is the formal name for that algorithm.

  • How might express this more succinctly than my voice over there?

  • Well let me propose this pseudocode.

  • There's no one way to describe this or any algorithm,

  • but this was as few English words as I could come up with and still

  • be pretty precise.

  • So repeat until no swaps the following-- for i from 0 to n minus 2,

  • if the i-th and i-th plus 1 elements are out of order, swap them.

  • Now why this lingo?

  • Well computational thinking is all about expressing yourself

  • very methodically, very clearly, and ultimately

  • defining, say, some variables or terms that you'll need in your arguments.

  • And so here what I've done is adopt a convention.

  • I'm using i to represent an integer--

  • some sort of counter--

  • to represent the index of each of my cups or doors or pages.

  • And here, we are adopting the convention, too,

  • of starting to count from 0.

  • And so if I want to start looking at the first cup, a.k.a.

  • 0, I want to keep looking up, up to the cup called n minus 2,

  • because if my first cup is cup 0, and this is then 1, 2, 3, 4, 5, 6, 7,

  • indeed the cup is labeled 8, but it's in position 7.

  • And so this position more generally, if there are n cups, would be n minus 1.

  • So bubble sort is telling me to start at 0 and then look up to n minus 2,

  • because in the next line of code, I'm supposed

  • to compare the i-th elements and the i-th plus 1, so to speak.

  • So I don't want to look all the way to the end,

  • I want to look one shy to the end, because I know in looking at pairs,

  • I'm looking at this one as well as the one to its right, a.k.a.

  • i plus 1.

  • So the algorithm ultimately is just saying,

  • as you repeat that process again and again until there are no swaps, just

  • as I proposed, you're swapping any two cups that with respect to each other

  • are out of order.

  • And so this, too, is an example more generally of smalling local problems

  • and achieving ultimately a global result, if you will.

  • Because with each swap of those cups, I'm improving the quality of my data.

  • And each swap in and of itself doesn't necessarily solve the big picture,

  • but together when we aggregate all of those smaller solutions have we

  • assembled the final result.

  • Now what about that second algorithm, wherein

  • I started again with some random cups, and then that time I

  • selected one at a time the number I actually wanted in place?

  • I first sought out the smallest.

  • I found that to be 1 and I put it all the way there on the left.

  • And I then sought out the next smallest number,

  • which after checking the remaining cups, I determined was 2.

  • And so I put 2 second in place.

  • And then I repeated that process again and again,

  • not necessarily knowing in advance from anyone what numbers I'd find.

  • Because I checked each and every remaining cup,

  • I was able to conclude safely that I had indeed found the next smallest element.

  • And so that algorithm, too, has a name--

  • selection sort.

  • And I might describe it pseudocode similar

  • in structure but with different logic ultimately.

  • Let me propose that we do for i from 0 to n minus 1,

  • where again, n is the number of cups, and 0 is by convention my first cup,

  • and n minus 1, therefore, is my last.

  • And what I then want to do is find the smallest element between the i-th

  • element and the n-th plus--

  • at n minus 1.

  • That is, find the smallest element between wherever you've begun

  • and that last element, n minus 1.

  • And then if-- when you've found that smallest element,

  • you swap it with the i-th element.

  • And that's why I was picking up one cup and another

  • and swapping them in place-- evicting one and putting one where it belongs.

  • And you do this again and again and again,

  • because each time your incrementing 1.

  • So whereas the first iteration of this loop will start here all the way left,

  • the second iteration will start here, and the third iteration

  • will start here.

  • And so with the amount of problem to be solved

  • is steadily decreasing until I have 1 and then 0 cups left.

  • Now it certainly took some work to sort those n cups,

  • but how much work did it take?

  • Well in the case of bubble sort, what was I doing on each pass

  • through these cups?

  • Well I was comparing and then potentially swapping

  • each adjacent pair of cups, and then repeating myself again and again.

  • Well if we have here n cups, how many pairs

  • can you create which you then consider swapping?

  • Well if I have n cups, I could seem to make 1, 2, 3, 4, 5, 6,

  • 7 out of 8 pairs at a time, so more generally n minus 1.

  • So on each pass here, it would seem that I'm comparing n minus 1 cups.

  • Now how many passes do I need to ultimately make?

  • It would seem to be roughly n, because in the worst case,

  • these cups might be completely out of order.

  • Which is to say, I might indeed do n things n minus 1 times,

  • and if you multiply that out, I'm going to get some factor of n squared.

  • But what about selection sort, wherein I instead looked through all of the cups,

  • selecting first the smallest, and then repeating

  • that process for the next smallest still?

  • Well in that case, I started with n cups,

  • and I might need to look at all n, and then

  • once I found that, I might instead look at n minus 1.

  • So there, too, I seem to be summing something like n plus n minus 1

  • plus n minus 2 and so forth, so let's see

  • if we can't now summarize this as well.

  • Well let me propose more mathematically, that, say, with selection sort,

  • what we've done is this.

  • In looking for that smallest cup, I had to make n minus 1 comparisons.

  • Because as I identified the smallest cup I'd yet seen,

  • I compared it to no more than n minus others.

  • Now if the first selection of a cup took me n minus 1 steps but then it's done,

  • the next lesson of the next smallest cup would

  • have taken me only n minus 2 steps.

  • And if you continue that logic with each pass,

  • you have to do a little bit less work until you're left with just one

  • very last cup at the end, such as 8.

  • So what does this actually sum too?

  • Well you might not remember or see it at first glance,

  • but it turns out, particularly if you look at one of those charts

  • at the back of a textbook, does this summation or series actually

  • aggregate to n times n minus all divided by 2.

  • Now this you can perhaps multiply out a bit more readily as

  • in n squared minus n all divided by 2.

  • And if we factor that out, we can now get n squared

  • divided by 2 minus n divided by 2.

  • Now which of these terms, n squared divided by 2 or n divided by 2,

  • tends to dominate the other?

  • That is to say, as n gets larger and larger,

  • which of these mathematical expressions has the biggest effect

  • on the number of steps?

  • Well surely it's n squared, albeit divided by 2, because as n gets large,

  • n squared is certainly larger than n.

  • And so what a computer scientist here would typically do

  • is just ignore those lower-ordered terms, so to speak.

  • And he would say with a figurative or literal wave of the hand,

  • this is on the order of n squared this algorithm.

  • That isn't to say it's precisely that many steps,

  • but rather as n gets really large, it is pretty much

  • that n squared term that really matters the most.

  • Now this is not a form of proof, but rather a proof by example, if you will,

  • but let's see if I can't convince you with a single example numerically

  • of the impact of that square.

  • Well if we start again with n squared over 2 minus n over 2 and say n

  • is maybe 1 million initially-- so not eight cups, not 1,000 pages in a book,

  • but 1 million numbers or any other element itself.

  • What does this actually sum to?

  • Well 1 million squared divided by 2 minus 1 million divided by 2

  • happens to be 500 billion minus 500,000, which of course is 499,999,500,000.

  • Now I daresay that is pretty darn close to big O of n squared.

  • Why?

  • Well if we started with, say, 1 trillion then

  • halved it and ended up with 499 billion, that's still pretty close.

  • Now in real terms, that does not equal the same number of steps,

  • but it gives us a general sense it's on the order of this many steps,

  • because if we plugged in larger and larger values for n,

  • that difference would not even be as extreme.

  • Well why don't we take a look now at these algorithms in a different form

  • altogether without the physical limitation of me as the computer?

  • Pictured here is, if you will, an array of numbers, but pictured graphically.

  • Wherein we have vertical bars, and the taller

  • the bar, the bigger the number it represents.

  • So big bar is big number, small bar is small number,

  • but they're clearly, therefore, unsorted.

  • Via these number of algorithms we've seen, bubble sort and selection sort,

  • what does it actually look like to sort of many elements?

  • Let's take a look.

  • In this tool where I proceed to choose my first algorithm,

  • which shall be, say, bubble sort.

  • And you'll see rather slowly that this algorithm is indeed comparing

  • pairwise elements, and if--

  • and only if they're out of order, swapping them again and again.

  • Now to be fair, this quickly gets tedious,

  • so let me increase the animation speed here.

  • And now you can rather see that bubbling up of the largest.

  • Previously it was my 8 and my 7 and 6.

  • Here we have 99, 98, 97, but indeed, those tallest bars

  • are making their way up.

  • So let's turn our attention next to this other algorithm, selection sort,

  • to see if it looks or perhaps feels rather different.

  • Here now we have selection sort each time

  • going through the entire list looking for the smallest possible element.

  • Highlighted in red for just a moment here is

  • 9, because we have not yet until-- oh, now found

  • a smaller element, now 2, and now 1.

  • And we'll continue looking through the rest of the numbers just

  • to be sure we don't find something smaller, and once we do,

  • 1 goes into place.

  • And then we repeat that process, but we do fewer steps now,

  • because whereas there are n total bars, we don't need to look at the leftmost

  • now because it's sorted, we only need look at n minus 1.

  • So this process again will repeat.

  • We found 2.

  • We're just double-checking that there's not something smaller,

  • and now 2 is in its place.

  • Now we humans, of course, have the advantage

  • of having an aerial view, if you will, of all this data.

  • And certainly a computer could remember more than just

  • the smallest number it's recently seen.

  • Why not for efficiency remember the two smallest numbers?

  • The three smallest numbers?

  • The four smallest numbers?

  • That's fine, but that argument is quickly devolving into--

  • just remember all the original numbers.

  • And so yes, you could perhaps save some time,

  • but it sounds like you're asking for more and more space

  • with which to remember the answers to those questions.

  • Now this, too, would seem to be taking us all day.

  • Even if we down here increase the animation speed,

  • it now is selecting those elements a bit faster and faster,

  • but there's still so much work to be done.

  • Indeed, these comparison-based sorts that are comparing things again

  • and again and then redoing that work in some form to improve the problem still

  • just tend to end up on the order of--

  • bingo, of n squared.

  • Which is to say that n squared or something quadratic

  • tends to be rather slow.

  • And this is in quite contrast to our logarithmic time before,

  • but that logarithm thus far was for searching, not sorting.

  • So let's compare these two now side by side,

  • albeit with a different tool that presents the same information

  • graphically sideways.

  • Here again we have bars, and small bar is small number,

  • and big bar is big number, but here, they've simply been rotated 90 degrees.

  • On the left here we have selection sort, on the right here bubble sort,

  • both of whose bars are randomly sorted so that neither

  • has an edge necessarily over the other.

  • Let's go ahead and play all and see what happens here.

  • And you'll see that indeed, bubbles bubbling up

  • and selection is improving its selections as we go.

  • Bubble would seem to have won because selection's got a bit more work,

  • but there, too, it's pretty close to a tie.

  • So can we do better?

  • Well it turns out we can, so long as we use a bit more of that intuition

  • we had when we started thinking computationally

  • and we divided and conquered, we divided and conquered.

  • In other words, why not, given n doors or n cups or in pages,

  • why don't we divide and conquer that problem again and again?

  • In other words, in the context of the cups,

  • why don't I simply sort for you the left half and then the right half,

  • and then with two sorted halves, just interweave them for you together.

  • That would seem to be a little different from walking back and forth

  • and back and forth and swapping elements again and again.

  • Just do a little bit of work here, a little bit

  • more now, and then reassemble your total work.

  • Now of course, if I simply say, I'll sort this left half,

  • what does it mean to sort this left half?

  • Well, I dare say this left half can be divided

  • into a left half of the left half, thereby making the problem smaller.

  • So somehow or other, we could leverage that intuition of binary search,

  • but apply it to sort.

  • It's not going to be in the end quite as fast as binary search,

  • because with sort, you have to deal with all of the elements,

  • you can't simply tear half of the problem

  • away because you'd be leaving half of your elements unsorted.

  • But it turns out there's many algorithms that

  • are faster than selection and bubble sort, and one of those

  • is called merge sort.

  • And merge sort leverage is precisely this intuition of dividing

  • a problem in half and in half, and to be fair, touching all of those halves

  • ultimately, but doing it in a way that's more efficient and less

  • comparison-based than bubble sort and selection sort themselves.

  • So let me go ahead and play all now with these three sets of bars

  • and see just which one wins now.

  • And after just a moment, there's nothing more

  • to say-- merge sort has already won, if you will, even though now bubble has,

  • and now selection.

  • And perhaps this was a fluke-- to be fair,

  • these numbers are random, maybe merge sort got lucky.

  • Let's go ahead and play the test once more with other numbers.

  • And indeed it again is done.

  • Let me play it one third and final time, but notice the pattern

  • now that emerges with merge sort.

  • You can see if you look closely the actual halving again and again.

  • And indeed, it seems that half of the list get sorted,

  • and then you re assemble it at the very end.

  • And indeed, let's zoom in on this algorithm

  • now and look specifically at merge sort alone.

  • Here we have merge sort, and highlighted in colors

  • as we do work is exactly the elements you're sorting again and again.

  • The reason so few of these bars are being

  • looked at a time is because again, logically or recursively, if you will,

  • are we sorting first the left half?

  • But no, the left half of the left half.

  • But no, the left half of the left half of the left half and so

  • forth, and what this really boils down to

  • ultimately is sorting eventually individual elements.

  • But if I hand you one element and I say, please sort this,

  • it has no halves, so your work is done-- you don't need do a thing.

  • But then if you have two halves, each of size 1,

  • there might indeed be work to be done there,

  • because if one is smaller than the other or one is larger than the other,

  • you do need to interleave those for me to merge them.

  • And that's exactly what merge sort's doing here.

  • Allow me to increase the animation speed and you'll see as we go,

  • that half of the list is getting sorted at a time.

  • It's not perfect and it's not perfectly smooth,

  • because that's-- because half of the other elements are there,

  • but now are reemerging the two halves.

  • And that was fast, but it finished faster

  • indeed than would have been for bubble and selection sort,

  • but there was a price being paid.

  • If you think back to our vertical visualization of bubble sort

  • and selection sort, they were doing all of their work in place.

  • Merge sort seemed to be getting a little greedy on us, if you will,

  • and that it was temporarily putting some of those bars down here,

  • effectively using twice as much space as those first two algorithms, selection

  • and bubble.

  • And indeed, that's where merge sort gets its edge fundamentally.

  • It's not just a better algorithm, per se, and better thought-out,

  • but it actually additionally consumes more resources-- not time, but space.

  • By using twice as much space-- not just the top half of the screen,

  • but the bottom--

  • can merge sort temporarily put some of its work over here,

  • continue doing some other work, and then reassemble them together.

  • Both selection sort and bubble sort did not have that advantage.

  • They had to do everything in place, which

  • is why we had to swap so many things so many times.

  • We had far fewer spots in which to work on that table.

  • But with merge sort, spend a bit more space,

  • and you can reduce that amount of time.

  • Now all of these algorithms assume that our data is back-to-back-to-back--

  • that is, stored in an array.

  • And that's great, because that's exactly how a computer is so inclined

  • to store data inherently.

  • For instance, pictured here is a stick of memory of RAM--

  • Random Access Memory.

  • And indeed, albeit a bit of a misnomer that R in RAM, random, actually

  • means that a computer can jump in instant or constant time

  • to a specific byte.

  • And that's so important when we want to jump

  • around our data, our cups, or our pages in order to get at data instantly,

  • if you will.

  • And the reason it is so conducive to laying out information back-to-back

  • contiguously in memory is if we consider one of these black chips on this DIMM--

  • or Dual In-line Memory Module--

  • is that we have in this black chip really, if you will,

  • an artist's rendition at hand.

  • That artist's rendition might propose that if you

  • have some number of bytes in this chip, say 1 billion for 1 gigabyte,

  • it certainly stands to reason that we humans could

  • number those bytes from 0 on up--

  • from 0 to 1 billion, roughly speaking.

  • And so the top left one here might be 0, the next one might be 1,

  • the next one thereafter should be 2, and so we can

  • number each and every one of our bytes.

  • And so when you store a number on a cup or a number behind a door,

  • that amounts to just writing those numbers inside of each of these boxes.

  • And each is next to the other, and so with simple arithmetic,

  • a bit of division and rounding, might you

  • be able to jump instantly to any one of these addresses?

  • There is no moving parts here to do any work like my human feet might

  • have to do in our real world.

  • Rather the computer can jump instantly to that so-called address

  • or index of the array.

  • Now what can we do when we have a canvas that

  • allows us to layout memory in this way?

  • We can represent any number of types.

  • Indeed in Python, there are all sorts of types of data.

  • For instance, bool for a Boolean value and float for a floating point

  • value, a real number with a decimal.

  • An int for an integer and str for a string.

  • Each of those is laid out in memory in some particular way that's

  • conducive to accessing it efficiently.

  • But that's precisely why, too, we've run into issues

  • when using something like a float, because if you decide a priori to use

  • only so many bytes, bytes to the left and to the right,

  • above and below it might end up getting used by other parts of your program.

  • And so if you've only asked for, say, 32 or 64 bits or 4 or 8 bytes,

  • because you're then going to be surrounded by other data,

  • that floating point value or some other can only be ultimately so precise.

  • Because ultimately yes, we're operating in bits,

  • but those bits are physically laid out in some order.

  • So with that said, what are the options via which we can paint on this canvas?

  • Surely it would be nice if we could store

  • data not necessarily always back-to-back in this way,

  • but we can create more sophisticated data structures

  • so as to support not only these types here, but also ones like these.

  • Dict in Python for dictionary, otherwise known as a hash table.

  • And list for a sort of array that can grow and shrink, and range

  • for a range of values.

  • Set for a collection of values that contain

  • no duplicates, and tuples, something like x, y or latitude, longitude.

  • These concepts-- surely it would be nice to have

  • accessible to us in higher level contexts like Python,

  • but if at the end of the day all we have is bytes of memory back-to-back,

  • we need some layers of abstraction on top of that memory

  • so as to implement these more sophisticated structures.

  • So we'll take a look at a few in particular ints

  • and str and dict and list, because all of those

  • somehow need to be built on top of these lower-level principles of memory.

  • So how might this work and what problems might we solve?

  • Let's now use the board as my canvas, drawing on it

  • that same grid of rows and columns in order

  • to divide this screen into that many bytes.

  • And I'll go ahead and divide this board into these squares, each one of which

  • represents an individual byte, and each of those bytes, of course,

  • has some number associated with it.

  • That number is not the number inside of that box, per se, not the bits

  • that compose it, but rather just metadata-- an index

  • where address that exists implicitly, but is not actually stored.

  • This then might be index 0 or address 0, this

  • might be 1, this 2, this 3, this one 4, this one 5.

  • And if we, as for artist's sake, move to the next row,

  • we might call this 6 and this 7, and so forth.

  • Now suppose we want to store some actual values in this memory,

  • well let's go ahead and do just that.

  • We might stored the actual number 4 here, followed by 8,

  • followed by 15 and 16, perhaps followed by 23, and then 42.

  • And so we have some random numbers inside of this memory,

  • and because those numbers are back-to-back,

  • we can call this an array of size 6.

  • Its first index is 0, its last index is 5,

  • and between there are six total values.

  • Now what can we do if we're ready to add a seventh number to this list?

  • Well, we could certainly put it right here

  • because this is the next appropriate location,

  • but it depends whether that spot is still available.

  • Because the way a computer typically works

  • is that when you're writing a program, you

  • need to decide in advance how much memory you want.

  • And you tell the computer by way of the operating system,

  • be it Windows or macOS, Linux, or something else,

  • how many bytes of memory you would like to allocate to your particular problem.

  • And if I only had the foresight to say, I

  • would like 6 bytes in which to store 6 numbers,

  • the operating system might have handed me that back and said,

  • fine, here you go, but the operating system thereafter

  • might have proceeded to allocate subsequent adjacent bytes, like 6

  • and 7, to some other aspect of your program.

  • Which is to say, you might have painted yourself into a bit of a corner

  • by only in code asking the operating system for just those initial 6 bytes.

  • You instead might have wanted to ask for more bytes

  • so as to allow yourself this room to grow,

  • but if you didn't do that in code, you might just be unlucky.

  • But that's the price you pay for an array.

  • You have this wonderfully efficient ability

  • to search it randomly, if you will, which

  • is to say instantly via arithmetic.

  • You can jump to the beginning or the end or even the middle,

  • as we've seen, by just doing perhaps some addition, subtraction, division,

  • and rounding, and that gets you ultimately right where

  • you want to go in some constant and very few number of steps.

  • But unfortunately, because you wanted all of that memory

  • back-to-back-to-back, it's up to you to decide how much of it you want.

  • And if the operating system, I'm sorry, has already allocated 6, 7,

  • and elsewhere on the board to other parts of the program,

  • you might be faced with the decision as to just say, no,

  • I cannot accept any more data, or you might say, OK, operating system,

  • what if I don't mind where I am in memory-- and you probably don't--

  • but I would like you to find me more bytes somewhere else?

  • Rather like going from a one-bedroom to a two-bedroom apartment

  • so that you have more room, you might physically have to pack your bags

  • and go somewhere else.

  • Unfortunately, just like in the real world, that's not without cost.

  • You need to pack those bags and physically move, which takes time,

  • and so will it take you and the operating system some time

  • to relocate every one of your values.

  • So sure, there might be plenty of space down here below on multiple rows

  • and even not pictured, but it's going to take a non-zero amount of time

  • to relocate that 4 and 8 and 15 and that 16 and 23 and 42 to new locations.

  • That might be your only option if you want to support more data,

  • and indeed, most programs would want-- it would be an unfortunate situation

  • if you had to tell your user or boss, I'm sorry, I ran out of space,

  • and that's certainly foolish.

  • If you actually do have more space, it's just not right there next to you.

  • So with an array, you have the ability physically

  • to perform very sophisticated, very efficient algorithms such as we've

  • seen-- binary search and bubble sort and selection sort

  • and merge sort, and do so in quite fast time.

  • Even though selection sort and bubble sort were big O of n squared,

  • merge sort was actually n times log n, which is slow--

  • which is slower than log n alone, but faster than n squared.

  • But they all presuppose that you had random access to elements

  • arithmetically via their indexes or address, and to do so,

  • you can with your computer's memory with arrays,

  • but you need to commit to some value.

  • All right, fine.

  • Let's not ask the operating system for 6 bytes initially, let's say, give me 7

  • because I'm going to leave one of them blank.

  • Now of course, that might buy you some runway, so to speak,

  • so that you can accommodate if and when a seventh element,

  • but what about an eighth?

  • Well, you could ask the operating system from the get-go, don't get me

  • 6 bytes of space, but give me 8 or give me 16 or give me 100.

  • But at that point, you're starting to get a little greedy,

  • and you're starting to ask for more memory than you might actually

  • need anytime soon, and that, too, is unfortunate,

  • because now you're being wasteful.

  • Your computer, of course, only has a finite amount of space,

  • and if you're asking for more of it than you actually need,

  • that memory, by definition, is unavailable to other parts

  • of your program and perhaps even others.

  • And so your computer ultimately might not

  • be able to get as much work done because it's been holding off to the side

  • just some empty space.

  • Empty parking spaces you've reserved for yourself or empty seats

  • at a table that might potentially go unused, it's just wasteful.

  • And hardware costs money.

  • And hardware enables you to solve problems.

  • And with less hardware available, can you solve fewer problems at hand,

  • and so that, too, doesn't feel like a perfect solution.

  • So again, this series of trade-offs, it depends

  • on what's most important to you-- time or space or money or development

  • or any number of other scarce resources.

  • So what can we do instead as opposed to an array?

  • How do we go about getting dynamism that we so clearly wants here,

  • whereas it wouldn't-- wouldn't it be nice if we could grow these great data

  • structures, and better yet, even shrink them?

  • If I no longer need some of these numbers,

  • I'm going to give you back that memory so that I can use it elsewhere

  • for more compelling purposes.

  • Well it turns out that in computer science,

  • programmers can create even fancier data structures

  • but at a higher level of abstraction.

  • It turns out, we could start making lists out of our values.

  • In fact, if I wanted to add some number to the screen, and for instance,

  • maybe these two spots were blocked off by something else.

  • But you know what?

  • I do know there's some room elsewhere on the screen,

  • it just happens to be available here.

  • And so if I want to put the number 50 in my list of values,

  • I might just have to say, I don't care where you put it,

  • go ahead and put it right there.

  • Well where is there?

  • Well if we continue this indexing-- this is 6 and 7 and 8 and 9, 10, 11, 12,

  • 13, 14, and 15, if 50 happens to end up by chance at location 15

  • because it's the first byte available, because not only these two, but maybe

  • even all of these are taken for some other reason--

  • ever since you asked for your first six, that's OK,

  • so long as you can somehow link your original data to the new.

  • And pictorially here, I might be inclined just to say, you know what?

  • Let me just leave a little breadcrumb, so to speak, and say that after the 42,

  • I should actually go down here and follow this arrow.

  • Sort of Chutes and Ladders style, if you will.

  • Now that's fine and you can do that-- after all, at the end of the day,

  • computers will do what you want, and if you

  • can write the code to implement this idea,

  • it will, in fact, remember that value.

  • But how do we achieve this?

  • Here, too, you have to come back to the fundamental definition

  • of what your computer is doing and how.

  • It's just got that chip of memory, and those bytes back-to-back,

  • such as those pictured here.

  • So this is all you get-- there is no arrow feature inside of a computer.

  • You have to implement that notion yourself.

  • So how can you go about doing that?

  • Well, you can implement this concept of an arrow,

  • but you need to implement it ultimately at a lower level or trust

  • that someone else will for you.

  • Well, as best I can tell, I do know that my first several elements happened

  • to be back-to-back from 4 on up to 42 in locations 0 through 5.

  • Because those are contiguous, I get my random access

  • and I can immediately jump from beginning to middle to end.

  • This 50 and anything after it needs to be handled a little better.

  • If I want to implement this arrow, the only possible way

  • seems to be to somehow remember that the next element after 42

  • is at location 15.

  • And that location, a.k.a.

  • address or index, just has to be something I remember.

  • Unfortunately I don't have quite enough room left to remember that.

  • What I really want to do is not store this arrow, but by the way,

  • parenthetically go ahead and store the number 15--

  • not as the index of that cell, but as the next address

  • that should be followed.

  • The catch, though, is that I've not left myself enough room.

  • I've made mental note in parentheses here

  • that we've got to solve this a bit better.

  • So let's start over for the moment, and no longer worry

  • about this very low level, because it's too messy at some point.

  • It's like talking in 0's and 1's--

  • I don't want to talk in bytes in this way.

  • So let's take things up in abstraction level,

  • if you will, and just agree to agree that you can store values in memory,

  • and those values can be data, like numbers you want--

  • 4, 8, 15, 16, 23, 42, and now 50.

  • And you can also store somehow the addresses or indexes--

  • locations of those values.

  • It's just up to you how to use this canvas.

  • So let's do that and clear the screen and now start

  • to build a higher-level concept.

  • Not an array, but something we'll call a linked list.

  • Now what is a linked list?

  • A linked list is a data structure that's a higher-level concept in abstraction

  • on top of what ultimately is just chunks of memory or bytes.

  • But this linked list shall enable me to store more and more values

  • and even remove them simply by linking them together.

  • So here, let me go ahead and represent those same values starting with 4,

  • followed by 8 and 15, and then 16 and 23, and finally, 42.

  • And now eventually I'm going to want to store 50, but I've run out of room

  • but that's fine, I'm going to go ahead and write 50 wherever there's space.

  • But now let's not worry about that grid, rows, and columns of memory.

  • Let's just stipulate that yes, that's actually there,

  • but it's not useful to operate at that level.

  • Much like it's not useful to continually talk in terms of 0's and 1's.

  • So let me go ahead and wrap these values with a higher-level idea

  • called a node or just a box.

  • And this box is going to store for us each of these values.

  • Here I have 4, here I have 8 and 15, here I have 16,

  • I have 23, and finally, 42.

  • And then when it comes time to add 50 to the mix,

  • it, too, will come in this box.

  • Now what is this box?

  • It's just an artist's rendition of the underlying bytes,

  • but now I have the ability to draw a prettier picture, if you will,

  • that somehow interlinks these boxes together.

  • Indeed, what I ultimately want to remember

  • is that 4 comes first and 42 comes last, but then wait, if I had 50,

  • it shall now come last.

  • So we could do this as an artist quite simply with those arrows pointing

  • each box to the next, implying that the next element in the list,

  • whether it's next door or far away, happens to be at the end of that arrow.

  • But what are those arrows?

  • Those are not something that you can represent in a computer

  • if at the end of the day all you have are blocks of memory and in them bytes.

  • If all you have are bytes-- when, therefore, patterns

  • of 0's and 1's, whatever you store in the computer

  • must be representable with those 0's and 1's, and among the easiest things

  • to represent, we know already, is numbers, like indexes or addresses

  • of these nodes.

  • So for instance, depending on where these nodes are in memory,

  • we can simply check that address and store it as well.

  • So for instance, if the 4 still happens to be at address 0,

  • and this time 8 is at address 4, and this one 8, and this one 12,

  • and this one 16, and this one 20-- just by chance back-to-back-to-back 4 bytes

  • apart--

  • 32 bits, well 50 might be some distance away.

  • Maybe it's actually at location 100, that's OK.

  • We can still do this.

  • Because if we use part of this node, part of each box

  • to implement those actual arrows, we can actually store all the information

  • we need to know how to get from one box to another.

  • For instance, to get from 4 to the next element,

  • you're going to want to coincidentally go to not number 4, but address 4.

  • And if you want to go from value 8 to the next value, 15,

  • you're going to want to go to address 8.

  • And if you want to go from 15 to 16, the next address

  • is going to be 12, followed by 16, followed by 20.

  • And herein lies the magic--

  • if you want to get from 42 to that newest element that's

  • just elsewhere at address 100, that's what gets associated with 42's node.

  • As for 50, it's the dead end.

  • There's nothing more there, so we might simply

  • draw a line through that box saying, eh, just

  • store it all 0 bits or some other convention equivalently.

  • So there's so many numbers now on the screen,

  • but to be fair, that's all that's going on inside of a computer--

  • just storing of these bytes.

  • But now we can stipulate that, OK, I can somehow

  • store the location of each node in memory using its index or address.

  • It's just frankly not all that pleasant to stare at these values,

  • I'd much rather look at and draw the arrows graphically,

  • thereby representing the same idea of these pointers, if you will,

  • a term of art in some languages that allows me to remember

  • which element goes to which.

  • And what is the upside of all this now complexity?

  • Well now we have the ability to string together all of these nodes.

  • And frankly, if we wanted to remove one of these elements

  • from the list, that's fine, we can rather snip it out.

  • And we can simply update what the arrow is pointing to,

  • and equivalently, we can update the next address in that node.

  • And we can certainly add to this list by drawing more nodes here or perhaps

  • over here and just link them with arrows conceptually, or more specifically,

  • by changing that dead end to the address of the next element.

  • And so we can create the idea of the abstraction of a list using

  • just this canvas of memory.

  • But not all is good here.

  • We've surely paid a price, right?

  • Surely we couldn't get dynamism for addition and removal

  • and updating of a list without paying some price.

  • This dynamic growth, this ability to store as many more elements

  • as we want without having to tell the operating system from the get-go how

  • many elements we expect.

  • And indeed, while we're lucky at first, perhaps,

  • if we know from the get-go we need at least six values here,

  • they might be a consistent distance apart--

  • 4 bytes or 32 bits.

  • And so I could do arithmetic on some of these nodes,

  • but that is no longer, unfortunately, a guarantee of this structure.

  • Whereas arrays do guarantee you random access, linked lists do not.

  • And linked lists instead require that you traverse them in linear time

  • from the first element potentially all the way to the last.

  • There is no way to jump to the middle element,

  • because frankly, if I do that math as before, 100 bytes away is the last,

  • so 100 divided by 2 is 50-- rounding down, keeping me at 50,

  • puts me somewhere over here, and that's not right.

  • The middle element is earlier, but that's

  • because there's no now support for random access or instant arithmetic

  • access to elements like the first, last, or middle.

  • All we'll remember now for the linked list is that first element,

  • and from there, we have to follow all of those breadcrumbs.

  • So that might be too high of a price to pay.

  • And moreover, there's overhead now, because I'm not storing

  • for every node one value, but two--

  • the value or data I care about, and the address or metadata that lets me

  • get to the next node.

  • So I'm using twice as much space there, say,

  • at least when storing numbers, but at least

  • I'm getting that dynamic support for growth.

  • So again, it depends on that trade-off and what is less costly to you.

  • But never fear.

  • This is just another problem to solve.

  • To be clear, we'd like to retain the dynamism that something

  • a linked list offers-- the ability to grow and even shrink that data

  • structure over time without having to decide a priori just how much memory we

  • want.

  • But at the moment we've lost the ability to search it quickly, as with something

  • like binary search.

  • So wouldn't it be nice if we could get both properties together?

  • The ability to grow and shrink as well as to search fast?

  • Well I daresay we can if we're just a bit more

  • clever about how we draw on our canvas.

  • Again, let's stipulate that we can certainly

  • store values anywhere in memory and somehow stitch them together

  • using addresses.

  • Now those addresses, otherwise known as pointers,

  • we no longer need draw, because frankly, they're just now a distraction.

  • It suffices to know we can draw them pictorially as with some arrows,

  • so let's do just that.

  • Let me go ahead now and draw those values,

  • say 16 up here followed by my 8 and 15, as well as my 4.

  • Over here, well I draw that 42 and my 23,

  • and now it remains for me to somehow link these together.

  • Since I don't need to leave room for those actual addresses,

  • it suffices now to just draw arrows.

  • I'll go ahead and draw just a box around 16 and 8, as well as my 4 and my 15,

  • as well as my 23 and my 42.

  • Now how should I go about linking them?

  • Well let me propose that we no longer link just from left to right,

  • but rather assemble more of a hierarchy here with 16 pointing at 8,

  • and 16 also pointing at 42.

  • And 42, meanwhile, pointing at 23 with 8 pointing at 4 as well as 15.

  • Now why have I done it this way?

  • Well by including these arrows sometimes bidirectionally,

  • have I stitched together a two-dimensional data

  • structure, if you will?

  • Now this again surely could be mapped to that lower level of memory

  • just by jotting down the addresses that each of these arrows represents,

  • but I like thinking at this level of abstraction

  • because I now can think in more sophisticated form about how

  • I might layout my data.

  • So what properties do I now get from this structure?

  • Well, dynamism was the first goal at hand,

  • and how might I go about adding a new value?

  • Say it's 50 that I'd like to add to this structure.

  • Well, if I look at the top here, 16, it's already

  • got two arrows, so it's full, but I know 50 is bigger than 16,

  • so let's start to apply that dynamic and say 50

  • shall definitely go down to the right.

  • Unfortunately, 42 already has one arrow off it, but there is room for more,

  • and it turns out that 50 is, in fact, greater than 42.

  • So you know what?

  • I'm just going to slot 50 right there and draw 42's second arrow to 50.

  • And what picture seems to be emerging here?

  • It's perhaps reminiscent of a family tree of sorts.

  • Indeed, with parents and children, or a tree more generally with roots.

  • Now whereas in our human world, trees tend to grow up,

  • these trees in computer science tend to grow down.

  • But henceforth, let's call this 16 our root,

  • and to its left is its left child, to its right is its right child, or more

  • generally, a whole left subtree and a whole right subtree.

  • Because indeed, starting at 42, we have another tree of sorts.

  • Rooted at 42 is a child called 23, and another child called 50.

  • So in this case, it's each of the nodes in our structure,

  • otherwise known in computer science as a tree, has zero, one, or two children,

  • you can create the second dimension.

  • and you can preserve not only the ability

  • to add data dynamically like 50, but, but,

  • but, we also now gain back that ability to search.

  • After all, if I'm asked now the question,

  • is the number 15 in this structure?

  • Well let me check for you.

  • Starting at 16, which is where this structure begins, just like a linked

  • list starts conceptually at the left, I'll

  • check if 16 is the value you want-- it's not, it's too big,

  • but I do know that 15, if it's here, it's to the left.

  • Now 8, of course, is not the value you want either,

  • but 8 is smaller than 15, so I'll now go to the right.

  • And indeed, sure enough, that I now find 15.

  • And it only took me one, two steps, not n to find it,

  • because through this second dimension am I able to lift up some of those nodes

  • rather than draw them just down as a straight line,

  • or in the linked to list, all the way from left to right.

  • With the second dimension can I now organize things more tightly.

  • And notice the key characteristics of this tree.

  • It is what's generally known, indeed, as a binary search tree.

  • Not only because it's a tree that lends itself to search,

  • but also because each of the nodes has no more than two or bi-children--

  • zero, one, or two.

  • And notice that to the left of the 16 is not only the value

  • 8, but every number that can be reached to the left of 16 happens to be,

  • by design, less than 16.

  • And that's how we found 15.

  • Moreover to the right of 16, every value is greater than 16,

  • just as we have here.

  • And that definition can be applied so-called recursively.

  • You can make that claim about every node in this tree at any level,

  • because here, 42, every node to its left albeit just one is less.

  • Every node to its right albeit one is indeed more.

  • So so long as you bring to bear to our data the same sort of intuition

  • we brought to our phone book can we achieve these same properties

  • and goals, this efficiency of logarithmic time.

  • Log base 2 of n is indeed how long it might take us, big O of that

  • to find or insert some value.

  • Now to be fair, there are some prices paid here.

  • If I'm not careful, a data structure like this

  • could actually devolve into a linked list

  • if I just keep adding, by coincidence or intent,

  • more and more big and big numbers.

  • They might just so happen to get long and long and long and stringy

  • unless we're smart about how we rebalance the tree occasionally.

  • And indeed, there are other forms of these trees that

  • are smart, and with more code, will rebalance themselves to make sure

  • that they don't get long and stringy, but stay as high up as possible.

  • But there's another price paid beyond that potential gotcha--

  • more space.

  • Whereas my array used no arrows whatsoever and thus no extra space,

  • my linked list did use one extra chunk of space for each node--

  • storage for that point or address of its neighbor.

  • But in a tree structure, if you're storing multiple children,

  • you're using as many as two additional chunks of memory

  • to store as many if two of those arrows.

  • And so with a tree structure are you spending more space,

  • but potentially it's saving you time.

  • So again, we see this theme of trade-offs,

  • whereby if you really want less time to be spent,

  • you're going to have to spend more of that space.

  • Now can we do even better?

  • With an array, we had instant access to data,

  • but we painted ourselves into that corner.

  • With a linked list did we solve that particular problem,

  • but we gave up the ability to jump right where we want.

  • But with trees, particularly binary search trees,

  • can we rearrange our data intelligently and regain that logarithmic time.

  • But wouldn't it be nice if we could achieve even better, say,

  • constant time searches of data and insertions thereof?

  • Well for that, perhaps we could amalgamate some of the ideas

  • we've seen thus far into just one especially clever structure.

  • And let's call that particular structure a hash table.

  • And indeed, this is perhaps, in theory, the holy grail of data structures,

  • insofar as you can store anything in it in ideally constant time.

  • But how best to do this?

  • Well let's begin by drawing ourselves an array.

  • And that array this time I'll draw vertically simply

  • to leave ourselves a bit more in room for something clever.

  • This array, as always, can be indexed into by way of these locations

  • here where this might be location 0 and 1, 2, and 3,

  • followed by any number of others.

  • Now how do I want to use this array?

  • Well suppose that I want to store names and not numbers.

  • Those names, of course, could just be inserted in any old location,

  • but if unsorted, we already know we're going

  • to suffer as much as big O of n time--

  • linear time with which to find a particular name in that array

  • if you know nothing a priori about the order.

  • Well we know already, too, we could do better just like the phone company,

  • and if we sort the names we're putting into this structure,

  • we can at least then do binary search and whittle that search time down

  • to log base 2 of n.

  • But wouldn't it be nice if we can whittle that down further

  • and get to any name we want in nearly constant time-- one step, maybe two

  • or a few?

  • Well with a hash table can you approximately or ideally do that,

  • so long as we decide in advance how to hash those strings.

  • In other words, those strings of characters, here called names,

  • they have letters inside of them, say D-A-V-I-D for my own.

  • Well what if we looked at not the whole name,

  • but that first letter, which is, of course, constant time

  • to just look at one value.

  • And so if D is the fourth letter in the English alphabet, what if I store

  • DAVID--

  • or really, any D name at the fourth index in my array,

  • location 3 if you start counting at 0?

  • So here might be the A names, and here the B names, and here the C names,

  • and someone like David now belongs in this bucket, if you will.

  • Now suppose I want to store other names in this structure.

  • Well Alice belongs at location 0, and Bob, for instance, location 1.

  • And we can continue this logic and can continue

  • to insert more and more names so long as we hash those names

  • and jump right to the right location.

  • After all, I can in one step look at A or B or D

  • and instantly know 0 or 1 or 3.

  • How?

  • Well recall that in a computer you have ASCII or Unicode.

  • And we already have numbers predetermined to map

  • to those same characters.

  • Now to be fair, A I'm pretty sure it was 65 in ASCII,

  • but we could certainly subtract 65 from 65 to get 0.

  • And if capital B was 66, we could certainly subtract 65 from 66 to get 1.

  • So we can look, then, at the first letter of any name, convert it to ASCII

  • and subtract quite simply 65 if it's capital,

  • and get precisely to the index we want.

  • So to be fair, that's not one, but it is two or three steps,

  • but that is a constant number of steps again and again independent

  • of n, the total number of names.

  • Now what's nice about this is that we have a data structure into which we

  • can insert names instantly by hashing them and getting as output

  • that number or index 0 through 25, in the case of an English alphabet.

  • But what problem might arise?

  • The catch, though, is that we have someone else, like Doug,

  • whose name happens to start with the same name,

  • unfortunately there seems to be no room at this moment for Doug

  • since I'm already there.

  • But there we can draw inspiration from other data structures still.

  • We could maybe not just put David in this array,

  • but not even treat this array as the entire data structure,

  • but really the beginning of another.

  • In fact, let me go ahead and put David in his or my own box

  • and give Doug his own as well.

  • Now Doug and I are really just nodes in a structure.

  • And we can use this array still to get to the right nodes of interest,

  • but now we can use arrows to stitch them together.

  • If I have multiple names, each of which starts with a D,

  • I just need to remember to link those together,

  • thereby allowing myself to have any number of names

  • that start with that same letter, treating that list really

  • as a linked list.

  • But I get to that length list instantly by looking at that first letter

  • and jumping here to the right location.

  • And so here I get both dynamic growth and instant access to that list,

  • thereby decreasing significantly the amount of time

  • it takes me to find someone maybe 1/26 of the time.

  • Now to be fair, wait a minute, we're already

  • seeing collisions, so to speak, whereby I have multiple inputs hashing

  • to the same output--

  • three in this instance.

  • And in the worst case, perhaps everyone in the room

  • all has a name that starts with D, which means really,

  • you don't have a hash table or array at all,

  • you just have one really long linked list, and thus, linear.

  • But that would be considered a more perverse scenario, which you should try

  • to avoid by way of that hash function.

  • If that is the problem you're facing, then your hash function is just bad.

  • You should not have looked only in that case

  • at just the first letter of every name.

  • Perhaps you should have looked at the first two letters

  • back-to-back, and put anyone's name that starts with D-A in one list;

  • and D-B, if there is any, in a second list; and D-C, if there's any of those,

  • in some third list altogether; and D-D and D-E and D-F

  • and so forth, and actually have multiple combinations of every two letters,

  • and have as many buckets, so to speak, as many indexes in your array

  • as there are pairs of two alphabetical letters.

  • Now to be fair, you might have two people

  • whose names start with D-A or D-O, but hopefully there's even fewer.

  • And indeed, I say a hash table--

  • this whole structure approximates the idea of constant time

  • because it can devolve in places to linear time with longer lists of names.

  • But if your hash function is good and you don't have these collisions,

  • and therefore ideally you don't have any linked lists, just names, then

  • you indeed have a structure that gives you constant time access,

  • ultimately, combining all of these underlying

  • principles of dynamic growth and random access

  • to achieve ultimately the storage of all your values.

  • How, then, might a language like Python implement data types like int and str?

  • Well in the case of Python's latest version,

  • it allows ints to grow as big as you need them to be.

  • And so it surely can only be using contiguous memory once allocated

  • that stays in the same place.

  • If instead you want a number to grow over time,

  • well you're probably going to need to allocate some variable number of bytes

  • in that memory.

  • Strings, too, as well.

  • If you want to allocate strings, you're going to need to allow them to grow,

  • which means finding extra space in proximity

  • to the characters you already have, or maybe relocating the whole structure

  • so that that value can keep growing.

  • But we know now, we can do this with our canvas of memory.

  • How the particular language does it isn't even necessarily of interest,

  • we just know that it can, and even underneath the hood, how

  • it might do so.

  • As for these other structures in Python like dict or dictionary and list,

  • well those, too, are exactly what we've seen here.

  • A dictionary in Python is really just a hash table, some sort of variable

  • that has indexes that are not necessarily numbers, but words,

  • and via those words can you get back a value.

  • Indeed, more generally does a hash table have keys and values.

  • The keys are the inputs via which you produce those outputs.

  • So in our data structure, might have been the inputs as names.

  • The output of my hash function was an index value like some number.

  • And in Python do you have a wonderful abstraction in code that

  • allows you to express that idea of associating keys

  • with values, names with yes or no, true or false

  • they are present so that you can ask those questions yourself in your code.

  • And as for list, it's quite simply that.

  • It's the idea of an array but with that added dynamism,

  • and as such, a linked list of sorts.

  • And so now at this higher level of code can you not only think computationally,

  • but express yourself computationally knowing and trusting

  • that the computer can do that bidding.

  • How the data structures are organized really

  • is the secret source of these languages and tools,

  • and indeed, when you have some database or backend system, too,

  • the intellectual property that underlies those systems

  • ultimately boils down not only to the algorithms

  • in use, but also the data structures.

  • Because together, they-- and we've seen this--

  • together combine to produce not only the correctness of answers you want,

  • but the efficiency with which you can to those answers.

[MUSIC PLAYING]

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it