Placeholder Image

Subtitles section Play video

  • And in fact, we can visualize this even a little metaphorically.

  • So for instance, here is, for instance, a mailbox.

  • And suppose that this is address 123.

  • What is in address 123?

  • Well it's a variable of type int, called n,

  • looks like it's storing the number 50.

  • Right?

  • We saw these letters--

  • these numbers last week.

  • So here's the number 50, which is an integer inside of this variable, today,

  • represented as a mailbox instead of as a locker.

  • Well suppose that this mailbox over here is not n but suppose this is p.

  • And it happens to be an address 456.

  • But who really cares?

  • If this variable p is a pointer to an integer, namely that one over there,

  • when I open this door, what am I going to find?

  • Well I'm hoping I find the equivalent of-- we

  • picked these up at the Coop earlier --the equivalent

  • of a conceptual pointer saying the number n is over there.

  • But what specifically, at a lower level, is actually inside this mailbox

  • if that variable n is at location 0x123?

  • What's probably inside this mailbox?

  • AUDIENCE: [INAUDIBLE]

  • DAVID J. MALAN: Yeah, the address, indeed, 123.

  • So it's sort of like a treasure map if you will.

  • Oh, I have to go to 123 to get this value.

  • Oh, the integer in question is indeed 50.

  • And that's the fundamental difference.

  • This is the int that happens to be inside of this variable of type int.

  • This is the address that's a pointer that's in this other variable, p,

  • but that is conceptually, simply pointing from one variable

  • to another, thereby giving any sort of conceptual breadcrumbs.

  • And we'll see-- frankly, in one week --how amazingly powerful it is.

  • When you can have one piece of memory pointing at another,

  • pointing at another, pointing at another,

  • you can start to construct very sophisticated data structures,

  • as they're called, things like family trees,

  • and lists, and other data structures that you might have heard of.

  • Or even if you haven't, these will be the underpinnings next week

  • of all of today's fanciest algorithms used by,

  • certainly the Googles, and the Facebooks,

  • and the Microsofts of the world to manage large data sets.

  • That's where we're going next week, in terms of application.

  • So questions about that representation?

  • Yeah, in the middle.

  • AUDIENCE: Does that mean that your memory has to be twice as big?

  • DAVID J. MALAN: Sorry can you say it once more?

  • AUDIENCE: Is that to say your memory has to be twice as big to store pointers?

  • DAVID J. MALAN: Ah, really good question.

  • Is it the case that your pointers need to be twice as big?

  • Not necessarily, just, this is the way life is these days.

  • On most modern Macs and PCs, pointers use 64 bits-- the equivalent of a long,

  • if you recall that brief discussion in Week 1.

  • So I deliberately drew my pointer on the screen

  • here as taking up 8 bytes or 64 bits.

  • I've deliberately drawn my integer n as taking up 4 bytes or 32 bits.

  • That is convention these days on modern hardware.

  • But it's not necessarily the case.

  • Frankly, I could not find a bigger mailbox at Home Depot,

  • so we went with two identical different colored ones.

  • So metaphor is imperfect.

  • All right.

  • So moving from this to something more familiar now, if you will.

  • Recall that we've been talking about strings for quite some time.

  • And in fact, most of the interesting programs we've written thus far

  • involve maybe input from the human and some form of text

  • that you are then manipulating.

  • But string we said in Week 1 is a bit of a white lie.

  • I mean, it is the training wheels that I promised

  • we would start taking off today.

  • So let's consider what a string actually is now in this new context.

  • So if we have a string like EMMA here, declared in a variable

  • called s, and quote unquote, EMMA in all caps, as we've done a couple of times

  • now.

  • What does this actually look like inside of the computer?

  • Well somewhere in my computer's memory there are four, nay, five bytes,

  • storing E-M-M-A, and then additionally, that null terminating character that

  • demarcates where the end of the string is.

  • This is just eight individual 0 bits.

  • So that's where EMMA might be represented in the computer's memory.

  • But recall that the variable in question was s.

  • That was my string.

  • And so that's why over the past few weeks

  • any time you want to manipulate a string, you use its name, like s.

  • And you can access bracket 0, bracket 1, bracket 2, bracket 3,

  • to get at the individual characters in that string like EMMA, E-M-M-A,

  • respectively.

  • But of course it's the case, especially per today's revelation, that really,

  • all of those bytes have their own addresses.

  • Right?

  • We're not going to care after this week what those addresses are

  • but they certainly exist.

  • For instance, E might be at 0x123.

  • M might be at 0x124--

  • 1 byte away --0x125, 0x126, 0x127.

  • They're deliberately 1 byte away because remember a string is defined

  • by characters back-to-back-to-back.

  • So let's say for the sake of discussion that EMMA name in memory

  • happens to start at 0x123.

  • Well, what then really is that variable s?

  • Well, I dare say that s is really just a pointer.

  • Right?

  • It can be a variable, depicted here just as before, called s.

  • And it stores the value 0x123.

  • Why?

  • That's where Emma's name begins.

  • But of course, we don't really have to care about this level of precision,

  • the actual numbers.

  • Let's just draw it as a picture.

  • s is, if you will, a pointer to Emma's actual name in memory,

  • which might be down over here.

  • It might be over here.

  • It might be over here, depending on where in the computer's memory

  • it ended up by chance.

  • But this arrow just suggests that s is pointing to Emma, specifically

  • at the first letter in her name.

  • But that's sufficient though, right?

  • Because how-- if s stores the beginning of Emma's name, 0x123.

  • And that's indeed where the E is but we just

  • draw this pictorially with an arrow.

  • How does the computer know where Emma's name

  • ends if all it's technically remembering is the beginning?

  • AUDIENCE: The null terminating character.

  • DAVID J. MALAN: The null terminating character.

  • And we stipulated a couple of weeks ago that that is important.

  • But now it's all the more important because it turns out

  • that s, this thing we've been calling a string,

  • has no familiarity with MMA or the null terminator.

  • All s is pointing at technically, as of today,

  • is the first letter in her name, which happens to be in this story at 0x123.

  • But the computer is smart enough to know that if you just point it

  • at the first letter in a string, it can figure out

  • where the string ends by just looking--

  • as with a loop --for that null terminating character.

  • So this is to say ultimately, that there is no such thing as string.

  • And we'll see if this strikes a chord.

  • There is no such thing as a string.

  • This was a little white lie we began telling in Week 1

  • just so that we could get interesting, real work done, manipulating text.

  • But what is string most likely implemented as would you say?

  • AUDIENCE: An array of characters.

  • DAVID J. MALAN: An array of characters, yes.

  • But that was Week 1's definition.

  • What technically now, as of today, must a string be?

  • AUDIENCE: [INAUDIBLE]

  • DAVID J. MALAN: Sorry, over here.

  • AUDIENCE: A pointer.

  • DAVID J. MALAN: A pointer.

  • Right? s, the variable in which I was storing

  • Emma's name would seem to manifest a pattern just

  • like we saw with the numbers a moment ago, the number 50.

  • s seems to be storing the address of the first character

  • in that sequence of characters.

  • And so indeed, it would seem to be a string.

  • Well, how do we actually connect these dots?

  • Well suppose that we have this line of code

  • again where we had int n equals 50.

  • And then we had this other line of code where we said,

  • go ahead and create a variable called p and store in it the address of n.

  • That's where we left off earlier.

  • But it turns out that this thing here is our data type from Week 1.

  • This thing here, int star, is a new data type as of today.

  • The variable stores, not an int, but the address of an int.

  • It turns out that something like this line of code, with Emma's name,

  • is synonymous with char star.

  • Right?

  • If a star represents an address and char represents the type of address being

  • pointed at, just as int star can let you point at a value like n--

  • which stored 50 --so could a char star--

  • by that same logic --allow you to store the address of and therefore

  • point at a character.

  • And of course, as you said, from Week 1, a string

  • is just a sequence of characters.

  • So a string would seem to be just the address of the first byte

  • in the sequence of characters.

  • And the last byte happens to be all 0s by convention, to help us find the end.

  • So what then more technically is a string

  • and what is the CS50 library that we're now going

  • to start taking off as training wheels?

  • Well last week we introduced you to the notion of typedef,

  • where you can create your own customized data type that does not exist in C

  • but does exist in your own program.

  • And we introduced this keyword, typedef.

  • We proposed last week that this was useful because you could actually

  • declare a fancy structure that encapsulates

  • multiple variables, like name and number,

  • and then we called this data structure, last week, a person.

  • That was the new data type we invented.

  • Well it turns out you can use typedef in exactly the same way

  • even more simply than we did last week by saying this.

  • If you say typedef char star string--

  • typedef means give me a new data type, just for my own use.

  • Char star means the type of value is going to be the address of a character.

  • And the name I want to give to that data type is going to be string.

  • And so literally, this line of code here, this

  • is one of the lines of code in CS50 dot h-- the header

  • file you've been including for several weeks,

  • where we are creating a data type called string

  • to make it a synonym for char star.

  • So that if you will, it's an abstraction,

  • a simplification on top of the idea of a sequence of characters

  • being pointed at by an address.

  • Any questions?

  • And honestly, this is why-- and maybe those sort

  • of blank stares --this is why we introduced strings in Week 1

  • as being an actual type as opposed to not existing at all.

  • Because who really cares about addresses and pointers

  • and all of that when all you want to do is like,

  • print, hello world, or hello, so and so's name?

  • Yeah, question.

  • AUDIENCE: What other-- what other functions are created--

  • major functions are created by CS50 are not intrinsic to--

  • DAVID J. MALAN: Really good question.

  • We'll come back to this later today.

  • But other functions that are defined in the CS50 library that

  • are training wheels that come off today are getString,

  • getInt, getFloat, and the other get functions as well.

  • But that's about it that we do for you.

  • Other questions?

  • Yeah.

  • AUDIENCE: Can you define all of these words again?

  • Like, it's-- so string is like a character pointer which points--

  • I was confused about that.

  • Can you repeat that?

  • DAVID J. MALAN: Sure.

  • A string, per this definition, is a char star, as a programmer would say.

  • What does that mean?

  • A string is quite simply a variable that contains the address of a character.

  • By our human convention, that character might be the beginning

  • of a multi character sequence.

  • But that's what we called strings in Week 1.

  • So a string is just the address of a single character.

  • And we leave it to human convention to know that the end of the string

  • will just be demarcated by eight 0 bits, a.k.a.

  • the null terminator.

  • And this is the sense in which-- especially

  • if you have some prior programming experience

  • --that C is much more low level.

  • In Python, as you'll soon see in a few weeks,

  • everything just works so splendidly easily.

  • If you want a string, you can have a string.

  • You don't have to worry about any of these low level details.

  • But that's because Python is built here, conceptually,

  • where C is built down here-- so to speak --closer to the computer's memory.

  • But there's no magic.

  • If you want to string, fine.

  • Just remember where it starts, remember where it ends.

  • And boom, you're done.

  • The star in the syntax today is just a way of expressing those ideas in code.

  • So let's go ahead then