Placeholder Image

Subtitles section Play video

  • Hi, everybody.

  • How we doing?

  • We got caffeinated?

  • Feeling good?

  • Nice.

  • So I'm Anjana Vakil, hello.

  • You can find me on Twitter at my name and today I'd like to talk to you about immutable

  • data structures for functional programming in JavaScript.

  • We're going to take a look at what immutable data structures are, why they're a really

  • cool way to handle the immutability that we typically use when we're doing functional

  • programming and how we can do that in JavaScript because I hear y'all like JavaScript!

  • So a little about me.

  • I'm probably the only not-web-developer in the room.

  • I am an engineer for Uber Research.

  • I work with them to develop a custom query language for data in the scientific research

  • funding domain.

  • I'm also an alum of the Recurse Center, which is a fantastic programming community in New

  • York City, and I am an alum of the Outreach Program, which if you have haven't heard of

  • it, it's getting women and more folks involved in these by giving them internships at Mozilla.

  • So I'm really happy to chat about those things if you want to come grab me after the talk.

  • But you might know that I like functional programming.

  • I think it rocks.

  • Anybody else agree with me that functional programming is cool?

  • Yeah!

  • Yeah, so functional programming is a pretty great way to avoid some of the headaches of

  • like imperative and object-oriented programming.

  • In functional programming, what we typically do is conceive of our programs as being just

  • pure functions.

  • That means their transform their inputs to outputs, and that's all they do.

  • They don't have my side effects like changing things in the console, and my taking things

  • in the global state are side effects.

  • But our data becomes data in, data out, and transformers of data.

  • And one thing that goes hand-in-hand with this, with avoiding side effects is immutable

  • data.

  • Immutable data meaning once we've created it, it never changes.

  • So this is a really good way of changing something accidental outside of your function.

  • If everything is immutable, you can't change anything.

  • So immutability another thing that rocks and it rocks pretty hard for other reasons that

  • we'll see in a moment.

  • But speaking of rocks, let's talk about rocks.

  • So this is a rock, and immutability rocks in the way that rocks rock.

  • Now I don't know about you, but I've been going to a lot of tech conferences recently

  • and I've been feeling like there has enough poetry.

  • So I'd like to read you a poem: Nobody sits like this rock sits.

  • You rock, rock.

  • The rock just sits and is.

  • You show us how to just sit here.

  • And that's what we need.

  • It's so true, so deep.

  • This is from -- Don't thank me, thank I Heart Huckabees, that's

  • a great movie.

  • Check it out.

  • So this is really how immutable data rocks.

  • It just sits there.

  • It just is.

  • Once we've created it, it never changes and that's amazing because it can help us avoid

  • some of the headaches of immutability.

  • So with immutability, we have some things pretty easy, but other things become harder

  • and we'll see how that looks.

  • So let's say I have an array called foo and it's got some numbers in it.

  • Hm, and I'm already bored.

  • Let's make it more fun.

  • Let's say I have a zoo with some animals -- more fun!

  • And I decided that I want to change something up about my zoo.

  • Maybe I want to replace that rabbit there with something a little more exotic.

  • Like an alien!

  • So this is cool.

  • I'm happy because I wanted a more exotic zoo.

  • I got an alien in my zoo now.

  • I didn't have to change anything except for that one little cell in my array.

  • That's pretty sweet but my co-worker over was expecting zoo to be filled with earth

  • beings, earth animals, and wasn't accounting for there being an alien in it.

  • Who put that in there?

  • Now my program doesn't work anymore.

  • Who did that?

  • So immutability has a couple problems.

  • We have to manage who's been changing what, when -- who's been putting which animals in

  • the zoo.

  • We have to have a lot of overhead to manage that state, and that gives us headaches as

  • individuals, and as teams.

  • We also get bugs in the code because maybe I was only planning -- or my co-worker was

  • only planning -- to handle terrestrial beings and didn't have a case of aliens being accounted

  • for, and that broke something.

  • So these are some side effects of immutability that don't make us happy.

  • Let's try doing things the immutable way.

  • So in an immutable world, my array, my zoo, once I've created it, it just sits and is

  • forever.

  • I cannot change it.

  • What I can do if I want a new zoo that's more exotic is I can make a copy that's the same

  • size as my original array, and I can make the modification I want, so I can put my alien

  • in there in place of the rabbit.

  • And so this is pretty sweet because now my co-worker is maybe, and they're, like, whoo,

  • nothing broke in my program, and it's all still animal creatures but I had to copy over

  • that whole array.

  • I had to allocate the space for that entire array, even all of the stuff that didn't change.

  • I had to copy all of that over, as well.

  • So this means that my code runs pretty slow.

  • And it also takes up a lot of memory.

  • It takes up a lot of space and time.

  • The complexity on those things are bad because copying is a waste of both time and space.

  • It makes us sad face!

  • We don't want that.

  • So if we want to do immutability, we must be able to find a better way of doing that.

  • Luckily for us, a lot of very smart folks have been thinking very hard about this problem

  • for a while, and they've come up with some really good solutions for how we can deal

  • with immutability efficiently.

  • immutable data structures!

  • So immutable data structures is a term that you may have heard about, with functional

  • programming, or also in terms of React where they come in handy.

  • Technically, an immutable data structure is like the rock, it just sits, and is once you

  • create it.

  • It never changes.

  • But also hear the term persistent data structures banged about.

  • Sometimes these are used interchangeably, but they have slightly different meanings.

  • So if immutable data is data that never changes, persistent data is data for which we have

  • access to old versions.

  • So as we've been creating new modified versions of our data structures, we keep the old versions

  • around.

  • You might hear about partially persistent data structures where we can look at the old

  • versions, we can access them, but we can't go back and update any of them.

  • All we can update is the most current version that we have.

  • And then you might also hear about fully persistent data structures where we can actually time

  • travel, we can go back and update any of our past versions.

  • And if this is starting to ring a bell like it's version control like git, it's sort of

  • the same idea.

  • So we're going to talk about these as persistent immutable data structures, they're both persistent,

  • and immutable.

  • Let's see how this works.

  • The key to all of this is we want the old versions of our data, like, my original zoo

  • to stay put.

  • We just want to to sit like the rock but we want new versions to be created efficiently.

  • So what magical tricks do we have to use to, like, make this happen?

  • Do we have to make invocations do dances to the gods of space and time complexity?

  • No.

  • It's very simple.

  • Trees and sharing.

  • Isn't that sweet?

  • These two simple concepts will get us efficient immutable data.

  • How?

  • So let's talk about trees because trees rock pretty hard, as well, alternative, unfortunately

  • I don't have a poem for that, sorry.

  • Imagine that we could find a way to represent our zoo array as a tree.

  • So one thing I could do is I could put all of my animals -- all of my values -- in the

  • leaves of a tree, and I could make it so that each leaf holds one value, one animal.

  • But they might get lonely, so let's put them with a buddy.

  • Let's put them 2x2.

  • So each of our leaves will have two values and we'll hope that the buddies get along

  • and not each each other -- looking at you, tiger, number six, don't eat that koala, and

  • we can go up to intermediate nodes up and up, until we get to the root node of the whole

  • structure, and now that root is an array represented previously by a tree.

  • So this is my tree now in this structure.

  • So given this type of structure, how do we update something?

  • Given that my data is immutable, and it can never change, how can I handle the fact that

  • it has an alien in it.

  • So here what I would do is I would take the node that contains the value that I want to

  • change.

  • So in this case it would be the 0/1 node that you see on the bottom of the screen.

  • And so I make a new copy where I've still got my monkey but I've changed the rabbit

  • to an alien.

  • And then I need to copy any of the intermediate nodes in the tree that were pointing to the

  • node that I changed.

  • So I basically trace a path up towards the root of the tree, which, now, I've got a new

  • root, which means another version of the data structure.

  • So this technique of making this update by copying the path from the leaf I changed to

  • the root is called path copying.

  • That's pretty cool because now I didn't have to copy the entire array; I just had to copy

  • the nodes on the way from the root to the leaf that I changed.

  • So if we've turned in something linear and copying into something logarithm.

  • That's pretty cool, that's more performant, and the data of this is that all of these

  • nodes in yellow here, so most of the tree is shared between the two versions, between

  • the old version and the new.

  • And so this saves me a lot of space because I can actually reuse the parts of the original

  • version that didn't change, whereas, before, I had to copy those over, as well.

  • So this means that what was before, like, a lot of memory consumption becomes a lot

  • smaller because you don't have to store as many copies of the things if they didn't change.

  • And that's called structural changing because we're sharing the structure of the tree between

  • the two versions.

  • So we've been talking about updating things but how do we get at the values in our data

  • structure?

  • How do we access them?

  • Well, it turns out this isn't just a tree, it's a special type of tree called a TRIE

  • tree, which originally came from the world "retrieval," so people could, I guess, call

  • it tree, which is funny because we also call TREE trees, so we can call them "tries" if

  • we want.

  • So a try is a type of tree, where the leaves represent the values, and the paths to the

  • value are the keys that that data is associated with.

  • So often you see TRIEs with values stored as keys.

  • So, for example, if I have T stored as a key, what I do to get to the T is I trace the tree

  • one letter at a time.

  • Then I go to T, and then to E, and then to EA, is my key, and then my value there is

  • three.

  • Because everything at the end sounds like "ee" in this talk.

  • So this is pretty cool, but in our data structure, we weren't using words, we just wanted an

  • array-type thing, we wanted indeces, right?

  • So the insight here is if we treat the index as a binary number, then we can pretend that

  • that's kind of, like, our word and we can descend the tree, bit-by-bit as if each representation

  • of our binary representation is a letter.

  • So let's see how that works.

  • If I'm trying to get at item five in my array, so the animal at index five, I'd convert that

  • to binary, so that's one, zero, one, and then I step through that as if it was a word.

  • I step through it letter-by-letter, bit-by-bit.

  • So I go from the root to the branch.

  • I have a choice of either zero or one.

  • I go to branch one first.

  • And then I go to branch zero, and then I take the thing on the one side.

  • So I go one, zero, one, down my tree and then I end up at my frog at index five.

  • So this is a pretty simple insight but it ends up being incredibly powerful because

  • it allows us to quickly traverse this tree structure, which lets us use that structural

  • sharing to more efficiently represent our new copies of our immutable data structure.

  • And, importantly, we don't have to be using a binary tree, meaning we have two branches

  • from each node.

  • That fits pretty well on a slide, but actually what you mostly see is a 32-way branching.

  • So in our trees that we've been looking at, we've kind of had one bit of information per

  • level.

  • And we've been descending bit-by-bit but if we had a 32-way branching tree, it would be

  • five bits of information that we would be representing at each level.

  • So that would look something like this.

  • If we had a much bigger number, like, 18,977, in binary, that's that bunch of ones and zeros.

  • This would be a really deep tree if I had to descend into it one at a time, it would

  • be like 15 levels deep.

  • Too much, too long.

  • So if I'd make more branches at each level, then I can chunk this up into kind of 5-bit

  • letters as it were, and descend the tree that it's now only three levels using the 32-way

  • branching.

  • So this is kind of a tradeoff between how deep your tree is going to be, and how big

  • the nodes are going to be because if I have just one bit of information at each level

  • then I have really small nodes.

  • That's quick to copy over but I have to go very, very deep down in the tree for a larger

  • array.

  • And generally, research has found that 32 is a pretty good tradeoff between the depth

  • of the tree.

  • So what we've seen is a bitmap vector TRIE.

  • That's just jargon.

  • We don't need to care about that.

  • But if you need something to Google, you can Google that.

  • This is cool for array-type of things and we have an index we want to jump there, but

  • what about objects?

  • We also want to be able to associate objects with arbitrary keys, not just indeces, so

  • we want non-integers as keys, how does that work?

  • So if I want a version of my data structure where it's no longer an array but it's something

  • like an object where I'm associated letters with each of my animals like M for monkey,

  • and P for panda, et cetera, what I can do is I can take my keys, in this case, they're