Placeholder Image

Subtitles section Play video

  • Hi, everybody.

  • How we doing?

  • We got caffeinated?

  • Feeling good?

  • Nice.

  • So I'm Anjana Vakil, hello.

  • You can find me on Twitter at my name and today I'd like to talk to you about immutable

  • data structures for functional programming in JavaScript.

  • We're going to take a look at what immutable data structures are, why they're a really

  • cool way to handle the immutability that we typically use when we're doing functional

  • programming and how we can do that in JavaScript because I hear y'all like JavaScript!

  • So a little about me.

  • I'm probably the only not-web-developer in the room.

  • I am an engineer for Uber Research.

  • I work with them to develop a custom query language for data in the scientific research

  • funding domain.

  • I'm also an alum of the Recurse Center, which is a fantastic programming community in New

  • York City, and I am an alum of the Outreach Program, which if you have haven't heard of

  • it, it's getting women and more folks involved in these by giving them internships at Mozilla.

  • So I'm really happy to chat about those things if you want to come grab me after the talk.

  • But you might know that I like functional programming.

  • I think it rocks.

  • Anybody else agree with me that functional programming is cool?

  • Yeah!

  • Yeah, so functional programming is a pretty great way to avoid some of the headaches of

  • like imperative and object-oriented programming.

  • In functional programming, what we typically do is conceive of our programs as being just

  • pure functions.

  • That means their transform their inputs to outputs, and that's all they do.

  • They don't have my side effects like changing things in the console, and my taking things

  • in the global state are side effects.

  • But our data becomes data in, data out, and transformers of data.

  • And one thing that goes hand-in-hand with this, with avoiding side effects is immutable

  • data.

  • Immutable data meaning once we've created it, it never changes.

  • So this is a really good way of changing something accidental outside of your function.

  • If everything is immutable, you can't change anything.

  • So immutability another thing that rocks and it rocks pretty hard for other reasons that

  • we'll see in a moment.

  • But speaking of rocks, let's talk about rocks.

  • So this is a rock, and immutability rocks in the way that rocks rock.

  • Now I don't know about you, but I've been going to a lot of tech conferences recently

  • and I've been feeling like there has enough poetry.

  • So I'd like to read you a poem: Nobody sits like this rock sits.

  • You rock, rock.

  • The rock just sits and is.

  • You show us how to just sit here.

  • And that's what we need.

  • It's so true, so deep.

  • This is from -- Don't thank me, thank I Heart Huckabees, that's

  • a great movie.

  • Check it out.

  • So this is really how immutable data rocks.

  • It just sits there.

  • It just is.

  • Once we've created it, it never changes and that's amazing because it can help us avoid

  • some of the headaches of immutability.

  • So with immutability, we have some things pretty easy, but other things become harder

  • and we'll see how that looks.

  • So let's say I have an array called foo and it's got some numbers in it.

  • Hm, and I'm already bored.

  • Let's make it more fun.

  • Let's say I have a zoo with some animals -- more fun!

  • And I decided that I want to change something up about my zoo.

  • Maybe I want to replace that rabbit there with something a little more exotic.

  • Like an alien!

  • So this is cool.

  • I'm happy because I wanted a more exotic zoo.

  • I got an alien in my zoo now.

  • I didn't have to change anything except for that one little cell in my array.

  • That's pretty sweet but my co-worker over was expecting zoo to be filled with earth

  • beings, earth animals, and wasn't accounting for there being an alien in it.

  • Who put that in there?

  • Now my program doesn't work anymore.

  • Who did that?

  • So immutability has a couple problems.

  • We have to manage who's been changing what, when -- who's been putting which animals in

  • the zoo.

  • We have to have a lot of overhead to manage that state, and that gives us headaches as

  • individuals, and as teams.

  • We also get bugs in the code because maybe I was only planning -- or my co-worker was

  • only planning -- to handle terrestrial beings and didn't have a case of aliens being accounted

  • for, and that broke something.

  • So these are some side effects of immutability that don't make us happy.

  • Let's try doing things the immutable way.

  • So in an immutable world, my array, my zoo, once I've created it, it just sits and is

  • forever.

  • I cannot change it.

  • What I can do if I want a new zoo that's more exotic is I can make a copy that's the same

  • size as my original array, and I can make the modification I want, so I can put my alien

  • in there in place of the rabbit.

  • And so this is pretty sweet because now my co-worker is maybe, and they're, like, whoo,

  • nothing broke in my program, and it's all still animal creatures but I had to copy over

  • that whole array.

  • I had to allocate the space for that entire array, even all of the stuff that didn't change.

  • I had to copy all of that over, as well.

  • So this means that my code runs pretty slow.

  • And it also takes up a lot of memory.

  • It takes up a lot of space and time.

  • The complexity on those things are bad because copying is a waste of both time and space.

  • It makes us sad face!

  • We don't want that.

  • So if we want to do immutability, we must be able to find a better way of doing that.

  • Luckily for us, a lot of very smart folks have been thinking very hard about this problem

  • for a while, and they've come up with some really good solutions for how we can deal

  • with immutability efficiently.

  • immutable data structures!

  • So immutable data structures is a term that you may have heard about, with functional

  • programming, or also in terms of React where they come in handy.

  • Technically, an immutable data structure is like the rock, it just sits, and is once you

  • create it.

  • It never changes.

  • But also hear the term persistent data structures banged about.

  • Sometimes these are used interchangeably, but they have slightly different meanings.

  • So if immutable data is data that never changes, persistent data is data for which we have

  • access to old versions.

  • So as we've been creating new modified versions of our data structures, we keep the old versions

  • around.

  • You might hear about partially persistent data structures where we can look at the old

  • versions, we can access them, but we can't go back and update any of them.

  • All we can update is the most current version that we have.

  • And then you might also hear about fully persistent data structures where we can actually time

  • travel, we can go back and update any of our past versions.

  • And if this is starting to ring a bell like it's version control like git, it's sort of

  • the same idea.

  • So we're going to talk about these as persistent immutable data structures, they're both persistent,

  • and immutable.

  • Let's see how this works.

  • The key to all of this is we want the old versions of our data, like, my original zoo

  • to stay put.

  • We just want to to sit like the rock but we want new versions to be created efficiently.

  • So what magical tricks do we have to use to, like, make this happen?

  • Do we have to make invocations do dances to the gods of space and time complexity?

  • No.

  • It's very simple.

  • Trees and sharing.

  • Isn't that sweet?

  • These two simple concepts will get us efficient immutable data.

  • How?

  • So let's talk about trees because trees rock pretty hard, as well, alternative, unfortunately

  • I don't have a poem for that, sorry.

  • Imagine that we could find a way to represent our zoo array as a tree.

  • So one thing I could do is I could put all of my animals -- all of my values -- in the

  • leaves of a tree, and I could make it so that each leaf holds one value, one animal.

  • But they might get lonely, so let's put them with a buddy.

  • Let's put them 2x2.

  • So each of our leaves will have two values and we'll hope that the buddies get along

  • and not each each other -- looking at you, tiger, number six, don't eat that koala, and

  • we can go up to intermediate nodes up and up, until we get to the root node of the whole

  • structure, and now that root is an array represented previously by a tree.

  • So this is my tree now in this structure.

  • So given this type of structure, how do we update something?

  • Given that my data is immutable, and it can never change, how can I handle the fact that

  • it has an alien in it.

  • So here what I would do is I would take the node that contains the value that I want to

  • change.

  • So in this case it would be the 0/1 node that you see on the bottom of the screen.

  • And so I make a new copy where I've still got my monkey but I've changed the rabbit

  • to an alien.

  • And then I need to copy any of the intermediate nodes in the tree that were pointing to the

  • node that I changed.

  • So I basically trace a path up towards the root of the tree, which, now, I've got a new

  • root, which means another version of the data structure.

  • So this technique of making this update by copying the path from the leaf I changed to

  • the root is called path copying.

  • That's pretty cool because now I didn't have to copy the entire array; I just had to copy

  • the nodes on the way from the root to the leaf that I changed.

  • So if we've turned in something linear and copying into something logarithm.

  • That's pretty cool, that's more performant, and the data of this is that all of these

  • nodes in yellow here, so most of the tree is shared between the two versions, between

  • the old version and the new.

  • And so this saves me a lot of space because I can actually reuse the parts of the original

  • version that didn't change, whereas, before, I had to copy those over, as well.

  • So this means that what was before, like, a lot of memory consumption becomes a lot

  • smaller because you don't have to store as many copies of the things if they didn't change.

  • And that's called structural changing because we're sharing the structure of the tree between

  • the two versions.

  • So we've been talking about updating things but how do we get at the values in our data

  • structure?

  • How do we access them?

  • Well, it turns out this isn't just a tree, it's a special type of tree called a TRIE

  • tree, which originally came from the world "retrieval," so people could, I guess, call

  • it tree, which is funny because we also call TREE trees, so we can call them "tries" if

  • we want.

  • So a try is a type of tree, where the leaves represent the values, and the paths to the

  • value are the keys that that data is associated with.

  • So often you see TRIEs with values stored as keys.

  • So, for example, if I have T stored as a key, what I do to get to the T is I trace the tree

  • one letter at a time.

  • Then I go to T, and then to E, and then to EA, is my key, and then my value there is

  • three.

  • Because everything at the end sounds like "ee" in this talk.

  • So this is pretty cool, but in our data structure, we weren't using words, we just wanted an

  • array-type thing, we wanted indeces, right?

  • So the insight here is if we treat the index as a binary number, then we can pretend that

  • that's kind of, like, our word and we can descend the tree, bit-by-bit as if each representation

  • of our binary representation is a letter.

  • So let's see how that works.

  • If I'm trying to get at item five in my array, so the animal at index five, I'd convert that

  • to binary, so that's one, zero, one, and then I step through that as if it was a word.

  • I step through it letter-by-letter, bit-by-bit.

  • So I go from the root to the branch.

  • I have a choice of either zero or one.

  • I go to branch one first.

  • And then I go to branch zero, and then I take the thing on the one side.

  • So I go one, zero, one, down my tree and then I end up at my frog at index five.

  • So this is a pretty simple insight but it ends up being incredibly powerful because

  • it allows us to quickly traverse this tree structure, which lets us use that structural

  • sharing to more efficiently represent our new copies of our immutable data structure.

  • And, importantly, we don't have to be using a binary tree, meaning we have two branches

  • from each node.

  • That fits pretty well on a slide, but actually what you mostly see is a 32-way branching.

  • So in our trees that we've been looking at, we've kind of had one bit of information per

  • level.

  • And we've been descending bit-by-bit but if we had a 32-way branching tree, it would be

  • five bits of information that we would be representing at each level.

  • So that would look something like this.

  • If we had a much bigger number, like, 18,977, in binary, that's that bunch of ones and zeros.

  • This would be a really deep tree if I had to descend into it one at a time, it would

  • be like 15 levels deep.

  • Too much, too long.

  • So if I'd make more branches at each level, then I can chunk this up into kind of 5-bit

  • letters as it were, and descend the tree that it's now only three levels using the 32-way

  • branching.

  • So this is kind of a tradeoff between how deep your tree is going to be, and how big

  • the nodes are going to be because if I have just one bit of information at each level

  • then I have really small nodes.

  • That's quick to copy over but I have to go very, very deep down in the tree for a larger

  • array.

  • And generally, research has found that 32 is a pretty good tradeoff between the depth

  • of the tree.

  • So what we've seen is a bitmap vector TRIE.

  • That's just jargon.

  • We don't need to care about that.

  • But if you need something to Google, you can Google that.

  • This is cool for array-type of things and we have an index we want to jump there, but

  • what about objects?

  • We also want to be able to associate objects with arbitrary keys, not just indeces, so

  • we want non-integers as keys, how does that work?

  • So if I want a version of my data structure where it's no longer an array but it's something

  • like an object where I'm associated letters with each of my animals like M for monkey,

  • and P for panda, et cetera, what I can do is I can take my keys, in this case, they're

  • letters, and hash them to get a number that represents the key.

  • So that each key will have its own number.

  • They won't be in order necessarily, but that's okay.

  • Objects don't have to be in order.

  • And then we can use the hash of that number in binary to descend the tree as before.

  • So if I wanted to look up the value associated with key "F," I could hash F, get some number,

  • and let's say I get five, like, A, B, C, D, E, five.

  • And that would be represented in binary as one, and I descend the tree as before, here,

  • for simplicity, just using a one bit at a time, two-way branching tree.

  • But typically we would be doing this with 32 branches per level.

  • So, again, we just descend the tree using the binary representation of our key, in this

  • case, we used a hash function to transform it from some arbitrary object into a number

  • and we get the animal we want -- in this case, our frog.

  • Cool.

  • So that, if you want to Google it, the thing you could Google is a hash array mapped TRIE.

  • And this was a data structure parented by Phil Bagwell, and Rich Hickey, kind of started

  • using it, and a lot of these an implemented in languages like Clojure to implement the

  • data efficiently.

  • There's a ton of optimizations that are usually done on these data structures to make them

  • super-duper fast and lots of details that we're not covering here but this is the basic

  • idea.

  • Trees to represent our data, structural sharing so that we can reuse as much information as

  • possible between the old versions and the new versions.

  • And this idea of using binary representations of our keys, whether indeces, or hashed keys

  • to descend the tree to find the thing we're looking for.

  • So to recap, mutability induces headaches.

  • It is to be avoided especially if you're doing functional programming where the essential

  • idea is to not have side effects and only be using pure functions that don't change

  • anything except do the computation on the input and return the output.

  • Immutability, on the other hand, is great because if I'm using immutable data, I can't

  • mess up my co-worker's program by making the zoo she only thought was animals suddenly

  • have an alien in it.

  • But copying is a really bad way of handling data because it is not efficient neither with

  • respect to time, nor space.

  • And structural sharing, using these tree structures -- or TRIE structures, and sharing as much

  • information from one version to the next is the really performant way to do this.

  • And so you're probably thinking, okay, these data structures are pretty cool.

  • But what am I supposed to do with them?

  • I'm not going to be building boxes of emoji here, am I?

  • No, you don't have to.

  • In JavaScript, there are some really great libraries out there to help you use these

  • right off the bat.

  • There are various solutions but I'm going to talk about a couple of them.

  • So one is called Mori.

  • Mori is basically a port of Clojure script by David Nolan that allows you to leverage

  • the implementations of these data structures from ClojureScript, which is the version of

  • Clojure which targets JavaScript from the comfort of your vanilla JavaScript.

  • And it's got a bit more of a Clojure feel to it.

  • A bit more of a functional language feel.

  • The API is functional and we're going to see what that looks like in a moment.

  • But that's one thing that kind of sets this library apart.

  • On the other hand, there's also Immutable.js.

  • This is a library put out by Facebook.

  • It was created by Lee Byron.

  • And this is a JavaScript implementation of these data structures.

  • So it has a bit more of that native JavaScript feel to it.

  • It doesn't have kind of the Clojure background brought in.

  • And that means it's got a more object-oriented style API, although it is still returning

  • new versions of data structures instead of changing mutable structures in place.

  • So let's see what those look like.

  • This is how you might use Mori to create what's called a vector.

  • A vector is the data structure from Mori that you'd probably be using as an array-type thing.

  • So I've got a vector that I'm calling A because it's sort of array-ish.

  • It's got one and two in it.

  • And if I want to push something onto that, the function that I'd use is conj.

  • This is from the Clojure called, Lisp-speak.

  • And what I would put in is the original A, and then what I want, which is, in this case,

  • three.

  • And you'll see that this creates this new structure on the right.

  • These vector, one, two, and one, two, three, they look different because they're not really

  • JavaScript arrays although you can convert back and forth.

  • But the point is this cong function returns a new value which I can catch as A2 and I

  • can prove to myself that my original A didn't change by using the count function to see

  • how many things are in it.

  • And there's only two things in it.

  • But I can prove that my version, A2, has the third thing by trying to access, by using

  • the get function to trying to get two, which it tells me, it is indeed three.

  • This is the same thing that you would use in Immutable.js.

  • Here you would use Immutable.js.list.of, that's interesting syntax.

  • But it creates something more like a JavaScript array.

  • Although it is not an array, it is a JS list.

  • That I'll call an array and if I want to add something onto a new version of A, I use this

  • sort of dot-method notation that we're used to.

  • I'd say a.push(3), but, importantly, this is not changing a.

  • It's just returning a new value of a, which I'm going to capture as a2 and I can prove

  • to myself that it didn't change.

  • A.size tells me it's two, and if I try to get the item at index two, I find that it's

  • three, as I expected.

  • So, similarly, for what are called maps, which is kind of the key-value object that we might

  • be using, if I create an object, o, which is going to be my Mori hashmap data structure,

  • I'm associating a is one, b with two, again, we see that the syntax is a little different

  • from our regular JavaScript beastlier not regular JavaScript objects.

  • They're super special immutable data structures, they need special syntax.

  • And so if I want to change the value of one of my keys, I can use this asoc function,

  • and then change the value of three in my new version, o2, and then I can prove to myself

  • that the original didn't change by using the get function to make sure that a in the original

  • one -- o, is one, and the a in o2 is three, as I would expect.

  • And it looks quite similar in Immutable.js except the structure is called map, not hashmap,

  • and I can pass in a little JavaScript object, and it gives me a little o, a little more

  • JavaScript syntax than we're used to.

  • This has a bit more of a syntax and feel that you might be used to from JavaScript programming,

  • I can use the set method on o to create a new version where a is now three, and I can

  • use the get methods on my old version o, and my new-version o2 to prove to myself that

  • the old one didn't change.

  • So these are really immutable data structures.

  • They look really weird if you try to look at them in the console just as JavaScript

  • objects.

  • They're really fun to kind of poke down into because they have this complicated tree structure.

  • So I highly recommend that you try out these libraries and see what works for you.

  • I can tell you really just briefly before I run out of time here, that how they compare

  • is basically, again, Mori is from the Clojure world, it's ClojureScript.

  • But the Immutable.js has more of the o.get() kind of feel to it, if you're comfortable

  • writing JavaScript like that.

  • However, for me, it gives me a little bit of a cognitive dissonance there because it

  • looks like we're mutating things with those calls -- we're not -- but for me, to get more

  • into the mindset of functional programming, I prefer the functional programming of Mori

  • because it gets to the way that we conceive things as inputs and not just outs.

  • We don't want to be in the mindset of making changes in place to objects.

  • There's also some minor performance differences between the two, Mori is a bit faster, and

  • Immutable.js is a bit smaller.

  • But they're both great options, try them out, and I hope one of them works for you.

  • So that's my talk.

  • I hope it's been useful.

  • Go forth and don't mutate your data!

  • Here's some references for you.

Hi, everybody.

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it