 ## Subtitles section Play video

• Traditionally, dot products or something that's introduced really early on in a linear algebra

• course

• typically right at the start.

• So it might seem strange that I push them back this far in the series.

• I did this because there's a standard way to introduce the topic which

• requires nothing more than a basic understanding of vectors,

• but a fuller understanding of the role the dot products play in math,

• can only really be found under the light of linear transformations.

• Before that, though, let me just briefly cover

• the standard way that products are introduced.

• Which I'm assuming is at least partially review for a number of viewers.

• Numerically, if you have two vectors of the same dimension;

• to list of numbers with the same length,

• taking their dot product, means,

• pairing up all of the coordinates,

• multiplying those pairs together,

• So the vector [1, 2] dotted with [3, 4],

• would be 1 x 3 + 2 x 4.

• The vector [6, 2, 8, 3] dotted with [1, 8, 5, 3] would be:

• 6 x 1 + 2 x 8 + 8 x 5 + 3 x 3.

• Luckily, this computation has a really nice geometric interpretation.

• To think about the dot product between two vectors v and w,

• imagine projecting w onto the line that passes through the origin and the tip of v.

• Multiplying the length of this projection by the length of v, you have the dot product

• v・w.

• Except when this projection of w is pointing in the opposite direction from v,

• that dot product will actually be negative.

• So when two vectors are generally pointing in the same direction,

• their dot product is positive.

• When they're perpendicular, meaning,

• the projection of one onto the other is the 0 vector,

• the dot product is 0.

• And if they're pointing generally the opposite direction, their dot product is negative.

• Now, this interpretation is weirdly asymmetric,

• it treats the two vectors very differently,

• so when I first learned this, I was surprised that order doesn't matter.

• You could instead project v onto w;

• multiply the length of the projected v by the length of w

• and get the same result.

• I mean, doesn't that feel like a really different process?

• Here's the intuition for why order doesn't matter:

• if v and w happened to have the same length,

• we could leverage some symmetry.

• Since projecting w onto v

• then multiplying the length of that projection by the length of v,

• is a complete mirror image of projecting v onto w then multiplying the length of that

• projection by the length of w.

• Now, if youscaleone of them, say v by some constant like 2,

• so that they don't have equal length,

• the symmetry is broken.

• But let's think through how to interpret the dot product between this new vector 2v and

• w.

• If you think of w is getting projected onto v

• then the dot product 2v・w will be

• exactly twice the dot product v・w.

• This is because when youscale” v by 2,

• it doesn't change the length of the projection of w

• but it doubles the length of the vector that you're projecting onto.

• But, on the other hand, let's say you're thinking about v getting projected onto w.

• Well, in that case, the length of the projection is the thing to getscaledwhen we multiply

• v by 2.

• The length of the vector that you're projecting onto stays constant.

• So the overall effect is still to just double the dot product.

• So, even though symmetry is broken in this case,

• the effect that thisscalinghas on the value of the dot product, is the same

• under both interpretations.

• There's also one other big question that confused me when I first learned this stuff:

• Why on earth does this numerical process of matching coordinates, multiplying pairs and

• have anything to do with projection?

• Well, to give a satisfactory answer,

• and also to do full justice to the significance of the dot product,

• we need to unearth something a little bit deeper going on here

• which often goes by the name "duality".

• But, before getting into that,

• I need to spend some time talking about linear transformations

• from multiple dimensions to one dimension

• which is just the number line.

• These are functions that take in a 2D vector and spit out some number.

• But linear transformations are, of course,

• much more restricted than your run-of-the-mill function with a 2D input and a 1D output.

• As with transformations in higher dimensions,

• like the ones I talked about in chapter 3,

• there are some formal properties that make these functions linear.

• But I'm going to purposely ignore those here so as to not distract from our end goal,

• and instead focus on a certain visual property that's equivalent to all the formal stuff.

• If you take a line of evenly spaced dots

• and apply a transformation,

• a linear transformation will keep those dots evenly spaced,

• once they land in the output space, which is the number line.

• Otherwise, if there's some line of dots that gets unevenly spaced

• then your transformation is not linear.

• As with the cases we've seen before,

• one of these linear transformations

• is completely determined by where it takes i-hat and j-hat

• but this time, each one of those basis vectors just lands on a number.

• So when we record where they land as the columns of a matrix

• each of those columns just has a single number.

• This is a 1 x 2 matrix.

• Let's walk through an example of what it means to apply one of these transformations to a

• vector.

• Let's say you have a linear transformation that takes i-hat to 1 and j-hat to -2.

• To follow where a vector with coordinates, say, [4, 3] ends up,

• think of breaking up this vector as 4 times i-hat + 3 times j-hat.

• A consequence of linearity, is that after the transformation

• the vector will be: 4 times the place where i-hat lands, 1,

• plus 3 times the place where j-hat lands, -2.

• which in this case implies that it lands on -2.

• When you do this calculation purely numerically, it's a matrix-vector multiplication.

• Now, this numerical operation of multiplying a 1 by 2 matrix by a vector,

• feels just like taking the dot product of two vectors.

• Doesn't that 1 x 2 matrix just look like a vector that we tipped on its side?

• In fact, we could say right now that there's a nice association between 1 x 2 matrices

• and 2D vectors,

• defined by tilting the numerical representation of a vector on its side to get the associated

• matrix,

• or to tip the matrix back up to get the associated vector.

• Since we're just looking at numerical expressions right now,

• going back and forth between vectors and 1 x 2 matrices might feel like a silly thing

• to do.

• But this suggests something that's truly awesome from the geometric view:

• there's some kind of connection between linear transformations that take vectors to numbers

• and vectors themselves.

• Let me show an example that clarifies the significance

• and which just so happens to also answer the dot product puzzle from earlier.

• Unlearn what you have learned

• and imagine that you don't already know that the dot product relates to projection.

• What I'm going to do here is take a copy of the number line

• and place it diagonally and space somehow with the number 0 sitting at the origin.

• Now think of the two-dimensional unit vector,

• whose tips sit where the number 1 on the number line is.

• I want to give that guy a name u-hat.

• This little guy plays an important role in what's about to happen,

• so just keep them in the back of your mind.

• If we project 2D vectors straight onto this diagonal number line,

• in effect, we've just defined a function that takes 2D vectors to numbers.

• What's more, this function is actually linear

• since it passes our visual test

• that any line of evenly spaced dots remains evenly spaced once it lands on the number

• line.

• Just to be clear,

• even though I've embedded the number line in 2D space like this,

• the output of the function are numbers, not 2D vectors.

• You should think of a function that takes into coordinates and outputs a single coordinate.

• But that vector u-hat is a two-dimensional vector

• living in the input space.

• It's just situated in such a way that overlaps with the embedding of the number line.

• With this projection, we just defined a linear transformation from 2D vectors to numbers,

• so we're going to be able to find some kind of 1 x 2 matrix that describes that transformation.

• To find that 1 x 2 matrix, let's zoom in on this diagonal number line setup

• and think about where i-hat and j-hat each land,

• since those landing spots are going to be the columns of the matrix.

• This part's super cool, we can reason through it with a really elegant piece of symmetry:

• since i-hat and u-hat are both unit vectors,

• projecting i-hat onto the line passing through u-hat

• looks totally symmetric to protecting u-hat onto the x-axis.

• So when we asked what number does i-hat land on when it gets projected

• the answer is going to be the same as whatever u-hat lands on when its projected onto the

• x-axis

• but projecting u-hat onto the x-axis

• just means taking the x-coordinate of u-hat.

• So, by symmetry, the number where i-hat lands when it's projected onto that diagonal number

• line

• is going to be the x coordinate of u-hat.

• Isn't that cool?

• The reasoning is almost identical for the j-hat case.

• Think about it for a moment.

• For all the same reasons, the y-coordinate of u-hat

• gives us the number where j-hat lands when it's projected onto the number line copy.

• Pause and ponder that for a moment; I just think that's really cool.

• So the entries of the 1 x 2 matrix describing the projection transformation

• are going to be the coordinates of u-hat.

• And computing this projection transformation for arbitrary vectors in space,

• which requires multiplying that matrix by those vectors,

• is computationally identical to taking a dot product with u-hat.

• This is why taking the dot product with a unit vector,

• can be interpreted as projecting a vector onto the span of that unit vector and taking

• the length.

• So what about non-unit vectors?

• For example,

• let's say we take that unit vector u-hat,

• but wescaleit up by a factor of 3.

• Numerically, each of its components gets multiplied by 3,

• So looking at the matrix associated with that vector,

• it takes i-hat and j-hat to 3 times the values where they landed before.

• Since this is all linear,

• it implies more generally,

• that the new matrix can be interpreted as projecting any vector onto the number line

• copy

• and multiplying where it lands by 3.

• This is why the dot product with a non-unit vector

• can be interpreted as first projecting onto that vector

• then scaling up the length of that projection by the length of the vector.

• Take a moment to think about what happened here.

• We had a linear transformation from 2D space to the number line,

• which was not defined in terms of numerical vectors or numerical dot products.

• It was just defined by projecting space onto a diagonal copy of the number line.

• But because the transformation is linear,

• it was necessarily described by some 1 x 2 matrix,

• and since multiplying a 1 x 2 matrix by a 2D vector

• is the same as turning that matrix on its side and taking a dot product,

• this transformation was, inescapably, related to some 2D vector.

• The lesson here, is that anytime you have one of these linear transformations

• whose output space is the number line,

• no matter how it was defined there's going to be some unique vector v

• corresponding to that transformation,

• in the sense that applying the transformation is the same thing as taking a dot product

• with that vector.

• To me, this is utterly beautiful.

• It's an example of something in math calledduality”.

• Dualityshows up in many different ways and forms throughout math

• and it's super tricky to actually define.

• Loosely speaking, it refers to situations where you have a natural but surprising correspondence

• between two types of mathematical thing.

• For the linear algebra case that you just learned about,

• you'd say that thedualof a vector is the linear transformation that it encodes.

• And the dual of a linear transformation from space to one dimension,

• is a certain vector in that space.

• So, to sum up, on the surface, the dot product is a very useful geometric tool for understanding

• projections

• and for testing whether or not vectors tend to point in the same direction.

• And that's probably the most important thing for you to remember about the dot product,

• but at deeper level, dotting two vectors together

• is a way to translate one of them into the world of transformations:

• again, numerically, this might feel like a silly point to emphasize,

• it's just two computations that happen to look similar.

• But the reason I find this so important,

• is that throughout math, when you're dealing with a vector,

• once you really get to know its personality

• sometimes you realize that it's easier to understand it, not as an arrow in space,

• but as the physical embodiment of a linear transformation.

• It's as if the vector is really just a conceptual shorthand for certain transformation,

• since it's easier for us to think about arrows and space

• rather than moving all of that space to the number line.

• In the next video, you'll see another really cool example of this "duality" in action