 ## Subtitles section Play video

• In a previous video, I've talked about linear systems of equations, and I sort of brushed

• aside the discussion of actually computing solutions to these systems.

• And while it's true that number-crunching is something we typically leave to the computers,

• digging into some of these computational methods is a good litmus test for whether or not you

• actually understand what's going on, since this is really where the rubber meets the

• Here I want to describe the geometry behind a certain method for computing solutions to

• these systems, known as Cramer's rule.

• The relevant background needed here is an understanding of determinants, dot products,

• and of linear systems of equations, so be sure to watch the relevant videos on those

• topics if you're unfamiliar or rusty.

• But first!

• I should say up front that Cramer's rule is not the best way for computing solutions

• to linear systems of equations.

• Gaussian elimination, for example, will always be faster.

• So why learn it?

• Think of this as a sort of cultural excursion; it's a helpful exercise in deepening your

• knowledge of the theory of these systems.

• Wrapping your mind around this concept will help consolidate ideas from linear algebra,

• like the determinant and linear systems, by seeing how they relate to each other.

• Also, from a purely artistic standpoint, the ultimate result is just really pretty to think

• about, much more so that Gaussian elimination.

• Alright, so the setup here will be some linear system of equations, say with two unknowns,

• x and y, and two equations.

• In principle, everything we're talking about will work systems with a larger number of

• unknowns, and the same number of equations.

• But for simplicity, a smaller example is nicer to hold in our heads.

• So as I talked about in a previous video, you can think of this setup geometrically

• as a certain known matrix transforming an unknown vector, [x; y], where you know what

• the output is going to be, in this case [-4; -2].

• Remember, the columns of this matrix tell you how the matrix acts as a transform, each

• one telling you where the basis vectors of the input space land.

• So this is a sort of puzzle, what input [x; y], is going to give you this

• output [-4; -2]?

• Remember, the type of answer you get here can depend on

• whether or not the transformation squishes all of space into a lower dimension.

• That is if it has zero determinant.

• In that case, either none of the inputs land on our given output or there are a whole bunch

• of inputs landing on that output.

• But for this video we'll limit our view to the case of a non-zero determinant, meaning

• the output of this transformation still spans the full n-dimensional space it started in;

• every input lands on one and only one output and every output has one and only one input.

• One way to think about our puzzle is that we know the given output vector is some linear

• combination of the columns of the matrix; x*(the vector where i-hat lands) + y*(the

• vector where j-hat lands), but we wish to compute what exactly x and y are.

• As a first pass, let me show an idea that is wrong, but in the right direction.

• The x-coordinate of this mystery input vector is what you get by taking its dot product

• with the first basis vector, [1; 0].

• Likewise, the y-coordinate is what you get by dotting it with the second basis vector,

• [0; 1].

• So maybe you hope that after the transformation, the dot products with the transformed version

• of the mystery vector with the transformed versions of the basis vectors will also be

• these coordinates x and y.

• That'd be fantastic because we know the transformed versions of each of these vectors.

• There's just one problem with this: it's not at all true!

• For most linear transformations, the dot product before and after the transformation will be

• very different.

• For example, you could have two vectors generally pointing in the same direction, with a positive

• dot product, which get pulled away from each other during the transformation, in such a

• way that they then have a negative dot product.

• Likewise, if things start off perpendicular, with dot product zero, like the two basis

• vectors, there's no guarantee that they will stay perpendicular after the transformation,

• preserving that zero dot product.

• In the example we were looking at, dot products certainly aren't preserved.

• They tend to get bigger since most vectors are getting stretched.

• In fact, transformations which do preserve dot products are special enough to have their

• own name: Orthonormal transformations.

• These are the ones which leave all the basis vectors perpendicular to each other with unit

• lengths.

• You often think of these as rotation matrices.

• The correspond to rigid motion, with no stretching, squishing or morphing.

• Solving a linear system with an orthonormal matrix is very easy: Since dot products are

• preserved, taking the dot product between the output vector and all the columns of your

• matrix will be the same as taking the dot products between the input vector and all

• the basis vectors, which is the same as finding the coordinates of the input vector.

• So, in that very special case, x would be the dot product of the first column with the

• output vector, and y would be the dot product of the second column with the output vector.

• Now, even though this idea breaks down for most linear systems, it points us in the direction

• of something to look for: Is there an alternate geometric understanding for the coordinates

• of our input vector which remains unchanged after the transformation?

• If your mind has been mulling over determinants, you might think of this clever idea: Take

• the parallelogram defined by the first basis vector, i-hat, and the mystery input vector

• [x; y].

• The area of this parallelogram is its base, 1, times the height perpendicular to that

• base, which is the y-coordinate of our input vector.

• So, the area of this parallelogram is sort of a screwy roundabout way to describe the

• vector's y-coordinate; it's a wacky way to talk about coordinates, but run with me.

• Actually, to be more accurate, you should think of the signed area of this parallelogram,

• in the sense described by the determinant video.

• That way, a vector with negative y-coordinate would correspond to a negative area for this

• parallelogram.

• Symmetrically, if you look at the parallelogram spanned by the vector

• and the second basis vector, j-hat, its area will be the x-coordinate of the vector.

• Again, it's a strange way to represent the x-coordinate, but you'll see what it buys

• us in a moment.

• Here's what this would look like in three-dimensions: Ordinarily the way you might think of one

• of a vector's coordinate, say its z-coordinate, would be to take its dot product with the

• third standard basis vector, k-hat.

• But instead, consider the parallelepiped it creates with the other two basis vectors,

• i-hat and j-hat.

• If you think of the square with area 1 spanned by i-hat and j-hat as the base of this guy,

• its volume is the same its height, which is the third coordinate of our vector.

• Likewise, the wacky way to think about any other coordinate of this vector is to form

• the parallelepiped between this vector an all the basis vectors other than the one you're

• looking for, and get its volume.

• Or, rather, we should talk about the signed volume of these parallelepipeds, in the sense

• described in the determinant video, where the order in which you list the three vectors

• matters and you're using the right-hand rule.

• That way negative coordinates still make sense.

• Okay, so why think of coordinates as areas and volumes like this?

• As you apply some matrix transformation, the areas of the parallelograms don't stay the

• same, they may get scaled up or down.

• But(!), and this is a key idea of determinants, all these areas get scaled by the same amount.

• Namely, the determinant of our transformation matrix.

• For example, if you look the parallelogram spanned by the vector where your first basis

• vector lands, which is the first column of the matrix, and the transformed version of

• [x; y], what is its area?

• Well, this is the transformed version of that parallelogram we were looking at earlier,

• whose area was the y-coordinate of the mystery input vector.

• So its area will be the determinant of the transformation multiplied by that value.

• So, the y-coordinate of our mystery input vector is the area of this parallelogram,

• spanned by the first column of the matrix and the output vector, divided by the determinant

• of the full transformation.

• And how do you get this area?

• Well, we know the coordinates for where the mystery input vector lands, that's the whole

• point of a linear system of equations.

• So, create a matrix whose first column is the same as that of our matrix, and whose

• second column is the output vector, and take its determinant.

• So look at that; just using data from the output of the transformation, namely the columns

• of the matrix and the coordinates of our output vector, we can recover the y-coordinate of

• our mystery input vector.

• Likewise, the same idea can get you the x-coordinate.

• Look at that parallelogram we defined early which encodes the x-coordinate of the mystery

• input vector, spanned by the input vector and j-hat.

• The transformed version of this guy is spanned by the output vector and the second column

• of the matrix, and its area will have been multiplied by the determinant of the matrix.

• So the x-coordinate of our mystery input vector is this area divided by the determinant of

• the transformation.

• Symmetric to what we did before, you can compute the area of that output parallelogram by creating

• a new matrix whose first column is the output vector, and whose second column is the same

• as the original matrix.

• So again, just using data from the output space, the numbers we see in our original

• linear system, we can recover the x-coordinate of our mystery input vector.

• This formula for finding the solutions to a linear system of equations is known as Cramer's

• rule.

• Here, just to sanity check ourselves, plug in the numbers here.

• The determinant of that top altered matrix is 4+2, which is 6, and the bottom determinant

• is 2, so the x-coordinate should be 3.

• And indeed, looking back at that input vector we started with, it's x-coordinate is 3.

• Likewise, Cramer's rule suggests the y-coordinate should be 4/2, or 2, and that is indeed the

• y-coordinate of the input vector we started with here.

• The case with three dimensions is similar, and I highly recommend you pause to think

• it through yourself.

• Here, I'll give you a little momentum.

• We have this known transformation, given by a 3x3 matrix, and a known output vector, given

• by the right side of our linear system, and we want to know what input vector lands on

• this output vector.

• If you think of, say, the z-coordinate of the input vector as the volume of this parallelepiped

• spanned by i-hat, j-hat, and the mystery input vector, what happens to the volume of this

• parallelepiped after the transformation?

• How can you compute that new volume?

• Really, pause and take a moment to think through the details of generalizing this to higher

• dimensions; finding an expression for each coordinate of the solution to larger linear

• systems.

• Thinking through more general cases and convincing yourself that it works is where all the learning

• will happen, much more so than listening to some dude on YouTube walk through the reasoning

• again.

In a previous video, I've talked about linear systems of equations, and I sort of brushed

Subtitles and vocabulary

Operation of videos Adjust the video here to display the subtitles

B2 US vector coordinate input output matrix linear

# Cramer's rule, explained geometrically | Essence of linear algebra, chapter 12

• 1 1
tai posted on 2021/02/16
Video vocabulary