Subtitles section Play video
-
Hello, my name is Christian Rudder,
-
and I was one of the founders of OK Cupid.
-
It's now one of the biggest dating sites in the United States.
-
Like almost everyone at the site,
-
I was a math major, and, as you might expect,
-
we're known for the analytic approach
-
we have taken to love.
-
We call it our matching algorithm.
-
Basically OK Cupid's matching algorithm helps us decide whether two people should go on a date.
-
We built our entire business around it.
-
Now, algorithm is a fancy word,
-
and people like to drop it like it's this big thing,
-
but, really, an algorithm is just a systematic,
-
step-by-step way to solve a problem.
-
It doesn't have to be fancy at all.
-
Here, in this lesson, I'm going to explain
-
how we arrived at our particular algorithm
-
so you can see how it's done.
-
Now, why are algorithms even important?
-
Why does this lesson even exist?
-
Well, notice one very significant phrase I used above:
-
they are a "step-by-step" way to solve a problem,
-
and, as you probably know,
-
computers excel at step-by-step processes.
-
A computer without an algorithm
-
is basically an expensive paperweight.
-
And since computers are such a pervasive part of everyday life, algorithms are everywhere.
-
The math behind OK Cupid's matching algorithm
-
is surprisingly simple.
-
It's just some addition,
-
multiplication,
-
a little bit of square roots.
-
The tricky part in designing it, though,
-
was figuring out how to take something mysterious,
-
human attraction,
-
and break it into components that a computer can work with.
-
Well, the first thing we needed to match people up was data,
-
something for the algorithm to work with.
-
The best way to get data quickly from people
-
is to just ask for it.
-
So, we decided that OK Cupid should ask users questions,
-
stuff like, "Do you want to have kids one day?"
-
and "How often do you brush your teeth?",
-
"Do you like scary movies?"
-
and big stuff like "Do you believe in God?"
-
Now, a lot of the questions are good
-
for matching like with like,
-
that is when both people answer the same way.
-
For example, two people who are both into scary movies are probably a better match than one person who is and one person who isn't.
-
But what about a question like,
-
"Do you like to be the center of attention?"
-
If both people in a relationship are saying yes to this,
-
then they are going to have massive problems.
-
We realized this early on,
-
and so we decided we needed
-
a bit more data from each question.
-
We had to ask people to specify not only their own answer,
-
but the answer they wanted from someone else.
-
That worked really well,
-
but we needed one more dimension.
-
Some questions tell you more about a person than others.
-
For example, a question about politics, something like,
-
"Which is worse: book burning or flag burning?"
-
might reveal more about someone than their taste in movies.
-
And it doesn't make sense to weigh all things equally,
-
so we added one final data point.
-
For everything that OK Cupid asks you,
-
you have a chance to tell us
-
the role it plays in your life,
-
and this ranges from irrelevant to mandatory.
-
So now, for every question,
-
we have three things for our algorithm:
-
first, your answer;
-
second, how you want someone else,
-
your potential match,
-
to answer;
-
and three, how important the question is to you at all.
-
With all this information,
-
OK Cupid can figure out how well two people will get along.
-
The algorithm crunches the numbers and gives us a result.
-
As a practical example,
-
let's look at how we'd match you with another person,
-
let's call him, "B".
-
Your match percentage with B is based on
-
questions you've both answered.
-
Let's call that set of common questions, "s".
-
As a very simple example, we use a small set "s"
-
with just two questions in common
-
and compute a match from that.
-
Here are our two example questions.
-
The first one, let's say, is, "How messy are you?"
-
and the answer possibilities are
-
very messy,
-
average,
-
and very organized.
-
And let's say you answered "very organized,"
-
and you'd like someone else to answer "very organized,"
-
and the question is very important to you.
-
Basically you are a neat freak.
-
You're neat,
-
you want someone else to be neat,
-
and that's it.
-
And let's say B is a little bit different.
-
He answered very organized for himself,
-
but average is OK with him
-
as an answer from someone else,
-
and the question is only a little important to him.
-
Let's look at the second question,
-
it's the one from our previous example:
-
"Do you like to be the center of attention?"
-
The answers are just yes and no.
-
Now you've answered "no,"
-
how you want someone else to answer is "no,"
-
and the questions is only a little important to you.
-
Now B, he's answered "yes,"
-
he wants someone else to answer "no,"
-
because he wants the spotlight on him,
-
and the question is somewhat important to him.
-
So, let's try to compute all of this.
-
Our first step is,
-
since we use computers to do this,
-
we need to assign numerical values
-
to ideas like "somewhat important" and "very important"
-
because computers need everything in numbers.
-
We at OK Cupid decided on the following scale:
-
irrelevant is worth 0,
-
a little important is worth 1,
-
somewhat important is worth 10,
-
very important is 50,
-
and absolutely mandatory is 250.
-
Next, the algorithm makes two simple calculations.
-
The first is how much did B's answers satisfy you,
-
that is, how many possible points did B score on your scale?
-
Well, you indicated that B's answer
-
to the first question about messiness
-
was very important to you.
-
It's worth 50 points and B got that right.
-
The second question is worth only 1
-
because you said it was only a little important,
-
and B got that wrong.
-
So B's answers were 50 out of 51 possible points.
-
That's 98% satisfactory.
-
It's pretty good.
-
And, the second question of the algorithm looks at
-
is how much did you satisfy B.
-
Well, B placed 1 point on your answer
-
to the messiness question
-
and 10 on your answer to the second.
-
Of those, 11, that's 1 plus 10,
-
you earned 10,
-
you guys satisfied each other on the second question.
-
So your answers were 10 out of 11
-
equals 91% satisfactory to B.
-
That's not bad.
-
The final step is to take these two match percentages
-
and get one number for the both of you.
-
To do this, the algorithm multiplies your scores,
-
then takes the nth root,
-
where n is the number of questions.
-
Because s, which is the number of questions,
-
in this sample, is only 2,
-
we have match percentage equals
-
the square root of 98% times 91%.
-
That equals 94%.
-
That 94% is your match percentage with B.
-
It's a mathematical expression
-
of how happy you'd be with each other
-
based on what we know.
-
Now, why does the algorithm multiply as opposed to, say,
-
average the two match scores together
-
and do the square-root business?
-
In general, this formula is called the geometric mean,
-
which is a great way to combine values
-
that have wide ranges
-
and represent very different properties.
-
In other words, it's perfect for romantic matching.
-
You've got wide ranges
-
and you've got tons of different data points,
-
like I said, about movies,
-
about politics,
-
about religion,
-
about everything.
-
Intuitively, too, this makes sense.
-
Two people satisfying each other 50%
-
should be a better match
-
than two others who satisfy 0 and 100,
-
because affection needs to be mutual.
-
After adding a little correction for margin of error,
-
in the case when we have a very small number of questions,
-
like we do in this example,
-
we're good to go.
-
Any time OK Cupid matches two people,
-
it goes through the steps we just outlined.
-
First it collects data about your answers,
-
then it compares your choices and preferences
-
to other people in simple, mathematical ways.
-
This, the ability to take real world phenomena and make them something a microchip can understand, is, I think, the most important skill anyone can have these days.
-
Like you use sentences to tell a story to a person,
-
you use algorithms to tell a story to a computer.
-
If you learn the language,
-
you can go out and tell your stories.
-
I hope this will help you do that.