Placeholder Image

Subtitles section Play video

  • MALE SPEAKER: Today we're very pleased, very happy, to have

  • Luis Von Ahn here today, from Carnegie Mellon University.

  • His talk is on human computation.

  • Luis is a very new assistant professor in computer science

  • at the School of Computer Science at Carnegie Mellon

  • University.

  • He received his Ph.D. in 2005, and I'm told he was the

  • hottest new graduate on the market, with offers from just

  • about every university out there, including corporate

  • offers, too.

  • He received his B.S. from Duke University.

  • He received a Microsoft Research Fellowship Award.

  • His research interests include encouraging people to work for

  • free, as well as catching and thwarting cheaters in online

  • environments.

  • His work has appeared in over a hundred news publications

  • around the world.

  • New York Times, CNN, USA Today, BBC, and

  • the Discovery Channel.

  • Luis holds four patent applications and has licensed

  • technology to major internet companies.

  • Please join me in welcoming Luis Von Ahn.

  • [APPLAUSE]

  • LUIS VON AHN: Can you hear me now?

  • OK.

  • So, I want to start by asking a question to the people in

  • the audience.

  • How many of you have had to fill out a registration form

  • for something?

  • Like Yahoo, Hotmail, or Gmail, or some sort of web form where

  • you've been asked to read a distorted sequence of

  • characters or a distorted word such as this one?

  • How many of you found it annoying?

  • Awesome.

  • OK, well, that was part of my thesis.

  • That thing is called a CAPTCHA, and the reason it's

  • there is to make sure that you, the entity filling out

  • the web form, are actually a human, and not some sort of

  • computer program that was written to submit the form

  • millions and millions of times.

  • The reason it works is because humans--

  • at least non-visually impaired humans--

  • have no trouble reading distorted characters, whereas

  • computer programs simply can't do it as well yet.

  • More generally, a CAPTCHA is just a program that can tell

  • whether its user is a human or a computer.

  • OK, let me say that another way.

  • A CAPTCHA is a program that can generate and grade tests

  • that most humans can pass, but current computer

  • programs can not.

  • Notice the paradox here.

  • A CAPTCHA is a program that can generate and grade tests

  • that it itself cannot pass.

  • So in that way, CAPTCHAs are a lot like some professors.

  • [LAUGHTER]

  • Just to make things crystal clear, let me give you an

  • example of one of these programs that can generate and

  • grade tests that most humans can pass, but current computer

  • programs cannot.

  • Here's how the program works.

  • First, the program picks a random string of letters.

  • O-A-M-G, in this case.

  • Then the program renders the string into a randomly

  • distorted image, and then the program generates a test,

  • which consists of the randomly distorted image and the

  • question, "What are the characters in this image?"

  • CAPTCHAs are used all over the place, for all kinds of

  • things, and I could spend the next hour talking about all

  • the different applications of CAPTCHAs.

  • But since I don't want to do that, I want to illustrate one

  • of the applications through a little story.

  • So a few years ago, Slashdot--

  • which is a very popular website--

  • put up this poll in their site, asking which is the best

  • computer science graduate school in the United States?

  • This is a very dangerous question to ask over the web.

  • As with most online polls, IP addresses of voters were

  • recorded to make sure that each person could only vote,

  • at most, once.

  • However, as soon as the poll went up, students at CMU wrote

  • a program that voted for CMU thousands and

  • thousands of times.

  • The next day, students at MIT wrote their own program.

  • And a few days later, the poll had to be taken down with CMU

  • and MIT having, like, a gazillion votes and every

  • other school having less than 1,000.

  • I guess the poll worked in this case.

  • [LAUGHTER]

  • I'm just kidding.

  • But in general, this is a huge problem.

  • You simply cannot trust the results of an online poll,

  • because anybody could just write a program to vote for

  • their favorite option thousands and

  • thousands of times.

  • One solution is to use a CAPTCHA to make sure that only

  • humans can vote.

  • CAPTCHAs have many, many other applications.

  • Another one is in free email services.

  • For instance, there are several companies that offer

  • free email services--

  • Yahoo, Microsoft, Google--

  • and up until a few years ago, all of them were suffering

  • from a very specific type of attack.

  • It was people who wrote programs to obtain millions of

  • email accounts every day, and the people who wrote these

  • programs were usually spammers.

  • So if you're a spammer and you want to send spam from, say,

  • Yahoo, you run into the problem that each Yahoo

  • account only allows you to sound, like,

  • 100 messages a day.

  • So if you want to send millions of messages a day

  • from Yahoo accounts, you have to own

  • millions of Yahoo accounts.

  • And this is why spammers wrote programs to obtain millions of

  • Yahoo accounts.

  • And the solution--

  • or one solution-- and this is what we originally suggested

  • to Yahoo-- was to use a CAPTCHA to make sure that only

  • humans can obtain free email accounts.

  • Now, since CAPTCHAs are used all over the place to stop

  • spammers from doing bad things, spammers have started

  • coming up with all kinds of dirty hacks to get around the

  • CAPTCHAs that are being used in practice.

  • So let me explain a couple of them.

  • Here's one.

  • I'm sure a lot of you have heard of this.

  • CAPTCHA sweatshops.

  • Spam companies actually are hiring people to solve

  • CAPTCHAs all day long.

  • And they are usually being hired in other countries where

  • the minimum wage is a lot lower, and this

  • is currently happening.

  • But there's at least two consolations.

  • First, it's at least costing them some.

  • So whereas before, they could get the accounts for free, now

  • it costs them a fraction of a cent per account, so they

  • can't get that many.

  • Second, CAPTCHAs are actually generating jobs in

  • underdeveloped countries.

  • [LAUGHTER]

  • So this is one dirty hack.

  • There's an even dirtier hack, and I'm sure a lot of you have

  • heard of it, and this is what some porn companies are

  • allegedly doing.

  • And I'm going to emphasize the word "allegedly." So, porn

  • companies also want to send spam.

  • They also want to break CAPTCHAs, and here's how they

  • are allegedly doing it.

  • They write a program the fills out the entire registration

  • form, say, at Yahoo.

  • And whenever the program gets to the CAPTCHA,

  • it can't solve it.

  • So what it does is it copies the CAPTCHA

  • back to the porn page.

  • Now, back at the porn page, there's a lot of people

  • looking at porn.

  • And suddenly, one of them gets this screen saying, "If you

  • want to see the next picture, you got to tell me what word

  • is in the box below." And you know what people do?

  • They type the word as fast as possible.

  • [LAUGHTER]

  • And by doing so, they are effectively solving the

  • CAPTCHA for the porn company bot.

  • That is, they're effectively obtaining a free

  • email account for them.

  • So pornographers, they're really, really smart.

  • So CAPTCHAs take advantage of human processing power in

  • order to differentiate humans from computers, and it turns

  • out that being able to do so has some very, very nice

  • applications in practice.

  • Now that I've told you about CAPTCHAs, now I can tell you

  • what this talk really is about.

  • This talk is not about CAPTCHAs.

  • This talk is about human computation.