Placeholder Image

Subtitles section Play video

  • Imagine if you could record your life --

  • everything you said, everything you did,

  • available in a perfect memory store at your fingertips,

  • so you could go back

  • and find memorable moments and relive them,

  • or sift through traces of time

  • and discover patterns in your own life

  • that previously had gone undiscovered.

  • Well that's exactly the journey

  • that my family began

  • five and a half years ago.

  • This is my wife and collaborator, Rupal.

  • And on this day, at this moment,

  • we walked into the house with our first child,

  • our beautiful baby boy.

  • And we walked into a house

  • with a very special home video recording system.

  • (Video) Man: Okay.

  • Deb Roy: This moment

  • and thousands of other moments special for us

  • were captured in our home

  • because in every room in the house,

  • if you looked up, you'd see a camera and a microphone,

  • and if you looked down,

  • you'd get this bird's-eye view of the room.

  • Here's our living room,

  • the baby bedroom,

  • kitchen, dining room

  • and the rest of the house.

  • And all of these fed into a disc array

  • that was designed for a continuous capture.

  • So here we are flying through a day in our home

  • as we move from sunlit morning

  • through incandescent evening

  • and, finally, lights out for the day.

  • Over the course of three years,

  • we recorded eight to 10 hours a day,

  • amassing roughly a quarter-million hours

  • of multi-track audio and video.

  • So you're looking at a piece of what is by far

  • the largest home video collection ever made.

  • (Laughter)

  • And what this data represents

  • for our family at a personal level,

  • the impact has already been immense,

  • and we're still learning its value.

  • Countless moments

  • of unsolicited natural moments, not posed moments,

  • are captured there,

  • and we're starting to learn how to discover them and find them.

  • But there's also a scientific reason that drove this project,

  • which was to use this natural longitudinal data

  • to understand the process

  • of how a child learns language --

  • that child being my son.

  • And so with many privacy provisions put in place

  • to protect everyone who was recorded in the data,

  • we made elements of the data available

  • to my trusted research team at MIT

  • so we could start teasing apart patterns

  • in this massive data set,

  • trying to understand the influence of social environments

  • on language acquisition.

  • So we're looking here

  • at one of the first things we started to do.

  • This is my wife and I cooking breakfast in the kitchen,

  • and as we move through space and through time,

  • a very everyday pattern of life in the kitchen.

  • In order to convert

  • this opaque, 90,000 hours of video

  • into something that we could start to see,

  • we use motion analysis to pull out,

  • as we move through space and through time,

  • what we call space-time worms.

  • And this has become part of our toolkit

  • for being able to look and see

  • where the activities are in the data,

  • and with it, trace the pattern of, in particular,

  • where my son moved throughout the home,

  • so that we could focus our transcription efforts,

  • all of the speech environment around my son --

  • all of the words that he heard from myself, my wife, our nanny,

  • and over time, the words he began to produce.

  • So with that technology and that data

  • and the ability to, with machine assistance,

  • transcribe speech,

  • we've now transcribed

  • well over seven million words of our home transcripts.

  • And with that, let me take you now

  • for a first tour into the data.

  • So you've all, I'm sure,

  • seen time-lapse videos

  • where a flower will blossom as you accelerate time.

  • I'd like you to now experience

  • the blossoming of a speech form.

  • My son, soon after his first birthday,

  • would say "gaga" to mean water.

  • And over the course of the next half-year,

  • he slowly learned to approximate

  • the proper adult form, "water."

  • So we're going to cruise through half a year

  • in about 40 seconds.

  • No video here,

  • so you can focus on the sound, the acoustics,

  • of a new kind of trajectory:

  • gaga to water.

  • (Audio) Baby: Gagagagagaga

  • Gaga gaga gaga

  • guga guga guga

  • wada gaga gaga guga gaga

  • wader guga guga

  • water water water

  • water water water

  • water water

  • water.

  • DR: He sure nailed it, didn't he.

  • (Applause)

  • So he didn't just learn water.

  • Over the course of the 24 months,

  • the first two years that we really focused on,

  • this is a map of every word he learned in chronological order.

  • And because we have full transcripts,

  • we've identified each of the 503 words

  • that he learned to produce by his second birthday.

  • He was an early talker.

  • And so we started to analyze why.

  • Why were certain words born before others?

  • This is one of the first results

  • that came out of our study a little over a year ago

  • that really surprised us.

  • The way to interpret this apparently simple graph

  • is, on the vertical is an indication

  • of how complex caregiver utterances are

  • based on the length of utterances.

  • And the [horizontal] axis is time.

  • And all of the data,

  • we aligned based on the following idea:

  • Every time my son would learn a word,

  • we would trace back and look at all of the language he heard

  • that contained that word.

  • And we would plot the relative length of the utterances.

  • And what we found was this curious phenomena,

  • that caregiver speech would systematically dip to a minimum,

  • making language as simple as possible,

  • and then slowly ascend back up in complexity.

  • And the amazing thing was

  • that bounce, that dip,

  • lined up almost precisely

  • with when each word was born --

  • word after word, systematically.

  • So it appears that all three primary caregivers --

  • myself, my wife and our nanny --

  • were systematically and, I would think, subconsciously

  • restructuring our language

  • to meet him at the birth of a word

  • and bring him gently into more complex language.

  • And the implications of this -- there are many,

  • but one I just want to point out,

  • is that there must be amazing feedback loops.

  • Of course, my son is learning

  • from his linguistic environment,

  • but the environment is learning from him.

  • That environment, people, are in these tight feedback loops

  • and creating a kind of scaffolding

  • that has not been noticed until now.

  • But that's looking at the speech context.

  • What about the visual context?

  • We're not looking at --

  • think of this as a dollhouse cutaway of our house.

  • We've taken those circular fish-eye lens cameras,

  • and we've done some optical correction,

  • and then we can bring it into three-dimensional life.

  • So welcome to my home.

  • This is a moment,

  • one moment captured across multiple cameras.

  • The reason we did this is to create the ultimate memory machine,

  • where you can go back and interactively fly around

  • and then breathe video-life into this system.

  • What I'm going to do

  • is give you an accelerated view of 30 minutes,

  • again, of just life in the living room.

  • That's me and my son on the floor.

  • And there's video analytics

  • that are tracking our movements.

  • My son is leaving red ink. I am leaving green ink.

  • We're now on the couch,

  • looking out through the window at cars passing by.

  • And finally, my son playing in a walking toy by himself.

  • Now we freeze the action, 30 minutes,

  • we turn time into the vertical axis,

  • and we open up for a view

  • of these interaction traces we've just left behind.

  • And we see these amazing structures --

  • these little knots of two colors of thread

  • we call "social hot spots."

  • The spiral thread

  • we call a "solo hot spot."

  • And we think that these affect the way language is learned.

  • What we'd like to do

  • is start understanding

  • the interaction between these patterns

  • and the language that my son is exposed to

  • to see if we can predict

  • how the structure of when words are heard

  • affects when they're learned --

  • so in other words, the relationship

  • between words and what they're about in the world.

  • So here's how we're approaching this.

  • In this video,

  • again, my son is being traced out.

  • He's leaving red ink behind.

  • And there's our nanny by the door.

  • (Video) Nanny: You want water? (Baby: Aaaa.)

  • Nanny: All right. (Baby: Aaaa.)

  • DR: She offers water,

  • and off go the two worms

  • over to the kitchen to get water.

  • And what we've done is use the word "water"

  • to tag that moment, that bit of activity.

  • And now we take the power of data

  • and take every time my son

  • ever heard the word water

  • and the context he saw it in,

  • and we use it to penetrate through the video

  • and find every activity trace

  • that co-occurred with an instance of water.

  • And what this data leaves in its wake

  • is a landscape.

  • We call these wordscapes.

  • This is the wordscape for the word water,

  • and you can see most of the action is in the kitchen.

  • That's where those big peaks are over to the left.

  • And just for contrast, we can do this with any word.

  • We can take the word "bye"

  • as in "good bye."

  • And we're now zoomed in over the entrance to the house.

  • And we look, and we find, as you would expect,

  • a contrast in the landscape

  • where the word "bye" occurs much more in a structured way.

  • So we're using these structures

  • to start predicting

  • the order of language acquisition,

  • and that's ongoing work now.

  • In my lab, which we're peering into now, at MIT --

  • this is at the media lab.

  • This has become my favorite way

  • of videographing just about any space.

  • Three of the key people in this project,

  • Philip DeCamp, Rony Kubat and Brandon Roy are pictured here.

  • Philip has been a close collaborator

  • on all the visualizations you're seeing.

  • And Michael Fleischman

  • was another Ph.D. student in my lab