Placeholder Image

Subtitles section Play video

  • Big data is an elusive concept.

  • It represents an amount of digital information,

  • which is uncomfortable to store,

  • transport,

  • or analyze.

  • Big data is so voluminous

  • that it overwhelms the technologies of the day

  • and challenges us to create the next generation

  • of data storage tools and techniques.

  • So, big data isn't new.

  • In fact, physicists at CERN have been wrangling

  • with the challenge of their ever-expanding big data for decades.

  • Fifty years ago, CERN's data could be stored

  • in a single computer.

  • OK, so it wasn't your usual computer,

  • this was a mainframe computer

  • that filled an entire building.

  • To analyze the data,

  • physicists from around the world traveled to CERN

  • to connect to the enormous machine.

  • In the 1970's, our ever-growing big data

  • was distributed across different sets of computers,

  • which mushroomed at CERN.

  • Each set was joined together

  • in dedicated, homegrown networks.

  • But physicists collaborated without regard

  • for the boundaries between sets,

  • hence needed to access data on all of these.

  • So, we bridged the independent networks together

  • in our own CERNET.

  • In the 1980's, islands of similar networks

  • speaking different dialects

  • sprung up all over Europe and the States,

  • making remote access possible but torturous.

  • To make it easy for our physicists across the world

  • to access the ever-expanding big data

  • stored at CERN without traveling,

  • the networks needed to be talking

  • with the same language.

  • We adopted the fledgling internet working standard from the States,

  • followed by the rest of Europe,

  • and we established the principal link at CERN

  • between Europe and the States in 1989,

  • and the truly global internet took off!

  • Physicists could easily then access

  • the terabytes of big data

  • remotely from around the world,

  • generate results,

  • and write papers in their home institutes.

  • Then, they wanted to share their findings

  • with all their colleagues.

  • To make this information sharing easy,

  • we created the web in the early 1990's.

  • Physicists no longer needed to know

  • where the information was stored

  • in order to find it and access it on the web,

  • an idea which caught on across the world

  • and has transformed the way we communicate

  • in our daily lives.

  • During the early 2000's,

  • the continued growth of our big data

  • outstripped our capability to analyze it at CERN,

  • despite having buildings full of computers.

  • We had to start distributing the petabytes of data

  • to our collaborating partners

  • in order to employ local computing and storage

  • at hundreds of different institutes.

  • In order to orchestrate these interconnected resources

  • with their diverse technologies,

  • we developed a computing grid,

  • enabling the seamless sharing

  • of computing resources around the globe.

  • This relies on trust relationships and mutual exchange.

  • But this grid model could not be transferred

  • out of our community so easily,

  • where not everyone has resources to share

  • nor could companies be expected

  • to have the same level of trust.

  • Instead, an alternative, more business-like approach

  • for accessing on-demand resources

  • has been flourishing recently,

  • called cloud computing,

  • which other communities are now exploiting

  • to analyzing their big data.

  • It might seem paradoxical for a place like CERN,

  • a lab focused on the study

  • of the unimaginably small building blocks of matter,

  • to be the source of something as big as big data.

  • But the way we study the fundamental particles,

  • as well as the forces by which they interact,

  • involves creating them fleetingly,

  • colliding protons in our accelerators

  • and capturing a trace of them

  • as they zoom off near light speed.

  • To see those traces,

  • our detector, with 150 million sensors,

  • acts like a really massive 3-D camera,

  • taking a picture of each collision event -

  • that's up to 14 millions times per second.

  • That makes a lot of data.

  • But if big data has been around for so long,

  • why do we suddenly keep hearing about it now?

  • Well, as the old metaphor explains,

  • the whole is greater than the sum of its parts,

  • and this is no longer just science that is exploiting this.

  • The fact that we can derive more knowledge

  • by joining related information together

  • and spotting correlations

  • can inform and enrich numerous aspects of everyday life,

  • either in real time,

  • such as traffic or financial conditions,

  • in short-term evolutions,

  • such as medical or meteorological,

  • or in predictive situations,

  • such as business, crime, or disease trends.

  • Virtually every field is turning to gathering big data,

  • with mobile sensor networks spanning the globe,

  • cameras on the ground and in the air,

  • archives storing information published on the web,

  • and loggers capturing the activities

  • of Internet citizens the world over.

  • The challenge is on to invent new tools and techniques

  • to mine these vast stores,

  • to inform decision making,

  • to improve medical diagnosis,

  • and otherwise to answer needs and desires

  • of tomorrow's society in ways that are unimagined today.

Big data is an elusive concept.

Subtitles and vocabulary

Operation of videos Adjust the video here to display the subtitles

B1 TED-Ed big data data computing access stored

【TED-Ed】Big Data - Tim Smith

  • 1650 216
    阿多賓 posted on 2014/03/14
Video vocabulary