Placeholder Image

Subtitles section Play video

  • Did you know that US retail giant Walmart generates 2.5 petabytes of data from approximately

  • 1 million customers every hour? And in case youre wondering how much is

  • a petabyte, as I did when I first read this, it is equal to 1 million gigabytes. The equivalent

  • of 13.3 years of HD video. Considering that Walmart locations are open

  • for business for more than 10 hours a day, we get a staggering 130 years of HD video

  • and 25 petabytes of data collected on a daily basis!

  • Yes, there aren’t many companies like Walmart. But even smaller enterprises nowadays generate

  • huge amounts of data, so, it becomes increasingly more challenging to take advantage of such

  • information abundance. And yes, data science is at the heart of all

  • that. But before we can apply data science, we must do justice to another crucial player

  • the cloud and cloud computing in general. That’s exactly what we will focus on in

  • this video: Why cloud computing is essential for data science in the 2020s.

  • But before we continue, let me tell you about something else weve put together:

  • Weve createdThe 365 Data Science Programto help people enter the field of data science,

  • regardless of their background. We have trained more than 350,000 people around the world

  • and are committed to continue doing so. If you are interested to learn more, follow the

  • link in the description. It will also give you 20% off all plans if you want to start

  • learning from an all-around data science training. Now, back to cloud computing.

  • To understand the advantages cloud computing provides when it comes to data science, let’s

  • imagine a world with as much data as we have today, but without servers.

  • In such an unfortunate scenario, firms would need databases that run locally, right?

  • So, every time when you, as a data scientist, want to engage in new analyses or refresh

  • an existing algorithm, you’d have to transfer information to your machine from the central

  • database, and then proceed to operate locally. This unfortunate world would have several

  • main drawbacks: Manual intervention would be necessary to

  • retrieve data Your machine becomes a single point of failure

  • for the analyses you have worked on locally Processing speed would be equivalent to the

  • computing power of your computer Chances are you will be able to work with

  • a limited amount of data due to the limited computing resources at your disposal

  • Moreover, under this setup, you wouldn’t be able to leverage real-time data to build

  • recommender systems or any type of machine learning algorithms that requirelive

  • data Doesn’t sound like the perfect scenario,

  • does it? Well, that’s why we invented servers. And

  • then these servers had drawbacks of their own.

  • The most obvious one is that a server needs space to be stored. A Cloud is basically somebody

  • else’s server, so their storage problem Server infrastructure is expensive to buy

  • and set up. Cloud infrastructure is already there and is simply awaiting your server consumption

  • In-house data storing requires you to have backups and ideallyhave them in different

  • locations. Clouds offer data everywhere, anytime, usually backed up on many different servers

  • across the world Servers need planning. For fast-growing companies,

  • server needs could be unpredictable even for the current quarter. With in-house servers,

  • you usually end up buying more servers than you actually need at a given time. With cloud

  • you pay as much as you use. You see my point, right?

  • Fortunately, we now have clouds. They overshadow local servers in almost every conceivable

  • aspect. And, in fact, data scientists should be focused on developing great algorithms,

  • testing hypothesis, taking advantage of all available data without having to wait hours

  • to see the results of the tests they are performing and certainly without having to worry how

  • much memory space they have left on their computer. And yes, sometimes data scientists

  • do end up waiting for long hours for an algorithm to train, but with a cloud, they have the

  • option to pay more and get the job done faster. That’s yet another advantage of cloud computing

  • over servers. That being said, the biggest winners are smaller

  • entities, as they get cheap access to the same tools as enormous corporations. And this

  • is why cloud technologies are a huge enabler. They create a level playing field and allow

  • small players to compete with much bigger ones.

  • If you think about it, this technological progress changed a number of businesses in

  • a way similar to how the Internet changed commerce.

  • Remember when, all of a sudden, people around the world were able to open e-commerce stores

  • and compete on a global scale with the established firms?

  • Well, in the same way, cloud technologies democratized data analysis and data science.

  • The fact that data scientists and data analysts can rely on data stored on the cloud truly

  • makes their life so much easier! In addition, most cloud providers allow data

  • scientists to access readily installed open-source frameworks right away. This is not only super

  • convenient but can also be a huge time saver. Alternatively, if you wanted to use Apache

  • Spark in the conventional way you would have to:

  • Start by installing java, • Then continue by installing Scala

  • After which youll be able to download Apache Spark and install it.

  • That’s the setup you need to go through if you are working on your own pc. However,

  • if you are using a cloud service, youll be able to start working with the Apache Spark

  • framework right away! Yep, it’s been already installed for you. The same is valid for many

  • different open-source frameworks. This type of easy-to-access, easy-to-use infrastructure

  • is very attractive and potentially applies to all sorts of applications data analysts

  • and data scientists use in their work. Over the last few years, Amazon Web Services,

  • Microsoft Azure, and Google Cloud have tried to boost their cloud services in terms of

  • capability to run machine learning algorithms. The Big 3 of cloud services focused on this

  • area extensively, as they realized it could be an important source of competitive advantage

  • in the long run. And, in case youre wondering, one of the biggest sell points of cloud machine

  • learning is that it allows small and medium enterprises to access a machine learning infrastructure

  • they otherwise wouldn’t be able to afford. For example, thanks to cloud-based machine

  • learning, a small e-commerce retailer could run a real-time recommender system algorithm

  • to improve the product offering shown to customers based on the products they have already added

  • to their cart. In this type of business, every website click can be interpreted as a particular

  • type of intention and signal, and hence the real-time updated algorithm operating in the

  • cloud will be able to make a suggestion that improves the chances of making a conversion

  • and maximizing revenues. Without cloud-based machine learning, setting

  • up the necessary infrastructure to perform this type of analysis would be really costly

  • and difficult to execute for small and medium enterprises.

  • It is still unclear who will win the cloud war between giants like AWS, Microsoft Azure,

  • and Google Cloud. But one thing is certain. This is a service that benefits greatly small

  • and medium-sized businesses, enabling them to level the playing field when competing

  • against large multinationals with superior IT infrastructure.

  • If you liked this video, don’t forget to give it a like, or a share!

  • And if data science is what you’d like to learn more about, subscribe to our channel

  • - youll find plenty of data science insights and data science career advice.

  • Thanks for watching!

Did you know that US retail giant Walmart generates 2.5 petabytes of data from approximately

Subtitles and vocabulary

B1 cloud data data science computing cloud computing machine learning

Why Cloud Computing is Critical for a Data Scientist

  • 16 2
    林宜悉 posted on 2020/03/14
Video vocabulary

Keywords

equivalent

US /ɪˈkwɪvələnt/

UK /ɪˈkwɪvələnt/

  • adjective
  • Having the same effect or meaning.
  • Having the same meaning or significance.
  • Equal in value, amount, meaning, or function.
  • Equal to something in value, use or meaning
  • noun
  • A thing that is equal to or corresponds with something else.
  • Thing like another in quality, quantity or degree
  • A person or thing that is equal to or corresponds with another in value, amount, function, meaning, etc.
scenario

US /səˈner.i.oʊ/

UK /sɪˈnɑː.ri.əʊ/

  • noun
  • An imagined sequence of events in a plan/project
access

US /ˈæksɛs/

UK /'ækses/

  • noun
  • The ability or right to enter, use, or look at something.
  • Way to enter a place, e.g. a station or stadium
  • The means or opportunity to approach or enter a place.
  • A means of approaching or entering a place.
  • The right or opportunity to use or look at something.
  • verb
  • To obtain or retrieve (computer data or a file).
  • (Of a computer) to find and use (data).
  • To be able to use or have permission to use
  • To obtain or retrieve (data or information) from a computer or other device.
  • other
  • The action or way of approaching, entering, or using.
  • The means or opportunity to approach or enter a place.
  • The opportunity or right to use something or to see someone.
  • other
  • To obtain or retrieve (data or information, typically held in a computer).
improve

US /ɪmˈpruv/

UK /ɪm'pru:v/

  • verb
  • To make, or become, something better
  • other
  • To become better than before; to advance in excellence.
  • To become better
  • other
  • To make something better; to enhance in value or quality.
  • To make something better; to raise to a more desirable quality or condition.
infrastructure

US /ˈɪnfrəˌstrʌktʃɚ/

UK /'ɪnfrəstrʌktʃə(r)/

  • noun
  • Basic necessary equipment for a country or region
  • other
  • The basic physical and organizational structures and facilities (e.g. buildings, roads, power supplies) needed for a society or enterprise to operate.
  • The basic hardware and software resources of a system.
  • The basic facilities, services, and installations needed for the functioning of a community or society, such as transportation and communication systems, water and power lines, and public institutions including schools, post offices, and prisons.
  • The basic framework of a system or organization, especially the hardware and software required for IT operations.
  • The underlying framework or system of an organization.
advantage

US /ædˈvæntɪdʒ/

UK /əd'vɑ:ntɪdʒ/

  • noun
  • A condition or circumstance that puts one in a favorable or superior position.
  • Thing making the chance of success higher
  • Benefit or profit gained from something.
  • A positive point about something
  • other
  • Benefit resulting from some course of action.
  • other
  • To make use of something, especially to further one's own position; exploit.
approximately

US /əˈprɑksəmɪtlɪ/

UK /əˈprɒksɪmətli/

  • adverb
  • Around; nearly; almost; about (a number)
interpret

US /ɪnˈtɚprɪt/

UK /ɪn'tɜ:prɪt/

  • verb
  • To express so that others understand it
  • To translate what is said into another language
  • other
  • To explain the meaning of something.
  • To perform a creative work (such as a play or piece of music) in a way that shows one's understanding of it.
  • To translate spoken words from one language to another.
  • To understand something in a particular way.
consumption

US /kənˈsʌmpʃən/

UK /kənˈsʌmpʃn/

  • other
  • The act of consuming.
  • The act of using energy, eating, or drinking something
  • A wasting disease, especially tuberculosis of the lungs
  • The act of eating or drinking
  • The purchase and use of goods and services by customers
  • noun
  • The act of buying and using products
  • The act of using energy, food or materials; the amount used
  • A serious disease of the lungs
conventional

US /kənˈvɛnʃənəl/

UK /kən'venʃənl/

  • adjective
  • Following the common attitudes and practices
  • Based on or in accordance with what is generally done or believed.
  • Non-nuclear
  • Based on or in accordance with what is generally done or believed.
  • Following what is considered normal or acceptable
  • Ordinary and unoriginal
  • Following traditional forms and genres

Click the word to look it up Click the word to find further inforamtion about it