Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • JULIE ELLIOTT: Hi, I'm Julie Elliott,

  • and I lead Kaggle's competitions team.

  • I'm speaking today about using TensorFlow 2.0 on Kaggle.

  • This is based on work led by Phil Culliton.

  • So what's Kaggle?

  • Kaggle is a data science platform.

  • Over four million data scientists

  • from all over the world come to Kaggle

  • to participate in machine learning competitions,

  • practice data science, build portfolios, and share

  • data sets of code.

  • It's an incredibly exciting place.

  • My job at Kaggle is to run machine learning competitions.

  • They run the gamut.

  • Right now, we have 17 running, including deep fake detection

  • in videos, image classification, an abstraction and reasoning

  • challenge by Fracois Chalet--

  • if you haven't seen his paper entitled,

  • "The Measure Of Intelligence," it's really amazing.

  • You should check it out--

  • an NLP challenge, and a bot building

  • simulation competition.

  • Basically, if you're interested in data or machine learning,

  • we'll likely have a competition that'll excite you.

  • TensorFlow is really heavily used in our competitions.

  • It's actually currently being used in all 17 of them.

  • But in the past few months, we've run two

  • TensorFlow-specific competitions--

  • a question and answering NLP competition

  • that Sandeep mentioned earlier, built

  • around the launch of TensorFlow 2.0,

  • and a new competition for image classification

  • that introduces TPUs to our platform.

  • We have tens of thousands of people using TensorFlow 2.1

  • right now to solve problems.

  • Three of the top solutions for the question and answering

  • competition actually used TensorFlow and TPUs.

  • If you'd like to check out TensorFlow 2.1,

  • especially with a GPU or TPU, Kaggle makes it super easy.

  • You can see from the list on this slide

  • that we've eliminated a lot of the obstacles

  • to making TensorFlow 2.1 build your models fast

  • and with next to no specialized code.

  • No provisioning the right kind of VM, no setup--

  • data sets that are ready to go and a weekly amount

  • of TPU and GPU time allotted to users for no cost.

  • We'll look at some code right now

  • that'll build a model on Kaggle for a competition

  • to classify flowers on CPU, GPU, and TPU with no changes.

  • So we start off simple.

  • Import TensorFlow in the Kaggle data sets library.

  • The rest of the code here does a little bit of magic.

  • It figures out what kind of accelerator

  • is attached to your VM and automatically parallelizes

  • your model building.

  • This will return an appropriate strategy

  • whether you're using a V3 TPU or a simple CPU VM.

  • So basically, this code asks for a TPU.

  • If it gets one, it connects it.

  • If it doesn't, it sets that up, as well.

  • The rest of the code is entirely accelerator-agnostic.

  • So normally, outside of Kaggle and Collab,

  • you need to provision this TPU for usage.

  • But on our platform, it's not necessary.

  • Just select the TPU in the dropdown, and go!

  • Now we'll load up our competition data set.

  • It's provided in a sharded TF record format for fast loading.

  • We set some parameters for our training and validation sets,

  • shuffling our training set, batching them

  • for optimal performance, setting our training set to repeat,

  • so it'll loop around for each epoch.

  • And in the end, we have nice out of memory data

  • sets ready to run in our accelerator of choice.

  • This code does some real work.

  • Here, we load up a VGG-16 for some transfer learning,

  • compile our model, and start training and validating.

  • We're using a standard distribution strategy

  • that was made possible by that previous code

  • that automated the strategy based on the accelerator used.

  • Note the widthstrategy.scope block.

  • That's basically saying, hey, whichever type

  • of parallelization you're doing here,

  • this is the process that parallelized.

  • Now, our model will be built and trained on the accelerator.

  • Making some flower predictions is really easy after this.

  • We load up our test set, run model.predict,

  • and write our predictions out in a format that Kaggle can read.

  • With a few tweaks and a few more epochs,

  • it can actually score pretty well.

  • I'll finish off this code section

  • with some of the fun flowers that our model made predictions

  • on.

  • You can see that some are right.

  • They have the little OK above them.

  • And some are wrong.

  • If you're interested, you're more than

  • welcome to drop by and beat this model.

  • It's not hard, and it's super fun.

  • If you're interested in seeing TensorFlow 2.1 at work

  • on Kaggle, come check it out.

  • Try out the code I just showed you.

  • Compete in the flower competition,

  • or try it out with a data set of your choice

  • on our hosted mobile platform.

  • In the next coming months, we'll be launching even more

  • competitions where you can leverage the power

  • of TensorFlow and TPUs.

  • This is a huge group undertaking.

  • And a special thanks to Martin Gorner,

  • the Dev-Rel PM who led the effort to integrate TPUs.

  • It's pretty awesome to see how easy it

  • is to use TensorFlow 2.1 on Kaggle now.

  • Thank you so much.

  • [MUSIC PLAYING]

[MUSIC PLAYING]

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it