Placeholder Image

Subtitles section Play video

  • SERENA AMMIRATI: Looking at this book page by page

  • and trying to decipher, read, and transcribe whatever is

  • there takes an enormous amount of time.

  • It would require an army of paleographers.

  • ELENA NIEDDU: What I am excited the most about machine learning

  • is that it enabled us to solve problems that up to 10,

  • 15 years ago we thought unsolvable.

  • ELENA NIEDDU: Before using any kind of machine learning model,

  • we needed to collect data first.

  • You have thousands of images of dogs and cats in the internet,

  • but there's very little images of ancient manuscripts.

  • We built our own custom web application for crowdsourcing.

  • And we involved high school students to collect the data.

  • I didn't know much about machine learning in general.

  • But I found it very easy to create a TensorFlow

  • environment.

  • When we were trying to figure out

  • which model worked best for us, Keras was the best solution.

  • The production model runs on TensorFlow layers, an estimator

  • interface.

  • We experimented with binary classification,

  • with fully connected networks.

  • And finally, we moved to convolutional neural network

  • and multiclass classification.

  • ELENA NIEDDU: When it comes to recognizing single characters,

  • we can get 95% average accuracy.

  • SERENA AMMIRATI: This will have an enormous impact.

  • In a short period of time, we will

  • have a massive quantity of historical information

  • available.

  • ELENA NIEDDU: I just think solving problems is fun.

  • It's a game against myself, and how good I can do.

SERENA AMMIRATI: Looking at this book page by page

Subtitles and vocabulary

Operation of videos Adjust the video here to display the subtitles

B1 elena machine learning classification serena machine learning

Powered by TensorFlow: helping paleographers transcribe medieval text using machine learning

  • 0 0
    林宜悉 posted on 2020/03/25
Video vocabulary