Placeholder Image

Subtitles section Play video

  • SERENA AMMIRATI: Looking at this book page by page

  • and trying to decipher, read, and transcribe whatever is

  • there takes an enormous amount of time.

  • It would require an army of paleographers.

  • ELENA NIEDDU: What I am excited the most about machine learning

  • is that it enabled us to solve problems that up to 10,

  • 15 years ago we thought unsolvable.

  • ELENA NIEDDU: Before using any kind of machine learning model,

  • we needed to collect data first.

  • You have thousands of images of dogs and cats in the internet,

  • but there's very little images of ancient manuscripts.

  • We built our own custom web application for crowdsourcing.

  • And we involved high school students to collect the data.

  • I didn't know much about machine learning in general.

  • But I found it very easy to create a TensorFlow

  • environment.

  • When we were trying to figure out

  • which model worked best for us, Keras was the best solution.

  • The production model runs on TensorFlow layers, an estimator

  • interface.

  • We experimented with binary classification,

  • with fully connected networks.

  • And finally, we moved to convolutional neural network

  • and multiclass classification.

  • ELENA NIEDDU: When it comes to recognizing single characters,

  • we can get 95% average accuracy.

  • SERENA AMMIRATI: This will have an enormous impact.

  • In a short period of time, we will

  • have a massive quantity of historical information

  • available.

  • ELENA NIEDDU: I just think solving problems is fun.

  • It's a game against myself, and how good I can do.

SERENA AMMIRATI: Looking at this book page by page

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it