Placeholder Image

Subtitles section Play video

  • Thanks to Brilliant for supporting this episode of SciShow.

  • Go to Brilliant.org/SciShow to learn how you can take your STEM skills

  • to the next level!

  • [♪ INTRO]

  • One of the trickiest issues in science has always been small studies.

  • Like, how much do the results of small studies

  • actually tell you about a broader population?

  • And how much of their results are just random noise?

  • Fortunately, these days, we do have a way

  • to gauge how much we can trust small studies.

  • And it's all thanks to a beer brewer

  • working at Guinness Brewery in the early 1900s.

  • Today, a simple statistical test that was invented to assess beer quality is one of

  • the most important tests used in biology, medicine, and some other scientific fields.

  • The guy at the heart of this story is William Sealy Gosset,

  • a British chemist and mathematician born in 1876.

  • Gosset had enjoyed experimenting and inventing things from a young age,

  • and he studied math and chemistry

  • at both the University of Winchester and at Oxford.

  • Then, when he was just 23, he got a job at Guinness as a brewer

  • in other words, a beer chemist.

  • Back when Gosset was hired, Guinness was still a pretty small operation,

  • and their brewing process wasn't much of a science.

  • Basically, they would mix some barley or other grain with flavored water

  • and let the yeast do its thing.

  • The solution would ferment, and in just over a week, voilà!

  • Beer.

  • Along the way, brewers would sample and sniff their products

  • and that was how they kept the quality consistent.

  • But now Guinness was looking to start brewing beer on an industrial scale.

  • So sniff and taste tests weren't going to cut it anymore.

  • It just wasn't practical to have someone sampling that much stuff

  • every step of the way.

  • But they needed some way to guarantee that the quality of their beer

  • was still goodand consistent.

  • Without wasting too much time or money.

  • That was Gosset's challenge.

  • For instance, one of his specific projects was to compare the sugar content

  • in the barley malt extract from different batches.

  • This sugar is what feeds the yeast,

  • so any differences can change the outcome of the beer.

  • But Gosset could only use a small set of samples to make this comparison.

  • And he knew that any difference between the sets

  • could possibly mean one of two things:

  • It could mean the two batches had different concentrations of sugar

  • But it could also just mean that at least one of the samples

  • had a different average concentration than the batch it came from.

  • Maybe the batch wasn't uniformly mixed, or something fluke-y like that.

  • Basically, he was running into the same problem

  • scientists have with any small study:

  • It's easy to randomly select samples that don't really represent the whole batch.

  • It's like if you wanted to estimate the average cost of a home in a certain town,

  • but you only sampled the price of ten houses.

  • That wouldn't necessarily tell you much of anything about the town as a whole.

  • So, Gosset realized that his key challenge was how to figure out

  • whether or not a given sample set made a reliable proxy for the whole batch.

  • And so far, statisticians hadn't bothered to investigate small samples,

  • since they weren't considered useful for analyses.

  • So this was uncharted territory.

  • But fortunately, Gosset's stats background came in handy here

  • and he decided to figure it out himself.

  • He compared the average concentration of small sets of samples

  • to averages from much bigger sample sizes

  • specifically, the classic bell curve called a normal distribution.

  • And he found that the smaller the sample size,

  • the more different its mean concentration could be from that of a large batch.

  • So, to deal with this situation, Gosset developed the concept of a t-distribution.

  • A t-distribution looks a lot like your classic bell curve,

  • which is used to depict a range of probabilities.

  • Like, say you're trying to get the distribution of test scores in a class.

  • You've got the average score at the top of the bell,

  • and the curve of the bell shows how much variability there is.

  • The difference is that while a normal bell curve

  • quickly trails off to zero on either end, a t-distribution doesn't.

  • Instead, its ends slowly taper off, with long tails that represent

  • the greater amount of noise, or unreliability, that's inherent in a small sample.

  • Gosset's t-distribution related the size of a set of samples

  • to how much variability it was likely to have.

  • And it came with a cutoff: a critical value.

  • If the difference in concentrations between two sample sets was significantly bigger

  • than this value, he could be pretty sure that the difference really existed

  • in the larger batches he was actually trying to compare.

  • If the difference was smaller, there was a good chance that the batches

  • were similar enough to be considered the same.

  • He called this comparison a t-test.

  • And the approach worked!

  • The development of the t-test made it possible for Guinness to industrialize

  • with confidence and start brewing the massive amounts of beer it puts out today.

  • And more than a hundred years later, the t-test isn't just for beer.

  • It's been adopted by all kinds of scientists

  • needing to interpret their experimental results.

  • Like, whether you need to compare the concentration of sodium

  • in patients' blood samples, or test which variety of crop yields the starchiest wheat,

  • the t-test is your go-to.

  • Today, it's known as the Student's t-test

  • not because it's the bane of stats learners everywhere,

  • but because Guinness would only allow Gosset

  • to publish the results under a pen name.

  • He seems to have pulled the nameStudent

  • from a brand stamped on his lab notebook!

  • But whether or not we know Gosset's name, we've probably all been touched

  • by science that his test made possible and reliable

  • so we can raise a glass to that!

  • We hope we've whetted your appetite to learn more about stats,

  • because today's episode is brought to you by Brilliant.

  • And they've got loads of courses that will teach you all about stats.

  • Like Random Variables and Distributions, which will help you understand

  • why Gosset had to invent a whole test just for small sample sizes.

  • But you don't have to stop there, because Brilliant has courses in basic science,

  • engineering, and computer science too.

  • If you're interested, you can head to brilliant.org/scishow,

  • where you can sign up and get 20% off an annual Premium subscription.

  • And thanks for checking them out!

  • [♪ OUTRO]

Thanks to Brilliant for supporting this episode of SciShow.

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it