Advanced 7668 Folder Collection
After playing the video, you can click or select the word to look it up in the dictionary.
Loading...
Report Subtitle Errors
DESTON BENNETT: Thanks for coming out today.
Again, my name's Deston Bennett.
I'm with the Grammy producers and engineers wing.
The Grammy Awards, as many of you may know, are the only
music awards that are peer determined, meaning it's not
the public that votes.
Those who vote are members of the Recording Academy, and who
are hands-on music creators--
artists, songwriters, musicians,
producers, and engineers.
From the very beginning at our founding in 1956, the basis
for the Grammy Awards process has been a commitment to
excellence.
The Recording Academy's original credo clearly states
that the awards are not about sales, and they're not about
popularity.
Musical excellence in all areas is the only criteria
Grammy voters are charged with to determine who gets
nominated, and what will win.
Knowing that little bit of information should help you
whenever there's some controversy about the Grammys,
as there sometimes can be.
For example, the year jazz bassist Esperanza Spalding won
the Best New Artist category against fellow nominees Drake,
Florence and the Machine, Mumford and
Sons, and Justin Bieber.
She won because the majority of the Grammy voters were
familiar with Esperanza and her work, and they saw her as
a stand-out that year.
It's pretty cool when you think about it.
Many of you who are musicians or audio producers or
engineers may be eligible to be members of the Academy.
And I'm happy to speak to you about that after this is over
if you like.
You can also get some more information or join by
visiting Grammy365.com.
Today specifically, we're here to talk about excellence in
sound, something that's key to great recordings.
The P&E wing has partnered with the Consumer Electronics
Association and others on an initiative we call Quality
Sound Matters.
We represent people who truly understand the difference good
sound makes, and we want to share their enthusiasm and
excitement about quality with everybody.
Today, we have a very cool presentation from
Grammy-winning engineer that we think you'll enjoy.
And I want to give a big thank you to Neil Annala and Joe
Rosenberg for bringing us here today.
We'd also like to thank JBL and Prism Sound for this
amazing sound system you're going to hear today, too.
The speakers in particular, they encompass some very new,
exciting technology that you're amongst
the first to hear.
And to top it off, I really want to introduce an amazing
engineer producer who's worked with artists including
Metallica, Lincoln Park, Green Day, and U2,
along with last week's--
well, not last week, but a recent number one album, the
Black Sabbath project.
He's a two-time Grammy winner for his work on the Red Hot
Chili Peppers' "Stadium Arcadium" project, as well as
Adele's "21" album.
He has another interesting honor.
In 2012, he was named the International Engineer of the
Year by England's Music Producers Guild.
Please welcome Andrew Scheps.
[APPLAUSE]
ANDREW SCHEPS: First of all, thanks for coming.
This is as full a house as we could have in here, I think.
So thank you so much, and thanks again to Neil and Joe
for putting this together.
This is awesome.
So later on, we're going to listen to a bunch of stuff,
which is the point of what I do.
And the Recording Academy has been really great about
sponsoring me to do this talk all over the country.
The idea of the talk, it was originally put together for
what the Recording Academy called their Grammy Future Now
conference, which was sort a mini, one-day TED conference
for producers and engineers, for people who make music in
Los Angeles.
And since then, I've gone around the country, and most
the time I give this presentation to
producers and engineers.
And it's because there's a lot of information in the
presentation that, as people who make records, we sort of
kind of know, but we don't actually know.
And so I'm trying to put numbers and facts behind the
things we think we know so that when we listen, you can
actually compare things and you know what it is you're
listening to, and why there might be differences and
things like that.
So I'll start the way I usually start by asking how
many people in this room are artists and make records, or
have ever released a record.
So still good number of you.
So if you've released a record, how many of you have
then gone and bought your record to make sure that what
comes off the services sounds like what you sent them?
So that's about normal.
About a third of the hands, maybe less.
And that's exactly the same with people who do that as
their day job-- or night job, depending on
the hours your keep.
It's not something people really think about.
They finish their record, they master it, like oh, it's done.
Send it off, and you're done.
And of course, now with all the digital services--
and we'll get into lots of them specifically--
and there are a few that happen to be housed in this
building or down in San Bruno--
there are lots and lots of different ways that music gets
out into the world.
And so, the idea is to give some context to know, what are
all these possibilities?
How do they compare?
And do they actually impact the consumer's experience when
they listen to your music?
So that's the idea.
Now along the way, I can usually get away with a lot of
sort of vagaries, because I'm talking to
producers and engineers.
All right, this is the question just for you guys.
How many people in this room know more about digital audio
theory than me?
There's going to be-- come on.
It's everybody in the room.
But seriously, how many people work directly with digital
audio in the room?
OK, I'm going to be vague.
I'm going to be slightly inaccurate, and I would
welcome corrections along the way.
So I've done the presentation, I think, 12 times now.
11 of those times were for producers and engineers, and
once was a few weeks ago, which is when Neil came and
saw me at Fantasy Studios in Berkeley.
And that was for a room full of people from the tech
community, including people from Google and YouTube and
Google Play music, as well as SoundCloud and Apple and
Rhapsody and Arteo, and a few other companies, and
Fraunhofer, who developed the MP3 and ADC codecs.
And I got my butt kicked.
And I'm fine with that.
I would love to get my butt kicked, because every time I
give this presentation, I know more, and I can kick butt back
a little bit, which is the point.
And I think what happens is people get into their little
rabbit holes on what they work on.
So I make records, and I want to make great sounding
records, but I don't want to follow it through the food
chain down to the consumer, because that's not what I do.
Now two years ago, I started my own record label, so now
that became part of what I do.
And I used to think, I'm going to start a label because the
labels suck.
They don't know what they're doing.
It turns out I don't know what I'm doing, and it's really,
really difficult, and there's a lot to it.
So every part of this process of getting music into the
hands of people who listen to it is unbelievably difficult,
incredibly technical, and fraught with peril for the
audio along the way.
So we'll talk about some of the specifics.
So what I want to do first, though, is put recording into
perspective, OK?
So for thousands and thousands of years-- and now we start my
very fine PowerPoint--
there has been music.
OK, for who knows how many?
Let's say 10,000?
Is that a good number?
This is where I get vague, and everybody in the
room backs me up.
So we're going to say 10,000 years, there's been music in
the form of songs that have been written by somebody.
And then, they would perform their song for somebody in
their village or something like that.
And the only way music could propagate would be either they
would go to the next village, or they would teach their song
to somebody and then they would go to the next village,
or people from the next village would come hear them
and go back.
Right?
So there's your music industry for the first 9,900 years.
Fair enough?
OK, about 100 years ago-- a little bit more-- but
basically, about 100 years ago, there started to be
consumer recordings of audio.
And there were a few things before this, but let's say the
wax cylinder was the first viable format.
So you have the Edison cylinder where people would
come into a room.
They would make lots of noise.
That noise gets collected by a horn.
It would get scratched on to this disc
that's spinning around.
And then, you could take that disc, and go
play it back elsewhere.
So all of a sudden, you have created
what is called recording.
Recording, especially back then, was technically just at
a delay process, right?
So you perform the music, and then you capture it for a
second, and then you can carry it around.
And then later on at any point, you can play it back.
So now, you can get rid of some of the space and time
constraints of everybody come to your concert.
Now you can record your concert and send it out.
Now this caused a huge uproar.
And in researching this for this presentation, and also I
teach a recording class where I try to give a little bit of
a history, there's some amazing quotes from John
Philip Sousa and people like that about how recording was
going to destroy not music, but society.
Destroy it.
You have to be in the room with the musician.
So I think we've all kind of gotten over that.
I mean, I would hope everybody here enjoys going to concerts
and things like that.
But we've gotten over the fact that we're going to completely
destroy society.
Music isn't the only thing that's destroyed.
And it's just one of many things.
OK, so that's 100 years.
That's it.
Then about 50 years ago, mainly with technology out of
Germany in the '40s, and then also some techniques developed
bouncing from one tape machine to another tape machine, you
started to be able to not just capture a live performance,
but you had what we call overdubs, which is basically,
you make a recording, and then you record some more stuff to
go with it.
So now, you can record at different times, and all of
those things up to make a recording.
A lot of the early Beatles recordings
were examples of bouncing.
They would record the band, then they would play the band
back while recording something else, combine those together.
So that was a technique.
The German tape machines allowed you to actually have
multiple tracks that side by side.
So you record on a couple of tracks, then you record on
another track.
So things we're sort of familiar with.
But that basically '40s, into the '50s--
but even in the '50s, most commercial recordings were
live recordings, to mono or possibly starting to get into
three-track tape, but eventually going to be mono
going out into the world.
But once you start having these multi-track tapes, then
you have to mix those things together.
So this created something in the music
industry that didn't exist.
It used to be there were only recording engineers who
captured things.
Now all of a sudden, you needed people who could take
all the stuff that was captured, combine it together,
and make it something that could go off into
the world to be heard.
So that's the mix of the recording of
the song with overdubs.
And then, once you actually had consumer formats--
whether it was the cylinders or onto LPs or 45's or 78's or
cassettes or eight-track tapes, up into CDs--
you needed to have some sort of standard as to how the
music would have to be put onto these media to then be
distributed.
So you would get your mastered mix of the recording of a song
with overdubs.
Now in a room full of engineers, that kills.
That's really funny, because it's a font joke.
Mastering makes things loud.
That's the idea so-- all right.
I'm sorry.
It's the wrong crowd.
OK, so this is now what the artist sends off
into the world, OK?
This is what a record is.
But it's much more than that, right?
Music, pre-recording, was nothing more than art.
There was some commerce involved, but it
was basically art.
It was musicians and composers who would have a piece of
themselves that they would want to capture, and then let
other people here it, and recreate the emotion they were
trying to create when they performed the music live.
So let's say that really, the recording is more like this.
And you don't have to read it.
The point is I needed to get a lot of text on the screen for
later on for one of my very clever, inaccurate analogies.
The idea being that we need to keep in mind that this is art,
and this is the difference between looking at an art book
and going to a museum, OK?
There are differences.
And the idea of live performance versus recording
is one stage of this difference.
But there's also a huge difference depending on how
that recording gets to you at the end of the day.
And when we actually get to the listening portion, I think
someone once said it's stuff that you can't unhear.
You'll hear the difference between some of these file
formats and bit rates and things like that, and you'll
decide for yourself whether it makes a difference.
My theory is I think it does.
OK, so now we're going to go through part of the
presentation, which is a little more technical, which
means it's a little dumbed down for most of the
people in this room.
But there are a couple important things.
So the first thing is, the difference
between sound and audio.
And I'm sure most people in this room know this, but the
idea that's important is that all sound is analog, period.
An analog meaning infinitely variable, OK?
Until you get down to the molecular quantum level, any
sound in the air is infinitely variable acoustic pressure
waves that travel around the room, right?
Everybody cool with that?
Now, you can buy a digital microphone or a digital pair
of headphones, and that isn't actually what they are.
They are analog microphones and analog headphones that
happen to have converters built into them.
So they are two things in one.
But they are an analog device.
There's no such thing as a digital microphone.
The only way you can record something is to put something
in the air in the way of the pressure wave so it moves
because of the pressure wave, and then using lots of
different technologies for how you design your microphone.
You turn that into a voltage is the most
common way to do it.
Then, you can digitize the voltage, OK?
So this would be the simplest sound.
It's a sine wave.
It's information at only one frequency.
But the idea is while it's a sound wave, you zoom in, you
zoom in, you zoom in.
It never pixelates, right?
It's smooth all the way down.
So the idea of digitizing--
and this is where, feel free to take a nap or something
real quick.
So obviously with digital systems, you don't have the
luxury of looking at something infinitely many times a
second, right?
You have to have a clock.
You have to decide how many times you're going to look.
So for the producers and engineers I talk to, this is
actually really helpful.
I know it's very simplistic, but it's just the easiest
visual representation of what sampling is.
So the idea is, time across the bottom,
voltage up and down.
And every time there's a vertical
line, that's a sample.
So how many times a second-- let's say that's a second, and
then we count the number lines, and that's how many
times a second we're looking at it.
And each time we look, we say, how big's the voltage?
And we write it down using a number.
And how many bits we get to write down that number are our
horizontal lines.
Everybody's good with that, right?
So the idea being that if you look at this particular grid
superimposed on the sine wave, we almost never go directly
through an intersection.
So we are always wrong.
We are always rounding.
And obviously, anyone in the room who really knows digital
theory knows that that's OK.
There's one quantization error, but you make up for it,
and you can reconstruct things quite well.
You'll also know this sample rate is way higher than we
actually need to capture this sine wave.
You only need just over two samples per
cycle, and your good.
So that's fine.
And I'm not saying that this is not a good sample for this
particular sine wave.
But as a visual
representation, it's important.
The idea being, though, if we want to be more accurate, we
can do two things--
we can up the sample rate, and we came up the bit depth.
So now, this is sort of the aha moment for a lot of
engineers who've got little pop-up menus for sample rates
and bit depths, and they don't actually know what they do
other than bigger is better, So I'll record more stuff.
Now there are diminishing returns.
In terms of actually building audio hardware, it's very hard
to build something that will work equally well at every
single sample rate.
And I do lots of listening tests for just my studio for
making records, and I found that there's a lot of gear
that works great at 96 kilohertz, and up at 192, it
doesn't really work so well, because some things are
getting stressed, and it's just not optimized for it.
So it's not always that higher sample rate is better.
But in a perfect system, a higher sample rate will be
more accurate more of the time.
Right?
I mean, I think that's fair enough to say.
And the same thing with bit depth.
And in some ways, bit depth is more
important than sample rate.
Now the other thing is you could very easily make the
theoretical argument that 44.1 kilohertz is fine, because
human hearing goes up to around 20 kilohertz?
And I know everyone probably already knows this, but
basically, take your sample rate, divide it by 2.
That's the highest frequency you can capture
at that sample rate.
Fair enough?
So 44.1, you get down to 22.05.
Wow.
22.05.
There you go.
Sorry.
My math just went out the window.
But the problem is, to make that work, you need a perfect
filter that cuts off everything above that
frequency, but doesn't touch anything below it, right?
That filter cannot be built.
It doesn't exist, especially as an analog filter.
So this is part of why higher sample rates are really
important for capturing things--
to get an accurate picture at 20k, you kind of need to leave
it alone out to 40 or 48, something like that.
So if you start working at 96, and you can either use very
gentle analog filters or you can start getting into
over-sampling and digital filters, but you can do things
way past where we hear that are brutal, and they don't
affect what goes on down where we do hear.
Now there are also people who argue that we respond to
frequencies above 20k.
We're not getting into that.
We're not getting into, we should be tuning everything to
436 instead of 440.
There are lots of holistic arguments about lots of
things, and I try and keep things more real and in
numbers, because then I don't have to argue about them for
12 hours, and not get anywhere.
So I try and keep it that way.
So anyway, this is basically what I try and impart about
sampling, even though you guys know most of this.
So then we start talking about the actual
consumer formats, OK?
Now there are two types of digital audio files.
Again, I'm sure you guys know this, but there's lossless
audio and there is lossy audio.
OK, lossless audio is take a PCM-encoded wave file at some
sample rate and some bit depth, and you keep all the
numbers, period.
That's it.
That's all a loss is.
It's AIFF, WAV, used to be Sound Designer, too.
OK, so those are loss files.
Now if you want to get into the analog versus digital
debate, they're all lossy, right?
We've thrown away some information.
But we're not there.
Let's say that our capture is awesome.
Let's say we're working at 96k, 24-bit.
We've got lots of information.
If we keep all that information, it's a loss file.
Lossy is--
and again, I'm just going to go through the presentation.
You guys know all of this already, which
is why it's so great.
So lossy is the difference between zipping a file, and
using something where when you unzip your 25-page paper
you've just written, it's missing a bunch of letters and
there's stuff spelled wrong.
And again, for a lot of producers and engineers, they
don't actually understand this concept.
They assume that lossy compression is still OK,
because you end up with a PCM audio stream at the other end.
But it's reconstructed, and stuff is thrown away to
actually make those files.
And the reason being that if you zip an audio file, you
save maybe 20%.
If you use FLAC, which is optimized for audio, you can
maybe save 50% of the space.
But that's it.
So if you do some quick math, and you're looking at a CD,
let's say, which is at 44.1, 16-bit, you're talking about
10 megabytes for every minute of stereo music.
Those are big files.
You guys spend a lot of time trying to get files from one
place to another quickly and efficiently.
Those files are too big, especially up to a few years
ago with the data pipes going to phones,
all the mobile devices.
There's no way you're going to send that much audio.
So this is why the lossy codecs actually exist.
So very briefly, Fraunhofer, which is based up here,
developed first the MP3 lossy codec, and then more recently,
the ADC codec.
These are based upon the way you hear.
If you know anything about the way your brain processes the
information from your ears, your ears have just got lots
of hairs in it.
And Julie will probably talk more about
this than I need to.
But you basically are splitting things into
different frequencies.
All of that information comes up into your brain.
Your brain then processes it, and decides, I don't need to
listen to that, not going to pay attention to that, I hate
that, screw that-- oh, that's important.
And then that's what you hear.
So there's lots and lots of information that's thrown
away, which is why in a crowded room, you can
concentrate on a conversation with somebody, because you
start to mask things out.
And the same is true when you're listening to music.
There are lots of things that can be masked out.
So through a lot of research, they decided, what can we
throw away, right?
The idea being that if we take care of getting rid of some of
this information, then all of a sudden, we're dealing with a
much smaller file.
And if you compare file sizes, a decent bit rate MP3 is maybe
10% of the size of the uncompressed audio file.
Yet in some listening tests, you might be able to actually
do pretty well against the file it was encoded from.
OK, so this is where I get very inaccurate, and people
actually got mad at me about this.
But that's OK, because I'm up here and you're back there,
and you'd have to jump over the screens.
So this is the way I explain lossy encoding to people.
So if we go back to our paragraph of lots and lots of
text, if I take out some of the vowels, everybody can
still read this just as fast as they used to, right?
The idea is your brain is predicting what should be
there as much as it's taking the input of
what actually is there.
So if we look at the word "mastered" in that first line,
as soon as you get to the M and you see the "stered" after
it, your brain has decided there's probably an A there.
There's room for an A there.
There's an A there.
It fills in the blank.
If you have a tiny little smudge on the page, your brain
is all about it.
That is an A. Absolutely.
Whereas on its own, that smudge is nothing.
It's a smudge.
So that's the basic idea, is finding what can we throw
away, and still be able to read as fast we can?
Or, listen and enjoy the music without having to figure out
what it was supposed to sound like?
So the idea being that if I only take out those vowels, we
don't save a whole lot of space.
If I take out all the vowels, now we're really starting to
save some space, and we can compact it down, but I can no
longer read this, OK?
So somewhere is a threshold.
The problem is when you're reading, you have very
discrete chunks of data.
You either know what that word is, or you don't.
Maybe you can fill in a word from the context around it,
but that's kind of as far as you can go.
When you're listening to music, at some point it just
sounds bad, and you don't really want to
listen to it anymore.
Sometimes it sounds so bad that it's kind of crazy and it
sounds like it's under water, and more
like whales than music.
But until you get to that point, it's very hard to say,
yeah, OK, we compressed too much, because you could put
someone in the room, and especially if they know the
song, they'll fill in some blanks on their own, and
they're like, yeah, I like this song.
It's all good.
So the problem with audio is you go from this analog sine
wave-- which no matter how far we zoom in, is still
infinitely varying.
We capture it, we compress it, we send it off, we
reconstruct, but we're starting to reconstruct
something that's a little more stepped.
Now again, this will get smoothed out by things in both
the circuitry and also by your ears, so there are lots of
things working to help you out in reconstructing this
waveform along the way.
But you compress too much, and then you start getting to
things that start to not really sound like sine waves,
or they've got so many harmonics on them that you
don't hear them as a sine wave anymore.
And at that point, you're listening to something
different than what you started with.
And I think that is more akin to someone who kind of sucks
at art, copying paintings, and selling it to you, and like,
yeah, I'll put that on my wall.
Now for $10 and I can download it?
Maybe that's a trade-off you're willing to make.
But in terms of taking the art that this artist has made, and
saying, this is my record, and I love it, and it makes my mom
cry, at some point you're going to send them such a low
bit rate file that their mom's not going to cry anymore.
And that's a drag, because at that point you've lost the
point of the music, right?
It's art coming through speakers.
It's emotion coming through speakers.
So what can we do as record makers, and then what can we
do as people who get that music out into the world to
help people listen to it?
And the great part is, I would assume that everybody in this
room listens to music recreationally.
Let's start with the hands of people who don't listen to
music ever.
OK, so not only are we in charge of making this music
and getting it out there, but we also consume it, so we want
to make products that we actually like, which with a
lot of things, people don't actually buy their own
products, whereas this is sort of the ultimate consumer
product, because everybody's into it one way or another.
So going back to the actual consumer formats.
Within the loss category, you've really
only got two choices.
You have CDs, which are dying a very quick death, which are
set at 44.1 16-bit audio, right?
Then you've got what is called high res.
And this is a term that people can argue about.
All it means is anything better than 44.1 16-bit, OK?
So when the Beatles re-released their catalog, I
dunno, six years ago, something like that, there was
a version you could buy on a USB stick
which was 44.1 24-bit.
That is high res audio, because it's higher than a CD.
So that's what the term means out in the audio world.
Now for me, I like to think of high res being up at 96k or
something like that.
But in terms of consumer audio, that's what you get.
Now in terms of buying high res audio, there are very,
very few options.
There's HDtracks, who will sell you things to download,
and there's this crazy Java file.
OK, has anyone bought anything from HDtracks in this room?
So a few people.
Is there anybody who thinks that it's so easy to download
and play back this stuff that everybody should be doing it?
OK.
Got a couple.
So there are a lot of things involved, and I'll talk a
little more about what I have set up here to
play this stuff back.
It's hard to get the high res music, and it's hard to play
it back properly.
It's easy to play it back wrong.
Anybody can do that.
Just throw it in iTunes or any other music player, it'll play
back wrong, and you're all good.
But you're getting into transcoding, and things that
you don't really want to get into.
But anyway, that's what you've got for the two viable sort of
ways you can get lossless audio.
There are a couple others that, once we start
listening--
excuse me-- once you start listening that I'll actually
show you, which are kind of cool.
There's high res streaming starting to
happen, adaptive streaming.
It's really awesome.
OK, then we get into the lossy formats, and those files are
basically MP3 and AAC, which are the
two Fraunhofer codecs--
AAC having not necessarily superseded MP3, but just
coming after.
I think Robert from Fraunhofer would argue that
it supersedes it.
But obviously, there's tons of stuff still coming out
on MP3 as you go.
Depends how you encode things like that.
Then there's ogg vorbis, which other than Wikipedia, I don't
know much about it.
Is it that it's open source?
OK, so it's the open source encoder.
There you go.
But of course, there are open source MP3 decoders, which
skirt Fraunhofer's license.
Because if you get the lame encoder, you're not paying
them, either.
So I don't know.
That's vague.
Yes?
AUDIENCE: It's totally patent free, as
well, but that's debatable.
ANDREW SCHEPS: The ogg vorbis?
OK, so ogg vorbis is patent-free, which I guess
would be the main difference.
Because if you can build yourself an MP4 encoder that's
open source, you're getting around--
anyway.
Robert and I had a very long conversation about this, and
he was awesome.
He was very, very good about this.
I thought he was going to kill me, but he was great.
OK, so if we actually start looking at the services
themselves, this is where for the producers and engineers
it's a big, big deal, because this is the stuff where they
don't necessarily understand things.
I mean, they understand, but it's the stuff you know but
you don't know.
So the CD and high res are both, I'm going to say, WAV.
You can buy it is FLAC, but that's just compressed WAV.
There are AAIFs and things floating out there.
But WAV is the most robust and the most prolific form of
uncompressed audio.
Everything else is not WAV.
OK, so it's either--
all right, first of all, who here is from the Play Music?
OK, I need an answer, because I have scoured your website,
and it says it plays up to 320 kbps files.
So what format, and what does the up to mean?
I can't-- is that an NDA thing?
AUDIENCE: [INAUDIBLE]
ANDREW SCHEPS: So it's MP3s.
AUDIENCE: [INAUDIBLE]
ANDREW SCHEPS: And I'm assuming it's scaled, so as
you test bandwidth, do go--
I'm going to guess, 128, 256, 320?
AUDIENCE: 192, 256.
ANDREW SCHEPS: OK.
So three tiers topping out at 320.
OK, I couldn't--
and this is part of the problem of
looking for this stuff.
And I don't think--
and you can correct me if I'm wrong--
I don't think anyone is intentionally being obscure
about this.
Maybe you are.
Are you being intentionally?
Are you obfuscating?
I love that word.
Maybe you are a little bit.
OK.
So--
yes, sir?
AUDIENCE: If people in front can move maybe towards the
back of the room, we're going to playing stuff
out of those speakers.
ANDREW SCHEPS: Yeah, that's going to hurt.
I think what we can do actually is, what's going to
happen is at about 10 to 5:00, Julie's going to speak,
because she's got a presentation
about what she's doing.
And we're technically sort of 4:00 to 5:00, but we also have
the room to 6:30.
So what I'd love to do is we've got 15 minutes, I'll
finish going blah blah blah.
We can maybe do some questions where you guys kick my ass.
Can I say ass on this?
It's internal, right?
You can kick my ass.
And then, we'll break for Julie to speak.
And then, we'll do the listening, and people who have
to go can go, but then we can also shove people into the
middle of the room, because you guys are
going to get killed.
I mean, I'm not going to have it crazy loud, but still,
you're going to get killed a little bit.
OK, so finding all of this information.
OK, how many people from YouTube?
Do we have anyone?
OK, so we got a couple.
Finding out the information on what happens with the audio on
YouTube was not that difficult, but it was also a
little odd in that-- so does everybody in the room know why
there are two bit rates, and everybody in the room know
when you get which one?
OK, there you go.
So here's the problem.
It's tied to the video rate.
There's no setting that says, give me good audio.
There's only the setting that says, give me good video.
So basically--
and you can correct me if I'm wrong--
720 and 1080 give you 384.
Everything else gives you 128, OK?
Here's the problem--
a lot of people can't afford to make videos for every song
on their record, and a lot of people who buy records and
then really like a song and want to upload it to YouTube
don't make a video that's HD for that song.
So you upload static art work, or you upload lyrics, or you
upload a picture of your dog-- or cats.
Cats are the internet, right?
So it's kittens.
But unless it's awesome footage of a kitten, nobody is
going to switch to HD.
Nobody.
And it doesn't default to HD, so nobody here's your music at
384, which is, in terms of pure bit rate, the highest of
the lossy formats available, period, and nobody hears it.
Yet from numbers I've seen, and I'm sure my NDA won't
cover this because I haven't even signed one, but for
numbers I've seen, 80% of music
discovery happens on YouTube.
Somebody says, hey, have you heard vrr, and I go, I don't
know, let me search for it.
And you put it in, and you listen to it on YouTube.
So 80% of the time, people are being introduced to music with
one of the lowest bit rates on the board, when the highest
rate on the board is actually there, though not available
for most of the videos, because people aren't
bothering to upload HD video.
And should be just to finish up the
YouTube thing right now?
OK, and this is something I'm hoping--
I know I'm speaking with some of you tomorrow, but I would
love to get--
my email address is my name, [email protected]
Hunt me down, find me, because I'd love to
discuss this stuff.
Because another thing is going through all of the YouTube
documentation, there's nothing that I could find about audio
upload guidelines.
OK, so there are no audio upload guidelines on the
YouTube site.
Zero.
The problem is, of course, what you're ending up with are
128 and 384 AACs, but most of the time, people are uploading
lossily compressed audio.
So you're transcoding.
Is there anybody in the room who disagrees that transcoding
is the worst sounding thing you could ever do to a piece
of audio between two lossy format?
Because we'll fight later.
OK, there are amazing sounding lossy encoded files.
384 AAC, I would defy most people to sit in a room, do
double-blind test between 384 AACs properly encoded and CDs.
I would defy anybody to not tell the difference between
384 transcoded AAc that came from any other lossy format.
It sounds terrible.
This is one of the things we're hoping to
move forward with.
So anyway, this is one of the problems with
comparing the services.
But the big problem that a lot of the people I speak to
normally have is they don't know how to compare the 44.1
and the 256, and zero consumers know how.
256 is way more than 44, right?
I rest my case.
But when you're trying to actually educate people about
just what this is, you need to come and sit in a room, and
have me go blah blah blah, and show you a chart.
So the idea is that, again, as with any scientific thing,
you've got to look at the units.
And the kilohertz and bit depth is totally different
from kilobits per second.
Now the cool thing is that all of the lossy formats are
actually very transparent with their bit rate.
OK, this is, again, where I make records.
I don't work with computers all the time.
I'm rounding.
There's no 1024.
The numbers are very round, because it's easy for us
people to understand.
All right, so basically, I take your bit rate, I put
three zeroes on the end, and that's how many bits per
second I get to represent my stereo piece of art that makes
my mom cry.
Then actually do the math--
44,100 times 16 times 2, and we're at 1.4 million on a CD.
Now obviously, the codecs that encode the lossy encoders are
very smart.
So it's not like just take a percentage, and that's how
much worse at sounds.
I absolutely get that.
But we're talking at a very big difference, and then you
look at the 192 32, which is the highest I've seen coming
off of HD tracks.
And you're up to 12.2 million.
OK, the problem being in the grand scheme of things that
that's really not a whole lot compared to the analog we
started with.
So again, we're not going to go the analog versus digital
debate, but how many people here like vinyl?
How many people actually look to see if the vinyl's done
from the analog masters instead of digital remasters?
Get some old Blue Note.
Even just compare it to some of the reissued Blue Note.
And it's kind of astonishing.
It's like your there.
OK, so this is where we stop talking about numbers.
And now, I want to go through this study very quickly.
This is sort of an older study.
Because of course, the thing is, does anybody care?
If nobody cares, then we don't need to care, right?
If this doesn't make a difference, and it's all just
a bunch of numbers, I don't care.
The idea is I want people to spend enough money on the
music that I work on that the artists I work with cannot
take a day job so they can keep making records.
And I want to be able to afford to keep making records,
and not necessarily take a day job, but if you've got
something for me, we'll talk.
OK?
That's the idea.
OK, we're not all looking to be on MTV Cribs,
because we're not.
OK, but if people don't care, then by all means, make the
files tiny, because then everything else about the
consumer experience is awesome.
Instant on, very fast, move it from one place to another, fit
25 bazillion songs on anything that fits in your pocket.
That's all good.
OK, now Harman who were actually nice enough to send
up this pair of speakers we're going to listen to later, this
study is from a little while ago to be fair.
But they decided, we need to actually know if people care.
Because they don't care what the outcome is, but they need
to know the answer to that question because they make
equipment for people to listen to music.
That's what they do.
So they need to know, do we need to be really
concentrating on stuff that plays back loss audio, or even
high res audio?
Or should we be building better MP3 hardware decoders
in, and just deal with that?
Should we actually limit the bandwidth?
When we're starting to talk about wireless technology--
I mean, if you look at Sonos and RedNet and a lot of the
really cool networked audio and wireless audio
technologies--
where do we need to cap our bandwidth?
These people need to know what people like, but they don't
actually have a horse in the race, because they're just
going to build the gear to play it back.
So Dr. Sean Oliver, who works there, who's a pretty amazing
guy, and he's got labs that have all
kinds of stuff in them.
They've got stuff that looks like it's out of an amusement
park, so when you're A-B-ing speakers, they hydraulically
move into the same place.
You don't have the differences in placement when people
change speakers and things like that.
So what he decided to do was get young people, because
there's a lot of sort of anecdotal evidence that young
people not only don't care--
but this is the crazy one to me, and if you know anything
about neurology and cognitive listening, it's even crazier--
but that kids these days have only heard MP3s, so they
actually prefer them.
Again, if anyone wants to discuss that later, I will
talk about that for hours, because that's the rabbit hole
I've been down for the last two years.
But I'll just say that that is pretty much
categorically not true.
So this study from a little while ago was
meant to prove this.
So they got a bunch of young kids these days, or in those
days, both high school and college age students.
The only thing that's really important here-- well, there
are two things.
One is that, for whatever reason, they were mostly male
students, as opposed to female students, studying audio,
which is kind of a drag at all times.
So that's just the way it works.
The other thing is you see this last column, this level
of training--
all this is is that these students were involved in a
recording program, or they had taken a comparative listening
class or a critical listening class, or something like that.
So they were aware of audio quality as a thing, as opposed
to just being someone off the street who really has never,
ever thought about it, OK?
So that's the break up.
Here is what they did.
And I--
all it means is they knew what they were doing, and it's
scientific.
OK, so it's true double-blind listening.
These kids don't know what they're listening to.
They come back multiple times, and they listen.
OK, now this is between 128k MP3, which was what everybody
was selling when they did this story.
And you think, my god, that's the Dark Ages, but it's
really, what, four years ago?
Maybe five?
Maybe five.
That's what you could buy.
So between that and CD.
So we're not talking high res HDtracks downloads.
70% of the time, those stupid kids liked the CD.
And this isn't even a what sounds better.
This is a what do you listening to?
Which one do you want to hear?
All right, the important part of this is going back to this
sort of threshold of where does my mom cry, is what
happens emotionally?
So part of one of my theories is, if you go back to that
huge block of text, and you take out a bunch of vowels, at
some point it's harder work to read.
So while you will still understand the words, and
enjoy the story maybe, you will be less emotionally
invested because you're doing stuff.
The same thing is true, I believe, when listening to
lossy audio, because while your brain might throw stuff
away, it's expecting it, and your brain gets pissed when
the stuff doesn't show up.
So you can create anxiety, you can create depression at very
low levels, but at the same time, it's also filling in the
blanks for you, right?
You're taking away lots of acoustic
things from the music.
That's one of the first things to go are reverb tails and
acoustic cues.
So your brain is recreating.
Therefore, it becomes more of an active process to listen.
Now while that may not be that much of an issue, one of the
anecdotal things that really sent me down this road is that
my daughter had a friend in high school who was interning
with me in my studio.
And great drummer, really musical kid, listens to music
all the time.
And he showed up at the studio in the afternoon to work on
something, and he came in, and he said, man, been listening
to music all day and I'm exhausted.
And I don't know how many people that sounds absolutely
crazy to, but that to me is crazy, because I would wake up
in the morning and put on records or cassettes--
even that I had recorded from a microphone in front of a
speaker, so not the highest quality audio in the world--
but I would listen for 15 hours, and my parents would
yell at me, and then I would listen to headphones in bed
for a while.
Even recently, I've gone to friends' houses who have these
amazing set-ups, and we listen to vinyl all day.
And as my wife can attest, I was down at this guy's house
for 15 hours, and I got home at 1:30 in the morning and put
on a record.
I was not exhausted.
When I listen to some of the streaming audio services,
though, I get tired.
I get a headache.
I grind my teeth.
And it's not an instantaneous thing.
It is not an, oh my god, that's killing me and making
my ears bleed.
But it is, in terms of a long-term commitment, and I
would also argue in terms of a long-term connection between
people who hear the music and the artist.
And one of the most important things with artists is that
people actually connect with them on an artistic level.
And that happens by them experiencing some of the
emotion that went into the song.
And it could be as simple as a lyric, which means you're in
pretty good shape no matter what.
But it could be because of the chord changes and the
instrumentation and the subtleties of the performance.
And when we start listening, you will, I believe, start to
hear some kind of not subtle differences.
We put the B back in subtle with some of the things that
change when you listen back to back between some of the
lossily encoded music and the lossless music.
In terms of when you get to the second verse of the song,
do you feel like, musically, I've already heard
this, let's move on?
Or do you feel like, god, what's next in the story?
And man, there's a new guitar part.
And these are subtle things.
So if you love an artist, then it doesn't really matter.
You will love them even if it sounds terrible.
But what if it's somewhere in the middle?
What if you're kind of on the fence?
What if the audio quality actually determines where your
threshold moves as you're listening as to whether you're
going to listen to the next song on that record, or even
make it to the end of the first song?
And I know that part of people not listening all the way
through to songs and skipping around all the time is just
due to changes in consumer habits, and we're all
multitasking more, and things like that.
But for the people here who listen to vinyl, I think you
may not always flip it to Side B, but how often do you lift
with the needle in the middle of Side A-- unless you're
DJing a party--
because you're just kind of tired of it, and now I
want to move on?
You'll generally have the experience of Side A. So
you're getting 20 minutes straight of something.
When you're just listening online, that
doesn't happen so much.
There's a lot of skipping around, and a lot of moving.
But what I've got here--
I went to a few of the different labels.
I've got 18 songs and a bunch of different genres, and I'll
put up just a list of them.
And you guys will DJ.
And also, we can talk about anything.
If anyone has questions or want to point out stuff I've
got wrong, I absolutely want that to happen, as well.
And we can do that while we listen, things like that.
And I have them in as many formats as I could possibly
have them in, including--
oh, we didn't make it to this slide.
Sorry.
Google Play Music, I've got my playlist from you guys.
So hopefully because I'm on your ridiculously fast, free
Wi-Fi, we'll be getting 320 the whole time I'm sure.
But also, then I want to show you something called
OraStream, which is adaptive based on
bandwidth, which is awesome.
And we'll talk about other stuff.
Roundabout.
OK, really quickly.
The way I'm playing the stuff back is I'm using my Mac.
I am playing out of a program called Decibel, which is just
a very, very simple music player.
And the only thing that it does is it switches the sample
rate of the hardware to match the files.
So that way, we're not doing any sample reconversion.
In software on the way out of the computer, we get it out to
the converter at its native sample rate.
It also crashes a lot.
It's a $30 program.
But it generally works.
I'm using this Prism Orpheus, which it's a one-rack space
eight-channel audio interface.
So it's amazing for recording, but I'm using it because it
gives me a volume knob on the front.
I'm just using it stereo going out.
The reason I'm using it, as opposed to something a little
more simple, is because some of my source material is at
192, so I need a box that'll go up to 192 without putting
something else in the middle.
I've tried as hard as I can to make sure that all of these
different files are from the exact same master.
So the same--
remember my font joke from earlier?
Sometimes, that happens multiple times to a release.
Roundabout is one of the examples.
Sorry, we will listen really quickly.
But I needed to say that Roundabout is one of the
examples of something where it is actually from a different
master, the high res version, because it was from a DVD
audio release from, I don't know, eight years ago--
way back in the stone ages when that was a format for
about eight minutes.
So that is actually a different master.
But still, it's a pretty astounding difference.
Now I will also say--
and we can stop this, but anyone who was at the talk I
gave at Berkeley knows that at some point, Robert from
Fraunhofer made me stop playing things off YouTube
because he said it's unfair because it's all transcoded,
and made it him look bad.
And I said, OK, that's fine, but I wasn't sure if everybody
in the room kind of understood what had just happened, that
we just took the biggest player in music discovery out
of the discussion completely because it wasn't fair to the
people who developed the codec that encode the music that's
on this service.
So I will play--
and that said, I play official videos if I can find them.
But there aren't always official videos.
So let's listen to some Yes, and would you
like to pick a format?
Do you want to go low to high, high to low?
AUDIENCE: High to low.
ANDREW SCHEPS: High to low, OK.
So we'll actually go down through CDs, because you'll
hear a little bit of the difference between the master.
So this is the 96 24 taken off the DVD-A, or whatever it was.
AUDIENCE: Quick question for you.
Are you relying on the digital analog in your Macbook?
ANDREW SCHEPS: No, I'm going FireWire to the Orpheus, and
the Orpheus is the D to A converter.
And it's a great sounding converter.
The Prism converters are--
some people say that they're the best converters out there
for music recording.
In the UK, it's almost exclusively what's used for
all the orchestral scoring guys.
They'll have 80 channels of the Prism converters.
And then, we're just going straight into an amplifier to
these speakers.
And that's it.
Yeah?
AUDIENCE: What are you doing to match levels?
I'm fudging it.
OK, so this is not a scientific test.
This is an anecdotal test.
Unless I unplug the monitor, which we can do as well,
you're going to know what you're listening to.
So I'll try and match levels as best I can from up here,
but it does vary a little bit.
So I'll always make the high res stuff louder, because then
you'll like it better.
AUDIENCE: How much power are you using to drive the
amplifiers?
ANDREW SCHEPS: It says it's 4 by 350.
So each speaker is bi-amp, so we get 700 watts a side.
So I'm barely cranking it.
You let me know how loud to go.
And I apologize again.
Yeah?
AUDIENCE: [INAUDIBLE] volume [INAUDIBLE]
digital in this thing?
ANDREW SCHEPS: In this?
No, it's actually an analog control on the output, which
is bizarre.
That's what they tell me.
You can hook it up in lots of different ways.
There's an audio path within it.
The way it is supposedly hooked up is as analog.
But if it is digital, I have to be able to
turn it up and down.
I don't have a choice.
There have been times when I actually had an analog control
room section instead, but it was a lot of
gear to bring up here.
So we're going to use that.
Again, everything is going through that.
Everything is constant except the files themselves.
AUDIENCE: Is it worth turning off the air conditioning, or
will that not matter because of the volume?
ANDREW SCHEPS: I think we'll get over the top of it, yeah.
I mean, again, this is not the most--
here's the crux of this.
And I do want to get to music for those of
you who have to leave.
But the crux of this is that you could set up audio file
double-blind A-B tests--
A-B-X tests-- and be really precise about this, and see
what you can tell the difference of.
But I think especially as we jump from ends of the
spectrum, it's not subtle.
It's huge differences, and then it's a question about
whether it matters to you.
I mean, who cares if you can hear the difference?
If you like them both, then fine.
Then you're good with the small files.
I'm not trying to evangelize one particular type of file,
or to convince anybody that you have to listen this way,
or you're missing out on the music.
My theory is that once you get to a certain point, you're no
longer kind of interfering in the emotional response.
But in terms of an audio file, short burst listening test,
this is more fun than anything else, because it takes a lot
of work to actually find all these stupid files and put
them in one place.
So that's the fun of it is I wasted days of my life so that
we can sit here and DJ.
OK, so that said, let me know how loud--
OK.
[MUSIC PLAYING]
ANDREW SCHEPS: All right, and here's CD, which--
again, a different master, but it's more to set for when we
listen to the other formats.
So this will be the same master as
all the other formats.
OK, so that's pretty different.
But it's also a different master.
So let's for fun, because it is fun.
This is when I'm glad I'm behind the speakers.
Sorry.
Let's just listen to some more stuff, and then we can talk
more, because--
AUDIENCE: What resolution were you playing that at?
ANDREW SCHEPS: Well, that would have been--
is it 128 AAC?
Because there was no high def video.
AUDIENCE: OK, so it was an old upload [INAUDIBLE]?
ANDREW SCHEPS: I guess, yeah.
I mean, or it's a static artwork upload, so they didn't
bother uploading it in HD.
AUDIENCE: [INAUDIBLE].
ANDREW SCHEPS: Yeah.
OK, so let's do Coltrane.
So this is the same master, OK?
There have been reissues and things like that of this, but
I know for a fact because I got this from Blue Note that
this is the same master in all formats.
OK, so where do we want to start?
You guys tell me.
So that's A. We'll do A,B,C. What do you think about that?
Or do you just want A, B?
AUDIENCE: A, B, A, B
ANDREW SCHEPS: Just A B?
Well, hold on.
A, B or A, B, C?
A, B. OK.
AUDIENCE: Are you sure [INAUDIBLE].
ANDREW SCHEPS: Yeah, that happens a lot.
And this is why, again, we had to stop going to YouTube as
any of them, because a lot of them are either swapped, or
depending on the transcoding start to collapse into mono.
Like the Beatles stuff is mono, but it's
not the mono mixes.
So yeah, that happens, but that's--
AUDIENCE: [INAUDIBLE]
resolution.
You can't have very high good placement [INAUDIBLE].
ANDREW SCHEPS: Oh yeah.
Yeah, I mean, with the CD.
OK, so that was A and B. That was YouTube versus 192.
And so again, it was the low resolution possibly
transcoded, even though that was an
official Blue Note upload.
But the problem is--
I mean, I'm sure you guys know, working at kind of a big
company, that at some point, someone told the people at
Blue Note, OK, now we're going to start doing our official
YouTube uploads.
And here are all the assets, and go ahead and do it.
And that definitely filtered down to an intern who had to
sit in front of a computer uploading for three weeks,
because nobody who really know what they're going to do,
knows what they're doing is going to spend longer than it
takes to just point them to the assets.
So their official uploads could've
been completely destroyed.
I mean, it's easier sometimes--
and this happens at HDtracks a lot, where they're sent
something that they're told is 96/24 so that they can sell
it, but the person who actually sent them the files
didn't know how to get over the 2 gig file size limit, and
the album was too big, so they just ripped a CD and sent it.
And it happens.
And then HDtracks gets in a lot of trouble, because there
are a bunch of crazy audiophiles at home doing FFTs
of this stuff.
And also, depending on how it was recorded, there isn't
necessarily anything above 20k.
But if they don't see stuff at 40k, they're like,
that's not high res.
So there are lots of problems in the supply chain, as well
as just the file formats, which is, again, why this is
not meant to be a scientific test, and
more of just an anecdote.
Now if you want, we can stay away from YouTube, because it
is, unfortunately, the most problematic.
But--
AUDIENCE: Which one [INAUDIBLE]
ANDREW SCHEPS: A was YouTube, and B was 192.
Now an interesting thing to me-- with these speakers, I
added some low end to tune this room very quickly
before I came in.
There's some thumping on that side that I'm hearing on the
192 which I don't really hear in the MP3.
So you don't always--
like, oh my god, it's just so much better.
Sometimes you uncover other things along the way.
AUDIENCE: Do you have a non-YouTube [INAUDIBLE]
ANDREW SCHEPS: Yeah.
We can do--
well, your stuff would be 320.
So we could do 320, or we could do Amazon if
you want to do that.
Let's do Amazon.
Well, I just told you, it's Amazon.
I'll leave it up.
OK, but here's Amazon.
AUDIENCE: Can you do it, and then we vote which is which?
ANDREW SCHEPS: Yeah.
Yeah.
Let me just play you some of the Amazon of the Coltrane,
and then we'll go to a different song, and I won't
say a word.
Now I'm crazy, so I did some FFTs of some of this stuff.
And one of the things that Amazon does-- because they're
only selling 256 MP3s, that's what they sell--
and presumably to help their encoder--
because they're not getting 24-bit files, either--
they actually pretty much cut off everything above 15k.
So that's ban limited to 15k on the way into the encoder,
because if you don't have to bother encoding from 15 to 20,
you've got that much more room to encode below.
So that's their decision.
Again, now I'm right between the speakers, so for me the
imaging is a pretty obvious thing.
The 192 is the only one where things are either on the left
or in the right.
And Rudy Van Gelder, who recorded this album, did not
have a pan pot.
It was a patch cord.
It was either in the left or the right, and that's it.
So as soon as you get anything that isn't discretely on one
side or the other, you know it's part of the process of
the encoding that has made things shift.
And that's another way that a lot of the encoders work.
And I don't know specifically the ones you use because
you're writing your own encoders, if you mono up
stuff, , it makes it much easier to encode.
It's one audio stream, and it's
identical in both channels.
So you can save a lot of space doing it that way.
And I'm sure that's part of the pre-encoding of a lot of
this stuff, especially at the lower bit rates.
So that'll happen.
And it's not that big a deal on modern pop stuff because
stuff is everywhere, but any of the Beatles stuff, all the
old Motown stuff, the Blue Note stuff, that is all
discrete stereo, and it will change it completely.
AUDIENCE: [INAUDIBLE]
are you thinking about [INAUDIBLE]
ANDREW SCHEPS: No.
No, I refuse to.
So here's my theory, is that I need to make my records sound
as good as I can make them sound regardless of what
happens afterwards.
So then, when I realized what was happening afterwards, I
asked the Recording Academy so let me come and talk about it.
And they said, sure, we've been trying to figure--
because they've had this Quality Sound Matters
initiative officially for a little over a year, but
unofficially for the last 10 years.
And they've had ideas about--
we're going to get buses and put awesome sound systems in
them, and we're going to drive them around and play this
stuff for people, and trying to come up with ways to let
people here what the difference is so that you can
start to understand.
So when I came up with this presentation as a way to do
it, they were all over it and have allowed me
to come and do it.
So my idea is to find out what's actually
important, and change it.
I refuse to live with the crap, and just say, I got to
make it work on earbuds, because in five years, it
won't be earbuds.
And the pipes will be bigger, and you guys will flip a
switch, and it's going to be either uncompressed or barely
compressed.
And so now, I've changed my whole workflow to cater to
something that goes away.
And it's one of--
not to talk about your neighbors-- but it's one of
the biggest problems I have with Apple conceptually, is
that they will talk a lot about what they want to get
from the labels and from the artists in terms of their
ingestion, and they want 24 bit, and they
want the high res.
But if I master specifically for their encoder right now,
in three weeks, if they say, bandwidth is awesome.
We're going to start selling 320 AACs.
Well, now it's a new encoder, or they just
update their encoder.
All of a sudden, I'm making decisions based on
things that go away.
And I think it's a very big difference between the record
making process and the consumer distribution world,
and you can't make records for the consumer distribution
world other than a lot of the analog limitations we used to
have to deal with.
Like you can't pan your bass off to one side if there's a
lot of low end, and still cut vinyl.
OK, like their physical limitations to things which
I'm fine with.
And AM radio--
they shave off the top and the bottom, and it's mono.
OK, that's fine, I know what's going to happen.
But in terms of taking some sort of encoding algorithm
that's constantly being updated-- otherwise, some
people in this room would be out of a job--
I can't work for that because it's a moving target.
So my idea is if I make it sound great, it will survive
the process better.
And that, I've actually found is true.
Like this Blue Note stuff sound so amazing and so
natural that you can start to hear things get hashy and it's
a little more annoying and a little brash, and the panning
isn't as wide.
But musically, it's still pretty awesome, and it's OK.
And it survives better.
And strangely, a lot of the urban music survives better,
because there's lots of separation between the
instruments.
Things are very discretely encompassed in terms of their
frequencies and things like that.
They're not sharing a lot of space.
You don't have 15 microphones on a drum kit that are all
making noise.
So that actually translates better.
And strangely, there is zero hip hop or R&B that I was able
to get, other than the Espreranza Spalding
record, in high res.
It doesn't exist.
CD is as high as it goes.
They turn in masters that are 44.1-16, because they're
building it on a laptop.
And they're actually building their tracks with MP3s.
AUDIENCE: [INAUDIBLE]
compressed not in bytes but making the lowest part of the
music-- the softest one-- high.
So if people start doing that [INAUDIBLE]
what's the point in going to high res?
ANDREW SCHEPS: Well, I would argue that even something that
doesn't have a whole lot of dynamic range, you will still
absolutely here the difference when you have a very lossly
encoded file.
You start to destroy things other than just the dynamic
range, right?
There's frequency content, there's panning content,
there's the mono versus stereo content,
there's depth of field.
There are all of the cues that are being taken away, all the
acoustic cues and reverb tails and things like that.
And that will affect it even if it's super loud.
I mean, there's this whole thing called the loudness war,
which maybe you know about, but they just like--
I won that war, OK?
I mixed "Death Magnetic," which was the album that
everybody said was the poster child for things
being way too loud.
OK, so I won.
Therefore, the war is over, we don't have to worry about it.
[LAUGHTER]
ANDREW SCHEPS: I spent weeks reencoding for iTunes and
Amazon at that time to make those
files work lossly encoded.
So what happens is you start to get rid of dynamic range
and things like that, is you start to break the encoders.
The encoders need some room to work.
So I'm making it very difficult for that to work.
And one of the things we found that worked great was turn the
mix down 0.7 db, period.
Just let there be headroom that we never even use,
because it's brick wall right there.
We never get up to that last 0.7, but all of a sudden, all
of the encoders sounded about 100 times better.
When we got to give them 24-bit files for the last
Chili Peppers record-- that was right at the beginning of
the mastered for iTunes project at Apple-- and the big
crux of that project is give us 24-bit files instead of
16-bit files.
That made a huge difference.
So in terms of what you feed the encoder, it isn't just
about the source material in terms of a sonic thing.
Because I think there are lots of hardcore and punk albums
that, from a sonic audio file point of view, sound terrible.
But they are so super exciting that people love those bands
and they want to listen to them.
And if you do a 128 MP3 of that album, what used to be
hashy and exciting is now just hashy and noisy, and I think
there are lots of people who wouldn't get into the band as
much as they would even if they buy it on a cassette,
which doesn't have anything above 12k on it, or
something like that.
So there are two very different aesthetic paths you
can take when you talk about the music.
And the problem is, it's not like with TV.
Right, with TV, who is going to argue that a high def set
looks worse than an SD set?
Because you see it, and it's easy to A-B.
Some people like the artifacts and you're used
to things like that.
And if you have a bad digital set that pixelates,
there can be issues.
And if you look at bad material on an HD set, it
looks terrible.
OK, so all those arguments are true.
But let's say you have a well-captured still image, and
you show it on these two different TVs.
One of them has way more information about it and it
just looks a hell of a lot better, the
other one does not.
Whereas with audio, people don't trust what they hear.
People think you have to be trained to like something
better when you just talk about audio formats.
And people believe what they're told, period.
I mean, nothing influences your opinion about things more
than me telling you how great it is, right?
If someone's about to play you something by a certain band
and you like them, and they say, I can't stand this band,
check it out, you will not like that band.
If they say this is my favorite band in the whole
world, you're going to try really, really hard to like
that band because you like that person.
So there's so much that goes into liking music that has
nothing to do with any of this, but it also has
everything to do with it, because I really believe that
there are just thresholds.
And for every person listening to a new piece of music,
there's a threshold of, am I going to like it?
Am I not going to like it?
And the more you can give them something that sounds true to
whatever the artist decided was done, the lower that
threshold will be, and the easier it is to connect.
So regardless, let's listen to some stuff, unless you want to
keep talking.
AUDIENCE: So when did you do that, the Death Metallic?
ANDREW SCHEPS: The Metallica mix? "Death Magnetic"?
That was--
I don't know, six years ago, seven years ago?
AUDIENCE: What made you [INAUDIBLE]
ANDREW SCHEPS: What made me destroy it?
AUDIENCE: Yeah.
ANDREW SCHEPS: OK.
That is a conversation that is not--
I mean, really the only thing I would say about that is I
have nothing to say about that.
The idea that me as an engineer could mix a record in
such a way that was destroyed, but everybody would be OK with
it and let it out into the world is just crazy.
There is a band involved, there are producers involved,
there are plenty of people involved who
said, this is awesome.
Now during the process, whether or not I made quieter
mixes to A B,and an let them hear differences and whatever,
I may have done, but it's irrelevant.
It's irrelevant.
What happens is at the end of the day, that album sounds the
way it does because that's what the band and the producer
thought was great.
And there's some people who really don't like the way it
sounds, but there are a lot of people I've talked to who
think it sounds awesome.
It's super aggressive.
It's not the most hi-fi thing in the world, but a lot of
stuff I do is not hi-fi.
But I hope that it's emotionally awesome, and makes
you love it, and makes you want to either kick a hole in
the wall or cry or call your mom or whatever it is that
we're trying to get across.
So this discussion in terms of what you do with that file
afterwards is also very different from the audio
quality in the sense of audio file.
There are lots and lots of records that if you go to one
of the big consumer electronics shows where they
have a million dollar set-up where a speaker this size will
cost you $85,000 each, and has iridium tweeters.
And you've got a stand for the turntable that costs more than
your house-- that kind of thing.
You can only listen to audio file stuff on there, right?
And so what are you going to hear?
You're going to hear a few jazz records and Steely Dan,
and that's kind of it.
And those are great records, and they're also amazing
sounding records.
But if you put on something like the Metallica record on
there, at that point, maybe some of that's wasted.
But it's not because you're putting on a low bit rate MP3.
OK, another thing just anecdotally--
and we will listen more.
I'm sorry.
I will talk about this for days.
But while I was putting all these files together, I had
this massive folder of files, and I'm keep things organized,
and making sure things are named.
And I was just listening on my laptop speakers.
First of all, I'm letting the OS do sample rate conversion
in real time.
Right, whatever Quicktime has, that's what happened, so it
can play back at whatever sample rate the stuff was set
to, which is probably 44.1.
And I'm just listening to the first 25 seconds of each song,
making sure they're all the right song.
I can tell the difference in my laptop speakers.
So I bring this set up because it's cool, and we've got a
room this big.
And if I played stuff on my laptop, no one can here it.
So this helps.
But if you have any sort of decent kind of system-ish that
has some good DSP on the back end to make it sound pretty
good, and it's got a little bit of power so some of the
dynamics come through, I think you absolutely will hear the
difference.
And even more than that, you'll feel the difference.
One of them is just more fun to listen to.
But that's a discussion that could go for weeks, and
there's no necessarily right answer.
But the good thing is, I won the war, so the war is over.
So now we can all make quiet records again.
Yeah?
AUDIENCE: So there's a new standard from the ITU to set
record loudness levels.
Are you following that at all?
ANDREW SCHEPS: Well, what those are as far as
understand, and correct me if I'm wrong, that's what's used
in the Apple Sound Check, as well, where you scan a record
to say how loud it is, and then it uses it to even out
the level if you take advantage of that in whatever
playback system you're using.
Is that--
OK.
So basically--
AUDIENCE: [INAUDIBLE] a little different
from the ITU's standard.
So there's different, competing implementations.
ANDREW SCHEPS: Again, I don't.
I mean, if we got into my mix process--
which I could talk for a different set
of hours about that--
my mixes are what sounds good.
And sometimes, the level of the mix really doesn't matter.
But a lot of times, it does.
And I mix on analog equipment which has voltage rails.
So as I hit that rail, I don't just cut it off.
It smooshes it off, and it takes a while to smoosh it off
completely.
And different amounts of that smooshing differ.
And it's just because I'm in the analog world, so clipping
and harmonic distortion are your friend until they're not
your friend, and something catches on fire.
So when I'm mixing something like the AFI record I just
mixed or Black Sabbath record, those mixes are going to be
loud because they don't really sound right until their loud.
But when I make something like [INAUDIBLE]
which is on my label or jazz record--
I mixed a Jeff Babko record last year--
those end up being much quieter mixes, because I want
it to be more open, and the dynamic range
really helps the music.
So for me, it's much more a feel thing.
And then I find out later that I've kind of screwed up, and
the mastering guy gets angry.
And then I will send the quieter makes and say, if you
get it to sound as good as my one that you say is too loud,
then we're good.
But if it doesn't feel as good, then we have to go with
my screwed up mix.
So I'm not the best person with that.
There are a lot more technical mixers than me who adhere to
things more than I do.
I'm kind of a disaster with that.
Yeah?
AUDIENCE: What's your take on [INAUDIBLE]
Pandora, [INAUDIBLE]?
ANDREW SCHEPS: OK.
So streaming, I mean, the filetypes are the same, right?
And on that chart, I had bit rates for the streaming files.
So I have no problem with streaming versus download.
I mean, there's a whole other conversation which is about
making the music business still exist.
And that's actually a really important conversation, and
encompasses way more than just this.
This is the esoteric, I think this makes a difference part.
Then there is the recording album credits part, which is a
discussion I'm hoping we're having tomorrow a little bit--
implementation of that, getting consumers to interact
directly with artists more, because that's what creates
the relationships that last so that I don't have to go get
another day job.
That's my goal in all of that.
In terms of just the audio, though, the streaming and not
is exactly the same thing.
So actually let me plug the monitor back in.
And let me show you one other thing, which is a technology.
It's called OraStream.
Does anyone in here know about OraStream?
So we've got one, because you were there last time.
Does anyone here know about the MP4 SLS format?
It's another Fraunhofer encoding format.
So it's meant to be an archival strength format.
So what it does is it will wrap audio in its own
metadata, and preserve whatever the native bit rate
and sample rate is of that audio.
But one of the byproducts it has is you can do what they
call truncating of the stream to produce in real time any
bit rate stream you want.
So what OraStream have done is they've come up with all of
the server side and back end technology to do pinging of
your connection in real time, and to granularly scale.
So the Google Play Music--
you've got three bit rates.
You check out how fast people are able to get the stuff, and
you give them the fastest when you think they can get without
any buffering, right?
Because buffering sucks.
No one wants their music to stop
But that's what you do, right?
And you will skip between those levels.
So if when you start playing a song, you're in a black hole,
even though you're listening on your cell phone.
You're in a parking garage.
You're going to start off at a very low bit rate.
Now are you constantly pinging, and you'll up the bit
rate as soon as you can?
Or do you wait for the next song?
AUDIENCE: You want for it.
ANDREW SCHEPS: You wait for the next song.
OK.
And this is technology-- these guys, I mean, they probably
had meetings here, I don't know, with anyone in the room.
But originally, it's a few guys from Singapore who
developed the technology.
And they were hoping someone else would just license it,
because they thought it was awesome, and why wouldn't
people want to do this?
So what they do is they're pinging constantly, and the
bandwidth will change.
And it plays back in HTML5 using a WebSocket, and it
plays back on iOS and Android via an app, because MP4 SLS
isn't supported directly in the OS of
anybody's computer yet.
So let me just quickly go to my account here.
And for audiophile people, by the way, this
is an awesome service.
So as a listener, it's like a Dropbox that can stream your
audio to you.
So you can get a free 1 gigabyte
account, I think it is.
Or you can pay $5 a month for 5 gig.
You can pay a little bit more for 10 gig or 50 gig, or
something like that.
You upload your lossless music to the service, and you can
immediately stream it anywhere in the world on any platform.
So it's their version of a cloud iPod.
But here is what it's awesome about it.
So let where are all of my playlists?
Here, let's stream something that's kind of--
oh, here we go.
Come on.
OK, so I've got some of the same songs here.
But here's what's important is see it right up at the top,
below the scroll.
What do you call it?
What's the official name for that, the progress bar with
the thing in it?
You know, it's the position bar thing.
OK, so watch what happens.
So everybody heard the song come out from under the water,
and start sounding good?
Here's one that is not a hi-fi recording.
Oh, this is a band from Austin who are the most exciting show
I've ever seen.
And I signed them to my label.
They made a record in two days.
I mixed it in one day.
It's psychedelic rock stuff.
It's not the most hi-fi thing in the world,
but this is at 96/24.
And again, just watch the bit rate if you can see it.
So we're going to start off at 128 because there's a cache.
OK, so we're just streaming 96/24.
And if you do the math and figure out the bit rate, the
number will always be a little lower, because the last part
of the decoding happens at the WebSockets, so you don't
actually need to give the full bit rate.
So the drawback is if you compare 256 stream from MP4SLS
to a 256 encoded MP3 or AAC, the MP4SLS will not sound as
good, because it's not optimized for that bit rate.
But I've never had to listen to 256 with this.
Wandering around on the 4G or 3G that I get off AT&T, I'm CD
quality all the time.
And as you go from the cell network onto your
wi-fi, it jumps up.
And it's seamless, and it works in real
time, and it's awesome.
So this is another example of, I think, where stuff can go
where you still get the convenience of things having
to start playing immediately, which I totally get.
You don't want to start streaming CD quality audio to
people on crappy cell connections.
But if you can hit Play immediately, then realize
they're not on a crappy cell connection and be CD quality
within the first few bars of a song, and when they jump on a
wi-fi network, be up at audio file quality,
that's pretty cool.
So hopefully, this is sort of where some
things will get headed.
And it's one of many possibilities.
But if anyone's interested in talking to the or guys, please
get in touch with me, because they've set it up where now
it's a lockbox service for people who want to just upload
their own stuff.
I can sell my artists' albums through there, download as
individual apps.
So they have a business model, but they're also always
looking for partners.
When Neil Young released his last record, and everyone has
heard of the Pono system that he's touting, which is a
hardware-based high res audio system?
The Warner Brothers wanted to stream his record for a week
before it came out, because that's what record labels do
now is give you a free stream.
And he said, yeah, that's fine, as long as it streams at
192/24, which of course, that's not going to happen.
So they got the or guys to do it, and they actually did it.
And they were streaming about 5 terabytes an hour all over
the world of people who wanted to listen.
And if they were on their mobile browser, they were
probably getting maybe CD quality.
But if they were on a computer hooked up to a stereo, they
could listen to his album at 192/24.
And again, granularly scaling, so if there's any little bit
in the traffic, or if your buddy starts streaming a movie
down the hall, you granularly dip, so it's
not a stepping dip.
So in terms of the listening experience, it's a lot less
intrusive, because you dip down and come back up.
Anyway, so that's OraStream.
Yeah?
AUDIENCE: There's one form that you haven't mentioned a
single time.
I was wondering [INAUDIBLE]
DSD?
ANDREW SCHEPS: OK, so DSD, just really quickly, is
basically 1 bit encoding at a megahertz level.
So instead of taking this grid and putting it over, many,
many, many more times a second then you would on a PCM
encoding, you say, what's the voltage?
Is it higher or lower than last time?
And you use your 1 bit--
this is the dumb version--
say, yeah, it's higher, it's higher, it's higher, it's
higher, now it's lower, it's lower.
So you're basically tracing the waveform very, very
quickly as it goes.
The only problem is-- the reason I don't mention it is
because until about a week ago, there was no viable
consumer format.
And now there is one site that is actually selling DSD audio
files that you can download.
And it's even more cumbersome to get a player to work.
Now in terms of audio quality, listening to DSD versus high
res PCM encoding, I haven't gotten to do A B test, but a
lot of people love it, think it sounds absolutely amazing.
It's a very different way to encode music.
It's awesome.
I try to only cover established consumer formats
during this, because that's what's out there.
And there's no way I can distribute
anything DSD right now.
It's impossible.
AUDIENCE: And it would be hard for you to edit it
ANDREW SCHEPS: It's almost impossible.
There's one system that allows you to do multi-track editing,
and it's really expensive, and their software sucks.
So I can edit, but it would not be good.
So again, obviously, there's always the ability to work
versus what would be best.
[APPLAUSE]
    You must  Log in  to get the function.
Tip: Click on the article or the word in the subtitle to get translation quickly!

Loading…

Andrew Scheps, "Lost in Translation: Audio Quality in Streaming Media" | Talks at Google

7668 Folder Collection
阿絑 published on October 5, 2014
More Recommended Videos
  1. 1. Search word

    Select word on the caption to look it up in the dictionary!

  2. 2. Repeat single sentence

    Repeat the same sentence to enhance listening ability

  3. 3. Shortcut

    Shortcut!

  4. 4. Close caption

    Close the English caption

  5. 5. Embed

    Embed the video to your blog

  6. 6. Unfold

    Hide right panel

  1. Listening Quiz

    Listening Quiz!

  1. Click to open your notebook

  1. UrbanDictionary 俚語字典整合查詢。一般字典查詢不到你滿意的解譯,不妨使用「俚語字典」,或許會讓你有滿意的答案喔