Placeholder Image

Subtitles section Play video

  • Sparks: Good afternoon. My name's Dave Sparks.

  • I'm on the Android team.

  • And I'm the technical lead for the multimedia framework.

  • I've been working on Android since October of 2007.

  • But actually, technically, I started before that,

  • because I worked on the MIDI engine that we're using.

  • So I kind of have a long, vested interest

  • in the project.

  • So today, we have kind of an ambitious title,

  • called "Mastering the Media Framework."

  • I think the reality is that if you believe that--

  • that we're going to do that in an hour,

  • it's probably pretty ambitious.

  • And if you do believe that,

  • I have a bridge just north of here

  • that you might be interested in.

  • But I think we actually will be able to cover

  • a few kind of interesting things.

  • In thinking about this topic,

  • I wanted to cover stuff that wasn't really available

  • in the SDK, so we're really going to del--

  • delve into the lower parts of the framework,

  • the infrastructure that basically

  • everything's built on.

  • Kind of explain some of the design philosophy.

  • So...

  • Oh, I guess I should have put that up first.

  • Here we go.

  • So on the agenda,

  • in the cutesy fashion of the thing,

  • we're talking about the architecture--

  • Frank Lloyd Android.

  • What's new in our Cupcake release,

  • which just came out recently.

  • And those of you who have the phone,

  • you're running that on your device today.

  • And then a few common problems that people run into

  • when they're writing applications for the framework.

  • And then there probably will be

  • a little bit of time left over at the end

  • for anybody who has questions.

  • So moving along,

  • we'll start with the architecture.

  • So when we first started designing the architecture,

  • we had some goals in mind.

  • One of the things was to make development

  • of applications that use media, rich media applications,

  • very easy to develop.

  • And so that was one of the key goals

  • that we wanted to accomplish in this.

  • And I think you'll see it as we look at the framework.

  • It's really simple to play audio,

  • to display a video, and things like that.

  • One of the key things, because this is

  • a multi-tasking operating system,

  • is we have-- you could potentially have

  • things happening in the background.

  • For example, you could have a music player

  • playing in the background.

  • We need the ability to share resources

  • among all these applications,

  • and so that's one of the key things,

  • was to design an architecture

  • that could easily share resources.

  • And the other thing is, you know,

  • paramount in Android is the security model.

  • And if you've looked over the security stuff--

  • I'm not sure we had a talk today on security.

  • But security is really important to us.

  • And so we needed a way to be able to sandbox

  • parts of the application that are--

  • that are particularly vulnerable,

  • and I think you'll see as we look at the--

  • the framework, that it's designed

  • to isolate parts of the system

  • that are particularly vulnerable to hacking.

  • And then, you know, providing a way

  • to add features in the future

  • that are backwards compatible.

  • So that's the-- the room for future growth.

  • So here's kind of a 30,000-foot view

  • of the way the media framework works.

  • So on the left side, you'll notice

  • that there is the application.

  • And the red line-- red dashed line there--

  • is denoting the process boundary.

  • So applications run in one process.

  • And the media server actually runs

  • in its own process that's actually booted up--

  • brought up during boot time.

  • And so the codecs

  • and the file parsers and the network stack

  • and everything that has to do with playing media

  • is actually sitting in a separate process.

  • And then underneath that are the hardware abstractions

  • for the audio and video pass.

  • So Surface Flingers are an abstraction for video

  • and graphics.

  • And Audio Flinger's the abstraction for audio.

  • So looking at a typical media function,

  • there's a lot of stuff--

  • because of this inner process communication that's going on,

  • there's a lot of things that are involved

  • in moving a call down the stack.

  • So I wanted to give you an idea--

  • for those of you who've looked at the source code,

  • it's sometimes hard to follow, you know, how is a call--

  • A question that comes up quite frequently

  • is how does a function call, like, you know,

  • prepare or make its way

  • all the way down to the framework

  • and into the-- the media engine?

  • So this is kind of a top-level view

  • of what a stack might look like.

  • At the very top is the Dalvik VM proxy.

  • So that's the Java object that you're actually talking to.

  • So, for example, for a media player,

  • there's a media player object.

  • If you look at the media player definition,

  • it's a pretty-- I mean,

  • there's not a lot of code in Java.

  • It's pretty simple.

  • And basically, it's a proxy for--

  • in this case, actually, the native proxy,

  • which it's underneath, and then eventually,

  • the actual implementation.

  • So from that, we go through JNI,

  • which is the Java Native Interface.

  • And that is just a little shim layer

  • that's static bindings

  • to an actual MediaPlayer object.

  • So when you create a MediaPlayer in Java,

  • what you're actually doing is making a call

  • through this JNI layer

  • to instantiate a C++ object.

  • That's actually the MediaPlayer.

  • And there's a reference to that that's held in the Java object.

  • And then some tricky stuff--

  • weak references to garbage collection

  • and stuff like that, which is a little bit too deep

  • for the talk today.

  • Like I said, you're not going to master the framework today,

  • but at least get an idea of what's there.

  • So in the native proxy,

  • this is actually a proxy object for the service.

  • So there is a little bit of code in the native code.

  • You know, a little bit of logic in the native code.

  • But primarily, most of the implementation

  • is actually sitting down in this media server process.

  • So the native proxy is actually the C++ object

  • that talks through this binder interface.

  • The reason we have a native proxy

  • instead of going directly through JNI

  • is a lot of the other pieces of the framework does.

  • So we wanted to be able to provide

  • access to native applications in the future

  • to use MediaPlayer objects.

  • So it makes it relatively easy,

  • because that's something you'd probably want to do

  • with games and things like that

  • that are kind of more natural to write in native code.

  • We wanted to provide the ability to do that.

  • So that's why the native proxy sits there

  • and then the Java layer just sits on top of that.

  • So the binder proxy and the binder native piece--

  • Binder is our abstraction for inter-process communication.

  • Binder, basically, what it does,

  • is it marshals objects across this process boundary

  • through a special kernel driver.

  • And through that, we can do things like move data,

  • move file descriptors that are duped across processes

  • so that they can be accessed by different processes.

  • And we can also do something which--we can share memory

  • between processes.

  • And this is a really efficient way of moving data

  • back and forth between the application

  • and the media server.

  • And this is used extensively

  • in Audio Flinger and Surface Flinger.

  • So the binder proxy is basically the marshalling code

  • on the applications side.

  • And the binder native code is the marshalling code

  • for the server side of the process.

  • And if you're looking at all the pieces

  • of the framework-- they start with

  • mediaplayer.java, for example--

  • there's an android_media...

  • _mediaplayer.cpp,

  • which is the JNI piece.

  • There's a mediaplayer.cpp,

  • which is the native proxy object.

  • Then there's an imediaplayer.cpp,

  • which is actually a-- a binder proxy

  • and the binder native code in one chunk.

  • So you actually see the marshalling code

  • for both pieces in that one file.

  • And one is called bpmediaplayer.cpp--

  • or, sorry, BP MediaPlayer object.

  • And a BN MediaPlayer object.

  • So when you're looking at that code,

  • you can see the piece that's on the native side--

  • the server side and the proxy.

  • And then the final piece of the puzzle

  • is the actual implementation itself.

  • So in the case of the media server--

  • sorry, the MediaPlayer-- there's a MediaPlayer service

  • which instantiates a MediaPlayer object

  • in the service that's, you know, proxied

  • in the application by this other MediaPlayer object.

  • That's basically-- each one of the calls

  • goes through this stack.

  • Now, because the stack is, you know, fairly lightweight

  • in terms of we don't make a lot of calls through it,

  • we can afford a little bit of overhead here.

  • So there's a bit of code that you go through

  • to get to this place, but once you've started playing,

  • and you'll see this later in the slides,

  • you don't have to do a lot of calls

  • to maintain the application playing.

  • So this is actually kind of a top-level diagram

  • of what the media server process looks like.

  • So I've got this media player service.

  • And it can instantiate a number of different players.

  • So on the left-hand side, you'll see, bottom,

  • we have OpenCORE, Vorbis, and MIDI.

  • And these are three different media player types.

  • So going from the simplest one, which is the Vorbis player--

  • Vorbis basically just plays Ogg Vorbis files,

  • which is a-- we'll get into the specifics

  • of the codec, but it's a psycho-acoustic codec

  • that's open sourced.

  • We use this for a lot of our internal sounds,

  • because it's very lightweight.

  • It's pretty efficient.

  • And so we use that for our ringtones

  • and for our application sounds.

  • The MIDI player, a little more complex.

  • But basically, it's just another instantiation

  • of a media player.

  • These all share a common interface,

  • so if you look at the MediaPlayer.java interface,

  • there's almost, you know, one-for-one correspondence

  • between what you see there and what's actually happening

  • in the players themselves.

  • And then the final one is OpenCORE.

  • So anything that isn't an Ogg file or a MIDI file

  • is routed over to OpenCORE.

  • And OpenCORE is basically the-- the bulk of the framework.

  • It consists of all of the major codecs,

  • like, you know, MP3 and AAC and AMR

  • and the video codecs, H.263 and H.264 and AVC.

  • So any file that's not specifically one of those two

  • ends up going to OpenCORE to be played.

  • Now, this provides some extensibility.

  • The media player service is smart enough

  • to sort of recognize these file types.

  • And we have a media scanner that runs at boot time--

  • that goes out, looks at the files,

  • figures out what they are.

  • And so we can actually, you know, replace or add

  • new player types by just instantiating

  • a new type of player.

  • In fact, there are some projects out there

  • where they've replaced OpenCORE with GStreamer

  • or other media frameworks.

  • And we're talking to some other--

  • some different types of player applications

  • that might have new codecs and new file types,

  • and that's one way of doing it.

  • The other way of doing it is you--

  • if you wanted to add a new file type,

  • you could actually implement it inside of OpenCORE.

  • And then on the right-hand side,

  • we have the media recorder service.

  • Prior to-- in the 1.0, 1.1 releases,

  • that was basically just an audio record path.

  • In Cupcake, we've added video recording.

  • So this is now integrated with a camera service.

  • And so the media recorder-- again, it's sort of a proxy.

  • There's a proxy, um--

  • it uses the same sort of type of thing,

  • where there's a media recorder-- media recorder object

  • in the Java layer.

  • And there's a media recorder service

  • that actually does the recording.

  • And for the actual authoring engine,

  • we're using OpenCORE.

  • And it has the-- the encoder side.

  • So we've talked about the decoders,

  • and the encoders would be H.263, H.264, and also AVC.

  • Sorry, and MPEG-4 SP.

  • And then, the audio codecs.

  • So all those sit inside of OpenCORE.

  • And then the camera service both operates

  • in conjunction with the media recorder

  • and also independently for still images.

  • So if your application wants to take a still image,

  • you instantiate a camera object,

  • which again is just a proxy for this camera service.

  • The camera surface takes care of handling preview for you,

  • so again, we wanted to limit the amount of traffic

  • between the application and the hardware.

  • So this actually provides a way for the preview frames

  • to go directly out to the display.

  • Your application doesn't have to worry about it,

  • it just happens.

  • And then in the case where the media recorder

  • is actually doing video record,

  • we take those frames into the OpenCORE

  • and it does the encoding there.

  • So kind of looking at what a media playback session

  • would look like.

  • The application provides three main pieces of data.

  • It's going to provide the source URI.

  • The "where is this file coming from."

  • It'll either come from a local file that's on the--

  • you know, on the SD card.

  • It could come from a resource

  • that's in the application, the .apk,

  • or it could come from a network stream.

  • And so the application provides that information.

  • It provides a surface that basically,

  • at the application level, called a surface view.

  • This, at the binder level, is an ISurface interface,

  • which is an abstraction for the--the view that you see.

  • And then it also provides the audio types,

  • so that the hardware knows where to route the audio.

  • So once those have been established,

  • the media server basically takes care of everything

  • from that point on.

  • So you--once you have called the prepare function

  • and the start function,

  • the frames--video frames, audio frames, whatever, are--

  • they're going to be decoded inside the media server process.

  • And they get output directly to either Audio Flinger

  • or Surface Flinger, depending on whether

  • it's an audio stream or a video stream.

  • And all the synchronization is handled for you automatically.

  • Again, it's a very low overhead.

  • There's no data that's flowing back up to the application

  • at this point--it's all happening inside the hardware.

  • One other reason for doing that

  • we mentioned earlier is that in the case--

  • in many cases, for example the G1 and the Sapphire,

  • the device that you guys got today--

  • those devices actually have hardware codecs.

  • And so we're able to take advantage

  • of a DSP that's in the device to accelerate.

  • In the case of, for example, H.264,

  • we can accelerate the decoded video in there

  • and offload some of that from the main processor.

  • And that frees the processor up to do other things,

  • either, you know, doing sync in the background,

  • or just all sorts of things that it might need--

  • you might need those cycles for.

  • So again, that's-- all that is happening

  • inside the media server process.

  • We don't want to give applications direct access

  • to the hardware, so it's another good reason

  • for putting this inside the media server process.

  • So in the media recorder side,

  • we have a similar sort of thing.

  • It's a little more complex.

  • The application can either,

  • in the case of--

  • it can actually create its own camera

  • and then pass that to the media server

  • or it can let the media server create a camera for it.

  • And then the frames from the camera go directly

  • into the encoders.

  • It again is going to provide a surface for the preview,

  • so as you're taking your video, the preview frames are going

  • directly to the-- to the display surface

  • so you can see what you're recording.

  • And then you can select an audio source.

  • Right now that's just the microphone input,

  • but in the future, it could be other sources.

  • You know, potentially you could be recording

  • from, you know, TV or some-- some other hardware device

  • that's on the device.

  • And then--so once you've established that,

  • the camera service will then start feeding frames

  • through the camera service up to the media server

  • and then they're pushed out to the Surface Flinger

  • and they're also pushed out into OpenCORE for encoding.

  • And then there's a file authoring piece

  • that actually takes the frames from audio and video,

  • boxes them together, and writes them out to a file.

  • So, get into a little more detail about the codecs.

  • We have a number of different--

  • we have three different video codecs.

  • So one of the questions that comes a lot--

  • comes up a lot from the forums

  • is what kind of codecs are available,

  • what should they be used for, and things like that.

  • So just kind of a little bit of history

  • about the different codecs.

  • So H.263 is a codec from-- I think it was--

  • came out about 1996, was when it was standardized.

  • It was originally intended for video conferencing,

  • so it's really low bit-rate stuff.

  • You know, designed to go over an ISDN line

  • or something like that.

  • So it's actually worked out pretty well for mobile devices,

  • and a lot of mobile devices support H.263.

  • The encoder is pretty simple.

  • The decoder is pretty simple.

  • So it's a lightweight kind of codec for an embedded device.

  • It's part of the 3GPP standard.

  • So it's adopted by a number of different manufacturers.

  • And it's actually used by a number of existing

  • video sites-- of websites--

  • for their encode.

  • For example, YouTube-- if you go to, like,

  • the m.youtube.com,

  • typically you'll end up at an H.263 stream.

  • Because it's supported on most mobile devices.

  • So MPEG-4 SP was originally designed

  • as a replacement for MPEG-1 and MPEG-2.

  • MPEG-1, MPEG-2--fairly early standardized codecs.

  • They wanted to do something better.

  • Again, it has a very simple encoder model, similar to H.263.

  • There's just single frame references.

  • And there's some question about whether

  • it's actually a better codec or not than H.263,

  • even though they're--

  • they came out very close together.

  • It's missing the deblocking filter, so--

  • I didn't mention that before.

  • H.263 has a deblocking filter.

  • If you've ever looked at video,

  • it typically comes out in, like, 8x8 pixel blocks.

  • And you get kind of a blockiness.

  • So there's an in-loop deblocking filter in H.263,

  • which basically smooths some of those edges out.

  • The MPEG-4 SP, in its basic profile,

  • is missing that.

  • So it--the quality of MPEG-4,

  • some people don't think it's quite as good,

  • even though it came out at roughly the same time.

  • Then the final codec we support

  • is a fairly recent development.

  • I think it's a 2003, or something like that.

  • The H.264 AVC codec came out.

  • Compression's much better.

  • It includes the ability

  • to have multiple reference frames,

  • although on our current platforms,

  • we don't actually support that.

  • But theoretically, you could get better compression

  • in the main-- what's called the main profile.

  • We support base profile.

  • It has this mandatory in-loop deblocking filter

  • that I mentioned before,

  • which gets rid of the blockiness in the frames.

  • One of the really nice things

  • is it has a number of different profiles.

  • And so different devices support different levels

  • of--of profiles.

  • It specifies things like frame sizes, bit rates,

  • the--the types of advanced features

  • that it has to support.

  • And there's a number of optional features in there.

  • And basically, each of those levels

  • and profiles defines what's in those codecs.

  • It's actually used in a pretty wide range of things.

  • Everything from digital cinema, now, HDTV broadcasts,

  • and we're starting to see it on mobile devices like the G1.

  • When you do a--if you're using the device itself today,

  • and you do a YouTube playback,

  • you're actually-- on Wi-Fi,

  • you're actually getting a H.264 stream,

  • which is why it's so much better quality.

  • On the downside, it's a lot more complex than H.263

  • because it has these advanced features in it.

  • So it takes a lot more CPU.

  • And in the case of the G1, for example,

  • that particular hardware,

  • some of the acceleration happens in the DSP,

  • but there's still some stuff that has to go

  • on the application processor.

  • On the audio side, MP3 is pretty--

  • everybody's pretty familiar with.

  • It uses what's called a psycho-acoustic model,

  • which is why we get better compression than a typical,

  • you know, straight compression algorithm.

  • So psycho-acoustic means you look for things in the--

  • that are hidden within the audio.

  • There are certain sounds

  • that are going to be masked by other sounds.

  • And so the psycho-acoustic model

  • will try to pick out those things,

  • get rid of them, and you get better--

  • much better compression there.

  • You get approximately 10:1 compression

  • over a straight linear PCM at 128kbits per second,

  • which is pretty reasonable, especially for a mobile device.

  • And then if you want to, you know, be a purist,

  • most people figure you get full sonic transparency

  • at about 192kbits per second.

  • So that's where most people won't be able to hear

  • the difference between the original

  • and the compressed version.

  • For a more advanced codec,

  • AAC came out sometime after MP3.

  • It's built on the same basic principles,

  • but it has much better compression ratios.

  • You get sonic transparency at roughly 128kbits persecond.

  • So, you know, much, much better compression.

  • And another mark that people use

  • is 128kbits per second--

  • MP3 is roughly equivalent to 96kbits per second AAC.

  • We also find it's-- it's used, commonly used,

  • in MPEG-4 streams.

  • So if you have an MPEG-4 audio--video stream,

  • you're likely to find an AAC codec with it.

  • In the case of our high-quality YouTube streams,

  • they're typically a 96 kilohertz AAC format.

  • And then finally, Ogg Vorbis, which I'd mentioned earlier,

  • we're using for a lot of our sounds.

  • Again, it's another psycho-acoustic model.

  • It's an open source codec,

  • so it doesn't have any patent,

  • you know, issues in terms of licensing--

  • whereas any of the other codecs, if you're selling a device,

  • you need to go, you know,

  • get the appropriate patent licenses.

  • Or I probably shouldn't say that,

  • because I'm not a lawyer,

  • but you should probably see your lawyer.

  • From our perspective, it's very low overhead.

  • It doesn't bring in all of the OpenCORE framework,

  • 'cause it's just an audio codec.

  • So it uses-- it's very lightweight

  • in terms of the amount of memory usage it uses

  • and also the amount of code space

  • that it has to load in in order to play a file.

  • So that's why we use it for things like ringtones

  • and other things that need fairly low latency

  • and we know we're gonna use it a lot.

  • The other thing is that, unlike MP3--

  • MP3 doesn't have a native way of specifying a seamless loop.

  • For those of you who aren't audio guy--

  • audio experts, "seamless loop" basically means

  • you can play the whole thing as one seamless,

  • no clips, no pops loop to play over and over again.

  • A typical application for that would be a ringtone,

  • where you want it to continue playing

  • the same sound over and over again

  • without--without the pops and clicks.

  • MP3 doesn't have a way to specify that accurately enough

  • that you can actually do that without having some sort of gap.

  • There are people that have added things in the ID3 tags

  • to get around that, but there isn't

  • any standardized way to do it.

  • Ogg does it-- actually, both Ogg and AAC

  • have conventions for specifying a seamless loop.

  • So that's another reason why we use Ogg

  • is that we can get that nice seamless loop.

  • So if you're doing anything in a game application

  • where you want to get, you know, some sort of--

  • a typical thing would be like an ambient sound

  • that's playing over and over in the background.

  • You know, the factory sound or, you know,

  • some eerie swamp noises or whatever.

  • That's the way to do it is to use the Ogg file.

  • You'll get pretty good compression.

  • It's pretty low overhead for decoding it.

  • And you can get those loops that won't click.

  • And then finally, the last codecs

  • we're going to talk about in terms of audio

  • are the AMR codecs.

  • AMR is a speech codec,

  • so it doesn't get the full bandwidth.

  • If you ever try to encode one with music on it,

  • it will sound pretty crappy.

  • That's because it-- it wants to kind of focus in

  • on one central tone.

  • That's how it gets its high compression rate.

  • But at the same time, it throws away a lot of audio.

  • So it's typically used for video codecs.

  • And in fact, GSM basically is based

  • on AMR-type codecs.

  • It's--the input is,

  • for the AMR narrow band, is 8 kilohertz.

  • So going back to Nyquist, that basically means

  • your highest frequency you can represent

  • is just shy of 4 kilohertz.

  • And the output bit-rates are, you know,

  • anywhere from just under 5kbits per second up to 12.2.

  • AMR wide band is a little bit better quality.

  • It's got a 16 kilohertz input, and slightly higher bandwidths.

  • But again, it's a speech codec primarily,

  • and so you're not going to get great audio out of it.

  • We do use these, because in the package,

  • the OpenCORE package, the AMR narrow band codec

  • is the only audio encoder--

  • native audio encoder we have in software.

  • So if your hardware platform doesn't have an encoder,

  • that's kind of the fallback codec.

  • And in fact, if you use the audio recorder application

  • like MMS, and attach an audio,

  • this is the codec you're going to get.

  • If you do a video record today,

  • that's the codec you're going to get.

  • We're expecting that future hardware platforms

  • will provide, you know, native encoders for AAC.

  • It's a little too heavy to do AAC

  • on the application processor

  • while you're doing video record and everything else.

  • So we really need the acceleration

  • in order to do it.

  • AMR is specified in 3GPP streams.

  • So most phones that will decode an H.263

  • will also decode the AMR.

  • So it's a fairly compatible format.

  • If you look at the--the other phones that are out there

  • that support, you know, video playback,

  • they typically will support AMR as well.

  • So we've talked about codecs.

  • Both audio and video codecs.

  • The other piece of it, when you're doing a stream,

  • is what's the container format?

  • And so I'm going to talk a little bit about that.

  • So 3GPP is the stream that's defined

  • by the 3GPP organization.

  • These are phones that support that standard

  • and are going to support these types of files.

  • 3GPP is actually an MPEG-4 file format.

  • But it's--very, very restricted set of--

  • of things that you can put into that file,

  • designed for compatibility with these embedded devices.

  • So you really want to use a H.263 video codec

  • for--for broad compatibility across a number of phones.

  • You probably want to use a low bit rate for the video,

  • typically like 192kbits per second.

  • And you also want to use the AMR narrow band codec.

  • For MPEG-4 streams, which we also support,

  • they're typically higher quality.

  • They typically are going to use

  • either an H.264 or a higher-- bigger size H.263 format.

  • Usually they use an AAC codec.

  • And then on our particular devices,

  • the G1 and the device that you just received today--

  • I'm not even sure what we're calling it--

  • I--

  • is capable of up to 500kbits per second

  • on the video side

  • and 96kbits per second.

  • So a total of about 600kbits per second,

  • sustained.

  • If you do your encoding well,

  • you're going to actually get more than that out of it.

  • We've actually been able to do better

  • than 1 megabit per second, but you have to be--

  • have a really good encoder.

  • If it gets "burst-y," it will interfere

  • with the performance of the codec.

  • So one question that comes up a lot on the forums

  • is what container should I use

  • if I'm either authoring or if I'm doing video recording?

  • So for authoring for our Android device,

  • if you want the best quality--

  • the most bang for your bits, so to speak--

  • you want to use an MPEG-4 codec--

  • er, container file with an H.264 encoded stream.

  • It needs to be, for these devices today,

  • a baseline profile roughly, as I was saying before,

  • at 500kbits per second HVGA or smaller,

  • and AAC codec up to 96kbits per second.

  • That will get you a pretty high quality--

  • that's basically the screen resolution.

  • So it looks really good on-- on the display.

  • For other--

  • you're going to create content on an Android device,

  • so you have a video record application, for example.

  • And you want to be able to send that via MMS

  • or some other email or whatever to another phone,

  • you probably want to stick to a 3GPP format,

  • because not all phones will support an MPEG-4 stream,

  • particularly the advanced codecs.

  • So in that case we recommend...

  • I'm getting ahead of myself here.

  • So in that case we recommend using the QCIF format.

  • That's 192kbits per second.

  • Now, if you're creating content

  • on the Android device itself,

  • intended for another Android device,

  • we have an H.263 encoder.

  • We don't have an H.264 encoder,

  • so you're restricted to H.263.

  • And for the same reason I've discussed before,

  • we won't have an AAC encoder,

  • so you're going to use an AMR narrow band encoder,

  • at least on the current range of devices.

  • So those are kind of the critical things

  • in terms of inter-operability with other devices.

  • And then the other thing is-- a question that comes up a lot

  • is if I want to stream to an Android device,

  • what do I need to do to make that work?

  • The thing where most people fail on that

  • is the "moov" atom, which is the index of frames

  • that tells--basically tells the organization of the file,

  • needs to precede the data-- the movie data atom.

  • And...the...

  • Most applications will not do that naturally.

  • I mean, it's more-- it's easier for a programmer

  • to write something that builds that index afterwards.

  • So you have-- you typically have

  • to give it a specific-- you know,

  • turn something on,

  • depending on what the application is,

  • or if you're using FFmpeg,

  • you have to give it a command line option

  • that tell it to-- to put that atom

  • at the beginning instead of the end.

  • So...

  • For--we just recently came out with what we've been calling

  • the Cupcake release, or the 1.5 release.

  • That's the release that's on the phones

  • you just received today.

  • Some of the new features we added in the media framework.

  • We talked about video recording before.

  • We added an AudioTrack interface

  • and an AudioRecord interface in Java,

  • which allows direct access to raw audio.

  • And we added the JET interactive MIDI engine.

  • These are kind of the-- the highlights

  • in the media framework area.

  • So kind of digging into the specifics here...

  • AudioTrack-- we've had a lot of requests

  • for getting direct access to audio.

  • And...so what AudioTrack does is allow you

  • to write a raw stream from Java

  • directly to the Audio Flinger mixer engine.

  • Audio Flinger is a software mixer engine

  • that abstracts the hardware interface for you.

  • So it could actually-- it could mix multiple streams

  • from different applications.

  • To give you an example,

  • you could be listening to an MP3 file

  • while the phone rings.

  • And the ringtone will play

  • while the MP3 file is still playing.

  • Or a game could have multiple sound effects

  • that are all playing at the same time.

  • And the mixer engine takes care of that automatically for you.

  • You don't have to write a special mixer engine.

  • It's in-- built into the device.

  • Potentially could be hardware accelerated in the future.

  • And it also allows you to...

  • It does sample rate conversion for you.

  • So you can mix multiple streams at different sample rates.

  • You can modify the pitch and so on and so forth.

  • So what AudioTrack does, it gives you direct access

  • to that mixer engine.

  • So you can take a raw Java stream,

  • you know, 16-bit PCM samples, for example,

  • and you can-- you can send that out

  • to the mixer engine.

  • Have it do the sample rate conversion for you.

  • Do volume control for you.

  • It does-- has anti-zipper volume filters

  • so--if anybody's ever played with audio before,

  • if you change the volume,

  • it changes the volume in discrete steps

  • so you don't get the pops or clicks

  • or what we typically refer to as zipper noise.

  • And that's all done with...

  • Either you can do writes on a thread in Java,

  • or you can use the callback engine to fill the buffer.

  • Similarly, AudioRecord gives you direct access to the microphone.

  • So in the same sort of way,

  • you could pull up a stream from the microphone.

  • You specify the sample rate you want it in.

  • And, you know, with the combination

  • of the two of those,

  • you can now take a stream from the microphone,

  • do some processing on it, and now put it back out

  • via the...

  • the AudioTrack interface too, that mixer engine.

  • And that mixer engine will go wherever audio is routed.

  • So, for example, a question that comes up

  • a lot is, well, what if they have a Bluetooth device?

  • Well, that's actually handled for you automatically.

  • There's nothing you have to do as an application programmer.

  • If there's a Bluetooth device paired that supports A2DP,

  • then that audio is going to go directly

  • to the...to the A2DP headset.

  • Your...whether it's a headset or even your car or whatever.

  • And then we've got this call mack--

  • callback mechanism so you can actually

  • just set up a buffer and just keep--

  • when you get a callback, you fill it.

  • You know, if you're doing a ping-pong buffer,

  • where you have half of it being filled

  • and the other half is actually being output to the device.

  • And there's also a static buffer mode

  • where you give it a-- for example,

  • a sound effect that you want to play

  • and it only does a single copy.

  • And then it just automatically mixes it,

  • so each time you trigger the sound,

  • it will mix it for you,

  • and you don't have to do additional memory copies.

  • So those are kind of the big highlights

  • in terms of the-- the audio pieces of it.

  • Then another new piece that's actually been in there

  • for a while, but we've finally implemented the Java support,

  • is the JET Interactive MIDI Engine.

  • So JET is--

  • it's based upon the EAS MIDI engine.

  • And what it does is allow you to pre-author some content

  • that is very interactive.

  • So what you do is you,

  • if you're an author, you're going to create content

  • in a-- your favorite authoring tool.

  • Digital authoring workstation tool.

  • It has a VST plugin, so that you can, you know,

  • basically write your-- your game code--

  • your--your audio in the tool

  • and hear it back played as it would be played on the device.

  • You can take and have multiple tracks

  • that are synchronized and mute them and unmute them

  • synchronous with the segment.

  • So basically, your piece is going to be divided up into

  • a bunch of little segments.

  • And just as an example,

  • I might have an A section, like the intro,

  • and maybe I have a verse and I have a chorus.

  • And I can interactively get those to place

  • one after another.

  • So, for example, if I have a game that, um--

  • it has kind of levels, I might start with

  • a certain background noise, and perhaps, you know,

  • my character's taking damage.

  • So I bring in some little element

  • that heightens the tension in the game

  • and this is all done seamlessly.

  • And it's very small content, because it's MIDI.

  • And then you can actually have little flourishes

  • that play in synchronization with it--

  • with the music that's going on.

  • So some--for example, let's say you, you know,

  • you take out an enemy.

  • There's a little trumpet sound or whatever.

  • A sound effect that's synchronized

  • with the rest of the-- the audio that's playing.

  • Now all this is done under-- under program control.

  • In addition to that, you also have the ability

  • to have callbacks that are synchronized.

  • So a good example would be a Guitar Hero type game

  • where you have music playing in the background.

  • What you really want to do is have the player

  • do something in synchronization with the rhythm of the sound.

  • So you can get a callback in your Java application

  • that tells you when a particular event occurred.

  • So you could create these tracks of--of events

  • that you've been-- you know, measured--

  • did they hit before or after?

  • And we actually have a sample application

  • in the SDK that shows you how to do this.

  • It's a--I think a, like, two- or three-level game

  • that with-- complete with graphics

  • and sound and everything to show you how to do it.

  • The code--the code itself is written in native code

  • that's sitting on top of the EAS engine,

  • so again, in keeping with our philosophy

  • of trying to minimize the--

  • the overhead from the application,

  • this is all happening in background.

  • You don't have to do anything to keep it going

  • other than keep feeding it segments.

  • So periodically, you're going to wake up and say,

  • "Oh, well, here's the next segment of audio to play,"

  • and then it will play automatically

  • for whatever the length of that segment is.

  • It's all open source.

  • Not only is the-- the code itself open source,

  • but the tools are open sourced,

  • including the VST plugin.

  • So if you are ambitious

  • and you want to do something interesting with it,

  • it's all sitting out there for you to play with.

  • I think it's out there now.

  • If not, it will be shortly.

  • And so those are the big highlights of the--

  • the MIDI-- the MIDI engine.

  • Oh, I forgot. One more thing.

  • The DLS support-- so one of the critiques

  • of general MIDI, or MIDI in general,

  • is the quality of the instruments.

  • And admittedly, what we ship with the device is pretty small.

  • We try to keep the code size down.

  • But what the DLS support does with JET

  • is allow you to load your own samples.

  • So you can either author them yourself

  • or you can go to a content provider

  • and author these things.

  • So if you want a high-quality piano

  • or you want, you know, a particular drum set,

  • you're going for a techno sound or whatever,

  • you can actually, you know,

  • put these things inside the game,

  • use them as a resource,

  • load them in and-- and your game will have

  • a unique flavor that you don't get

  • from the general MIDI set.

  • So...

  • I wanted to talk about a few common problems

  • that people run into.

  • Start with the first one here.

  • This one I see a lot.

  • And that is the behavior of the application

  • for the volume control is-- is inconsistent.

  • So, volume control on Android devices

  • is an overloaded function.

  • And as you can see from here,

  • if you're in a call, what the volume control does

  • is adjust the volume that you're hearing

  • from the other end of the phone.

  • If you're not in a call, if it's ringing,

  • pressing the volume button mutes the--the ringer.

  • Oh, panic.

  • I'm in a, you know, middle of a presentation

  • and my phone goes off.

  • So that's how you mute it.

  • If we can detect that a media track is active,

  • then we'll adjust the volume of whatever is playing.

  • But otherwise, it adjusts the ringtone volume.

  • The issue here is that if your-- if your game is--

  • or your application is just sporadically making sounds,

  • like, you know, you just have little UI elements

  • or you play a sound effect periodically,

  • you can only adjust the volume of the application

  • during that short period that the sound is playing.

  • It's because we don't actually know

  • that you're going to make sound until that particular instant.

  • So if you want to make it work correctly,

  • there's an-- there's an API you need to call.

  • It's in--it's part of the activity package.

  • It's called setVolumeControlStream.

  • So you can see a little chunk of code here.

  • In your onCreate,

  • you're going to call this setVolumeControlStream

  • and tell it what kind of stream you're going to play.

  • In the case of most applications that are in the foreground,

  • that are playing audio,

  • you probably want streamed music,

  • which is kind of our generic placeholder

  • for, you know, audio that's in the foreground.

  • If your ringtone application, for some--

  • you know, you're playing ringtones,

  • and you would select a different type.

  • But this basically tells the activity manager,

  • when you press the audio button,

  • if none of those...

  • previous things are-- in other words,

  • if we're not in call, if it's not ringing,

  • and if there's-- if--

  • if none of these other things are happening,

  • then that's the default behavior of the volume control.

  • Without that, you're probably going to get

  • pretty inconsistent behavior and frustrated users.

  • That's probably the number one problem

  • I see with applications in the marketplace today

  • is they're not using that.

  • Another common one I see on the--in a--

  • on the forums is people saying,

  • "How do I--how do I play a file from my APK?

  • "I just want to have an audio file

  • that I ship with the-- with the package,"

  • and they get this wrong for whatever reason.

  • I think we have some code out there

  • from a long time ago that looks like this.

  • And so this doesn't work.

  • This is the correct way to do it.

  • So there's this AssetFileDescriptor.

  • I talked a little bit earlier about the binder object

  • and how we pass things through,

  • so we're going to pass the file descriptor,

  • which is a pointer to your resource,

  • through the binder to the...

  • I don't know how that period got in there.

  • It should be setDataSource.

  • So it's setDataSource, takes a FileDescriptor,

  • StartOffset, and a Length,

  • and so what this will do is, using a resource ID,

  • it will find, you know, open it,

  • find the offset where that raw--

  • that resource starts.

  • And it will, you know, pass--

  • set those values so that we can tell

  • the media player where to find it,

  • and the media player will then play that

  • from that offset in the FileDescriptor.

  • I had another thought there.

  • Oh, yeah. So--yeah.

  • Raw resources, make sure that when you put your file in,

  • you're putting it in as a raw resource,

  • so it doesn't get compressed.

  • We don't compress things like MP3 files and so on.

  • They have to be in the raw directory.

  • Another common one I see on the forums

  • is people running out of MediaPlayers.

  • And this is kind of an absurd example,

  • but, you know, just to give you a point.

  • There is a limited amount of resources.

  • This is an embedded device.

  • A lot of people who are moving over from the desktop

  • don't realize that they're working with something

  • that's, you know, equivalent to a desktop system

  • from maybe ten years ago.

  • So don't do this.

  • If you're going to use MediaPlayers,

  • try to recycle them.

  • So our solution is, you know,

  • there are resources that are actually allocated

  • when you create a MediaPlayer.

  • It's allocating memory, it may be loading codecs.

  • It may--there may actually be a hardware codec

  • that's been instantiated that you're preventing

  • the rest of the system from using.

  • So whenever you're done with them,

  • make sure you release them.

  • So you're going to call release,

  • you set null on the MediaPlayer object.

  • Or you can call reset and set-- do a new setDataSource,

  • which, you know, is basically just recycling your MediaPlayer.

  • And try to keep it to, you know, two or three maximum.

  • 'Cause you are sharing with other applications, hopefully.

  • And so if you get a little piggy with your MediaPlayer resources,

  • somebody else can't get them.

  • And also, if you go into the background--

  • so, and you're in-- on pause,

  • you definitely want to release all of your MediaPlayers

  • so that other applications can get access to them.

  • Another big one that happens a lot

  • is the CPU... "My CPU is saturated."

  • And you look at the logs and you see this.

  • You know, CPU is-- is--

  • can't remember what the message is now.

  • But it's pretty clear that the CPU is unhappy.

  • And this is kind of the typical thing,

  • is that you're trying to play too many

  • different compressed streams at a time.

  • Codecs take a lot of CPU resources,

  • especially ones that are running on software.

  • So, you know, a typical, say, MP3 decode

  • of a high-quality MP3 might take 20% of the CPU.

  • You add up two or three of those things,

  • and you're talking about some serious CPU resources.

  • And then you wonder why your, you know, frame rate

  • on your game is pretty bad.

  • Well, that's why.

  • So we actually have a solution for this problem.

  • It's called SoundPool.

  • Now, SoundPool had some problems in the 1.0, 1.1 release.

  • We fixed those problems in Cupcake.

  • It's actually pretty useful.

  • So what it allows you to do is take resources

  • that are encoded in MP3 or AAC or Ogg Vorbis,

  • whatever your preferred audio format is.

  • It decodes them and loads them into memory

  • so they're ready to play,

  • and then uses the AudioTrack interface

  • to play them out through the mixer engine

  • just like we were talking about before.

  • And so you can get much lower overhead.

  • You know, some are in the order of about 5% per stream

  • as compared to these, you know, 20% or 30%.

  • Depending on what the audio codec is.

  • So it gives you the same sort of flexibility.

  • You can modify--in fact, it actually gives you

  • a little more flexibility, because you can set the rates.

  • It can-- will manage streams for you.

  • So if you want to limit the number of streams

  • that are playing, you tell it upfront,

  • "I want," let's say, "eight streams maximum."

  • If you exceed that, it will automatically,

  • based on the priority, you know, select the least priority,

  • get rid of that one, and start the new sound.

  • So it's kind of managing resources for you.

  • And then you can do things like pan in real time.

  • You can change the pitch.

  • So if you want to get a Doppler effect

  • or something like that, this is the way to do it.

  • So that's pretty much it.

  • We have about ten minutes left for questions,

  • if anybody wants to go up to a microphone.

  • [applause]

  • Thank you.

  • man: Hi, thank you. That was a great talk.

  • Is setting the streamed music,

  • so you can respond to the volume control--

  • do you have to do that every time you create a new activity,

  • or is it sticky for the life of the app?

  • Sparks: It's sticky--

  • you're going to call it in your onCreate function.

  • man: But in every single activity?

  • Sparks: Yeah, yeah. man: Okay.

  • man: Hi, my first question is that currently,

  • Android using the OpenCORE

  • for the multimedia framework.

  • And my question is that does Google has any plan

  • to support any other middleware,

  • such as GStreamer or anything else?

  • Sparks: Not at this time.

  • We don't have any plans to support anything else.

  • man: Okay.

  • What's the strategy of Google

  • for supporting other pioneers

  • providing this multimedia middleware?

  • Sparks: Well, so, because of the flexibility

  • of the MediaPlayer service, you could easily add

  • another code--another media framework engine in there

  • and replace OpenCORE.

  • man: Okay.

  • So my second question is that, um--

  • [coughs]

  • that currently--

  • Google, you mentioned implementing the MediaPlayer

  • and the recording service.

  • Is there any plan to support the mobile TV and other,

  • such as video conference, in frameworks?

  • Sparks: We're--we're looking at video conferencing.

  • Digital TV is probably a little bit farther out.

  • We kind of need a platform to do the development on.

  • So we'll be working with partners.

  • Basically, if there's a partner that's interested

  • in something that isn't there,

  • we will--we can work with you on it.

  • man: Okay, thank you.

  • man: Does the media framework support RTSP control?

  • Sparks: Yes.

  • So RTSP support is not as good as we'd like it to be.

  • It's getting better with every release.

  • And we're expecting to make some more strides

  • in the next release after this.

  • But Cupcake is slightly better.

  • man: And that's specified by...

  • in the URL, by specifying the RTSP?

  • Sparks: Yeah. Right. man: Okay.

  • And you mentioned, like, 500 kilobits per second

  • being the maximum, or--

  • What if you tried to play something

  • that is larger than that?

  • Sparks: Well, the codec may fall behind.

  • What will typically happen is that you'll get a--

  • if you're using our MovieView, you'll get an error message

  • that says that it can't keep up.

  • man: Mm-hmm. So it will try, but it will--

  • It might fall behind. Sparks: Yeah.

  • man: Thank you.

  • man: My question is ask--

  • how about-- how much flexibility we have

  • to control the camera services?

  • For example, can I control the frame rate,

  • and the color tunings, and et cetera?

  • Sparks: Yeah, some of that's going to depend on the--

  • on the device.

  • We're still kind of struggling

  • with some of the device-specific things,

  • but in the case of the camera,

  • there's a setParameters interface.

  • And there's access, depending on the device,

  • to some of those parameters.

  • The way you know that is, you do a setParameter.

  • Let's say you ask for a certain frame rate.

  • You--you do a getParameter.

  • You find out if it accepted your frame rate or not.

  • Because there's a number of parameters.

  • man: Yeah, but also, in the-- for example, the low light.

  • So you want--not only you want to slow the frame rate,

  • but also you want to increase the integration time.

  • Sparks: Right.

  • man: So in the-- sometimes you want,

  • even in the low light,

  • but you want to slow the frame rate.

  • But you still want to keep the normal integration time.

  • So how you--do you have those kind of flexibility to control?

  • Sparks: Well, so that's going to depend

  • on whether the hardware supports it or not.

  • If the hardware supports it, then there should be

  • a parameter for that.

  • One of the things we've done is--

  • for hardware dev-- manufacturers

  • that have specific things that they want to support,

  • that aren't like, standard--

  • they can add a prefix to their parameter key value pairs.

  • So that will, you know-- it's unique to that device.

  • And we're certainly open to manufacturers suggesting,

  • you know, new-- new standard parameters.

  • And we're starting to adopt more of those.

  • So, for example, like, white balance is in there.

  • Scene modes, things like that are all part of it.

  • man: Okay. Sparks: Yeah.

  • man: I was wondering what kind of native code hooks

  • the audio framework has?

  • I'm working on an app that basically would involve,

  • like, actively doing a fast Fourier transform,

  • you know, on however many samples you can get at a time.

  • And so, it seems like for now--

  • or in the Java, for example,

  • it's mostly built toward recording audio and--

  • and doing things with that.

  • What sort of active control do you have over the device?

  • Sparks: So officially, we don't support

  • native API access to audio yet.

  • The reason for that is,

  • we, you know-- any API we publish,

  • we're going to have to live with for a long whi--

  • a long time.

  • We're still playing with APIs,

  • trying to, you know, get-- make them better.

  • And so the audio APIs

  • have changed a little bit in Cupcake.

  • They're going to change again in the next two releases.

  • At that point, we'll probably be ready

  • to start providing native access.

  • What you can do,

  • very shortly we'll have a native SDK,

  • which will give you access to libc and libm.

  • You can get access to the audio

  • from the Java-- official Java APIs,

  • do your processing in native code,

  • and then feed it back, and you'll be able to do that

  • without having to do MEMcopies.

  • man: And so basically, that would just be

  • accessing the buffer that the audio writes to.

  • And also, just a very tiny question about the buffer.

  • Does it--

  • does it loop back when you record the audio?

  • Or is it--does it record in, essentially, like, blocks?

  • Do you record an entire buffer once in a row,

  • or does it sort of go back to the start and then keep going?

  • Sparks: You can either have it cycle through a static buffer,

  • or you can just pass in new buffers each time,

  • depending on how you want to use it.

  • man: Okay. Thanks.

  • man: Let's say you have a game

  • where you want to generate a sound instantly

  • on a button press or a touch.

  • Sparks: "Instantly" is a relative term.

  • man: As instantly as you can get.

  • Would you recommend, then, the JET MIDI stuff,

  • or an Ogg, or what?

  • Sparks: You--you're probably going to get best results

  • with SoundPool,

  • because SoundPool's really aimed at that.

  • What SoundPool doesn't give you--

  • and we don't have an API for it,

  • we get a lot of requests for it,

  • so, you know, it's on my list of things to do--

  • is synchronization.

  • So if you're trying to do a rhythm game

  • where you--you want to be able to have very precise control

  • of--of, say, a drum track--

  • you--there isn't a way to do that today.

  • But if you're just trying to do--

  • man: Like gunfire kind of thing.

  • Sparks: Gunfire? SoundPool is perfect for that.

  • That's--that's what it was intended for.

  • man: Yeah, if I use the audio mixer,

  • can I control the volume

  • of the different sources differently?

  • Sparks: Yes. man: Okay.

  • Sparks: So, SoundPool has a volume control

  • for each of its channels that you--

  • basically, when you trigger a SoundPool sound,

  • you get an ID back.

  • And you can use that to control that sound.

  • If you're using the AudioTrack interface,

  • there's a volume control interface on it.

  • man: My question is,

  • for the testing sites, how--

  • does Google have a plan to release a certain application

  • or testing program to verify MediaPlayer

  • and other media middleware like this?

  • Sparks: Right.

  • man: 3D and everything else?

  • Sparks: So we haven't announced

  • what we're doing there yet.

  • I can't talk about it.

  • But it's definitely something we're thinking about.

  • man: Okay.

  • Another question is about the concurrency

  • there for the mobile devices.

  • The resource is very limited.

  • So for example, the service you mentioned.

  • The memory is very limited.

  • So how do we handle any--

  • or maybe you have any experience--

  • handle the 3D surface

  • and also the multimedia surface

  • and put together a raw atom surface

  • or something like that?

  • Sparks: So when you say "3D," you're talking about--

  • man: Like OpenGL, because you do the overlay

  • and you use the overlay and you--

  • Sparks: Yeah, I'm-- I'm not that up on it.

  • I'm not a graphics guy.

  • I'm really an audio guy.

  • But I actually manage the team that does the 3D stuff.

  • So I'm kind of familiar with it.

  • There's definitely limited texture memory

  • that's available--that's probably the most critical thing

  • that we're running into-- but obviously,

  • you know, that--

  • we're going to figure out how to share that.

  • And so--

  • I don't have a good answer for you,

  • but we're aware of the problem.

  • man: Okay. Yeah.

  • Just one more question is do you have any plan

  • to move OpenGL 2.0 for the Android?

  • Sparks: Yes. If you--

  • man: Do you have a time frame?

  • Sparks: Yeah, if you're following

  • the master source tree right now,

  • you'll start to see changes come out for--

  • we're--we're marrying 2D and 3D space.

  • So the 2D framework will be running as an OpenGL context,

  • which will allow you, then, to, you know--

  • ES 2.0 context.

  • So you'll be able to share between the 3D app

  • and the 2D app.

  • Currently, if you have a 3D app,

  • it takes over the frame buffer

  • and nothing else can run.

  • You'll actually be able to run 3D

  • inside the 2D framework.

  • man: Okay, thank you.

  • man: I think this question is sort of related.

  • I was wondering how would you take, like, the--

  • the surface that you use to play back video

  • and use it as a texture, like in OpenGL?

  • Sparks: That's coming, yeah.

  • Yeah, that--so you actually would be able to map

  • that texture onto a 3D--

  • man: Is there any way you can do that today

  • with the current APIs?

  • Sparks: Nope.

  • Yeah, there's no access to the--

  • to the video after it leaves the media server.

  • man: And no time frame

  • as far as when there'll be

  • some type of communication as far as

  • how to about doing that in your applications?

  • Sparks: Well, it's-- so it's in our--

  • what we call our Eclair release.

  • So that's master today.

  • man: Okay. Okay, thank you.

  • Sparks: I think-- are we out of time?

  • woman: [indistinct]

  • Sparks: Okay.

  • woman: Hi, do you have any performance metrics

  • as to what are the performance numbers

  • with the certain playback of audio and video to share,

  • or any memory footprints available

  • that we can look up, maybe?

  • Sparks: Not today.

  • It's actually part of some of the work we're doing

  • that somebody was asking about earlier.

  • That I can't talk about yet. But yeah.

  • There's definitely some-- some plans to do metrics

  • and to have baselines that you can depend on.

  • woman: And then the second question that I have

  • is that do you have any additional formats

  • that are lined up or are in the roadmap?

  • Like VC-1 and additional audio formats?

  • Sparks: No, not-- not officially, no.

  • woman: Okay.

  • woman: Hi, this is back to the SoundPool question.

  • Is it possible to calculate latency

  • or at least know, like,

  • when the song actually went to the sound card

  • so I could at least know when it actually did play--

  • if there's any sort of callback or anything?

  • Sparks: So you can get a playback complete callback

  • that tells you when it left the player engine.

  • There's some additional latency in the hardware

  • that we...we don't have complete visibility into,

  • but it's reported back

  • through the audio track interface,

  • theoretically, if it's done correctly.

  • So at the MediaPlayer level, no.

  • At the AudioTrack level, yes.

  • If that's...makes any sense.

  • woman: Okay, so I can at least get that,

  • even if I can't actually calculate latency

  • for every single call?

  • Sparks: Right, right.

  • woman: Okay. Thank you.

  • Sparks: Uh-huh.

  • man: Yeah, this is a question

  • about the samples processing.

  • You partially touched upon that.

  • But in your architecture diagram,

  • where do you think the sound processing effect

  • really has to be placed?

  • For example, it could be an equalizer

  • or different kind of audio post processing

  • that needs to be done.

  • Because in the current Cupcake version, 1.5,

  • I do not see a placeholder

  • or any implementation of that sort.

  • Sparks: So one of the things we're in the process of doing

  • is we're-- we're looking at OpenAL--

  • Have I got that right? OpenAL ES?

  • As the, um--possibly the-- an abstraction for that.

  • But it definitely is something you want to do

  • on an application-by-application basis.

  • For example, you don't want to have

  • effects running on, you know, a notification if...

  • The--you--you wouldn't want the application

  • in the foreground and forcing something

  • on some other application that's running in background.

  • So that's kind of the direction we're headed with that.

  • man: What's the current recommendation?

  • How do you want the developers to address?

  • Sparks: Well, the-- since there isn't any way,

  • there's no recommendation.

  • I mean, if you were doing native code,

  • it's kind of up to you.

  • But our recommendation would be if you're, you know,

  • doing some special version of the code,

  • you would probably want to insert it

  • at the application level and not sitting

  • at the bottom of the Audio Flinger stack.

  • man: Okay, thanks.

  • woman: Is it better to get the system service once

  • and share it across activities in an application,

  • or let each activity fetch the service?

  • Sparks: I mean, there's a certain amount of overhead,

  • 'cause it's a binder call to do it.

  • So if you know you're going to use it,

  • I would just keep it around.

  • I mean, it's just a-- a Java object reference.

  • So it's pretty cheap to hold around.

  • man: Is there any way to listen to music

  • on a mono Bluetooth?

  • Sparks: Ah, on a SCO?

  • Yeah, no. [chuckles]

  • The reason we haven't done that

  • is the audio quality is really pretty poor.

  • I mean, it's designed for-- for call audio.

  • So the experience isn't going to be very good.

  • Theoretically, you know, it's possible.

  • We just don't think it's a good idea.

  • [chuckling]

  • man: If you want to record for a long period of time,

  • you know, like a half-hour,

  • can you frequency scale the processor

  • or put it to sleep, or...

  • Sparks: It--well, that happens automatically.

  • I mean, it's-- it's actually going to sleep

  • and waking up all the time.

  • So it's just depending on what's--

  • man: But if you're doing, like, a raw 8k sample rate,

  • how big a buffer can you have, and then will it sleep in--

  • while that buffer's filling?

  • Sparks: So the--the size of those buffers

  • is defined in the media recorder service.

  • And I think they're...

  • I want to say they're like 2-- 2k at...

  • whatever the output rate is.

  • So they're pretty good size.

  • I mean, it's like a half a second of audio.

  • So the processor, theoretically,

  • would be asleep for quite some time.

  • man: So is that handled by the codec,

  • or is it handled by-- I mean, the DSP on a codec?

  • Or is it handled by--

  • Sparks: So the... the process

  • is going to wake up when there's audio available.

  • It's going to...

  • you know, route it over to the AMR encoder.

  • It's going to do its thing.

  • Spit out a bunch of bits that'll go to the file composer

  • to be written out.

  • And then theoretically,

  • it's gonna go back to sleep again.

  • man: No, I mean on the recorder.

  • If you're recording the audio.

  • If you're off the microphone.

  • Sparks: I'm sorry?

  • man: If you're recording raw audio off the microphone.

  • Sparks: Yeah.

  • Oh, oh, are you talking about using the AudioTrack

  • or AudioRecord interface?

  • man: The AudioRecord interface. ADPCM.

  • Sparks: Yeah, that's...

  • So it's pretty much the same thing.

  • I mean, if you define your buffer size large enough,

  • whatever that buffer size is, that's the buffer size

  • it's going to use at the lower level.

  • So it'll be asleep for that amount of time.

  • man: And the DSP will be the one filling the buffer?

  • Sparks: Yeah, yeah. The DSP fills the buffer.

  • man: All right, thanks.

  • man: One last question.

  • From a platform perspective,

  • would you be able to state a minimum requirement

  • on OpenGL performance?

  • Sparks: I'm not ready to say that today.

  • But...

  • at some point we'll--

  • we'll be able to tell you about that.

  • man: Okay, thanks. Sparks: Uh-huh.

  • Guess that's my time. Thanks, everyone.

  • [applause]

Sparks: Good afternoon. My name's Dave Sparks.

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it