Placeholder Image

Subtitles section Play video

  • >> So my name is [INDISTINCT] with the development programming [INDISTINCT] here in Google. And

  • I run [INDISTINCT] and that's where I first kind of met Steve, I had him come out for

  • this conference called The Ajax Experience to talk about all of these great stuff he

  • was doing at Yahoo! with respect to performance, that then went on become--to become his book

  • and the YSlow application Firebug extension that he's going to talk about. And he's got

  • some fun anecdotes about how he came up with the name and some other fun stories to share

  • with us. So, let's all welcome Steve from Yahoo!.

  • >> SOUDERS: I just want to say, every once in a while in your career, you get to meet

  • people who are incredibly smart and incredibly nice people too, and [INDISTINCT] is one of

  • those guys and, so I wanted to present you with a copy of my book, [INDISTINCT]

  • >> Oh, very nice. Thank you very much. Thank you.

  • >> SOUDERS: All right. Thanks for having me here.

  • >> [INDISTINCT] >> SOUDERS: I'm sorry?

  • >> [INDISTINCT] >> SOUDERS: I'm sentimental. I'm emotional.

  • That's okay. So, my name is Steve Souders. I'm the Chief Performance Yahoo!! and I understand

  • Doug Crockford from Yahoo! was here a month or so ago. So, it's--maybe not the first time

  • that someone from Yahoo! has come and done a presentation at Google. But I do have to

  • say, I will, you know, was wondering how this was going to go and, you know, was trying

  • to get myself prepared for this. And so, I just want to start off with this note, and

  • I don't know if anyone remembers Concentration, Ed McMahon was one of the hosts for it, yeah.

  • And so, anyone from the audience, want to take a guess? Sissy Spacek and someone who

  • looks like Peter Frampton in the movie Carrie. How many people here actually know what this

  • expression means or has ever used it? So, preaching to the choir might have been another

  • good one to use but in rebus that--I didn't think that was as interesting. So, my wife

  • said no one is going to know what that means. So, I thought I would look it up in Wikipedia

  • and bring that as well. So, it kind of mean something that's--so it's selling coal or

  • carrying coal to Newcastle, it's something that is may be foolhardy or pointless. And

  • so, given how fast Google sites are and the work that you're doing on performance maybe

  • there's not a lot of motivation or point for me to come and talk more about performance.

  • But as was shown with this expression, there can even be cases where someone comes and

  • surprises people by carrying coal to Newcastle and actually being able to sell it. So I'm

  • hoping that there are some takeaways that you can get from this performance best practices

  • that we found at Yahoo! and so let's dig into that. So I've been at Yahoo! for about eight

  • years working on various things and about three years ago some of the folks there asked

  • me to start a group to focus on performance. And so I called the group the Exceptional

  • Performance Group. And the charter is very simple. We're supposed to quantify and improve

  • the performance of all Yahoo! products worldwide. Such a really big charter, we scoped it down

  • a little bit. I kind of break performance into two areas, response time, so it's kind

  • of a black box perspective of it, and efficiency. So, efficiency actually is easier to correlate

  • to dollars. If you can do what you're doing with half the hardware, that's a lot of hardware

  • cost that you've saved, power consumption, rack space, but actually the area that I focused

  • on for the last three years is the response time. And the main reason for that is at Yahoo!,

  • of course, we want to reduce our hardware cost and our power cost, but really, it's

  • really important for us to have a very good user experience, very engaging products that

  • increase stickiness and user adoption. So that's where we've been focusing. And we've

  • also narrowed it down a little bit by focusing almost exclusively on web apps. So we're not

  • like trying to optimize Yahoo! Messenger. And were kind of like a consulting group within

  • Yahoo!. We've--we've done this at Yahoo! in other areas like security and, you know, redundancy.

  • We can have a small group of people; in my case, our group's about five to seven people.

  • We have a small group of people that focus full time on this area and can really do a

  • deep dive. And then we disseminate, we evangelize those best practices across the company. So,

  • we build tools like YSlow, I'll be showing later. We have lots of other tools. Some we've

  • released, some are internal. We look at a lot of data. We do research, so we'll--we'll

  • do--we kind of have on the job experience. We'll go out and we'll do consulting with

  • groups and we'll go, "Oh, wow! This is what really made this site go a lot faster. Let's

  • store that away as a best practice. Let's see if we can generalize it and make it applicable.

  • See if it's applicable to 80% of the properties," that's what we call them at Yahoo!, "80% of

  • the properties at Yahoo!." So we try to identify these best practices. Sometimes we have to

  • research the best practices. So no one's actually doing it yet but we think there's a way to

  • navigate through this--these set of constraints to find something that's going to accelerate

  • the user experience. So we do research and when we find these best practices we evangelize

  • them out to the--to the company. So when I started this, my background is more on backend

  • engineering. So, some of the first projects I did at Yahoo!, I ran the My Yahoo! team

  • for three years, I built an architecture that pushed all of our content, all our sports

  • scores and movie listings and TV listings worldwide. I wrote a caching layer between

  • all of our properties. So if you had a property that needed personal information, like someone's

  • calendar for that day, they wouldn't have to constantly bombard the calendar service

  • with web service calls. We could cache that in a way of updating the cache and expiring

  • the cache and flushing the cache. So when I was approached to start this performance

  • team, I thought, "Okay. Well, this is going to work well." You know, I've worked on large

  • scale systems trying to make them as efficient as possible, but I said, "Before I start digging

  • into this, if the--" we identified early on that the goal was really to make the user

  • experience faster. I said, "let me look at that. Let me analyze that, profile that, and

  • see what the long tent in the pole is." And so I found something that was kind of surprising

  • and I'm glad I found it right away because it completely flipped the approach to looking

  • at performance at Yahoo!. So this is--this is from a package [INDISTINCT] called IBM

  • page detailer, each of the bars is a http request. The first--you can see they're labeled,

  • and the first one is the html document and in this case, this www.yahoo.com with an empty

  • cache. And the thing that's--was very surprising to me was only 5% of the overall user wait

  • time was getting that html document and that includes not just the web servers stitching

  • the content together but the time for the request to go up and all of the packets to

  • come back. All of that was only 5%. And in my previous life, working on these large websites,

  • that's the part I was always focusing on, how can I build a better data base or cache

  • data in memory or change my compiler options, anything to squeeze out a couple more milliseconds.

  • And it turns out that I was working on the short tents in the pole. So really the long

  • tent is this, I call it the frontend. I think--most of the time when someone says, "The frontend

  • part," they might be thinking about like JavaScript executions, so this is bigger than that. It's

  • really--I call it frontend, everything after the html document has arrived to the browser.

  • Once the browser has that delivery of the--of the page, what does the browser have to do

  • from that point forward? I call that the frontend part. So there's certainly JavaScript html,

  • JavaScript CSS, parsing, JavaScript execution, but there also is a lot of other network time

  • involved there for all of these other http request. Most of the time there isn't, for

  • this other part, this frontend part, there usually isn't a lot of backend time, web server

  • time, because most of these are static assets that are just read off the disk. But some

  • of them could be hx:request or something that take a little longer. But in this case we

  • found that 95% of the time for a empty cache is spent on everything after the html document.

  • So I thought, "Okay. Well, what is it with a prime cache?" So even in that case its only

  • 12%. There's a little white gap here in the middle where the browsers reading those cached

  • assets off the disk and having to reparse the CSS and JavaScript and execute the JavaScript,

  • and at the end there are still a handful of request for images that have changed or ads

  • or beacons or something. But still only 12% was that backend part, getting the html document.

  • So this, you know, really surprised me and I said, "Well, maybe this is something peculiar

  • to www." But as I looked at more and more sites, I found that this pattern held true.

  • That only ten to twenty percent of the total end-user experience was spent getting the

  • html document down. So these are the top, as of about 6 months ago, these are the top

  • ten sites in the U.S. And there's only one that breaks that kind of guideline and that's

  • Google in a prime cache. So there's only 2 http request for Google with a prime cache,

  • just www.google.com. And--but even here, the html—-you'd think like--I think the other

  • one is a beacon, maybe I was in like a test bucket or something like that, I don't know

  • if it happens all the time, but here the html document was still only 36%. But this is the

  • exception; almost every site you got to you'll find what we call the performance golden rule,

  • that 80% to 90% of the end-user response time, the time the user is waiting, is spent on

  • this part after the html document arrives. And so that's--if you really want to improve

  • response times that's where you have to focus. I've got three good reasons why you should

  • believe that. One, just the a priori probability of making an improvement is greater if you

  • focus on that frontend part. In your wildest dreams, maybe you could cut the backend performance

  • in half while you put a 5% to 10% dent in the response time in the user experience.

  • But if you could cut that frontend part in half, you're going to make a 40% to 45% dent,

  • and that's going to be huge. Users are actually going to notice that. So just a priori, you

  • have a better chance of making a big difference. The changes are simpler. So, you know, if

  • you want to change--cut half of the backend response time, you have to come up with a

  • new database schema, optimize your code, replicate your architecture across multiple data centers

  • worldwide. Huge, huge complex task. Whereas, you'll see in a minute, I'll talk about some

  • of these guidelines in more detail, most of them are hours or days of work. Change your

  • web server configuration, rearrange the page a little bit, nothing that's really that complex.

  • So the changes are simpler and they're proven to work. So my group has worked with, probably,

  • a hundred properties at Yahoo!. It's pretty easy for us, in fact there's only one exception

  • I can think of where we've haven't been able to go in and with just a few days work cut

  • 25% off the response time of any website. And what's also cool is now that the book

  • is out and YSlow is out, I'm getting e-mails from people at small and large companies everywhere

  • that they've tried rules, you know, 2, 3 and 7 and they've cut 20% or 40% of the response

  • times. So this doesn't just happen at Yahoo! and it doesn't just happen at gigantic websites,

  • these rules really apply to almost any web page that you're building. So, I wanted to

  • talk a little bit--I'm going to have just a few slides about some research and then

  • we'll go through the bulk of the talk, is the rules that the guidelines we have. And

  • at the end I'll run YSlow and we could like look at some Google sites or any other sites

  • and analyze, to do kinds--kind of some live analysis. So, my coworker Tenni Theurer blogs

  • about most of our research on yuiblog.com, and I'm going to talk about one of these experiments

  • that we wrote up, the browser cache experiment, because a lot of our best practices hinge

  • on increase in the use of the browser's cache. But before we really could know how valuable

  • that was, we had to answer the question, "How many users come in with prime cache?" We didn't

  • know, no one knew and I couldn't find any research about that out there in the world.

  • So we made up this experiment where we put a little 1X1 pixel in a page, but we had to

  • be kind of careful about these two response headers. We put the expires in the past and

  • we made sure that on all the servers the file stamps was identical. So the last modified

  • timestamp for--no matter which server you went to, for this image, would always be the

  • same. And so, we know that there's going to be two possible http status codes returned

  • for this, either a 200 which tells us the user had an empty cache or 304 which tells

  • us that the user had downloaded this image previously, they have it in their cache with

  • this last modified header, so when they requested--when they went to the page again and requested

  • that image again they made a IMS, If Modified Since request or conditional git request,

  • they said, "I have a copy of this image on my disk that was last modified at this time,"

  • and the web server says, "Oh, we'll just use that one. 304 not modified, just use that

  • one." But those two different status codes, 200 and 304 are written into the web server

  • logs. So we can just go through the web server logs and find the ratios of these 200s to

  • 304s and answer these two questions. What percentage of users come in everyday with

  • an empty cache and what percentage of page views happen everyday with users with an empty

  • cache, and we'll see that those are two different numbers. So on the first day no one has seen

  • this image before, so 100% of the users come in at least once a day without having this

  • image, they come in at least once a day considered having an empty cache, and 100% of page view

  • have an empty cache. But then over time, more and more users are going to get this pixel,

  • this image, written into their cache and as they go to the pages that image is going to

  • get a 304 status code response. So after--and we've run these on various sites at Yahoo!,

  • it always happen after about fifteen days, we hit a steady state and it always comes

  • out to these numbers. Pretty much no matter what website we're looking at, about 80% of

  • the page views are done with a prime cache or full cache and twenty percent with an empty

  • cache. And for users, it varies between 40% to 60% and I don't mean that we run it on

  • property X and it will be 40% then 60%, 40% then 60%. What I mean is, if property X is

  • really sticky, then maybe only 40% of the users are coming in with an empty cache everyday,

  • whereas if property X is not very sticky, 60%. But it ranges in those values, in that

  • range. So, what does this tell us? Unfortunately, it tells us our job is really hard because

  • both of those numbers are really high. You can't ignore these users who are coming in

  • with an empty cache everyday. You know, 50% of your users about are coming in with an

  • empty cache and that's going to be their first page view, that's really going to set their

  • expectations, their impression of what the site's performance is going to be like. So

  • you have to optimize for that, you have to make sure that when people come in with an

  • empty cache for that first page view, the page still really fires fast. But then you

  • also have to think about 80% of the time people are coming in with a prime cache. So you don't

  • want to do things that really optimize the empty cash but then kind of penalize that

  • 80% of page views that happen over time. So, that was one experiment we wrote up, and you

  • can go there and you can read the other ones. So now we'll dive in to the 14 rules, and

  • I'm actually going to–-for the sake of time, because I want to try to wrap up in under

  • an hour. I'm going to skip four of them and a lot of these, you guys are already-–you

  • already know, you're already practicing. But certainly these four [INDISTINCT] CSS expressions,

  • reduced DNS look-ups, remove duplicate scripts, and configure e-tags, are things that I haven't

  • seen Google ever be a concern for Google sites. So we're going to skip those, and we'll go

  • through the others as fast as I can. So the most important one--and these are in approximate

  • priority order. So, we saw from that package [INDISTINCT] plot that there's a lot of http

  • request that happen after the html document comes down and that's [INDISTINCT] even more,

  • so our sites are becoming richer, there's more JavaScript on our pages, so an obvious

  • way to improve performance is to reduce the number of http request. But the constraint

  • I set myself in--I set for myself was, "How do you do that without changing the content

  • on the page?" Because I'm not a designer, I don't want to go back and tell the designers

  • that they should, maybe not have so many rounded corners or not so many images. Given the content

  • on the page, what the designers have come up with, how can you reduce the number of

  • http request in that design? So one is--and we're going to CSS sprites in the next slide,

  • but let me just do these other three. If you have six JavaScript files, just combine them

  • into one. So you--instead of six http requests, you're just going to have one, the overhead

  • is going to make that page faster, even if you have [INDISTINCT] on. Same thing if you

  • have four CSS files, combine them into one. Image maps are kind of old school but if it

  • works for you, if you have four icons that are next to each other and they could just

  • be one image map, do an image map. Inline images are very, very cool. Unfortunately

  • they're not supported in IE, but if you have-–this is where you can actually take, like the contents

  • of an image or JavaScript file or anything else, and actually inline it in the page.

  • And so, for us there are some cases where if this has been so important we've actually

  • forked our backend code to just deliver data URLs for browsers that support it. But the

  • most important one to have a big gain in performance is CSS sprites. How many people here have

  • ever built a sprite? Cool. The Google front page uses a CSS spite. So the idea is--how

  • many people here have ever played with a Ouija board? Wow! More people have built sprites

  • than use the Ouija board. I've never had that happen before. That's very cool. So the idea

  • is, you know, with the Ouija board, there's that glass thing that everyone puts their

  • fingers on, the planchette. And so think of--think of any box that you have in your page, [INDISTINCT]

  • expand or whatever, as that planchette, and the Ouija board is really all these images

  • that you've combined into a single image. So in this case, these are 60 icons. I don't

  • think the Yahoo! front page ever had all 60 of these on the page, but they have a lot

  • of them. And we said, "Wow, that's a lot of http request. Let's combine those into a Sprite."

  • And when we did that, they said, "Oh, well, if it's just one image, let's add these other

  • icons that we might want to use in the future." So we have 60 icons here, we just combined

  • them into one image with a little bit of white space separating them. And now we can take

  • our div like the planchette and we can slide it over the background image using the background

  • position CSS styling, and the size of the div will dictate how much of the background

  • shows through. So now we can get 60 icons available on that page without--with only

  • one http request. So this is really powerful if you have a lot of background images, combining

  • them into one image is a way to really cut down on http requests. There are some cases

  • where, depending on--if they're being used for corners, you might not be able to fit

  • all of your CSS background images into one sprite, but if you can go from 20 background

  • images to just four images that's still a huge savings. Something that's kind of interesting

  • is you would think that the overall combined size would actually be bigger that the sum

  • of the individual files because of the extra white space, but each individual file has

  • some color table and formatting overhead in it, so when you add them up, the combined

  • file, the sprite file is actually smaller, than the sum of all the individual files so

  • you save download size, too. Yes? Oh, and please ask questions in the middle.

  • >> [INDISTINCT] the white space? >> SOUDERS: Just so that your box might be

  • a little bigger than the actual image, like my icon might be a 16x16 size but maybe I

  • have one that's just a candle and it's very narrow. And I might have, if I said, "Well,

  • let me just try to squeeze them together as close as possible," I might have gone too

  • far inside that 16x16 space, so you--and then you just add that extra white space, just

  • kind of for a little bit of safety, right? It makes it a little more flexible. Okay.

  • So use the CDN. So I don't know what-–I just did some NS lookups on these popular

  • sites with CDNs that used and--yes, actually you guys host everything on the same domain.

  • So, maybe we could talk later and you could describe your [INDISTINCT] topology to me.

  • But you can see that, you know, Akamai is kind of the industry leader, and we made this

  • change on Yahoo! Shopping about two and a half years ago, they were still serving their

  • content off shopping@yahoo.com. We made this one change, moving all the static stuff to

  • our CDN, and it cut 25% off the response time, just this one change alone. And the point

  • I try to emphasize, especially to kind of start up companies is, make this step before

  • you try replicating your architecture because, like I was saying before, splitting your backend

  • application across multiple data centers can be very complex and time consuming and this

  • is pretty easy. There are some cost involved, paying for a service, Akamai or whatever.

  • There's a new one, Panther Express, I think, which is very, very reasonable. I just heard

  • about them this weekend. But make this step first before you ever decide to split your

  • [INDISTINCT] add a far future expires header. So I want to mention here, I wasn't--I received

  • a very nice compliment on my book, that they thought, being that I was from Yahoo!, that

  • I was very even handed in my analysis of different sites including Google and the book. I really

  • appreciated that. I tried, you know, very sincerely to be objective. So I just want

  • to point out here, I didn't pick Froogle because I was trying to find a Google site that was

  • bad. It's just that, www.google.com doesn't have really any content on it, it's not very

  • rich, and so it wasn't a very interesting one to analyze. So I looked around and I picked

  • Froogle as the kind of Google example to analyze for things like expires headers. And I should

  • also mention, that's why Craigslist isn't on here. Craigslist is still in the top 10,

  • but there's nothing on it. So I switched it out for AOL. But anyway, what we see here

  • is that, depending on the site that you're looking at, they more or less believe in making

  • assets cacheable on the browser. And so, the idea here, just really quick is, if you put

  • a far future expires header, now the browser has that on the disk, the next time the user

  • goes to the page, if it hasn't been flushed from the disk cache, the browser says, "Oh,

  • there's the thing I need. Oh, and look it's still valid. It's still fresh. It doesn't

  • expire until 2010 or 2038." So, it can just use it off the disk. Whereas, if you don't

  • have an expires header, the browser will see it on disk, if it hasn't been flushed, but

  • it'll say, "Oh, it's not fresh anymore. Let me make that If Modified Since conditional

  • git request," and the web server can still luckily return the 304 if the asset hasn't

  • changed at all. But that's still a round trip. If you're in Idaho on a slow Internet connection,

  • that's going to slow down the page. So it's really good if you have static assets, to

  • put a far future expires header on it, in that way the browser, the next time the user

  • goes to that page, it can just read it off the disk without any network traffic at all.

  • So, the challenge about this is suppose you have an asset, a JavaScript file or an image

  • and it changes, well, for years, we've had the policy at Yahoo! that once you push something

  • out to a large user base on the Internet, you can't change it, because there are so

  • many misconfigured proxies or overly aggressive caching technologies that they might not pick

  • up that change. And when we make a change, especially if it's like a bug fixed to a JavaScript

  • file, we want to make sure that every user gets that new file. So for years, we've had

  • the policy of putting a timestamp or a version number in our URLs of our static assets. So

  • if you're doing that, if you've already swallowed the pill that you can never change an asset

  • once it's pushed, the only way to do that is to change the file name, you might as well

  • make that asset cacheable forever. So, sometimes people say, like CNN, only two out of 151

  • assets have a far future expires header. Well, that's a news site. May be a lot of these

  • things were changing a lot, their photos, and they're constantly changing, it was just

  • easier for them. So, then what I do is I look at the median age. This is the number of days

  • between--when I ran this test and how far back the last--the asset was modified based

  • on the last modified header, so I can count the number of days that has been since this

  • asset was modified. And if I look at the median of those on CNN, it's been seven months. Fifty

  • percent of the assets on this page that are not cacheable have not been touched in seven

  • months. And so we can see that value for other sites. So we kind of had the same attitude,

  • especially with JavaScript and CSS at Yahoo!. "Oh, well, you know, those are changing a

  • lot." And when we looked at the last modified header, we saw that they really weren't changing

  • as much as we thought. So it's a good practice to look at that and figure out if it would

  • make sense to give everything a far future expires header. So, it's kind of interesting

  • to look at these rules from the perspective of, do they help empty cache, so first page

  • viewers, only prime cache or subsequent page viewers or both. So this is one that—-and

  • the ones that help both are really, really key. So this is one that helps both, especially

  • first--people who visit the site for the very first time. So you can just compress anything

  • that's not already binary. You don't want to compress images or flash or PDFs but--not

  • just HTML documents but JavaScript files, CSS files, JSONRequest, Ajax request, all

  • of those can be--can be compressed. And typically you'll cut about 70% off the size of what's

  • sent over the wire. And so everything sort of get there a lot faster. And there's always

  • some [INDISTINCT] browsers that might have problems but that number is getting smaller

  • and smaller. And you can do different approaches like a white list approach to turn in on gzip

  • in or not. Yeah and here's the point I was making. It's pretty popular to gzip the HTML

  • document but it's less well known or less practiced to gzip the CSS and JavaScript.

  • So, that would be good to do. So this is an interesting one. This actually doesn't make

  • the response time, the mechanical instrumented response time any faster, but it makes the

  • proceed response time faster. And that's really what we're after, trying to make that user

  • experience feel as fast as it can. So the thing that happens here is it's a little different

  • in IE and Firefox. In IE it gets the HTML page, parses it, finds all the assets that

  • have to be downloaded, all the components or resources and maybe at the bottom, there's

  • a CSS file, right? Well IE says, "Well, that CSS file might change the way that I draw

  • elements in the page, tables, or anchors. So what I'm going to do is, I'm not going

  • to draw anything in the page until I download that CSS file." Well, since it's the last

  • thing in the page, it's one of the last things most likely to get downloaded. So the IE will

  • have all the static html text in the page, it might also download images and other things

  • in the page, it's going to hold all of those, its just going to leave the page white until

  • it downloads that final CSS file and then also [INDISTINCT] draw the page. So this isn't

  • what you want. You want to get the CSS files, the style sheet declarations, inclusions up

  • in the head and that's also what the specs has to do. So it's a good thing to do. Firefox

  • is different, Firefox will render things when it has them and so if you had the same scenario

  • where you had a style sheet at the bottom, it would render the html, render the images,

  • finally it would download the style sheet which would say, "Draw everything differently,

  • change the font, change the way anchors look." And now you have what's called the flash of

  • unstyled content. The page gets redrawn and it's a flash experience to the user which

  • isn't pleasant either. So the key is to put the style sheets in the head. Another small

  • change to try to follow is don't use the @import rule because in IE that will cause the style

  • sheet to actually be deferred later in the page, and since it's deferred you'll have

  • this non-rendering, no progressive rendering behavior in IE, so use the link tag for pulling

  • in style sheets. And kind of--the other side of the coin is scripts. Scripts have two bad

  • behaviors. One is they block all parallel download. So with http 1.1 you can download

  • two components per host name in parallel. So if everything was on one host name you'd

  • see a stair step pattern like this. But maybe you're using two or three host names, so you

  • can actually get some things in parallel but still on any given host name no more than

  • two. I've actually gotten IE to download 114 things in parallel. But as soon as the browser

  • hits a script, and this is in both IE and Firefox, it won't download--start any other

  • downloads no matter what the host name is until that script is returned. So, one thing

  • you want to be careful of is putting the scripts higher than they need to be. If we could move

  • the script, maybe we could move it all the way down. Maybe it actually is doing a document

  • write or something like that. But if we can move it just a little ways down, some of those

  • images would actually be drawn. Once they got downloaded the browser would go ahead

  • and render them and all of the text above the script would be rendered. But not only

  • do scripts block parallel download, they also block rendering. Anything that's below the

  • script will not be rendered until the script is downloaded. So its not always possible,

  • you know, you might have scoping issues that require it to be higher in the page but if

  • not, move the scripts as low in the page as possible or better yet load them with an unload

  • event handler. And defer doesn't help, it only works in IE and it's not supported by

  • Firefox. And even in IE--it doesn't defer to the end, it just defers at a couple resources,

  • so it will still have this blocking behavior. We'll skip rule seven. Rule eight. So far

  • I've kind of talked a lot about scripts and style sheets being external but should you

  • really make them external or not? And so, that really kind of depends on the way users

  • interact with the site. So if this is a site for example, that users only come to three

  • times a month and they only have one page view, you should make your JavaScript and

  • CSS external, inline it. Because the advantage of making it external is, it will be cached

  • and the next time the user comes, they'll--they won't have to download that 10k or 40k of

  • JavaScript. But the user is only doing one page view. Their next page view might not

  • be for eight or ten days and by that time, especially when we look at how many users

  • come in with an empty cache, that asset might have been purge from their cache. So, it might

  • be the fastest experience for that type of site to inline everything. But then if you're

  • a site that has multiple page views per session or a high re-visitation rate, you might want

  • to make those--that JavaScript and CSS external components with a far future expires header,

  • make them cacheable, and now when the user comes in they might have to do an extra http

  • request on their first page view but now for the next four page views that will be a faster

  • experience, it will be less bandwidth cost. And there's a couple extra credit things you

  • could do here, post-onload download is, okay maybe like, mail might be a good example.

  • On the first page of mail, the mail launch page, I want that to be really fast. So I'm

  • going to put all my JavaScript and CSS in the page itself, then in the onload event,

  • I'm going to download that JavaScript and CSS again but I'm going to download them in

  • their external file format and they'll get written to cache. So now when the user actually

  • goes to the next page, you can include those external assets and they'll already be in

  • the cache and that page will be very fast because it'll read the JavaScript and CSS

  • from cache rather than having to download it over the wire again. The problem with that

  • is, that first page, that launch page, is always going to have that JavaScript and CSS

  • in it even if they have the external assets. So then you can do something like, when you

  • download those external assets, set a cookie, a pretty short lived cookie, maybe session

  • based, maybe just a day or a week. And now on the backend server, when you're serving

  • the launch page, look for the presence of that cookie. If you see the cookie, it's a

  • good indicator that you have the external assets, so you script source. But if you don't

  • see the cookie, inline it and do the post-onload download. Rule 10. Minify JavaScript. Google

  • certainly does this. You'll see not-—most of the top 10 sites don't do it. Minification,

  • just removing white space, comments. But you could also minify inline scripts, too, and

  • there's even fewer sites that do this. And I would even argue it might be easier to do

  • that because you just have to hook in--hook all your JavaScript--script insertions into

  • a function. And JSMin is kind of the most popular one and it is written in almost in

  • every language, written by Doug Crockford at Yahoo! and it's available in a lot of different

  • languages so it would be easy to hook into your backend system. But I wanted to point

  • out, just recently, Julien LeComte has come out with the YUI compressor that's available

  • as of about a month or two ago. And it works on JavaScript and CSS, so that's nice. But

  • it's more of an obfuscator. Typically obfuscators have--so they have greater savings, you see

  • here, minification cut 20%. Obfuscation where we take long function, variable names, symbol

  • names and make them shorter has even greater savings as you would expect, but they can

  • also introduce bugs. I've seen that happen. But the nice thing about the YUI compressor,

  • Julien has taken a little different approach there, it's very safe. It's almost as safe

  • as JSMin. Now it's not as fast as JSMin, so if you're doing real time clean up of your

  • JavaScript, I recommend using JSMin, but if you're doing that as part of a build process,

  • the YIU compressor will actually have greater savings and it works on CSS as well. Avoiding

  • redirects, so I kind of call this the worst form of blocking. I talked about how scripts

  • can block downloads, but if you put a redirect in front of the html response, everything

  • is delayed. So we can do things at our html page that I've been talking about to increase

  • parallelization, to increase progressive rendering, but if you put a redirect in front of the

  • html document, none of that hard work can be taken advantage of. So try not to put redirects

  • in front of your html documents. Last rule I'm going to talk about is making the Ajax

  • cacheable. So Ajax requests are dynamic, and a lot of times we think--like our html documents,

  • we almost never want to make cacheable because they're very dynamic. Okay, so Ajax responses

  • are dynamic. Well, yes, so maybe we shouldn't make them cacheable. And a lot of times Ajax

  • responses are personalized. They have parts of them that are only appropriate for that

  • single individual user. So you would kind of think, "Well, maybe I shouldn't make that

  • cacheable." I mean that's not a static image for example. But the scenario I always bring

  • up is something like a mail web app. Maybe the mail application on launch is doing an

  • Ajax request to get your addresses, right? And so those addresses, maybe you change your

  • addresses a lot. For me, maybe I add an address once a week. So if I go to mail three times

  • a day, seven days a week, 21 times I'm going to that page and it's making 21 downloads

  • of my Ajax address book. Whereas, if instead we just put something in the URL that indicated

  • the timestamp of that address book, when was the last time I edited my address book? And

  • just--on the backend server when you're stitching together the URL for that Ajax request, make

  • sure that you embed that in the URL. If I haven't edited my address book, the URLs the

  • same, the Ajax request will just read from disk. But if I have changed it, the Ajax URL

  • will change and now I'll make another request but I'll make it cacheable. So look at your

  • Ajax request, you might be able to make those cacheable as well. So then really quick, I'm

  • going to talk about the second edition of the book, I'm going to give the little prelude

  • to that. I've got five new rules. One is split dominant content domains. So this is Google

  • News, and you see that stair step pattern, two, two, two, two, two, right? And that's

  • because all of those images are using the same host name. So, and we could see that

  • kind of two, two, two, right? You can see that whenever, you know, a dark blue one ends,

  • the next one starts. The next one starts, the next one starts. The light blue one, so

  • on and so on. So if we actually use two host names there instead of one, we could get four

  • downloads in parallel and look what it does to the overall response time of the page.

  • It cuts about a third off the response time, right? So we've done studies and we wrote

  • them up there about, well, you know, how far should I take this? Should I use three hostnames,

  • four hostnames, five hostnames? We found that once you go to four and above, the benefits

  • start degrading because of DNS look-ups and I think thrashing on the browser side, on

  • the CPU side, but definitely splitting things across more than one domain is something to

  • investigate. Just be careful of cookie weight, the expiration model is different than http.

  • So a lot of times I think when people are creating cookies they'll--they don't think

  • about aggressively cleaning up that cookie weight. So, try not to set your cookie expiration

  • dates too far out or you might see those cookies lingering for a while. At Yahoo!, we host

  • our static content on a different domain. It's not on Yahoo!.com, and that is to avoid

  • that cookie weight. For our static content, it doesn't change based on the user's cookie

  • state, so we serve it on a different domain and those http headers are much smaller. Minify

  • CSS now with YUI compressor; I'll start doing more research on that. And then, we also want

  • to do, maybe not obfuscation but simplification, like change fffff to just fff, 0PX to just

  • 0. Use iframes as wisely. Especially for ads, putting third party content inside an iframe

  • has some benefits especially sandbox in JaveScript, but just be careful about how much you do

  • that. Don't go creating iframe willy-nilly. They're very expensive [INDISTINCT] just a

  • blank iframe can add twenty to fifty milliseconds to the load time of the page. And so just

  • be careful about that. I got some more details about that. That'll be coming. And look at

  • optimizing images. I haven't talked about this too much in my previous work but we've

  • seen sites where we could save 80% on the size of images by changing the format, we're

  • just optimizing the format that we are using without any loss in image quality. So I'll

  • be writing about that, too. So I'm almost done. Yahoo! Search is our poster child. We

  • started working with them about a year and a half ago. We recommended these kind of frontend

  • deep performance changes, and over the year and a half we've cut the response time by

  • 40%for broadband users. You can see--I always make the analogy, it's like closets space

  • at home, right? You clean out the closet and the next week it's full again. So you can

  • see they're like, you know, we drive things down and then the people go, "Oh, it's a lot

  • faster now, now we can add features." So you're always towing the line, right? You got to

  • fight the battle to keep everyone focused on performance. But sometimes it does allow

  • you a little room to add features that users really want, that improve the experience.

  • So, my book is out, it's been out for about two months. High Performance websites. Tenni

  • and I do a lot of talking at conferences. I mostly blog on YDN, she's on yuiblog and

  • we've also released YSlow. How many people here have used YSlow? Very cool. How man people

  • here use Firebug? Great. So there's the URL, you can download it. It's a performance lint

  • tool. It gives you a grade against these 13--actually the first 13 rules. It doesn't recognize Ajax

  • right now, we're working on that. It's an extension to Firebug so it's an extension

  • to an extension, and it's open source. So what I wanted to do now was just really quick,

  • look at a couple sites. So, here's www.google.com--oh, yes?

  • >> [INDISTINCT] performance getting worse again? Yeah, do you have any, sort of, ongoing

  • testing or, you know, submit rules that ensure people don't make performance worse?

  • >> SOUDERS: Yes, so the question was, do we at Yahoo! have any kind of ongoing analysis

  • or monitoring that helps us make sure that people aren't making the performance worse.

  • Yes we do. We have ways of running YSlow, there's this little option to run YSlow in

  • autorun mode which means it will kick off automatically and if you--I'll give you my

  • card, if you send me an email there's a couple open source technologies where you can actually,

  • from, like a PerlScript, reach inside and touch the DAM of the browser. And so you could

  • pull out--you could build some kind of harness that could run things in an automated fashion

  • with a set of scripted URLs. But really the biggest thing that we've done and the reason

  • why we did--why we did YSlow and FireBug, I had--I first wrote YSlow about two, two

  • and a half years ago, as a bookmark lit, then a Greasemonkey script, and then Firebug came

  • out and it really took off at Yahoo!. Most of these rule, some of them have to do with

  • CDNs and web server configuration but really, a lot of these rules are targeted towards

  • frontend developers. And Firebug is the tool of choice, at least at Yahoo! for frontend

  • developers. And so we wanted to try to keep this focus on performance during the product

  • development cycle. And the way to do that, that has really worked out, is put it in the

  • tool that the--that the developers, the frontend developers are already using. So that everyday,

  • as they're doing stuff, they'll just run YSlow every once in a while. And they'll know what

  • their grade is for the thing that's out right out now live and they can see if they're getting

  • better or worse. And it kind of keeps people on their toes or at least, it makes them aware

  • that there's a--if they're adding a new feature, they're might be a response time penalty for

  • that, a tradeoff to consider. So, we see here, Google got a 99, it's not possible to get

  • a 100. So this is the highest grade you can get. But I just wanted to point out, so we

  • can see here that it took about 170, I'm on about four megabits per second right now,

  • it took about 170 milliseconds and it's about 11K. So now I wanted to do this. How am I

  • doing on time? Pretty good? All right. So very different styles, two front pages, right?

  • Yahoo! has a lot more content and again, I'm not a designer, I don't want to talk about

  • the--the tradeoffs there. But I just want to point something about the way YSlow works.

  • So we see here, this page took one second and was a 153K and yet it still got an A,

  • how is that? YSlow, as I mentioned, I try not to make recommendations about changing

  • design. YSlow looks at the quality with which a site was built. So, different sites are

  • going to have different design requirements, some might need more images, some might need

  • less, some might need more JavaScripts, some might need less. So, what YSlow does, is it

  • says, "Given what you've done in this page, have you done it the best way possible?" So

  • you might have a lot of JavaScripts but did you minify it? You might have a lot of CSS

  • background images but did you do them with sprites or not? So, even with-–so YSlow

  • doesn't take any consideration at all the size or response time of any of the assets

  • in the page, it's just looking at the way the page was built. So, I'll come back to

  • that in a minute but I just wanted to--you know what, actually is--we're running out

  • of time a little bit. So, I'm going to wrap up, and then I'll do some questions. So, on

  • that point, here I have the top 10 sites, their total page weight response time, YSlow

  • grade. I did a little correlation coefficient calculations, so just a reminder, correlation

  • coefficients run from -1 to 1. -1 means no correlation, 0 means inverse correlation,

  • 0 means no correlation, 1 means highly correlated. Anything above 0.5 is considered strong or

  • high correlation, typically. So we're not too surprised to see that there's a very high

  • 0.94 correlation between response time and page weight. So, if you have a bloated page,

  • it's typically going to be slower. But what I was very satisfied to see was this high

  • correlation between response time and YSlow grade. So, YSlow was built to measure these

  • best practices that if you follow them will make your page faster. And what we found and

  • we've looked at much larger number of sites than just these, what we found is, during

  • development, where sometimes it's difficult to gather a lot of response time measurements

  • on your page as you're building it, you can have a pretty good idea of how the page is

  • going to respond based on your YSlow grade. If it's getting better than what's out there,

  • you're probably going to have a faster page. If it's getting worse, your YSlow grade, you're

  • probably going to have a slower page. So, I think that's very powerful. So, I wanted

  • to wrap up the takeaways. The main thing that I emphasize is looking at performance from

  • a different perspective. I don't want to say, "Don't optimize backend performances, especially

  • for hardware cost power consumption," but if you really want to put a dent in response

  • time, you got to look at this frontend part of it. A lot of these things are not that

  • hard to implement and it's not that development from this point forward is going to be 10%

  • slower because of these practices, a lot of them are, "Get this mechanism in place, and

  • then it's behind you." It's a gift that keeps on giving. So, harvest this low-hanging fruit

  • early on and then you'll just reap the benefits of it on an on-going basis. So, that's that

  • one, make the investments. And you should feel empowered that you control response times.

  • So it used to be, I think that we felt the [INDISTINCT] speed was actually the most critical

  • controller or variable for the end-user response time, but we've been able to cut some response

  • time for like 50% by making these engineering changes. So there's really a lot that you

  • do control in how fast your pages will appear to users. And finally, look out for number

  • one. At Yahoo!, the users are number one; we're always focused on doing harder work,

  • us taking on harder work, to make their experience better. We kind of feel that we're the last

  • line of defense before the pages get out to users and we want to do everything we can

  • to make that experience as fast as it can be. We think that's critical at Yahoo!. And

  • I hope all of us doing frontend work feel that same way and we're all looking out for

  • users on the Internet. And that's it, thank you. So, I'll take questions.

  • >> [INDISTINCT] >> SOUDERS: That's the Verrazano Bridge in

  • New York City. The question was, what bridge is that? Yes?

  • >> So how exactly do you measure [INDISTINCT] >> SOUDERS: Well, you can use-–we have various

  • ways of measuring response times. You can use services like Keynote or Gomez. You can

  • use tools like Fasterfox. The definition I have for response time is the unload event

  • to the onload event. And, excuse me, the main reason that I've defined it that way is because

  • it's a measurement that we can apply across all pages. And it's absolutely true that the

  • more important thing is not to optimize this instrumentation that's been put in place,

  • it's to optimize the user experience. So, if users engage with your page after the first,

  • you know, everything above the fold is rendered or like in My Yahoo!, the first modules across

  • the top are rendered, then that's great. And if you can, try to figure out a way to measure

  • to that point in the page where users feel the page is engageable and they feel it's

  • downloading. But if you're not sure, we fall back to that definition, unload to onload.

  • And it also, you can play games, you can move a lot of stuff that's critical to the user

  • experience to the unload event and make that mechanical time shorter. And so we try to

  • emphasize, you know, whatever time you're measuring, try to understand how that instrumentation

  • works and really understand how it reflects what the user perceives the response time

  • to be. Yes? >> More and more websites are [INDISTINCT]

  • and there's a lot of [INDISTINCT] do you see YSlow [INDISTINCT] performance [INDISTINCT]

  • in the future? >> SOUDERS: The question was, we observed

  • that there's a lot more web 2.0 HD [INDISTINCT] web apps that have a lot of JavaScripts that

  • runs on the client side, do I see YSlow helping to measure that or make performance suggestion

  • there. I'm not envisioning that right now. We're starting to, at Yahoo!, especially with

  • people like Julien and Doug there, we're starting to try to more formally gather our JavaScript

  • performances best practices similar to what we've done with kind of overall frontend browser

  • web server interaction. And I also feel that Firebug does a pretty good job of profiling,

  • in fact, does an incredibly good job profiling JavaScript code on the client. And so, I don't

  • feel that there's such a lack of tools to help with that right now.

  • >> [INDISTINCT] >> SOUDERS: Yes, this gentleman pointed out

  • that performance will also vary browser to browser. So you have to pay attention to that.

  • You know, I will say that the 14th rule came out from looking at more Ajax-y apps where

  • there's so much JavaScript, you know, hundreds of [INDISTINCT] of JavaScript in some cases,

  • that all of the sudden the amount of http traffic was not really the issue. It was that

  • JavaScript code executing. Now, we did kind of--we did come up with rule 14 which is,

  • "Okay, how much of that hundreds of [INDISTINCT] of JavaScript that's in, maybe in an Ajax

  • response or Json response, could that be cache?" And that will help. But there's still this

  • whole other world where even if--even if everything was in cache, just reading that JavaScript

  • off disk and having to execute it could take seconds. And there's some exciting things

  • coming up in a JavaScript performance out of Microsoft and, I forget who the other folks

  • are. So stay tuned to that, they'll--you know, some ways of improving JavaScript performance

  • by--in order of magnitude or more. Any other questions? Yes, in the back.

  • >> What's the intuition behind the browsers stalling [INDISTINCT] when it's downloading

  • some scripts. [INDISTINCT] >> SOUDERS: The question was, why do the browser

  • developers stall things when they download scripts. So the idea is that, suppose you

  • downloaded scripts in--so, I want to preface this by saying I've talked with the IE team

  • and I'm hoping to--I've passed on this suggestion to the Mozilla Team, there's no reason you

  • couldn't download scripts in parallel and get everything to work correctly but it would

  • be a fair amount work. And here's the reason they don't do it now, they didn't do it initially.

  • Suppose you have two scripts, A and B, A is a hundred K and B is one K, but B requires

  • A and you them in that order and you decide to download them in parallel. Well, guess

  • what? B is going to come back first, it's going to be executed and it's going to generate

  • errors because the codes it relies on has not been downloaded yet. So, that's one reason,

  • there's also the document write reason. So, that's one reason why they only download one

  • thing at a time to make sure that scripts will be downloaded and executed in order.

  • But clearly we can all think of ways that you could achieve that without having to have

  • this blocking behavior. I will say that Opera 6 does do download of images in parallel with

  • scripts and that's kind of nice but still it will not download more than one script

  • at a time. Okay. Two more question--okay, three more questions.

  • >> Do you think [INDISTINCT] how much of [INDISTINCT] you can get by turning on pipelining?

  • >> SOUDERS: Yes, the question was, have we looked at how much pipelining helps with performance.

  • It helps a lot, I don't have numbers that are firmer than that because we feel it's

  • moot. I forget what it is, like, Firefox has pipelining support but it's turned off by

  • default. And IE, even IE 7 doesn't support pipelining. So it going to be so long before

  • we could really take advantage of that, we've looked that a little but we haven't spent

  • too much time quantifying it. >> Have you looked at JavaScript parse times

  • at all? Are there certain constructs more expensive to parse?

  • >> SOUDERS: Do you mean parse time separated from execution time? No, I don't have any

  • best practice--the question was, have we looked at best practices for making JavaScript parse

  • time faster, nothing comes to mind. >> When you're actually in the process of

  • doing [INAUDIBLE] how are you measuring [INDISTINCT] YSlow? Like before and after you do everything?

  • >> SOUDERS: Well, the grades we measure with YSlow, do you mean the response times?

  • >> MALE: Right. Right. Like if you have some specific [INDISTINCT] how do you measure [INDISTINCT]

  • as you go, as you [INDISTINCT] >> SOUDERS: Oh, well, a lot of times--so you

  • don't mean, mechanically, how do we measure the time? Just what's out process for...

  • >> Yes. [INAUDIBLE] >> SOUDERS: Yes. So, you know sometimes there's

  • nothing you can do, you make a change and it just goes out and you just have to measure

  • the site that's out there. And, there's lot of variable that could make that a--that comparison

  • invalid. So, what we try do is, we try to run things in parallel. So we'll have a sub-set

  • of users, some that are exposed to the benefit and some that are the control and then among

  • those users we can see how fast or slow the page is. So, that's the best way do side by

  • side comparisons. >> [INAUDIBLE]

  • >> SOUDERS: Yes. Or we can use scripted means, as well, Keynote, Gomez things. And maybe

  • last question, and then I'll wrap up. Was there one more? Yes?

  • >> Have you received contributions to YSlow from outside of Yahoo!? And if so, what kind?

  • >> SOUDERS: The question was, have I received contributions to YSlow from outside and if

  • so, what kind. Now the-–it's an open source license but it's not open code, we don't have

  • the code published. We get-–I won't say lots, maybe one or two emails a day of bugs

  • or suggestions. And we're, you know, working on a new release which I do monthly releases.

  • But certainly, going to an open code, like Firebug has, is something that we've talked

  • about and I imagine will come in sometime, but it's not on the road map right now. Okay.

  • So I want to wrap up and let everyone get back to work. Thank you for having me.

>> So my name is [INDISTINCT] with the development programming [INDISTINCT] here in Google. And

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it