Name: Performance Tuning Best Practices for MySQL
Uploaded: 2015-02-13T00:41:13.000Z
Duration: 43 min 33 s
Description: Thousands of YouTube videos with English-Chinese subtitles! Now you can learn to understand native speakers, expand your vocabulary, and improve your pronunciation...

>> Okay. Hi, guys. So this is Jay Pipes. He's a MySQL employee. He's in charge of North

American Community Relations. I met him this week at the MySQL User Conference which is

down in Sta. Clara. Now, the deal is that we're going to host this on Google video.

So if you have confidential questions, if you could leave them till the cameras are

turned off, that would be cool. With that, no more ado, Jay Pipes.

>> PIPES: All right. Nice to meet all of you. I'm going to be giving a talk that I gave

on, I think, Tuesday or Wednesday, on "Performance Tuning Best Practices for MySQL." Some of

the things are version specific but for the most part, it's pretty general. We're going

to be covering Profiling and Benchmarking Core Concepts, Sources of Problems that you

might encounter, Index Guidelines, Schema Guidelines, Coding Techniques, and lastly,

we'll go over some general things about tuning server parameters. Since the audience is,

I think, sort of mixed levels of experience, we'll just shout out questions or tell me

if you want some grand stuff or so. All right, Benchmarking Concepts. The general idea behind

benchmarking MySQL is to, obviously, have a track record. You know, when you change

things in your code, in your application, or the configuration files with MySQL, or

your schema design, or you're indexing, you want to make sure you have a track record

of what you're doing. Because a lot of times, if you--if you don't keep a track record and

don't do benchmarking, you won't really know whether a specific change that you made had

a detrimental impact or whether it actually increased performance. So, the baseline--and

you always want to have your baseline for your starting point. And always give yourself

a target. Don't just do it so it's a moving start--moving target, you always want to have

an end goal, you know. You either want to get, you know, 20% increase in performance.

So that's your goal. But just don't have it, you know, just kind of end with nothing because

then you'll just spend years and years and years trying to, you know, tune milliseconds

out of your--out of your application when there's really no benefit to it. So, change

only one thing at a time. A pretty basic concept but it really--it goes a long way in benchmarking.

If you change two things and you rerun a test, how do you know which one of those two things

actually affected the difference between the two tests? So, always change one thing at

a time and rerun the benchmarks. And record everything, configuration files, information

on the hardware, information on obviously the operating system, the version of MySQL,

and everything to do with the benchmarks. Keep it in a separate folder and keep everything

for historical record. A lot of times, you'll think that you won't need information when

you're running benchmarks. And then, six months later, you realize that, "Wow! It would be

really useful if I had--oh, I threw it away". So, always keep information from your benchmarks

so you have a historical record of what you did. And specifically with MySQL, you want

to disable the query cache whenever you're running benchmarks. The reason is it will

skew the results that you get from read-intensive queries and write-intensive queries as well.

So, disable the query cache by setting the query cache size equal to zero when you--when

you run benchmarks, so. Prof… Yes. Go ahead. >> [INDISTINCT] benchmarking for [INDISTINCT]

problem with the operating system, caches [INDISTINCT]?

>> PIPES: Right. Yes. The question was--what you were experiencing that the operating system

caching was skewing the results of the benchmark. Generally, the idea is either you can disable

caching if it's actually getting in the way or if you bump up the actual number of runs

in the benchmark tests you can minimize the skew effect. So that's usually what we recommend.

But, when you get into specific caches there's very specific things to do. So it kind of

depend on the exact scenario that you're doing, so. But, yes. Bumping up the number of runs

in the benchmarks generally tends to mitigate a lot of that, so. Does that answer your question?

Okay. Profiling Concepts, profiling is a little different than benchmarking. Generally, you're

profiling a production or running a test system that you're trying to figure out, you know,

diagnosing what's going on while it's running. So, instead of doing--you know finding the

stress or the load that the system can take, you're actually trying to pinpoint where the

bottlenecks are in the code while it's running. With MySQL, the best tool that you have is

the EXPLAIN command. So, regardless of what version of MySQL you are running, you really

want to get familiar with everything in the EXPLAIN command. Understand what the access

types, does everyone know what the EXPLAIN command is? Everyone use it? Okay. Does everyone

know what every single type of access type is? Okay. It's sort of a poorly named column

in the--in the EXPLAIN results. But, it actually describes the path or the optimization pattern

at which the optimizer's saying, "Okay well, I'm going to, you know, access, you know,

these results at and join it to these results at." And each of those things, the ref, the

refrenol, the equaf range, all the different access types mean different things. And the

performance that you get from each of them, you can go into MySQL manual, depending on

your version. You can see which ones have a better performance for different types of

queries. It's not always the same. So, you want to make sure you understand those. Understanding

obviously that if you see all in the access type, it means it's doing a full table scan.

You know just little things like that. Be aware of what they all mean and what the performance

impact of the different ones are. So, by using the Slow Query Log and mysqldumpslow to parse

the query log, very easy to set up, Log Slow Queries and then you give it a long query

time, anything above that, it will log to the--to the Slow Query Log. There's also--I

think in 4.1, we put it an option that you can--anything that--any table, any select

statement that does not use an index on a table can get logged to the Slow Query Log

regardless of how fast it runs. And that's--and that's useful for when you're doing an indexing

strategy and you're trying to determine okay my patterns have access, do I--do I have indexes

on those fields that I'm actually accessing? You know, and my where clause and ON clause

and that kind of thing. So, low hanging fruit law of diminishing returns tackles stuff that's

going to get you the absolute best return on investment for your time. Don't concentrate

on, you know, I know you guys are Google. So, a millisecond definitely does count. But,

if you got a--if you got a choice of, you know, going from five milliseconds to four

milliseconds, you're going from a minute to, you know, a second, it's a no-brainer. You

know, concentrate on the stuff that--that's going to give you the biggest impact. Using

Mytop. Jeremy Zawodny wrote a nice little tool. Most of you are probably familiar with

the top utility in UNIX. It's almost identical but it's just for MySQL. And it gives you

an ability to on a delay--on a repeating delay, you know, show you the process list of what's

going on in MySQL, and show, you know, show various show commands, the status variables

and gives you a nice little interface to profile into the system while it's running. And one

nice thing about Mytop that I really like is that you can--you can use it for remote

hosts as well. So, you can--you can run it on your local machine and access a remote

host. So that it is a nice little tool. Okay. Sources of Problems. This is kind of my personal

opinion of where I rank the law of diminishing returns, like what I actually like go for

immediately. The performance team tends to think that, you know, sometimes you can--you

can tune a server parameter and that's like the biggest bang for the least, you know,

amount of time. But I think, really, where performance problems come in are the--are

the definitely the top three here. Poor Indexing Choices will kill an application faster than

really anything else. If you're querying on a field and you don't have an index on it

and you're repeatedly hitting that, it's going to kill the application faster than, you know,

whether sort buffer size can be adjusted. So, same thing with the Inefficient or Bloated

Schema Design, you know, I talked in the--in the performance talk. I've seen a number of

cases where people designing new applications, well, actually de-normalized from the beginning

because they think it's like going to give them better performance. Don't do that. You

know, there's specific cases where you can de-normalize the schema and you can get some

performance benefits. But don't do it right away. You know, just because you think it's