Placeholder Image

Subtitles section Play video

  • In these first few videos, I want to lay the foundation of some core concepts you will

  • need throughout these video tutorials. Let's get started with the basics. So what is a

  • data structure? one definition that I really like is a data structure is a way of organizing

  • data so that it can be used efficiently. And that's all data structure really is it is

  • a way of organizing data in some fashion so that later on it can be accessed, queried,

  • or perhaps even updated quickly and easily. So why data structures? Why are they important?

  • Well, they are essential ingredients in creating fast and powerful algorithms. Another good

  • reason might be that they help us manage and organize our data in a very natural way. And

  • this last point, is more of my own making. And it's that it makes code cleaner and easier

  • to understand. As a side note, one of the major distinctions that I have noticed from

  • bad mediocre to excellent programmers is that the ones who really excel are the ones who

  • fundamentally understand how and when to use the appropriate data structure for the task

  • they're trying to finish. data structures can make the difference between having an

  • okay product and an outstanding one. It's no wonder that every computer science undergraduate

  • student is required to take a course in data structures. It is strange that before we even

  • begin talking about data structures that we need to talk about the abstraction of data

  • structures. What I'm talking about is the concept of an abstract data type. What is

  • an abstracted type and how does it differ from a data structure? Well, the answer is

  • that an abstract data type is an abstraction of a data structure, which provides only the

  • interface to which that data structure must adhere to. The interface does not give any

  • specific details on how

  • a specific data structure should be implemented, or in what programming language. One particular

  • example I like to give is to suppose that your abstract data type is for a mode of transportation

  • to get from point A to point B. Well, as we both know, there are many modes of transportation

  • to get from one place to another. So which one do you choose? some specific modes of

  • transportation might be walking or biking, perhaps even taking a train and so on, these

  • specific modes of transportation would be analogous to the data structures themselves.

  • We want to get from one place to another through some mode of transportation, that was our

  • abstract data type. How did we do that? Exactly? Well, that is the data structure. Here are

  • some examples of abstract data types on the left, and some possible underlying implementation

  • on the right hand side. As you can see, a list can be implemented in two ways. You can

  • have a dynamic array or a linked list. They both provide ways of adding, removing and

  • indexing elements in the list. Next, we have a queue and the map abstract data types, which

  • themselves can be implemented in a variety of ways. Notice that under the implementation

  • for a queue, I put a stack based queue because yes, you can have a queue, which is implemented

  • using only stacks. This may not be the most efficient way of implementing a queue. But

  • it does work and it is possible. The point here is that the abstract data type only defines

  • how a data structure should behave, and what methods it should have, but not the details

  • surrounding how those methods are implemented. Alright, now that we were done with abstract

  • data types, we need to have a quick look at the wild world of computational complexity,

  • to understand the performance that our data structures are providing. So as programmers,

  • we often find ourselves asking the same two questions over and over and over again. That

  • is, how much time does this algorithm need to finish and also how much space Does this

  • algorithm need for my computation. So, if your program takes a lifetime of the universe

  • to finish, then it's no good. Similarly, if your program runs in constant time, but requires

  • a space equal to the sum of all the bytes of all the files on the internet, internet,

  • your algorithm is also useless. So just standardize a way of talking about how much time and how

  • much space is required for an algorithm to run. theoretical computer scientists have

  • invented big O notation, amongst other things like big theta, big omega, and so on. But

  • we're interested in Big O because it tells us about the worst case. Big O notation only

  • cares about the worst case. So if your algorithm sorts numbers, imagine the input is the worst

  • possible arrangement of numbers for your particular sorting algorithm. Or, as a concrete example,

  • suppose you have an unordered list of unique numbers, and you're searching for the number

  • seven, or the position where seven occurs in your list. And the worst case isn't one

  • seven, that is at the beginning. Or in the middle of the list, it's one seven is the

  • very last element of the list. for that particular example, the time complexity would be linear

  • with respect to the number of elements in the size view list. Because you have to traverse

  • every single element until you find seven. The same concept applies to space, you can

  • just consider the worst possible amount of space your algorithm is going to need for

  • that particular input. There's also the fact that big O really only cares about what happens

  • when your input becomes arbitrarily large. We're not interested in what happens when

  • the input is small. For this reason, you'll see that we ignore things like constants and

  • multiplicative factors. So in our big O notation, you'll often see these coming up. Again, again,

  • again. So to clarify one thing,

  • when I say n, n is usually always want to be the input size coming into your algorithm,

  • there's always going to be some limitation of size n. So constant time referred to as

  • a one wrapped around a big O. If your algorithm case in logarithmic amount of time to finish,

  • we say that's big O of a log event. If it's linear, then we just say n, if it takes like

  • quadratic time or cubic time, then we say that's n raised to that power, it's exponential.

  • Usually, this is going to be something like two to the n three, the N, N number be greater

  • than one to the n. And then we also have n factorial. But we also have things in between

  • these like square root of n log log of n, n to the fifth and so on. Actually, almost

  • any mathematical expression containing n can be wrapped around a big O. And is big O notation

  • valid. Now, we want to talk about some properties of Big O. So to reiterate what we just saw

  • in the last two slides, Big O only really cares about what happens when input becomes

  • really big. So we're not interested when n is small, only

  • what happens when n goes to infinity. So this is how and why we get the first two properties.

  • The first that we can simply remove constant values added to our big O notation. Because

  • if you're adding a constant to infinity, well, let's just infinity, or if you're multiplying

  • a constant by infinity, yeah, that's still infinity. So we can ignore constants. But

  • of course, this is all theoretical. In the real world. If your constant is of size, 2

  • billion, probably, that's going to have a substantial impact on the running time of

  • your algorithm.

  • However, let us look at a function f, which I've defined over some input size. And if

  • f of n is seven log of n cubed plus 15 n squared plus two n q plus eight. Well, big O of f

  • of n is just n cubed, because n cubed is the biggest, most dominant term in This function.

  • All right, now let's

  • look at some concrete examples of how big O is used. So both of the following run in

  • constant time with respect to the input size, and because they don't depend on n

  • at all.

  • So on the left, when we're just adding or doing some mathematical formula, yes, that's

  • constant time. And on the right, okay, we're doing a loop, but the loop itself doesn't

  • depend on n. So it runs also in a constant amount of time. So as our input size gets

  • arbitrarily large, well, that loop is still going to run in the same amount of time regardless.

  • Now let's look at a linear example. So both the following actually run in

  • linear time with respect to the input size, and because we do a constant amount of work,

  • and times.

  • So on the left, we're incrementing, the counter AI by one each time. So f of n is and, and

  • clearly, when we wrap this in a big go get a big O of n. On the right, a little bit more

  • complicated, we're not incrementing by one, we're incrementing by three, so we're going

  • to finish that loop three times faster. So f of n is n over three. So our second example

  • is two algorithms that run in quadratic time. So the first one seems obvious, we do n work

  • at times. So n times as big O of n squared. But observe the second one, I changed the

  • zero with an eye. So pause the video and try to figure out maybe why that's big O of n

  • squared. Okay, let's go over the solution. So first, just focus on the second loop. The

  • first loop isn't as important. So since I go from zero to n, the amount of looping done

  • is going to be directly related to what AI is, and remember is changing from the outer

  • loop. So if we fix AI to be zero, we do n work. We fix it one, we do n minus one work.

  • If we fix it, too, we do n minus two work and excetera. So the question then becomes

  • what is n plus n minus one plus n minus two plus n minus three and so on? Well, this is

  • a well known identity, and turns out to be n times n plus one divided by two. So if we

  • wrap this in a big O, we split our equation, we can see that this is big O of n squared.

  • Now let's look at perhaps a more complicated example. Earlier, you may have wondering,

  • how do we ever get these log Eurythmics or linear rhythmic time complexities? So here's

  • a classic algorithm of doing a binary search, which yields actually a logarithmic time complexity.

  • So what this algorithm does is it starts by making two pointers, one at the very start

  • in one at the very end of the array, then it selects a midpoint between two and checks

  • if the value we're looking for was found at the midpoint, and then has either found it

  • or not, if it has found it, it stops. Otherwise, it discards one half of the array, and adjusts

  • either the high or the low pointer. remark that even in the worst case, we're still discarding

  • half of the array, each iteration. So very, very quickly, we're going to run out of a

  • range check. So if you do the math, the worst case is you will do exactly log base two of

  • N iterations, meaning that the binary search will runs in logarithmic time. Very cool algorithm,

  • a very powerful algorithm. Here's a slightly different example worth going over. So first,

  • notice that there is an outer loop with a counter I that does and work, then notice

  • that there are two inner loops, one that does three n work another one that does, too, one

  • work. So the rule we use to determine the complexity of this algorithm is to multiply

  • loops on different levels and add those that aren't the same, generally speaking. So, so

  • using the rule above, we can see that it takes n work to do the outer loop, multiply by three

  • n plus two n for both inner loop Which gives us five n squared, which is big O of n squared.

  • All right, so this next one looks very similar, but it's actually quite different. So on the

  • outer loop with AI, we have AI going from zero to three n. So there's three n work done

  • on the outside. But we have to multiply that but what's going on on the inside,

  • so the inside

  • j goes from 10, to 50. So that does 40 loops exactly every loop. So that's a constant for

  • the amount of work. Plus however, the second loop, so j is less than and cubed, but j is

  • J equals j plus two, so it's accelerated a little, so we're going to finish that loop

  • a little faster. So we're gonna get on the inside 14, plus n cubed divided by two, but

  • we have to multiply that by three n. So we wrap that in a big O. So big O of f of n,

  • is going to give us big of enter the force, because And the fourth is the dominant term

  • in our function, f of n. For some other classic examples of Big O. So if we have to find all

  • the subsets of a set, that takes an exponential amount of time, because there are two the

  • N subsets, finding all permutations of a string takes big O of n factorial. Another classic

  • one is merge sort. So we have n times log in for your merge sort. And if we have to

  • iterate over all the cells of an array of size n by n, we have big O of n m. Time. All

  • right, let's talk about arrays, probably the most used data structure. This is part one

  • of two in the array videos. The reason the array is used so much is because it forms

  • a fundamental building block for all other data structures. So we end up seeing everywhere.

  • with arrays and pointers alone, I'm pretty sure we can construct just about any data

  • structure. So an outline for today's video. First, we're going to begin by having a discussion

  • about arrays and answer some fundamental questions such as what where and how are arrays used?

  • Next, I will explain the basic structure of an array and the common operations we are

  • able to perform on them. Lastly, we will go over some complexity complexity analysis and

  • look at some source code on how to construct a dynamic array using only static arrays,

  • discussion and examples. So what is a static array? So static array is a fixed length container

  • containing elements, which are indexable, usually on the range of zero, inclusive to

  • n minus one also inclusive. So a follow up question is what is meant by being indexable?