Placeholder Image

Subtitles section Play video

  • In our previous lesson, we saw what binary search trees are, now in this lesson we are

  • going to implement binary search tree. We will be writing some code for binary search

  • tree. prerequisite for this lesson is that you must understand the concepts of pointers

  • and dynamic memory allocation in C/C++. If you have already followed this series and

  • seen our lessons on linked list, then implementation of binary search tree or binary tree in general

  • is not going to be very different. We will have nodes and links here as well. Ok, so

  • lets get started. Binary search tree or BST as we know is a binary tree in which for each

  • node, value of all the nodes in left subtree is lesser or equal and value of all the nodes

  • in right subtree is greater. We can draw BST as a recursive structure like this. Value

  • of all the nodes in left subtree must be lesser or equal and value of all the nodes in right

  • subtree must be greater and this must be true for all nodes and not just the root node.

  • So, in this recursive definition here, both left and right subtrees must also be binary

  • search trees. I have drawn a binary search tree of integers here. Now, the question is,

  • how can we create this non-linear logical structure in computer's memory. I had talked

  • about this briefly when we had discussed binary trees. The most popular way is - dynamically

  • created nodes linked to each other using pointers or references just the way we do it for linked

  • lists. Because in a binary search tree, or in a binary tree in general, each node can

  • have at most 2 children, we can define node as an object with 3 fields something like

  • what I'm showing here. We can have a field to store data, another to store address or

  • reference of left child and another to store address or reference to right child. If there

  • is no left or right child for a node, reference can be set as NULL. In C or C++, we can define

  • node like this. There is a field to store data. Here the data type is integer, but it

  • can be anything. There is one field which is pointer to node. Node asterisk means pointer

  • to node. This one is to store the address of left child and we have another one to store

  • the address of right child. This definition of node is very similar to definition of Node

  • for doubly linked list. Remember in doubly linked list also, each node had two links

  • - one to previous node and another to next node. but doubly linked list was a linear

  • arrangement. This definition of node is for a binary tree. We could also name this something

  • like BstNode, but Node is also fine, lets go with Node. Now, in our implementation,

  • just like linked list, all the nodes will be created in dynamic memory or heap section

  • of application's memory using malloc function in 'C' or new operator in C++. We can use

  • malloc in C++ as well. Now, as we know any object created in dynamic memory or heap section

  • of applications memory cannot have a name or identifier. It has to be accessed through

  • a pointer. malloc or new operator return us pointer to the object created in heap. if

  • you want to revise some of these concepts of dynamic memory allocation, you can check

  • the description of this video for link to a lesson. Its really important that you understand

  • this concept of stack and heap in applications memory really well. Now, for a linked list,

  • if you remember, the information that we always keep with us is address of the head node.

  • If we know the head node, we can access all other nodes using links. In case of trees,

  • the information that we always keep with us is address of the root node. If we know the

  • root node, we can access all other nodes in the tree using links. To create a tree, we

  • first need to declare a pointer to BstNode. I'll rather call node BstNode here, BST for

  • binary search tree. So, to create a tree, we first need to declare a pointer to BstNode

  • that will always store the address of root node. I have declared a pointer to Node here

  • named rootPtr - Ptr for pointer. In C, you can't just write BstNode asterisk rootPtr.

  • You will have to write struct space BstNode asterisk, you will have to write struct here

  • as well. I am gonna write C++ here, but anyway right now I'm trying to explain the logic.

  • we will not bother about minute details of implementation. In this logical structure

  • of tree that I'm showing here, each Node as you can see has 3 fields, 3 cells. leftmost

  • cell is to store the address of left child and rightmost cell is to store the address

  • of right child. Lets say root node is at address 200 in memory and I'll assume some random

  • addresses for all other nodes as well. Now, I can fill in these left and right cells in

  • each node with addresses of left and right children. In our definition, data is first

  • field, but in this logical structure, I am showing data in the middle. Ok, so for each

  • Node, I have filled in addresses for both left and right child. Address is 0 or NULL

  • if there is no child. Now, as we were saying, identity of the tree is address of the root

  • node. We need to have a pointer to Node in which we can store the address of the root

  • node. We must have a variable of type pointer to node to store the address of root node.

  • All these rectangles with 3 cells are Nodes. They are created using malloc or new operator

  • and live in heap section of application's memory. We cannot have name or identifier

  • for them. They are always accessed through pointers. This rootPtr, root pointer has to

  • be a local or global variable. We will discuss this in little more detail in some time. Quite

  • often, we like to name this root pointer, just root. We can do so, but we must not confuse.

  • This is pointer to root and not the root itself. To create a BST, as I was saying, we first

  • need to declare this pointer. Initially, we can set this pointer as NULL to say that the

  • tree is empty. A tree with no node can be called empty tree and for empty tree, root

  • pointer should be set as NULL. We can do this declaration and setting the root as NULL in

  • main function in our program. Actually, lets just write this code in a real compiler. I

  • am writing C++ here. As you can see, in the main function I have declared this pointer

  • to Node which will always store the address of root node of my tree and I am initially

  • setting this as NULL to say that the tree is empty. With this much of code, we have

  • created an empty tree. But, whats the point of having an empty tree. We should have some

  • data in it. So, what I want to do now is I want to write a function to be able to insert

  • a node in the tree. I will write a function named Insert that will take address of the

  • root node and data to be inserted as argument and this function will Insert a node with

  • this data at appropriate position in the tree. In the main function, I'll make calls to this

  • insert function passing it address of root and the data to be inserted, lets say first

  • I want to insert number 15 and then I want to insert number 10 and then number 20. We

  • can insert more, but lets first write the logic for Insert function. Before I write

  • the logic for Insert function, I want to write a function to create a new Node in dynamic

  • memory or heap. This function GetNewNode should take an integer, the data to be inserted as

  • argument, create a node in heap using new or malloc and return back the address of this

  • new node. I am creating a new node here using this new operator. the operator will return

  • me the address of the newly created node which I am collecting in this variable of type pointer

  • to BstNode. In C, instead of new operator, we will have to use malloc. We can use malloc

  • in C++as well. C++ is only a super-set of C. malloc will also do here. Now, anything

  • in dynamic memory or heap is always accessed through pointer. Now, using this pointer - newNode,

  • we can access the fields of newly created Node. I will have to dereference this pointer

  • using asterisk operator. i am writing asterisk newNode and now I can access the fields. We

  • have 3 fields in node - data and 2 pointers to node , left and right;. i have set the

  • data here. Instead of writing asterisk newNode dot data, we have this alternate syntax that

  • we could use. We could simply write newNode->data and this will mean the same. We have used

  • this syntax quite a lot in our lessons on linked list. Now, for the new node, we can

  • set the left and right child as NULL and finally we can return the address of the new Node.

  • Ok, coming back to the insert function. We can have couple of cases in insertion. First

  • of all, tree may be empty. For this first insertion, when we are inserting this number

  • 15, tree will be empty. if tree is empty, we can simply create a new node and set it

  • as root. With this statement, root equal GetNewNode, I am setting root as address of the new node.

  • But there is something not alright here. This root is local variable of Insert function

  • and its scope is only within this function. We want this root, root in main to be modified.

  • This guy is a local variable of main function. There are 2 ways of doing this. We can either

  • return the address of the new root. So, return type of insert function will be pointer to

  • BstNode and not void. And here, in the main function, we will have to write statement

  • like root equal Insert and the arguments. So, we will have to collect the return and

  • update our root in main function. Another way is that, we can pass the address of this

  • root of main to the Insert function. This root is already a pointer to Node. So, its

  • address can be collected in a pointer to pointer. So, Insert function , in insert function first

  • argument will be a pointer to pointer and here, we can pass the address. We will say

  • ampersand root to pass the address. We can name this argument root or we can name this

  • argument rootPtr. We can name this whatever. Now, what we need to do is, we need to dereference

  • this using asterisk operator to access the value in root of main and we can also set

  • the value in root of main. So, here with this statement, we are setting the value and the

  • return type now can be void. This pointer to pointer thing gets a little tricky. I'll

  • go with the former approach. Actually, there is another way. instead of declaring root

  • as a local variable in main function, we can declare root as global variable. Global variable,

  • as we know has to be declared outside all the functions. if root would be global variable,

  • it would be accessible to all the functions and we will not have to pass the address stored

  • in it as argument. Anyway, coming back to the logic for insertion. As we were saying,

  • if the tree is empty, we can simply create a new node and we can simply set it as root.

  • At this stage, we wanted to insert 15. If we will make call to the insert function,

  • address of root is 0 or NULL. NULL is only a macro for 0 and the second argument is the

  • number to be inserted. In this call to Insert function, we will make call to get new Node.

  • Lets say, we got this new Node at address 200. GetNewNode function will return us address

  • 200 which we can set as root here, but this root is a local variable. We will return this

  • address 200 back to the main function and in the main function, we are actually doing

  • this root equal Insert. So, in the main function we are building this link. Ok, our next call

  • in the main function was to insert number 10. At this stage, root is 200. the address

  • in root is 200 and the value to be inserted is 10. Now, the tree is not empty. So, what

  • do we do. If the tree is not empty, we can basically have 2 cases. If the data to be

  • inserted is lesser or equal, we need to insert it in the left subtree of root and if the

  • data to be inserted is greater, we need to insert it in right subtree of the root. So,

  • we can reduce this problem in a self similar manner, in a recursive manner. Recursion is

  • one thing that we are going to use almost all the time while working with trees. In

  • this function, I'll say that if the data to be inserted is less than or equal to the data

  • in root, then make a recursive call to insert data in left subtree. The root of the left

  • subtree will be the left child. So, in this recursive call, we are passing address of

  • left child and data as argument and after the data is inserted in left subtree, the

  • root of the left subtree can change. Insert function will return the address of the new

  • root of the left subtree and we need to set it as left child of the current node. In this

  • example tree here, right now, both left and right subtree are empty. We are trying to

  • insert number 10, so we have made call to this function Insert. From main function,

  • we have called Insert passing it address 200 and value or data 10. Now, 10 is lesser than

  • 15, so control will come to this line and a call will be made to Insert data in left

  • subtree. Now, left subtree is empty. So, address of root for left subtree is 0. Data passed,

  • data to be inserted passed as argument is 10. Now, this first insert call will wait

  • for this insert below to finish and return. For this last, insert call, root is NULL.

  • Lets say we got this node at address 150. Now, this Insert call will return back 150

  • and execution of first insert call will resume at this line and now this particular address

  • will be set as 150. So, we will build this link and now this insert call can finish.

  • It can return back the current root. Actually this return root should be there for all cases,

  • so I am taking it out and I have it after all these conditions. Of course, we will have

  • one more else here. If the data is greater, we need to go Insert in right subtree. the