Placeholder Image

Subtitles section Play video

  • Hello, everyone!! Myself Sandeep from codeKarle.

  • And in this video, let's look at how do we design a cab booking system, something like an Uber or OLA or Lyft or something of that sort.

  • Let's start with the functional and non functional requirements that we want this system to support.

  • So, very first thing is - as a customer, when you open your app, you should be able to see what cabs are around you.

  • So, that is the See Cab feature in your vicinity.

  • The next thing is - if you want to book a cab from one place, point A to point B let's say, you need to know how much time it will take to travel

  • from point A to point B. And, you should be able to get an approximate price of how much would it cost you to book a cab via this platform.

  • The next thing is - you should obviously be able to book a cab.

  • We'll not get into the varieties of.. types of cab like go, premium and all of that. We'll just assume that there is one type of cab.

  • You can add more features, simply! That's not a big trouble.

  • The next thing is - there should be a very good location tracking of what driver was at what place at what point in time, for various reasons.

  • From a non-functional standpoint, this platform should be global and it should be accessible to people of all countries.

  • At least, we'll design it in that way.

  • What that means is - you need to have servers in a lot of geographies so as to make sure that people in a certain geography

  • are accessing the servers near to them and thereby reducing the latency.

  • Next obvious thing - it needs to work at a fairly low latency.

  • Though this is not very, very mission critical, but it still needs to be, you know, reasonably fast.

  • Availability should be very high.

  • This system should not go down. It will cause a lot of problems to people who are stuck somewhere, if the system is down.

  • And at the same time, it should have high consistency.

  • High Availability - High Consistency, might seem to you that it is trying to violate the CAP theorem, which basically says that -

  • assuming all the systems in the world are distributed nowadays, so out of availability and consistent, you can just get one.

  • The idea is - certain components of this need to be very highly available and certain other components need to be very highly consistent,

  • not both at the same time.

  • From a scale standpoint, this system should scale to a very good number.

  • If I look at just some of the statistics of Uber, there are roughly 100 million active users that use Uber on a monthly basis. These are unique users.

  • And Uber, in general, does roughly 14 million rides per day.

  • So with that thought in mind, let's try to look at a system that can scale up to these numbers, with these criterion.

  • Now, the main problem that companies like Uber are trying to solve is - when you have a customer's location, who wants to book a cab,

  • you try to find out some few drivers who are very near to this location and then using some logic,

  • try to come up with the best driver who is suited to do this trip for this customer.

  • So the problem then becomes, how do you find these 2 - 3 closest drivers to this customer?

  • There are multiple ways to do that. We'll go over one of the ways which uses a concept called segment and mapping segments, basically.

  • Now, this is a term that I just made up. It's not an industry standard term. So, just keep that in mind.

  • Now let's just say, you have a city, something like this.

  • The idea is you basically divide it into rectangular segments.

  • So, you kind of divide this city into multiple pieces.

  • And you say that - this is probably your segment id 1.

  • This is your segment id 2, something of that sort.

  • Now, the idea is - you are dividing a city.

  • It'll normally be divided into a lot more segments than what you can see here.

  • Now, the idea is - given certain coordinates of the segment boundary and given certain coordinates of a cab,

  • you should be able to figure out which segment does a cab belong in.

  • Now, the problem looks trivial, and it is not difficult also.

  • So, think of it like a standard coordinate system.

  • This is point (0,0). This is (0,1). This is (1,0). This is (1,1).

  • Somewhere here.

  • Now, if a point lies somewhere in between here, let's say (0.4, 0.5),

  • you should be able to mathematically say that this point lies within this boundary.

  • A very similar logic we'll try to use when we try to assign a particular segment to a cab.

  • Also, keep in mind that cabs continuously moving and their locations are continuously changing.

  • So, we'll try to make sure that we get continuous pings from all the cabs and then keep a track of which segment do they belong in.

  • A cab could be here right now. Could be in this particular segment. And the cab driver is going via a road, over here. Now, this changes the segment.

  • And this information would be calculated at runtime, as and when we are getting pings from the cab.

  • So, we'll have something called a Maps Service.

  • This Maps Service will do a couple of things.

  • The very first thing is that - it will be responsible for dividing the city into these segments and taking care of the segments.

  • The other thing it will also do is - given a lat long of a cab and given a lat long of a customer,

  • tell which segment do these users belong to at this point in time.

  • This service will also be used to calculate ETA from point A to point B and the route from point A to point B and thus even the distance.

  • But, we'll abstract out. We'll not go into much detail on how that ETA and distance piece is implemented.

  • Think of it for now as if we'll be using a Google Maps Service and we'll go over the details of implementation of that

  • when we do another system design video on implementing Google Maps.

  • With that being said, there is also one more thing that this service does.

  • So let's just say, there is a huge amount of traffic or huge amount of cab drivers in this particular segment. And it is getting unmanageable.

  • So the idea of segment is - it should be a small set of drivers that are in the segment.

  • So this service will take care of dividing this segment into multiple parts. It could do it into 4 parts. It could divide it into 6 parts.

  • That logic resides within Maps Service.

  • Let's just say, there is very less traffic in some other locality.

  • So, it could also decide to merge a couple of segments into one segment and say, this whole thing is now one segment.

  • So all the segment management remains within this service.

  • Now, let's look at the overall architecture and how individual users and drivers get connected to the system.

  • All the users get connected through this User App, which talks to a Load Balancer, which talks to something called as a User Service.

  • This User Service is your repository of all the user information.

  • Plus, it is also a proxy, that will connect to other services to get any information that a user wants.

  • So, for example, if a user wants to see their profile, update their profile, all the APIs to do that are powered by User Service.

  • If somebody wants to fetch user information, any other service for example, then all the APIs for that are powered by this service.

  • Let's say, if a user wants to see their trips, then this User Service will talk to Trip Service, fetch all the trips for that user and send it back to the user.

  • So that's the responsibility of User Service.

  • From a database standpoint, it sits on top of a MySQL cluster, which stores all the user information within that.

  • And it also uses a Redis for caching the same information.

  • So let's say, a GET API to get a user's information is called, it first queries the Redis. If it has information, it returns from there. If it doesn't have

  • information, then it queries a MySQL slave, fetches that information, stores it in Redis and then returns back to whoever was calling.

  • The next flow that the user calls is basically when they try to book a cab.

  • The whole screen that the customer, kind of, goes through when they are trying to book a cab is powered by this Cab Request Service.

  • Essentially what it does is -

  • it basically makes a WebSocket connection with the User's App, which displays them a few cabs onto their UI which are around them.

  • Also, it places a request with something called a Cab Finder. We'll go over this in the next section. Whenever Cab Finder responds back with a cab,

  • this Cab Request Service talks to the User App, basically sends them a response, through this WebSocket connection,

  • saying that the cab is booked and these are the details and whatever is required.

  • This is approximately the major user flows.

  • Now quickly, let's look at the driver flows.

  • A driver basically talks via the Driver App.

  • Again, there is a very similar Driver Service, exactly similar to the User Service.

  • But for drivers, again, all the APIs for getting, updating all the driver information is powered via this.

  • If a driver wants to see their payment information for example, their payment history, this Driver Service will expose an API

  • which the Driver App will call and Driver Service will internally call a Payment Service to get that information and respond back.

  • It could call Trip Service to get the trips of all the drivers. And, all the UI data gets powered by this service.

  • This service sits on top of another MySQL which has all the driver information.

  • And it uses Redis for caching in exactly the same way the User Service does.

  • This Driver App also talks to something called as a Location Service through a series of servers,

  • again maintaining a WebSocket connection with these servers.

  • And, as and when a driver is moving through the city, every 5 seconds or 10 seconds, their location is being sent out to this Location Service,

  • which then queries the Map Service that we talked about earlier, to find out which segment does the driver belong to.

  • And when the customer places this cab request, customer's segment is calculated, driver segments are calculated

  • and they are mixed and matched by the Cab Finder and a couple of other components to give the best suited driver for a particular trip.

  • Now, let's look at how does a customer and driver come together.

  • All the active drivers in the system, who are online right now, ready for trips and all, are mentioned as D1, D2, D3.

  • There'll be a lot more such drivers.

  • All those drivers.. each of them is connected to one of the servers through a WebSocket connection.

  • And those servers are called out as WebSocket Handler 1, WebSocket Handler 2, WebSocket Handler 3.

  • Now, why do we need WebSocket here?

  • So, we always need a connection between the driver and a service.. for a lot of reasons.

  • One of the very first things is that - a driver continuously sends location pings to the backend, telling about the location.

  • Now, if each time they start creating a new connection, that's a kind of a heavy operation. So, we'll have this connection live.

  • Also, at times, the servers might want to talk to driver. So, let's say, if a trip is assigned to a driver, we need to inform the driver.

  • So, we can reuse the same connection to talk to a driver and tell them that this is the trip information that you have to do right now.

  • For all of that, there are these WebSocket Handler servers.

  • In the real world, there'll be hundreds of such servers, who are interfacing with all the drivers, which are throughout the world, geographically split.

  • Now, let's say, somebody in the system identifies that a trip is being given to a driver and to reach out to the driver,

  • they first need to know that which out of these hundreds of WebSocket Handler servers, do I need to talk to.

  • For that, there is something called as a WebSocket Manager.

  • Now, this WebSocket Manager is another distributed service which manages the fact that which server is connected to what all drivers.

  • So, let's say, D3 is a new driver that has come online right now and through the load balancer, it got connected to WebSocket Handler 3.

  • So, this Handler will inform the Manager, saying I have now got connected to D3 also. So, if there's anything for D3, inform me.

  • And this Manager will store this in it's database.

  • Now, let's say, this connection got broken and D3 is offline right now. Again this Handler will inform the Manager that D3 is now offline.

  • Do not reach out to me for any communication of D3.

  • This manager sits on top of a Redis cluster.

  • This Redis would not just be storing data in-memory, it will also be storing it in a persistent store on disk.

  • And it will basically store 2 kinds of mapping.

  • One is saying that.. the most frequently used one, is that - which driver is connected to which host..

  • which is saying something like D1 is connected to H1.

  • Similarly, there'll be an entry for each of the drivers in the system, saying which driver_id is connected to what host_id.

  • It'll also have a reverse mapping, saying which host_id is connected to what all driver_id.

  • So, it could have a mapping saying H1 is connected to D1, D2, D3, so on and so forth. Because that mapping might be used for something.

  • Coming to other things that this WebSocket is used for.

  • So these drivers/ devices send location pings to our backend, let's say, every 5 seconds.

  • So, every 5 seconds, we get a hit about the location information.

  • All the location related information is managed by something called Location Service. It does a lot of things.

  • One of the things that happens here is - it stores the information about the driver's location into it's Cassandra.

  • Why Cassandra?

  • Because, again, there are like thousands probably or maybe even millions of drivers across the globe

  • who are sending their location updates every 5 seconds.

  • So, there are a lot of updates happening. So, a Cassandra should be able to easily scale up to that number. That's the main reason.

  • There are 2 kinds of information that get stored here.

  • One is - the live location of the driver, which is the last known location.

  • The other thing that is stored is - while a driver is doing a trip with a customer, we need to know exactly what was the route followed,

  • for any auditing purpose or billing purpose.

  • Very common use case is - Once we know the points that the driver followed, we would be able to trace that out and then come up

  • with the real distance that the driver actually travelled and then use that to come up with the pricing.