Subtitles section Play video Print subtitles I'm running something called Private AI. It's kind of like ChatGPT, except it's not. Everything about it is running right here on my computer. I'm not even connected to the internet. This is private, contained, and my data isn't being shared with some random company. So in this video, I want to do two things. First, I want to show you how to set this up. It is ridiculously easy and fast to run your own AI on your laptop, computer, or whatever it is. This is free, it's amazing, it'll take you about five minutes. And if you stick around to the end, I want to show you something even crazier, a bit more advanced. I'll show you how you can connect your knowledge base, your notes, your documents, your journal entries, to your own Private GPT, and then ask it questions about your stuff. And then second, I want to talk about how Private AI is helping us in the area we need help most, our jobs. You may not know this, but not everyone can use ChatGPT or something like it at their job. Their companies won't let them, mainly because of privacy and security reasons. But if they could run their own Private AI, that's a different story. That's a whole different ballgame. And VMware is a big reason this is possible. They are the sponsor of this video, and they're enabling some amazing things that companies can do on-prem in their own data center to run their own AI. And it's not just the cloud, man, it's like in your data center. The stuff they're doing is crazy. We're gonna talk about it here in a bit. But tell you what, go ahead and do this. There's a link in the description. Just go ahead and open it and take a little glimpse at what they're doing. We're gonna dive deeper, so just go ahead and have it open right in your second monitor or something, or on the side, or minimize. I don't know what you're doing, I don't know how many monitors you have. You have three, actually, Bob. I can see you. Oh, and before we get started, I have to show you this. You can run your own private AI that's kind of uncensored. Like, watch this. I love you, dude, I love you. So yeah, please don't do this to destroy me. Also, make sure you're paying attention. At the end of this video, I'm doing a quiz. And if you're one of the first five people to get 100% on this quiz, you're getting some free coffee. Network Chuck coffee. So take some notes, study up, let's do this. Now, real quick, before we install a private local AI model on your computer, what does it even mean? What's an AI model? At its core, an AI model is simply an artificial intelligence pre-trained on data we've provided. One you may have heard of is OpenAI's chat GPT, but it's not the only one out there. Let's take a field trip. We're gonna go to a website called huggingface.co. Just an incredible brand name, I love it so much. This is an entire community dedicated to providing and sharing AI models. And there are a ton. You're about to have your mind blown, ready? I'm gonna click on models up here. Do you see that number? 505,000 AI models. Many of these are open and free for you to use, and they're pre-trained, which is kind of a crazy thing. Let me show you this. We're gonna search for a model named Llama 2, one of the most popular models out there. We'll do Llama 2 7B. I, again, I love the branding. Llama 2 is an AI model known as an LLM or large language model. OpenAI's chat GPT is also an LLM. Now this LLM, this pre-trained AI model was made by Meta, AKA Facebook. And what they did to pre-train this model is kind of insane. And the fact that we're about to download this and use it, even crazier. Check this out. If you scroll down just a little bit, here we go, training data. It was trained by over 2 trillion tokens of data from publicly available sources, instruction data sets, over a million human annotated examples. Data freshness, we're talking July, 2023. I love that term, data freshness. And getting the data was just step one. Step two is insane because this is where the training happens. Meta, to train this model, put together what's called a super cluster. It already sounds cool, right? This sucker is over 6,000 GPUs. It took 1.7 million GPU hours to train this model. And it's estimated it costs around $20 million to train it. And now Meta's just like, here you go kid, download this incredibly powerful thing. I don't want to call it a being yet. I'm not ready for that. But this intelligent source of information that you can just download on your laptop and ask it questions. No internet required. And this is just one of the many models we could download. They have special models like text to speech, image to image. They even have uncensored ones. They have an uncensored version of Allama too. This guy, George Sung, took this model and fine tuned it with a pretty hefty GPU, took him 19 hours and made it to where you could pretty much ask this thing anything you wanted. Whatever question comes to mind, it's not going to hold back. So how do we get this fine tuned model onto your computer? Well, actually I should warn you, this involves quite a bit of Allamas, more than you would expect. Our journey starts at a tool called Allama. Let's go ahead and take a field trip out there real quick. We'll go to allama.ai. All we have to do is install this little guy, Mr. Allama. And then we can run a ton of different LLMs. Llama2, Code Llama, told you lots of llamas. And there's others that are pretty fun like Llama2 Uncensored, more llamas. Mistral, I'll show you in a second. But first, what do we install Allama on? We can see right down here that we have it available on Mac OS and Linux, but oh, bummer, Windows coming soon. It's okay, because we've got WSL, the Windows Subsystem for Linux, which is now really easy to set up. So we'll go ahead and click on download right here. For Mac OS, you'll just simply download this and install it like one of your regular applications. For Linux, we'll click on this. We got a fun curl command that will copy and paste. Now, because we're going to install WSL on Windows, this will be the same step. So, Mac OS folks, go ahead and just run that installer. Linux and Windows folks, let's keep going. Now, if you're on Windows, all you have to do now to get WSL installed is launch your Windows terminal. Just go to your search bar and search for terminal. And with one command, it'll just happen. It used to be so much harder, which is WSL dash dash install. It'll go through a few steps. It'll install Ubuntu as default. I'll go ahead and let that do that. And boom, just like that, I've got Ubuntu 22.04.3 LTS installed, and I'm actually inside of it right now. So now at this point, Linux and Windows folks, we've converged, we're on the same path. Let's install Olama. I'm going to copy that curl command that Olama gave us, jump back into my terminal, paste that in there, and press enter. Fingers crossed, everything should be going great, like the way it is right now. It'll ask for my sudo password. And that was it. Olama is now installed. Now, this will directly apply to Linux people and Windows people. See right here where it says NVIDIA GPU installed? If you have that, you're going to have a better time than other people who don't have that. I'll show you here in a second. If you don't have it, that's fine. We'll keep going. Now let's run an LLM. We'll start with Llama 2. So we'll simply type in Olama, run, and then we'll pick one, Llama 2. And that's it. Ready, set, go. It's going to pull the manifest. It'll then start pulling down and downloading Llama 2, and I want you to just realize this, that powerful Llama 2 pre-training we talked about, all the money and hours spent, that's how big it is. This is the 7 billion parameter model, or the 7B. It's pretty powerful. And we're about to literally have this in the palm of our hands. In like three, two, one. Oh, I thought I had it. Anyways, it's almost done. And boom, it's done. We've got a nice success message right here, and it's ready for us. We can ask you anything. Let's try, what is a pug? Now, the reason this is going so fast, just like a side note, is that I'm running a GPU, and AI models love GPUs. So let me show you real quick. I did install Llama on a Linux virtual machine. And I'll just demo the performance for you real quick. By the way, if you're running like a Mac with an M1, M2, or M3 processor, it actually works great. I forgot to install it. I gotta install it real quick. And it'll ask you that same question, what is a pug? It's going to take a minute. It'll still work, but it's going to be slower on CPUs. And there it goes. It didn't take too long, but notice it is a bit slower. Now, if you're running WSL, and you know you have an Nvidia GPU and it didn't show up, I'll show you in a minute how you can get those drivers installed. But anyways, just sit back for a minute, sip your coffee, and think about how powerful this is. The tinfoil hat version of me, stinkin' loves this. Because let's say the zombie apocalypse happens, right? The grid goes down, things are crazy. But as long as I have my laptop and a solar panel, I still have AI, and it can help me survive the zombie apocalypse. Let's actually see how that would work. It gives me next steps. I can have it help me with the water filtration system. This is just cool, right? It's amazing.