Placeholder Image

Subtitles section Play video

  • How's it going?

  • I'm Megha.

  • Today I'm going to be talking about large language models.

  • Don't know what those are?

  • Me either.

  • Just kidding.

  • I actually know what I'm talking about.

  • I'm a customer engineer here at Google Cloud, and today I'm going to teach you everything you need to know about LLMs.

  • That's short for large language models.

  • In this course, you're going to learn to define large language models, describe LLM use cases, explain prompt tuning, and describe Google's generative AI development tools.

  • Let's get into it.

  • Large language models, or LLMs, are a subset of deep learning.

  • To find out more about deep learning, check out our Introduction to Generative AI course video.

  • LLMs and generative AI intersect and they are both a part of deep learning.

  • Another area of AI you may be hearing a lot about is generative AI.

  • This is a type of artificial intelligence that can produce new content including text, images, audio, and synthetic data.

  • All right, back to LLMs.

  • So what are large language models?

  • Large language models refer to large, general purpose language models that can be pre-trained and then fine-tuned for specific purposes.

  • What do pre-trained and fine-tuned mean?

  • Great questions.

  • Let's dive in.

  • Imagine training a dog.

  • Often you train your dog basic commands such as sit, come, down, and stay.

  • These commands are normally sufficient for everyday life and help your dog become a good canine citizen.

  • Good boy.

  • But if you need special service dogs such as a police dog, a guide dog, or a hunting dog, you add special trainings, right?

  • A similar idea applies to large language models.

  • These models are trained for general purposes to solve common language problems such as text classification, question answering, document summarization, and text generation across industries.

  • The models can then be tailored to solve specific problems in different fields such as retail, finance, and entertainment using a relatively small size of field datasets.

  • So now that you've got that down, let's further break down the concept into three major features of large language models.

  • We'll start with the word large.

  • Large indicates two meanings.

  • First is the enormous size of the training dataset, sometimes at the petabyte scale.

  • Second, it refers to the parameter count.

  • In machine learning, parameters are often called hyperparameters.

  • Parameters are basically the memories and the knowledge the machine learned from the model training.

  • Parameters define the skill of a model in solving a problem such as predicting text.

  • So that's why we use the word large.

  • What about general purpose?

  • General purpose is when the models are sufficient to solve common problems.

  • Two reasons led to this idea.

  • First is the commonality of human language regardless of the specific tasks.

  • And second is the resource restriction.

  • Only certain organizations have the capability to train such large language models with huge datasets and a tremendous number of parameters.

  • How about letting them create fundamental language models for others to use?

  • So this leaves us with our last terms, pre-trained and fine-tuned, which mean to pre-train a large model for a general purpose with a large dataset and then fine-tune it for specific aims with a much smaller dataset.

  • So now that we've nailed down the definition of what large language models LLMs are, we can move on to describing LLM use cases.

  • The benefits of using large language models are straightforward.

  • First, a single model can be used for different tasks.

  • This is a dream come true.

  • These large language models that are trained with petabytes of data and generate billions of parameters are smart enough to solve different tasks, including language translation, sentence completion, text classification, question answering, and more.

  • Second, large language models require minimal field training data when you tailor them to solve a specific problem.

  • Large language models obtain decent performance even with little domain training data.

  • In other words, they can be used for few-shot or even zero-shot scenarios.

  • In machine learning, few-shot refers to training a model with minimal data, and zero-shot implies that a model can recognize things that have not explicitly been taught in the training before.

  • Third, the performance of large language models is continuously growing when you add more data and parameters.

  • Let's take POM as an example.

  • In April 2022, Google released POM, short for Pathways Language Model, a 540 billion parameter model that achieves a state-of-the-art performance across multiple language tasks.

  • POM is a dense decoder-only transformer model.

  • It leverages a new pathway system which enabled Google to efficiently train a single model across multiple TPU v4 pods.

  • Pathways is a new AI architecture that will handle many tasks at once, learn new tasks quickly, and reflect a better understanding of the world.

  • The system enables POM to orchestrate distributed computation for accelerators, but I'm getting ahead of myself.

  • I previously mentioned that POM is a transformer model.

  • Let me explain what that means.

  • A transformer model consists of an encoder and a decoder.

  • The encoder encodes the input sequence and passes it to the decoder, which learns how to decode the representations for a relevant task.

  • We've come a long way from traditional programming to neural networks to generative models.

  • In traditional programming, we used to have to hard code the rules for distinguishing a cat.

  • Type, animal, legs 4, ears 2, fur yes, likes, yarn and catnip.

  • In the wave of neural networks, we could give the network pictures of cats and dogs and ask, is this a cat?

  • And they would predict a cat.

  • What's really cool is that in the generative wave, we as users can generate our own content, whether it be text, images, audio, video, or more.

  • For example, models like POM, or pathways language model, or Lambda, language model for dialogue applications, ingest very, very large data from multiple sources across the internet, and build foundation language models we can use simply by asking a question, whether typing it into a prompt or verbally talking into the prompt itself.

  • So when you ask it, what's a cat?

  • It can give you everything it has learned about a cat.

  • Let's compare LLM development using pre-trained models with traditional ML development.

  • First, with LLM development, you don't need to be an expert.

  • You don't need training examples, and there is no need to train a model.

  • All you need to do is think about prompt design, which is a process of creating a prompt that is clear, concise, and informative.

  • It is an important part of natural language processing, or NLP for short.

  • In traditional machine learning, you need expertise, training examples, compute time, and hardware.

  • That's a lot more requirements than LLM development.

  • Let's take a look at an example of a text generation use case to really drive the point home.

  • Question answering, or QA, is a subfield of natural language processing that deals with the task of automatically answering questions posed in natural language.

  • QA systems are typically trained on a large amount of text and code, and they are able to answer a wide range of questions, including factual, definitional, and opinion-based questions.

  • The key here is that you needed domain knowledge to develop these question answering models.

  • Let's make this clear with a real-world example.

  • Domain knowledge is required to develop a question answering model for customer IT support, or healthcare, or supply chain.

  • But using generative QA, the model generates free text directly based on the context.

  • There's no need for domain knowledge.

  • Let me show you a few examples of how cool this is.

  • Let's look at three questions given to Gemini, a large language model chatbot developed by Google AI.

  • Question one.

  • This year's sales are $100,000.

  • Expenses are $60,000.

  • How much is net profit?

  • Gemini first shares how net profit is calculated, then performs the calculation.

  • Then Gemini provides the definition of net profit.

  • Here's another question.

  • Inventory on hand is 6,000 units.

  • A new order requires 8,000 units.

  • How many units do I need to fill to complete the order?

  • Again, Gemini answers the question by performing the calculation.

  • And our last example.

  • We have 1,000 sensors in 10 geographic regions.

  • How many sensors do we have on average in each region?

  • Gemini answers the question with an example on how to solve the problem and some additional context.

  • So how is that?

  • In each of our questions, a desired response was obtained.

  • This is due to prompt design.

  • Fancy.

  • Prompt design and prompt engineering are two closely related concepts in natural language processing.

  • Both involve the process of creating a prompt that is clear, concise, and informative.

  • But there are some key differences between the two.

  • Prompt design is the process of creating a prompt that is tailored to the specific task the system is being asked to perform.

  • For example, if the system is being asked to translate a text from English to French, the prompt should be written in English and should specify that the translation should be in French.

  • Prompt engineering is the process of creating a prompt that is designed to improve performance.

  • This may involve using domain-specific knowledge, providing examples of the desired output, or using keywords that are known to be effective for the specific system.

  • In general, prompt design is a more general concept while prompt engineering is a more specialized concept.

  • Prompt design is essential while prompt engineering is only necessary for systems that require a high degree of accuracy or performance.

  • There are three kinds of large language models.

  • Generic language models, instruction-tuned, and dialogue-tuned.

  • Each needs prompting in a different way.

  • Let's start with generic language models.

  • Generic language models predict the next word based on the language in the training data.

  • Here is a generic language model.

  • In this example, the cat sat on.

  • The next word should be the, and you can see that the is most likely the next word.

  • Think of this model type as an autocomplete in search.

  • Next, we have instruction-tuned models.

  • This type of model is trained to predict a response to the instructions given in the input.

  • For example, summarize a text of x.

  • Generate a poem in the style of x.

  • Give me a list of keywords based on semantic similarity for x.

  • In this example, classify text into neutral, negative, or positive.

  • And finally, we have dialogue-tuned models.

  • This model is trained to have a dialogue by the next response.

  • Dialogue-tuned models are a special case of instruction-tuned where requests are typically framed as questions to a chatbot.

  • Dialogue-tuning is expected to be in the context of a longer back-and-forth conversation and typically works better with natural question-like phrasings.

  • Chain of thought reasoning is the observation that models are better at getting the right answer when they first output text that explains the reason for the answer.

  • Let's look at the question.

  • Roger has five tennis balls.

  • He buys two more cans of tennis balls.

  • Each can has three tennis balls.

  • How many tennis balls does he have now?

  • This question is posed initially with no response.

  • The model is less likely to get the correct answer directly.

  • However, by the time the second question is asked, the output is more likely to end with the correct answer.

  • But there is a catch.

  • There's always a that can do everything has practical limitations.

  • But task-specific tuning can make NLMs more reliable.

  • Vertex AI provides task-specific foundation models.

  • Let's get into how you can tune with some real-world examples.

  • Let's say you have a use case where you need to gather how your customers are feeling about your product or service.

  • You can use a sentiment analysis task model.

  • Same for vision tasks.

  • If you need to perform occupancy analytics, there is a task-specific model for your use case.

  • Tuning a model enables you to customize the model response based on examples of the tasks that you want the model to perform.

  • It is essentially the process of adapting a model to a new domain or a set of custom use cases by training the model on new data.

  • For example, we may collect training data and tune the model specifically for the legal or you can also further tune the model by fine-