Since almost everybody used ChatGPT now, and we'll keep using it (or another similar model) more and more, I believe it is important for everyone to understand the vocabulary used in this field of mine.
Here is a segment of my podcast with Logan Kilpatrick, developer advocate at OpenAI where I asked him for a "one-liner" definition for the most useful (popular) tools involved with GPT (or generative AI) in 2023.
Here's the final list for you:
What is GPT?
GPT is just the architecture of the model essentially. GPT stands for Generative Pre-trained Transformers, which is the technology that GPT-4 and GPT-3.5, all those models are based on that technology.
What is the difference between ChatGPT and GPT-4?
ChatGPT is a website, it's a UI, it's an engineering product that is an interface for OpenAI's machine learning models, of which GPT-4 is one of them.
So GPT-4, GPT-3.5, there's a bunch of GPT-3.5 variant models. Those are the actual machine learning models. ChatGPT is the UI in which you interact with the machine learning models.
What is a token?
Tokens are how the model understands text. Instead of understanding text at a word level or an individual character level, it understands at a token level.
And if you look up OpenAI Tokenizer on your favorite search engine, you'll be taken to a nice visual explanation of how this tokenizing process actually works.
What is alignment?
Alignment is a process of taking a base model and actually making it more useful to a human. Right after the training process finishes for a large language model, it's not super useful or safe in general in that form. But you can go through this whole alignment process to actually make the model more in tune with what humans want to see.
What is RLHF?
RLHF is part of that alignment process in some sense, where you actually tune the model to look at a bunch of human preferences.
You can show a bunch of like example outputs and then let users, let real humans decide what output is better for them.
And then you tune the model to more commonly choose the options that are better for us as humans to interpret.
What is a prompt and a prompt engineer?
Prompts are just text that you send to the model.
And it can be something as simple as a question or a statement or a command, but it can also be super intricate, like a multi-paragraph or a 20 (n) step process. It's essentially what you want the model to do.
Prompt engineering is sort of the iterative process of refining the prompt to better have the model do what you want it to do. The general sense is the model might not always do what you want the first time you ask. And as you prompt engineer, you modify your prompt and see if you can get it closer to giving an output that you would want.
What is a modality or a multimodal model?
Multimodal model is a tongue twister. GPT-4 is OpenAI's first multimodal model, which just means that it can take text input and image input.
Multiple modalities, text and image in this case, and in the future, maybe other modalities, but at the present it's text and image.
What is model hallucination?
Model hallucinations are when you ask, what we humans would consider to be a factual question, like "how many dogs are in the state of Illinois in the United States", the model will generate something that appears to be a potentially valid answer, but actually is fabricated and just completely made up.
And that just goes back to the process of how outputs from these large language models are actually created. In that case, instead of like saying the model is lying or giving us some incorrect answer, we call it a model hallucination.
What is an agent or a GPT agent?
GPT agents is this emerging area where instead of going one step at a time and as the human iterating on the prompt and adding additional steps, you define a set of goals and you let the model go and write however many prompts and make as many commands as it's required for the model to accomplish whatever the goal it is that you set out.
And this has the potential for some really interesting things, some potentially harmful things as well.
It's definitely one of these emergent use cases of large language models.
What is an AGI?
AGI is artificial general intelligence. There's a bunch of different definitions for AGI.
OpenAI's definition of AGI is when these artificial intelligence systems are able to do all economically viable work that a human is able to do.
So once we get to that point, that's when OpenAI will check the box that we've gotten to AGI, but people have very different definitions.
The Turing test was a sort of an original example of this where as long as it was indistinguishable that you were talking to a human or a robot, that might be considered AGI.
And I think we've already sort of passed that mark.
There are a bunch of different definitions.
And voilà! Those are the most important terms to understand when you are working with Openai's products or generative AI in 2023! If you went through the list, I definitely invite you to watch the podcast episode (or listen on Spotify) with Logan: