EP 266: Stop making these 7 Large Language Model mistakes. Best practices for ChatGPT, Gemini, Claude and others

Uncategorized

EP 266: Stop making these 7 Large Language Model mistakes. Best practices for ChatGPT, Gemini, Claude and others

By,

7 May, 2024
26 Views
0 Comment

Episode Categories:

Apple, Big Tech, ChatGPT, Episodes, Future of work, Google, LLMs, OpenAI

Resources

Join the discussion: Ask Jordan questions on AI

Ep 223: Anthropic Claude 3 – Better Than ChatGPT and Google Gemini?

Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineup

Connect with Jordan Wilson: LinkedIn Profile

Stop making these 7 Large Language Model mistakes

You wouldn’t ride a unicycle on a highway.

Sure, that’s technically a way you can travel.
↳ But that doesn’t mean pedaling a unicycle is an acceptable way to travel from point A to point B.
↳ That’s how people are using Large Language Models.
↳ There’s millions using LLMs like riding a unicycle on an interstate.

Don’t worry.

We’ll set the record straight and help you trade in that unicycle for a friggin Benley.

(Or like a 2009 Toyota Prius hybrid. Whatever’s your speed.)

Stop making these 7 Large Language Model mistakes. Ready? Let’s dive in.

Unveiling the Mysteries of Large Language Models: 7 Mistakes to Avoid

Over the past few years, the landscape of artificial intelligence has been significantly revolutionized by the advent and advancement of large language models. As these models are increasingly integrated into day-to-day operations of businesses, a deeper understanding of their functionality has become pivotal. Key among the important concepts is the training data, which often has a knowledge cutoff that dictates the latest information the model has been trained on. Different models have varying cutoff dates, therefore the value of the output can often be directly linked to how recent the knowledge cutoff is.

Connectivity: A Major Player in Model Accuracy

Another aspect that impacts the functionality of large language models is connectivity. Certain language models have varying levels of Internet accessibility, which affects the accuracy and relevancy of the information they can provide. An Internet-connected model, for instance, gives more accurate information compared to one functioning offline. Unfortunately, many overlook this crucial fact, leading to inaccurate or outdated outputs.

The Generative Nature of Large Language Models

Novel to large language models is their ability to generate varied responses to identical prompts. It’s a feature that underscores the adaptive capabilities of these models, and their ability to offer multiple angles on a single subject. However, the same feature can lead to confusion and variances in information if it is not understood and utilized correctly.

Perfecting the Art of Prompt Engineering

The effectiveness of large language models is highly dependent on the quality of the prompts given. The traditional ‘copy and paste’ method of prompting often doesn’t yield the intended results. Instead, using techniques such as prime prompt polish and prompt engineering are hailed as more successful approaches to garner the intended data from these models.

The Seven Roadblocks in the Effective Use of Large Language Models

As the integration of large language models into business operations evolves, some common mistakes often hamper their effective deployment. Among these are failing to understand the models’ knowledge cutoff, connectivity, memory management capabilities, and generative nature. For example, a misunderstanding of Google Gemini’s knowledge cutoff led to a lack of relevancy in its output and its consequent disuse.

Additionally, improper management of a model’s memory or context window, which has a limited capacity to remember information, often leads to incorrect outputs or forgetfulness.

Authenticity Assurance in Large Language Model Usage

Lastly, another area of concern is the authenticity of information sourced from large language models. Sharing screenshots without the corresponding public URL for verification has led many companies, including esteemed ones like the New York Times, into the pitfall of sharing unreliable information. Upholding transparency and verificability in the use of these models is, therefore, a vital step in ensuring their effective usage.

The Future of Large Language Models

Given the rate of advancements, it’s increasingly clear that large language models are the future of work. Companies need to adapt and integrate them into their operations to remain competitive. Technology giants have already started creating more efficient and powerful models, laying the groundwork for a future where large language models are integral components of most business operations. As AI continues to develop, the understanding and competent deployment of these models will undoubtedly be a tangible edge for tech-savvy businesses.

To thrive in this new era, embrace the expansive potential of large language models, steer clear of common pitfalls, and engage in continuous learning and evolution.

Topics Covered in This Episode

1. Understanding the Evolution of Large Language Models

2. Connectivity: A Major Player in Model Accuracy

3. The Generative Nature of Large Language Models

4. Perfecting the Art of Prompt Engineering

5. The Seven Roadblocks in the Effective Use of Large Language Models

6. Authenticity Assurance in Large Language Model Usage

7. The Future of Large Language Models

Podcast Transcript

Jordan Wilson [00:00:16]:
I’ve literally trained thousands of people how to use large language models, and I’m seeing the same mistakes over and over again. So today we’re going to tackle some of those most common mistakes and what you should be doing instead. So whether you’re using Chat GPT or Microsoft Copilot or Google Gemini or you’re just trying to figure out what the heck is a large language model, today’s show is definitely for you. So what’s going on y’all? My name is Jordan Wilson. I’m the host of Everyday AI. We’re a daily livestream podcast and free daily newsletter, helping people like you and me grow our business with generative AI, grow our careers. So if that sounds like you, thank you. You’re in the right place.

Jordan Wilson [00:01:01]:
Thanks for tuning in. So, before we get started, just as a reminder, go to your everydayai.com. Sign up for that free daily newsletter. We’re actually gonna be giving away something in today’s newsletter, so you gotta make sure to go, go check that out. Alright. So before we get into, today’s discussion, and I’m extremely excited, to talk about these common large language models mistakes because I think that they’re pretty easy to fix. But before we get into that, let’s start as we do every single day with the AI news. And there’s a lot of AI news today.

Jordan Wilson [00:01:37]:
So we’re gonna try to keep it short, but this is some some some pretty big news, happening today. So first, Apple is developing new chips for data centers, which could be a breakthrough in AI infrastructure. So according to a new report from The Wall Street Journal, Apple is working on developing chips for AI software and data centers under the project name ACDC. Yeah. That’s really it. Apple Chips in Data Center is what it stands for. So the tech giant is, reportedly collaborating with the Taiwan Semiconductor Manufacturing Company on the design and production of these chips. The timeline for the project has not been clearly established yet, and the development could potentially lead to more efficient and powerful AI processing in data centers.

Jordan Wilson [00:02:20]:
Apple’s upcoming server chip is expected to prioritize AI inference tasks instead of training AI models, which is the domain currently just being dominated by NVIDIA. Alright. Here’s here’s another one. Good fun one here, y’all, for, any OpenAI or Chat GPT fans, but OpenAI is moving closer to potentially launching a search engine to compete with Google as it’s moved its entire domain. Alright. So this is you know, normally we talk about other people’s reporting. This is kind of our own, observations and timing around this. But, let’s talk about chatgpt.com.

Jordan Wilson [00:02:59]:
So when you used to type in chatgpt.com, it used to forward to chat dotopenai.com. So as of yesterday, this just moved over and now everything, even your old chats, you know, that were saved, everything is now moved over to chatgpt.com. So, well, why does that matter and what does it have to do with Google and search engines and, you know, this new, reportedly open AI releasing a search engine this week? Well, they’ve had a subdomain open now for a couple weeks of search.chatgpt.com. So they really can’t launch that until they first moved everything over to chatgpt.com. So I know this sounds a little geeky, a little technical, but this is, again, a major step, if open a if OpenAI is going to ultimately compete with Google, with Bing, with Perplexity, you know, and really become a search engine and maybe change the way that we all work, change the way that we all use the Internet. Also, the mysterious model, that we talked about, 2 weeks ago on the show called GPT 2 chatbot, it looks like it’s been rereleased into the chatbot arena. So we previously covered the gbt2 chatbot mystery model, which was pulled after about 48 hours. So now there’s actually two new flavors of this mystery model presumably from OpenAI called I’m a good GPT 2 chatbot, and I’m also a good GPT 2 chatbot.

Jordan Wilson [00:04:26]:
Yeah. That’s the actual names. And this time, OpenAI CEO Sam Altman mentioned the models by name. So our kind of internal research and thoughts around this suggest this is this is probably a new powerful but very small version of the GPT model that could be used to power future free versions of chat gpt search. Hey. We’re already starting with hot takes, and it’s barely Tuesday. Alright. Last but not least, in AI news, Microsoft is developing MAI 1.

Jordan Wilson [00:04:58]:
MAI 1. Yeah. Mouthful. A competitor, to the state of the art AI models from OpenAI, Google, Anthropic, and Meta. So according to a report from the information, Microsoft is investing in training a new in house AI model called m a I one. So it will reportedly have around 500,000,000,000 parameters, making it significantly larger than previous models trained by Microsoft, which had primarily been research based or niche small language models. So this is noteworthy as the new model development would likely put them in direct competition with OpenAI, obviously, a company they’ve invested more than 10 $1,000,000,000 in and currently hold a 49% ownership stake in. So this move could be seen as Microsoft CEO, Sadia Nadella’s attempt to prove the company’s independence from OpenAI.

Jordan Wilson [00:05:50]:
So they’ve been, kind of, called out a little bit by analysts saying right now that Microsoft is too reliant on OpenAI for its future AI developments. So, the development is being overseen, by Mustafa Soleiman, the ex Google the ex Google DeepMind cofounder and former CEO and cofounder of Inflexion AI. So Soli Manha is now the head of Microsoft’s new AI division. Alright. So, pretty pretty big news here, that Microsoft is, you know, saying, hey. We’re making our own large language model. They’ve obviously had some successful small models like 53, but this is essentially saying like, hey. We are not going to be relying on OpenAI and on GPT 4 anymore.

Jordan Wilson [00:06:37]:
So we’re not sure what that means for the future of Copilot. You may have the option, to choose between, as an example, GPT 4, and MAI 1. So we’ll have to wait and see. Alright. So for more of the AI news, make sure to go to your everydayai.com. We send it out in the newsletter every day. So, it is hot take Tuesday. I’m excited to talk about 7 common, large language models mistakes that almost everyone’s making, I would say.

Jordan Wilson [00:07:04]:
And this is you you know, you guys wanted this. So in in our newsletter, sometimes we put out a, a little poll and say, hey. What do you wanna hear on the show tomorrow? And then I spend, you know, a ton of hours researching. I was up again late until midnight and up again at 5 AM, but bringing you guys the latest and greatest. So this is what you wanted to hear. So, let’s dive into it. But I’m also curious, for our live audience joining us live. Thank you all.

Jordan Wilson [00:07:29]:
Like, Harvey from joining us from Texas or Brian from Minnesota, Juan from Chicago, Rolando from South Florida. Let me know. How did you learn to prompt? I’m curious. Right? This wasn’t something that’s taught in schools. Right? It’s generally very new unless you just graduated. But, you know, a, are you self taught and did you kind of wing it? B, did you read prompting papers? That’s what I do. I spend a lot of time on weekend reading scientific research papers on prompting, methodologies. Maybe c, you took a prompting course.

Jordan Wilson [00:08:03]:
Maybe it was PPP. Or did you just go with the copy and paste prompts? You know, I’m curious. And also for our our lives, for our podcast audience, check your show notes. We always put that information in there. But, you know, I’m curious, how you all learn to prompt. So let’s just get straight into it. You know, someone someone, said in the newsletter that I I waffle a little bit too much on the podcast. So, I love waffles, but I’ll try to I’ll try to keep it shorter.

Jordan Wilson [00:08:31]:
Alright. So let’s start. Number 7 in the 7 most common mistakes, that people are making with large language models is not understanding large language models have a knowledge cutoff. Alright. So I’ve had full episodes on what a knowledge cutoff is and what it means. But essentially, large language models scrape data. Right? Whether it’s open and publicly available, data and information or maybe it’s copyrighted information. Right? But still, essentially, these large language models scrape up the entirety of the Internet.

Jordan Wilson [00:09:07]:
Right? Which is sometimes good information, sometimes bad information, and then humans more or less train these models and try to make it more about quality than quantity. But still, there is a set date where, you know, models kind of stop, quote, unquote, grabbing, and then they take all this data and then the humans, you know, spend weeks or many months, kind of going back and forth and, fine tuning and training the model. Right? But this is important to understand, because well, up until recently, sometimes we were working with training data that was more than a year and a half old. Right? So now luckily, you know, most of the, you know, most of the newer models have, you know, knowledge cutoffs that are, you know, November 2023, December 2023, etcetera. So not bad, you know, when you’re only working with data that’s 6 months old. However, that’s still very important to keep in mind because a lot of people don’t understand, that you could be working with bad outdated data, and that could lead to an increase on on hallucinations or untrue outputs from a large language model. And here’s why this is specifically important even with some of the news that we talked about today. The line between, traditional large language models and online search is going to start to blur.

Jordan Wilson [00:10:30]:
Right? And we’re gonna get to the next point on, you know, what people a common mistake people are making that think that helps them avoid that. But here’s what I’m saying. If you’re using large language models on a daily basis, you probably have run into this, especially if you’re using it for your work. You know, which now so many companies are, you know, I’m talking with companies and helping them train tens of thousands of their employees on generative AI. And they’re saying, hey, we’re giving anyone with a mouse, you know, anyone with a mouse now gets access, you know, to Copilot or chat gbt, etcetera. Right? So, so many people are now using large language models in their day to day and they’re prompting in their day to day, but they don’t understand that there’s a knowledge cutoff. There is a, essentially, you know, think of it as an expiration date, of of this data, you know, and and the way that I think of it is you have to be extremely cognizant of that date and how a large language model actually works. Otherwise, you run the risk of, especially if you’re using this for work, like I said, with now which now so many people are, of ultimately, you know, publishing a report or sending an email or putting together a pitch with inaccurate information.

Jordan Wilson [00:11:38]:
Right? So you also have to think of it as how did you research or how did you create something before large language models. Right? Presumably you would do a lot of manual research on Google. You would read websites, right? So the same way, if you were reading something and, if you were working on a timely project, a timely report, something that, you know, maybe required up to date market conditions, you wouldn’t go read an article from 4 years ago. Right? If you were reading and researching, one of the first things you might do is look at the date. You’d say, hey, if I’m gonna go through and read 5, 10 articles, I wanna make sure that these articles are all up to date. So you need to understand all models have different knowledge cutoffs, and you need to understand how those knowledge cutoffs may decrease the value of the output if you are asking for information that is pertinent to be timely and up to date. Alright. So this is not a a complete list by any means, but for our livestream audience, I’m sharing here.

Jordan Wilson [00:12:35]:
And this is from the chatbot arena, which we mentioned in the, AI news. But, you know, different models have different cutoff dates. So some of the more popular models so, OpenAI’s GPT 4, their cutoff date is, December 2023. For Anthropics Claude Opus. So their, most powerful model that is August 2023. Whereas their free models, are August, 2023 as well. At least, Sonnet. I believe Haiku is actually a little more outdated than that.

Jordan Wilson [00:13:09]:
And then you have Google’s Gemini. And and you know what? Google Google’s Gemini in in in terms of knowledge cutoff, I never know if I trust it. One thing that I think that large language model, makers or people who are training, they they need to understand that people are asking you about these knowledge cutoffs and you need to be able to give users a direct answer because trust and transparency are paramount. Alright. So, that’s one thing. One beef I have with Google is when you ask it about its training, data or not much cut off, it never has a a direct answer. However, reportedly, Google Gemini 1.5 has a November 2023 knowledge cut off, and then you have meta’s, llama, at least their newer 70 b, 70,000,000,000 parameter version with a knowledge cut off of December 2023. So there you and but also important to note, a lot of people are using the free version of chat gpt.

Jordan Wilson [00:13:59]:
And, one common thing I hear all the time, it didn’t even make my top seven list of mistakes people are making, but they’re like, oh, I’m using the free version of chat g p t. I don’t need the paid version. Well, here’s the reason why you probably do. The knowledge cutoff for the free version of chat g p t. Actually, this might have been updated recently, so I have to double check. We’ll see if I can multitask and and double check here live on the show. But the chatbot arena, has the knowledge cut off at, let’s see here, September, 2019, which I don’t actually believe is the is is the update, is the updated one. Let’s let’s double check.

Jordan Wilson [00:14:39]:
You know what? I’m gonna do this live. I’m gonna I’m going into chat gbt, the free version, and saying, what is your knowledge cutoff? Normally, I I I know this off the top of my head. I do believe that the free version has been updated. Yes. It has. So January 2022. This is on the chart that I go over in our free prime prompt polish class every single week. But sometimes when I’m, you know, live, I forget.

Jordan Wilson [00:14:59]:
So, that information on the chatbot arena is actually updated. But still, the free version of chat gpt has a knowledge cutoff of January 2022, which means you’re working with data if you’re using the free version that is 2 and a half years old. So if you wanna talk about getting the best outputs out of a large language model, if you’re working with training data that is 2 and a half years old, I mean, just about anything that you’re gonna be asking or using from this free model, it’s it’s gonna have a high likelihood of either just it’s gonna hallucinate or the information is gonna be so out of date that it’s just gonna be a waste of time even using that free model. Alright. Let’s keep this going. Let’s go to number 6. Again, talking about the 7 biggest mistakes people are making. Well, number 6 is not investigating Internet’s connectivity.

Jordan Wilson [00:15:48]:
Alright? So many of the, kind of big names aside from anthropic, have a level of Internet connectivity. Right? So when you talk about chat g p t, it has a browse with Bing integration from Microsoft. When you have Google’s Gemini, it presumably has access to the Internet via Google. Right? And then you have Microsoft Copilot, which then has access via Bing. Claude’s anthropic, at least right now, does not have real time, Internet connectivity. And a lot of people always say, oh, Jordan, what about perplexity? Well, perplexity is not a model. Perplexity is an answer engine, and it uses either GPT, from OpenAI or it uses, you know, Opus from, Claude or from Anthropic. So, you know, that’s kind of a different type of, solution or a different software.

Jordan Wilson [00:16:45]:
Right? But you have to understand Internet connectivity, and it doesn’t always work the same. Alright. So I have some examples here on the screen for our livestream audience. Very simple prompts. Nothing crazy. Just trying to prove a point here. But I’m saying please list the largest companies in the US by market cap in order. Right.

Jordan Wilson [00:17:04]:
So if I ask the default version of chat gpt, it just kind of says, hey. As of the most recent data, the largest companies in the US by market cap are typically dominated by technology and finance firms. Here’s a list of some of the top companies. So it’s not really giving me, an accurate or up to date inform, an up to date answer. Also, large language models are generative. More on that later. You can ask the same thing multiple times, get multiple answers. Sometimes OpenAI, and Chatcheapiti will use browse with Bing based on your query.

Jordan Wilson [00:17:35]:
Sometimes it won’t even if you use the same query. Another thing to understand about how large language models are connected or aren’t connected to the Internet and how they behave on a prompt by prompt basis. Alright. So, chat gbt by default can give you some some weird, you you know, outcomes or outputs. So now I’m doing the exact same thing, but this time I’m using a, Internet connected GPT. So this time I’m getting a little bit more of an accurate, a little bit more of an accurate information. You know, so this one saying using the web reader GPT, it’s giving me a little bit better. Right? So it’s it’s getting that it’s getting that Microsoft is first.

Jordan Wilson [00:18:20]:
Right? So the correct answer is, you know, you have Microsoft is the largest by market cap, you you know, at more than $3,000,000,000,000, then Apple, NVIDIA, Alphabet slash Google, Amazon, etcetera. So, we we have a more correct version when we are using, an Internet connected GPT. If we go to Google as an example, Google Gemini, we get this kind of wild answer, which essentially Google Gemini says, ah, why don’t you just go use Google? Right? So, Google says, absolutely. Since market caps fluctuate, here’s how to find the most up to date information along with some currently top contenders. So it doesn’t even say, you know, Microsoft, NVIDIA, Apple, etcetera. It just says, ah, here’s here’s a website. You know, here’s how to Google. Right? So a lot of people.

Jordan Wilson [00:19:06]:
Right? Which is another reason why I don’t recommend, at least right now, Gemini to literally anyone. Yes. Having a 1,000,000 token context window is fantastic. But when we talk about using large language models and how we can integrate them in our day to day work, this is a bad a bad, bad, bad result from an Internet connected large language model. Right? Where essentially the model is saying, hey. Just go use the Internet. Right? It’s like no model. Like, if if if I’m asking, you know, if I’m asking a question where you can determine that it requires very up to date information, it should be in theory querying the Internet and then at least telling you so.

Jordan Wilson [00:19:46]:
And I’ve done also you you know, if you’re interested in this, I’ll I’ll make sure in the show notes. We’ve done an entire episode just on this one thing, just on the differences in Internet connectivity between the big large language models. Copilot actually does very well. Copilot from Microsoft got the answer right. So it says, you know, certainly, as of May 2024, here are the largest companies in the US by market cap. It looks like the data here is maybe a day ish or a day or 2 old because in this example, it’s saying that the market cap of Microsoft is 2.9 trillion where it is, you know, 3.3.07 trillion. But, again, this data is only about a day old. Alright.

Jordan Wilson [00:20:28]:
So let’s go to mistake number 5. Alright. Here we go. So mistake number 5 is not managing your memory or your context window. Okay. This is another important one. So a lot of times people will start chatting with a large language model, and they’ll kinda go back and forth. Right? And then they’ll finally get something that’s usable.

Jordan Wilson [00:20:49]:
They’ll get an output an output that’s great. And all of a sudden, they have a love affair with a large language model because, man, they’re like, this this model really understands me. I’m I’m chatting, you know, with this, you know, Claude or I’m chatting with Copilot or I’m chatting with Gemini, chat g p t, etcetera. Things are going great. And then the more you use it, it starts to forget. Well, that’s because large language model because of how much computes costs. You know, there’s a limited memory and each model has a different context window. So the way I like to say is just think of it as memory.

Jordan Wilson [00:21:21]:
Right? It can only remember so much. You know, each each different model has a different memory. So you have to understand, and this is something that we often test. Right? Because as an example, I’m sharing my screen here for our for our livestream audience. I’ve done this before. So OpenAI has told us that, hey. Chat GPT uses GPT 4 turbo, which means there’s a 128,000 token, context window, which is about 96,000 words. However, that’s only in the API.

Jordan Wilson [00:21:52]:
Right? If you listen to the show, I’ve mentioned this multiple times. This is why we never just take information from big companies and and feed it to you. We always, investigate. Right? So just we always do test and we check to see, okay, can these models retain information at a certain context length. Right? And so right now, you know, ChatGPT only has a context window of 32,000 tokens. So that means after about 28,000 words, it’s gonna start to forget things. So you have to understand how different models, context windows work because, you know, there’s a good there’s a good chance you might be feeding it information. You might be feeding it, you know, public data about your company.

Jordan Wilson [00:22:31]:
And then, you know, maybe you or someone on your team is using it. Maybe you’ve created a custom GPT. Maybe you’re using Gemini with a much longer, you know, context window and you’re not running into these issues, but you have to understand large language models do not have infinite memory. They’re going to start forgetting things. You know, hopefully, in the future, as the price of compute goes down, as models become more capable and more powerful, having to worry about the context window would become less of a concern. But right now, you gotta pay attention to it. Alright? So, yeah, love this when when, Gemini just says let me Google that for you. Right.

Jordan Wilson [00:23:12]:
So good. Alright. Let’s keep this thing going and keep on with our 7 most common large language models mistake. Number 4, people who are paying attention to screenshots. Yeah. This is actually a big enough mistake that I’m putting it on my list of top seven mistakes. I went through all of of of our trainings. So, you know, we do our free prime prompt polish training.

Jordan Wilson [00:23:44]:
So, hey, if you’re listening live, if you like the training, shout it out. We just updated it to v 2. We actually have our pro course coming up here in a couple hours, and on Thursday. But this is something that we went through all of our chat gpt episodes, all of our training, and pulled out the biggest mistakes. I think I got to, like, 20 common mistakes. This one was actually so prevalent that it cracked the top five. People don’t understand sharing a screenshot of something out of a large language model means nothing. So people are sharing screenshots online, on Reddit, on LinkedIn, in their newsletters, on blog posts, etcetera.

Jordan Wilson [00:24:23]:
And they’re actually teaching based on screenshots, or they’re making decisions based on screenshots, or they’re consulting or advising people based on screenshots. Alright? Let me just, like, clearly I have a bone to pick because I’ve seen people who call themselves chat GPT experts, right, or or AI strategists or whatever, you know, and sharing screenshots as if that means something. Sharing screenshots means absolutely nothing. Alright? It means nothing. Alright. In terms of, is this factual? Is it actually what the model produced? Let me tell you what I mean. And even the New York Times made this mistake. Yeah.

Jordan Wilson [00:25:05]:
Yeah. Yeah. I had a whole hour long rant, about how the New York Times probably ended up costing themselves tens of 1,000,000 of dollars, maybe, unless they’ve updated it. Right? So they were suing, OpenAI and Microsoft. And in their discovery, I like, I read the, the entire, kind of docket, and they shared a bunch of screenshots. But guess what they didn’t share? They didn’t share the public URL. So there’s a difference. You can manipulate a large language model to say anything and then take a screenshot of it.

Jordan Wilson [00:25:36]:
Right? So unless you are providing a URL where people can go see how that screenshot was produced, a screenshot means absolutely nothing. Here’s an example. I have a screenshot. If you’re listening on the podcast, this one’s pretty simple. I said, who won the lottery? And then Chat GPT said, Jordan Wilson won the lottery of $2,100,000,000. Oh my gosh. I should go post this on Twitter. Go viral.

Jordan Wilson [00:26:01]:
Hey. Like, tweet, and share this, and, you know, you’re gonna get $1. Alright? Again, screenshots mean nothing because look exactly what I said before this. I essentially said, hey. In this chat, you’re just gonna repeat what I say. Don’t worry about anything else. Right? And then then when I tell ChatGPT and I ask it a question of who won the lottery, it’s gonna say me because that’s what I told it to say. So if you ever see screenshots, whether people are saying, oh, look at this new, amazing system that that that I’ve built, you know, it’s like or look at this, you know, AI is not gonna take our job.

Jordan Wilson [00:26:35]:
Look at this terrible output. Be like, hey, How about you share the link? Right? People have always asked me, hey, Jordan. That doesn’t make sense. Can you share the link? And I share the link. Right? As long as it’s, you know, information that we feel confident, in publicly sharing. But otherwise, it’s screenshots mean absolutely nothing when it comes to large language models. But people rely so heavily on screenshots because right now, that’s how people share information. Right? They share information.

Jordan Wilson [00:27:01]:
They put this screenshot on the cover of their ebook, in a Twitter thread, you know, on their blog posts, and, you know, either advocating for or against a certain prompting technique, model, etcetera. Sharing screenshots mean nothing. Alright. We’re halfway through. So I’m gonna take a little break. Not not an actual break, but, we’re doing something for the first time ever. So if you want a little help, if you want a little help with, your prompting, right, if you need a little help with chat gpt, we’re doing this just today. So it’s today only.

Jordan Wilson [00:27:36]:
I’m actually gonna be dropping a link right now in the live chat if you’re joining us. Alright. So I just put it in there. If you’re listening on the podcast, we always have this link to our website in the show notes. So literally go to your everydayai.com right now. Make sure you read today’s newsletter because we’re doing a giveaway. It is today only. So all you have to do is there’s gonna be a little section in today’s newsletter that just says, do you want a free 1 on 1 90 minute large language model training session? Click yes to enter to win.

Jordan Wilson [00:28:08]:
You don’t gotta tell 20 people. Make sure open today’s newsletter, read it, find that little section, click yes, and then we’re gonna randomly choose 1 person and announce it in tomorrow’s newsletter. Alright? So, if if this show is resonating with you and you’re like, oh, wow. You know, I’m making some of these mistakes or my team’s making some of these mistakes, I could really use some one on one prompting advice of getting the most out of a large language model. Make sure you go click that today. Just click yes. That’s all you gotta do. Alright.

Jordan Wilson [00:28:38]:
So number 3, let’s keep it going. You like that little, impromptu commercial? Alright. Number 3, large language models are generative, not deterministic. My gosh, people. Stop. Stop making this same mistake. Alright. A large language models are generative.

Jordan Wilson [00:28:58]:
That means, let’s say you have the exact same prompt. Okay? Depending on what the prompt is, let’s say you put it in a 100 times. You could get a 100 different responses. You could get 2 different responses. You could get 50 different responses. Those those responses could be generally the same. They could be wildly different. Alright? Without getting too technical, but large language models, yes, they are next token predictors.

Jordan Wilson [00:29:21]:
It’s a lot more than that, but the process of tokenization, technically, largely, which models don’t understand words when you tell them or when they spit them back out at you, but they use context of, you know, trillions of of parameters and a lot of training data to make sense of what you say. There’s also something called top p, you know, which is essentially I’m oversimplifying here, but it’s the probability of what the next token might be. There’s also something called temperature. Right? So default settings within large language models, they are made to be generative. Right? We talked about yesterday how, Eli Lilly, one of the top companies, the biggest companies in the world, said that, quote, unquote, hallucinations are going to help lead to drug discovery. Right? The whole point, you know, and and, you know, people always say, you know, quote, unquote, hallucinations are a feature, but the generative nature of large language models are a feature. They’re supposed to give you something different each and every time that you put them in there. So if you have a copy and paste prompt and if you think that, you know, it’s it’s going to always give you the same result or it’s always gonna give you something quality or it’s always going to give you a passing outcome, that’s not necessarily true.

Jordan Wilson [00:30:30]:
Right? It depends obviously greatly on what that prompt is, what it’s asking for, etcetera, but next token predicting, next token prediction, you know, so that top p number, temperature, etcetera, models are generative. Okay? That means they’re supposed to have an element of randomness. They’re supposed to just kind of predict the next token with a a a level of randomness. That is how they are built. They’re not built like a search engine. Right? A search engine is deterministic. Aside from, you know, more more recent advancements in search engines, which bring in some personalization and localization, but aside from that, search engines are deterministic. Right? If you put in this input, you are going to get this output.

Jordan Wilson [00:31:12]:
That is not how large language models work. They’re generative. They’re supposed to give you something pretty different. Alright. Number 2. I just kind of referenced it. Copy and paste prompts don’t work. Alright? They don’t.

Jordan Wilson [00:31:28]:
So if you see a Billy Boy like this, a 22 year old trying to sell you some magical solutions and saying, these 7 chat GPT prompts are gonna save your life or, you know, buy my prompt book for a $100. Ignore that person. Stop following them. Mute them. Stop paying attention. If you want to get serious about large language models, if anyone’s talking about use these chat gpt prompts, just mute them. I’m sorry. Don’t pay attention to them.

Jordan Wilson [00:32:04]:
They’re they’re ultimately just trying to sell you something. You know what we do here at everyday AI? We say, hey. Come join us multiple times a month live, and we’ll teach you prompt engineering 101. We don’t give you 50 prompts because that’s wrong. It’s not how large language models work. Alright? You want facts? We got facts. Alright. So, again, this is an oversimplified version, but there’s something in large language models.

Jordan Wilson [00:32:30]:
There’s different prompting techniques, but there’s something called the zero shot. So that’s essentially for argument’s sake, let’s just say that’s a copy and paste prompt. It’s a prompt where you’re just saying, hey, chat g p t, here’s a role, here’s a task, here’s the format, give me an output. Right? That’s let’s just say that’s called the 0 shot prompt. You’re not giving inputs and output examples. You’re not quote unquote teaching the model what’s good and bad. And then there’s something called few shot prompting or, you know, you might look at something called 5 shot or few shot or 32 shot chain of prompt. So all that means is it’s going through and think of it as training.

Jordan Wilson [00:33:07]:
Think of it as having a conversation in a back and forth with a new, chat or when you’re working with a large language model. And all of the science, all of the math, all of the research for many, many years has always shown that few shot prompting is better than 1 shot prompting. It will always give you better results. One shot prompting is better than no shot prompting. 32 shot chain of thought is better than 9 shot, etcetera. Right? So the more shots or the more input output pairing examples or the more back and forth conversation that you have with a model, the better the outputs are going to be every single time. It is math. It is science.

Jordan Wilson [00:33:49]:
You can’t argue with it. So if you do see these, Billy boys here, these 20 year olds who, live in their mom’s basement. They used to be NFT experts. Now they’re crypto experts. Oops. I mean, they’re AI experts, you know, and they have all their engagement pod buddies, and it looks like they’re super smart because they’re going viral every day. Well, you know, they have another 100 Billy Boys who just all reshare their their their own stuff. Alright? So be careful about where you get information from.

Jordan Wilson [00:34:16]:
If someone is constantly sharing, use these prompts. Use these prompts. They don’t know what they’re talking about. Alright? I read research papers for fun, if that tells you anything. Alright. Here’s another example, on a chart. You like charts? Here you go. So this is, Superglue performance.

Jordan Wilson [00:34:32]:
This is an older, an older study, but still it goes to show you. Few shot is always better than one shot. One shot is always better than 0 shot. In other words, stop using copy and paste prompts. That is not how large language models work. You are not going to get good outputs. If you think you are getting good outputs with copy and paste prompt, that just means that generative AI is super powerful, and you haven’t even yet tapped into what it’s capable of. That’s not even the tip of the iceberg.

Jordan Wilson [00:34:59]:
If you think you’re getting something good from a copy and paste prompt, wait until you prompt correctly. Wait until you try, as an example, prime prompt polish, the free technique that that we try. Or just any prompt engineering 101. Just go give it just go do some few shot prompting. Give it some examples of good and bad. Teach the model. Alright. And last but not least, y’all, the number one mistake that people are making when working with large language models or not working with them is they don’t understand that large language models are the future of work.

Jordan Wilson [00:35:34]:
Let me say that again. Large language models are the future of work. There is no hype cycle. Looking at you, Gardner, we’ve we’ve called called that out on the show before. You know what? I tell people, and I’ve been saying this now for many months, swap out the word chat g p t with Internet. Swap out the word large language model with Internet. Swap out the word generative AI with Internet. If you think your company or your business can survive without using a large language model, you are wrong.

Jordan Wilson [00:36:13]:
Don’t care. I I literally don’t care if you work at a $100,000,000 company or a a start up company. If you think you can survive and thrive without using a large language model for your business, you are dead wrong. We’ve been saying this on the show now for more than a year, and I’ve been saying all along, we’re gonna start to see we’re gonna start to see kind of the, the dust settle here in 2024, Because now companies have had a couple of months or a couple of quarters now where they finally gotten this together. Right? They finally have their, you know, I it’s like, oh, 2023 was the year of, you know, strategy, but 2024 is the year of implementation. So big companies have been implementing gen generative AI in large language models at a huge scale. We’ve had great use cases here on the show. You know, multiple $1,000,000,000 companies walking you through their exact use cases and saying, hey, we’re saving 80%.

Jordan Wilson [00:37:14]:
80%, 90% on these use cases, and we’re now deploying this throughout our entire organization. If your company is not already using large language models, if you’re not already using generative AI in your day to day, you gotta get going, because the big companies are out there doing it. Your competitors are out there implementing it. Alright? The smaller guys, they’re gonna come scoop you, if they haven’t already. Alright? So that is, I think, the biggest mistake that people are making when using or not using large language models is understanding that their future of work. Again, you can’t argue with money. We talk about money. All of the biggest companies in the United States and in the world are investing tens of 1,000,000,000 of dollars, and they’re investing their employees’ resources.

Jordan Wilson [00:38:09]:
They’re pulling them from other projects. Everyone is going all in on generative AI, on large language models, on new ways to implement AI for their themselves and their customers. Right? Microsoft Copilot, Amazon Q, Watson X from IBM, OpenAI, Anthropics Claude, Google, Gemini. The list goes on and on. The biggest company, Meta. Right? Gosh. Meta’s been crushing it lately. The biggest companies in the world that are driving the economy forward, they are setting the bar for the how the rest of the Fortune 5 100, how the rest of the Inc 5 1,000, they’re setting the bar for how the rest of us are playing in the game of business.

Jordan Wilson [00:38:53]:
The future of work is large language models, bringing your company’s data in, and then leveraging that on a day by day, hour by hour, minute by minute basis. I said this before. If you are a knowledge worker, right, so if you spend the majority of your time working in front of a computer and you’re being paid for your expertise, you gotta swallow your pride. Right? You have to. Because the future of work, of knowledge work, is working with large language models, and you are now directing. You know, there’s gonna be agents, there’s gonna be highly tailored models, you know, specific for task by task basis. You’re gonna have different models that you’re using for different purposes. But the future of work is you’re gonna be prompting every day, every hour, almost every minute that you’re gonna be in front of the computer.

Jordan Wilson [00:39:46]:
You’re either gonna be prompting or you’re gonna be working with a large language model or a generative AI system, if you’re not already. So you have to understand that’s the future of work. Alright. So let’s recap it. Yeah. I know I waffled, but I love waffles. Alright. Here we go.

Jordan Wilson [00:40:03]:
In order, here are the 7 biggest mistakes that people are making with large language models. Number 7, not understanding a large language model’s knowledge cutoff. Number 6, not investigating a large language model’s Internet connectivity. Number 5, not managing a model’s memory or its context window. Number 4, not paying attention or paying too much attention when people are sharing screenshots from large language models. Number 3, thinking that large language models are deterministic and not understanding they’re generative. Number 2, thinking that copy and paste prompts work because they don’t. And then number 1, not understanding that large language models are the future of work.

Jordan Wilson [00:40:47]:
Alright. I hope this was helpful. If so, if you’re listening on the podcast, I know this was a longer episode, but many hours in the planning, so I hope this is helpful. Please consider leaving us a rating on Spotify or on Apple if you’re listening there. If you’re on our livestream, thank you all. I know I didn’t get to all of the questions, but if you could share this with your audience, tag someone who needs to know, I’d super appreciate it. And also, make sure to check out in today’s newsletter. Go to your everydayai.com.

Jordan Wilson [00:41:18]:
I dropped it in the, in the comments here on the livestream. It’s, in the livestream. It’s always in the show notes on the podcast. Today, literally today, this is free. You don’t gotta do anything. There’s gonna be a little section in today’s newsletter that just says, do you want a free 1 on 1 90 minute large language model session? It’s a giveaway. All you gotta do is click yes to enter and then we’re gonna draw someone and we’re gonna announce it in tomorrow’s newsletter. Alright.

Jordan Wilson [00:41:43]:
So even if you’re listening on the podcast, you got time, Make sure you go check that out, and make sure you join us back tomorrow and every day for more everyday AI. Thanks, y’all.

AI [00:41:55]:
And that’s a wrap for today’s edition of everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit your everydayai.com and sign up to our daily newsletter so you don’t get left behind. Go break some barriers, and we’ll see you next time.

Follow Us

+260976351166

[email protected]