Sam Witteveen
Күн бұрын
30,066
1

Claude 3 Vs Gemini Vs GPT-4: Who Can Make Amazing Powerpoints?

Ғылым және технология

In this video I compare 3 LLMs
🕵️ Interested in building LLM Agents? Fill out the form below
Building LLM Agents Form: drp.li/dIMes
👨‍💻Github:
github.com/samwit/langchain-t... (updated)
git hub.com/samwit/llm-tutorials
⏱️Time Stamps:
00:00 Intro
00:02 Tweet
00:25 What is the current state of building a PowerPoint Deck with an LLM?
01:08 ChatGPT
04:14 Claude 3 Opus
04:34 Gemini 1.5 Pro
08:27 Making the slides with python
11:46 v0.dev

Пікірлер: 74

@micbab-vg2mu2 ай бұрын
Working in big pharma, I've seen firsthand that less than 1% of people in my corpo are leveraging generative AI effectively, with many in denial about its potential. Despite 18 years as a white-collar manager, I've discovered that AI models like GPT-4 and Claude 3 can handle half of my tasks more competently. Yet, the real breakthrough comes from combining my experience with these models. Despite the challenges in convincing others, I've seen the undeniable benefits of integrating AI into my workflows.
@Chris-qe1nx
2 ай бұрын
its even funnier considering how this comment kinda reads like a LLM generated comment lol.
@micbab-vg2mu
2 ай бұрын
Yes I am not native speaker and I put my writing for clarity and grammar check to Gemini Advance -) @@Chris-qe1nx
@micbab-vg2mu
2 ай бұрын
@Chris-qe1nx Chris I use LLMs to correct my grammar mistakes and improve clarity of my writing - I am not a native speaker. Next time I will use additional prompt: "I have a short comment enerated by an LLM. It's intended for a general audience. I want to rewrite it to make it sound more natural and indistinguishable from human writing" the result are better if I add some misteks and trolling - the content will be 100% human:)
@Chris-qe1nx
2 ай бұрын
@@micbab-vg2mu Not a bad thing at all, that's what it's here for
@ickorling7328
2 ай бұрын
For now. Its learning from you. It can and will replace middle managers and anyone short of the very best and hardest working people.
@jayhu60752 ай бұрын
This is a great example exercise for using LLM to design slices or websites in the future. Hopefully, there will be a follow-up to utilize agents in this. Many thanks for this.
@paulmiller5912 ай бұрын
Very interesting idea thanks Sam. You have inspired me. I constantly need to create tweaked pitch decks, so this is the way to go, giving me back time I was short of from the start. It also allows me to eventually add a full-time salesperson to follow the process that I prefer in content creation without me needing to run the creation process.
@YuraL882 ай бұрын
I'm really excited to see testing in a real-world scenario, not only some puzzles and academic benchmarks.
@ehza
2 ай бұрын
exactly
@AnimusOG2 ай бұрын
Well done brother. Every damn video is a Gem in your channel. More people will find you.... If they dont they're missing out! If you havent seen this man's colabs, they are groundbreaking for your progress folks!
@TheEamonKeane
2 ай бұрын
what collabs ? On other channels?
@supernewuser2 ай бұрын
it's actually hilarious that the gemini one made all of the pages black
@samwitteveenai
2 ай бұрын
lol I hadn't thought of it that way until read your comment. yes very funny point.
@avi72782 ай бұрын
Someone with expertise will still need to collect the data and prompt the LLM, then proof and edit.
@Dave-cg9li2 ай бұрын
'Not sure about the markdown or LaTeX bit'? A lot of presentations in academia and research are fully made in LaTeX, it's definitely a great option for that 😄 ... and it might be the better way for LLMs to design whole presentations
@anothername27302 ай бұрын
The part where you said you told Gemini it had to get its act together was hilarious. 😂
@stevefox74692 ай бұрын
Current risk of using an LLM : 1) takes longer than not using an LLM 2) does a worse job than a human 3) results can't be trusted and needs a human to fact check. Creating a slidedeck is just a task as well, not a job. A job is often a group of only loosely related tasks that make it hard for current LLMs. I am an avid LLM user , doing perhaps 100 prompts a day using Claude Opus, custom build Open AI assistants etc and I feel like spend most of my day figuring out how to get LLMs to do tasks more effectively than me. I feel we still have a long way to go before we get replaced.
@carlpanzram7081
2 ай бұрын
I agree partly. Right now LLMs aren't yet capable of replacing people in their jobs, but I strongly disagree with the notion that "we have a long way to go." Just 5 years ago we had nothing like useful LLMs. If it progresses at the same pace it has in the recent past, it's not going to take longer than 10 years at maximum, until they are more intelligent and knowledgeable than not just the average human, but ANY human. Also, the issue is largely one of architecture and meta architecture. If you had a bunch of LLM that would be really good at a few single specific tasks, in order to have AI replace a human entirely, all you would need is another LLM to replace what YOU are doing as well. Think about your part of the equation, and wether you think AI could replace those tasks too. I mean, what do you do? Write prompts. Make appointments. Send emails. Plan into the future. Use Programms. Organize data and design work flows. Why would AI not be capable of that very soon?
@Fermion.
2 ай бұрын
Right now, LLMs are very powerful tools, if one has good project management and some software development skills. For instance, just prompting an LLM to - "Make a real time inventory tracking system for us" would fail miserably. But if one knows how to efficiently split said project into much more specific tasks, amongst several departments, that's where it can be very powerful Leading to a meeting with the Project Manager and C-levels about layoffs and departmental budgetary cutbacks. (I know this from personal experience). Knowing how to create, superprompt. and defer tasks to specific AI agents is where the current models shine. Some random, non-technical office worker feeding in vague, crappy prompts will get vague, crappy results. GIGO (Garbage In Garbage Out) still applies to LLMs.
@carlkim25772 ай бұрын
Nice test. But i think the prompting could have been more detailed. For example, i would have found a slice i like and uploaded a pic. Show it the colors and formatting you want. We could easily create a custom gpt that asks for the right input and then generate the slides.
@alertbri2 ай бұрын
Have you seen/considered/compared the Gamma app?
@samwitteveenai
2 ай бұрын
no I haven't. Just did a search presume it is this ? gamma.app/
@alertbri
2 ай бұрын
@@samwitteveenai yes, that's it - the free trial gives you a good chance to assess it. I particularly like the Web based display which adapts to mobile / Web for presentation.
@kate-pt2ny
2 ай бұрын
@@samwitteveenai gamma is the best for PPT
@RobBrogan
2 ай бұрын
Came here to say the same. It seems like tasks like these are better handled by specific products geared for that, such as Gamma for presentations, Gigabrain for search Reddit, ReLoom for wireframes, etc... Although ChatGPT is great, it really s***s the bed when you try to ask it something like designing a UI haha
@codelucky2 ай бұрын
Is Claude 3 Opus better at writing Blog posts or for copywriting than Haiku?
@micbab-vg2mu
2 ай бұрын
We do not have access to Haiku yet - in theory Opus it is better but in reality?
@reza2kn2 ай бұрын
Awesome as always. I wonder how better / worse would the Copilot thing in PowerPoint work. Although I'm sure Apple will add the same thing to Apple Keynotes and call it innovation too!
@seventyfive75972 ай бұрын
Copilot Pro is GPT4 fine tuned for this
@mickmickymick69272 ай бұрын
why not mistral? In my experience it's similar to GPT4
@micbab-vg2mu
2 ай бұрын
Mistral is good - but not better than GPT4 - it is why I do not use it more offten - now I am testing Opus.
@ShaneNg2 ай бұрын
It's what M365 Copilot for
@andyma11462 ай бұрын
Hi! 😄 I'm a long-time subscriber and first-time commenter to this channel😁Do you have a Patreon? If so then I would like to check it out but I couldn't find a link on your youtube channel page. Please let me know. Thanks for making all of this excellent content - I feel like I've learned so much from your videos & they are among my favorite AI content on youtube! 🙏
@samwitteveenai
2 ай бұрын
Hi thanks for the kind words. I am planning of launching a Patreon later this week. I just want to load it up with some extra content before launch.
@andyma1146
2 ай бұрын
@@samwitteveenai Oh, awesome! I'm looking forward to it! I hope that you will share a link to it when it's ready, thanks
@Neomadra2 ай бұрын
I think we need AI agents to get proper slides, not sure LLMs will get them right one shot anytime soon
@CNW212 ай бұрын
I'd be curious to see this with Gemeni Advanced. This isn't exactly a fair comparison. Very interesting otherwise though, thanks.
@samwitteveenai
2 ай бұрын
So the Gemini I used in here was 1.5 which should probably be better than Gemini Advanced
@CNW21
2 ай бұрын
yikes.... just when I thought google had caught up@@samwitteveenai
@Amamos
Ай бұрын
@@samwitteveenai I think 1.5 is the update to the free version, still worse and much smaller than 1.0 ultra
@ali.saghabashi2 ай бұрын
Gemini 1.5 pro is the best currently.
@samson_772 ай бұрын
Thanks, that's a good video about what's already possible and what not. Outcome: LLM are still bad designers, but I am pretty sure, that this might change in the future.
@PseudoProphet2 ай бұрын
Gemini 1.5 pro is not the same as the others, it's a MOE model.
@ickorling7328
2 ай бұрын
Nah, GPT 4 is highly speculated to be MoE asewell. The future is MoE to some degree of the solution. The layers between user and model, and checks will be key. Thats ontop of autogen / chat dev type agent organization possibilities. Mamba MoE and other Mamba research is key right now and for future integration.
@PseudoProphet
2 ай бұрын
yes that's true, but Gemini 1.5 doesn't even use Gemini Ultra as one of its experts or brain. it only uses Gemini 1.0 pro, that's a competitor for GPT3.5 @@ickorling7328
@ali.saghabashi
2 ай бұрын
Gemini 1.5 pro is the best currently
@PseudoProphet
2 ай бұрын
@@ickorling7328 yes it is moe now, it wasn't so when launched. Gemini 1.5 pro is moe but it doesn't have Gemini ultra as a prt of it.
@mnageh-bo1mm2 ай бұрын
waiting for the gamma video
@parsazeinali18122 ай бұрын
You should be trying gamma
@samwitteveenai
2 ай бұрын
someone else mentioned this and I just checked it out. It is very nice (and impressive) at the layout elements though still didn't get the full prompt right.
@JudWhite1
2 ай бұрын
The Gemma models provided by Google (gemma-7b-it-sfp, etc) don't let its full potential shine through. Check it out if you're interested in Gemma specifically, but also check out community fine tunes like Gemma-Wukong-2b. None appear state of the art, but inference is fast with gemma.cpp and Transformers, and they're still quite good for their size. I believe fine-tuning requires less resources which is its own type of interesting.
@dirremoire2 ай бұрын
Ask Gemini to add a picture of Nerva. Go ahead, I dare you. 😅
@samwitteveenai
2 ай бұрын
lol he might look a bit different based on what their public service was doing 😅
@pstefan862 ай бұрын
Having explored this technique a fair amount - the biggest challenge isnt whether you can generate a compelling set of slides, its the nuance of messaging given the context; which is often complex. At this point LLMs disrupt mediocrity - but not really well placed, value added, highly contextualized information transmission.
@samwitteveenai
2 ай бұрын
yeah I agree with your points. I do think it could make someone 10x faster with researching and making the slides though. Curious what others think about that.
@efejiroe2 ай бұрын
Good job. Using the same prompt for all three is problematic I think.
@alicefuller30712 ай бұрын
Sooo you made a video comparing 3 LLMS with the title AMAZING POWERPOINTS and don't even have Powerpoint on the computer you're using to demonstrate how amazing the presentation can be? So your title is a lie because we're not going to see a Powerpoint slide presentation. We're looking at Google Slides. You should've just put that in the title. Yes, both tools make slide presentations but they are not the same.
@samwitteveenai
2 ай бұрын
Fair point, though I do think Powerpoint is a pretty generic term and the python package was for building 'Powerpoints'
@bobtarmac18282 ай бұрын
Ai jobloss? Really?! Next you’ll be telling me some kind of Ai new world order will be taking over soon.