Claude 3 vs ChatGPT in Street Fighter | Local 7B Model Tournament (Mistral, Gemma ++)

Ғылым және технология

Claude 3 vs ChatGPT in Street Fighter | Local 7B Model Tournament (Mistral, Gemma ++)
👊 Become a member and get access to GitHub and Code:
/ allaboutai
🤖 AI Engineer Course:
scrimba.com/learn/aiengineer?...
📧 Join the newsletter:
www.allabtai.com/newsletter/
🌐 My website:
www.allabtai.com
GitHub:
github.com/OpenGenerativeAI/l...
D\ROM:
wowroms.com/en/roms/mame/down...
In this video I share how can can install and test the Open source project Street Figther LLM Eval. I create my own strategies, upgrade code to include Claude 3 API and do a Local 7B model tourament! Very Fun!
00:00 Street Fighter LLM Intro
00:24 How to Install
06:10 Claude 3 vs OpenAI Setup
08:45 OP Counter Strategy!
12:58 Local 7B LLM Tournament
18:21 Conclusion

Пікірлер: 39

@FerGodSakes219Ай бұрын
Just wanted to say thanks for another great video. Discovered your channel last weekend and really appreciate your content. thank you!
@AllAboutAI
Ай бұрын
thnx a lot! really appreciate it, glad you're enjoying the content :) let me know if you have any questions!
@joannot6706Ай бұрын
That was so interesting, please do more of this!
@AllAboutAI
Ай бұрын
thnx :) yeah this was a lot of fun to create. i have way more ideas so will def do more of these in the future!
@joannot6706
Ай бұрын
@@AllAboutAI Awesome
@nic-oriАй бұрын
Thank you. Useful information.👍👍👍
@AllAboutAI
Ай бұрын
thnx mate :) happy to help! let me know if there is anything else i can assist with.
@TheHistoryCode125Ай бұрын
This video demonstrates how to set up and use an open-source project called LLM Coliseum that allows you to evaluate large language models in real-time using the video game Street Fighter. The process involves installing Docker, cloning the GitHub repository, and setting up API keys. The video then shows how to pit OpenAI's GPT-3.5 model against Anthropic's Claude in the game.
@AllAboutAI
Ай бұрын
nice, sounds like a super fun project! i've had a blast playing around with it. as i mentioned in the video, i've made a few tweaks to the open source code so you can use models like claude 3 haiku as well. def checkout the github if you become a channel member, i upload all my code and experiments there. let me know if you have any other questions!
@orterves
Ай бұрын
This is an AI generated summary of the video
@DemiGoodUAАй бұрын
cool to see the same thing, but without the response time dependency. It is more interesting to see who is smarter, not faster
@App2bitsАй бұрын
@AllAboutAI, could explain why you used WSL instead of plain Windows? :)
@AllAboutAI
Ай бұрын
ah yeah, i just found it a bit easier to run everything on linux for this project. the docker setup and all that just seemed to work a bit better for me on wsl. but no worries if you just wanna use windows instead, should still work fine :)
@peterkonrad4364Ай бұрын
what ive always been wondering, also with games like connect 4 and so on, if theres a strategy a that always beats strategy b, and strategy b always beats strategy c, does that automatically mean, that a beats c? or could it be that c beats a? that would mean we have a kind of stone paper scissors situation, which would be much more fun. it would mean you first have to identify what strategy your opponent is doing, and then reacting with something that you know that beats it. it would mean that there is no single strategy that beats everything. that would be kind of boring.
@AllAboutAI
Ай бұрын
thnx, that's a really good point. i think you're absolutely right, it could be a rock paper scissors type situation. that's something i've been thinking about as well. it would definitely make it more fun and challenging, having to identify your opponent's strategy and then counter it. i'm going to have to experiment more with that. i agree, having a single optimal strategy that beats everything would be a bit boring. the fun is in trying to outsmart and outmaneuver your opponent. great insights, thanks for sharing!
@coryarmbrecht
Ай бұрын
@@AllAboutAI Is this where things like AlphaStar and AlphaGo come in?
@App2bitsАй бұрын
What is your hardware setup to work with these models in parallel?
@AllAboutAI
Ай бұрын
thnx for tuning in! i am actually running this on my personal GPU, so i have a decent rig. but i agree, having multiple gpus would be ideal to play with different models in parallel. im just using this for a bit of fun and learning, not for anything production ready. let me know if you have any other questions!
@negadan774 күн бұрын
We need to do a fight test with gpto Vs gpt4
@rishabhsingh1406Ай бұрын
I think using groq for inference and then having battle can make it even more fun
@AllAboutAI
Ай бұрын
oh yeah, that's a great idea! i've actually been looking into using groq myself recently. i think it could add a really cool extra layer to the battle sim. i'll definitely give that a try and see how it goes. thanks for the suggestion, it sounds super fun!
@GrandorkАй бұрын
Why use GPT 3.5 and Claude Haiku instead of GPT 4 and Claude Opus.
@AllAboutAI
Ай бұрын
thnx for the question. i decided to use gpt 3.5 turbo and claude haiku for a few reasons. first, they are a bit more accessible and affordable compared to the more powerful gpt-4 and claude opus models. i wanted to make this project something anyone could try out. plus, i've found the haiku and 3.5 models to be surprisingly capable for things like this. but you're right, the newer models could potentially offer even better performance! i'll have to experiment more with those in the future.
@ReNiCGamingАй бұрын
holy shit
@AllAboutAI
Ай бұрын
tnx! yeah, i tried to make this project as fun and engaging as possible. if you wanna get the code, just sign up as a member and i'll invite you to the community github :)
@ReNiCGaming
Ай бұрын
I've played a whole lot of sf3 (check the channel). If you want help refining prompts/"moves" and optimal strategy. I'd love to help.
@ReNiCGaming
Ай бұрын
@@AllAboutAI if I can find the time I may.. it would be interesting to play AGAINST AI.
@wurstelei1356Ай бұрын
Aw, I though it is doing the multimodal thing, but this is just the keyboard combos as text. Multimodal would be way too slow I guess.
@AllAboutAI
Ай бұрын
yeah, i agree. the keyboard combos are just a quick demo, the real power is in the multimodal stuff. but that does take a lot more compute, so its not quite ready for prime time yet. i'll try to do a more in-depth tutorial on the multimodal stuff soon!
@wurstelei1356
Ай бұрын
@@AllAboutAI Nice, I am collecting everything I can about multimodal AI. Especially the robot controlling ones. Would be nice to see a tutorial on controlling a cheap robot arm by multimodal here. Or maybe with a Google robot transformer.
@qadirtimerghazinАй бұрын
I guess it may be OK in WSL, but seeing stuff done as root freaks me out :)
@AllAboutAI
Ай бұрын
haha yeah, i know what you mean. i try to avoid root where possible too. but sometimes it just makes things easier, ya know? anyways, hope you're still enjoying the vids! let me know if you have any other questions.
@jana171Ай бұрын
It's like you just taught Skynet that a less aggressive battleplan will get you the win in the end... maybe humanity will survive a few more years due to your research 🙂
@AllAboutAI
Ай бұрын
haha cheers mate :) yeah i def think there is a lot of potential in using llms strategically for different tasks. its just a bit of fun and learning, but who knows what the future holds!
@NotEpoch
Ай бұрын
ai is way too smart to fight a war. only humans are stupid enough to do that. if ai really wanted to take over it would very slowly infiltrate our ideas, opinions, government, etc. it told me so itself lol!
@jana171
Ай бұрын
@@AllAboutAI Yeah i totally loved this.. we could be looking at an entire new sport here, or a complete shift in how measuring of models and hardware are done. Epic !
@LancelotxxxАй бұрын
i hope they manage to get it running at real speed someday and with multiple characters. Its quite a good and fun benchmark imho
@AllAboutAI
Ай бұрын
thnx, yeah that would be really cool. i think we are still a way off from that, but the progress in ai is moving so fast, who knows what the future holds! :)