🇫🇷 Mistral AI's NEW 22B Coding Model with Code Inpainting 🎨 Beats DeepSeekCoder 33B!

Ғылым және технология

Meet Codestral, the game-changing code generation model by Mistral AI! This powerful tool assists developers with code completion and interaction through an easy-to-use API. Codestral surpasses the competition, even beating Deepseeker Coder 33B and Llama3 70B! Unlock your coding potential and boost your productivity with Codestral.
Tell us what you think in the comments below!
Maxime Tweet: x.com/maximelabonne/status/17...
Mistral Blog Post: mistral.ai/news/codestral/
Le Platforme (use Codestral FREE): chat.mistral.ai/chat/feec47ed...
Hugging Face Card (weights): huggingface.co/mistralai/Code...
-----------------
This video contains affiliate links, meaning if you click and make a purchase, I may earn a commission at no extra cost to you. Thank you for supporting my channel!
My 4090 machine:
amzn.to/3QMvE4s - MSI 4090 Suprim Liquid X 24G (best linux compatibility)
amzn.to/3V5R0My - Corsair 1500i PSU
amzn.to/4dIwybZ - 12VHPWR Cables that DONT MELT!
Tech I use to produce my videos:
amzn.to/4bN5eaR - Samsung T7 2TB SSD USB-C
amzn.to/4dJFHky - Sandisk 32Gb USB-C flash drive
amzn.to/44LHZeG - Blue XLR Microphone
amzn.to/3ULTT3N - Focusrite Scarlett Solo Usb C to XLR interface

Пікірлер: 84

@ppbroAI28 күн бұрын
yup, is rlly good. Tried in 4 bits, I like its explanations so far
@aifluxchannel
28 күн бұрын
Great to hear! I can't wait to try 8 bit quants once I get back to my GPU machine! :)))
@OMGanger
27 күн бұрын
Any suggestions on something better than gpto? I feel like it’s not that hard to run tree and retrieve and dump context at each node along it
@QuickTechNowКүн бұрын
Helped me a lot in my C++ project, thought that these companies translate "code" to "python". Thanks!
@aifluxchannel
Күн бұрын
Glad it helped!
@JakubHohn28 күн бұрын
I really like the coding AIs, but what feels like a great downside is that none of them are capable of CRUDing (create, read, update, delete) files directly. When they will be able to do that, I think they will be radically more useful.
@aifluxchannel
28 күн бұрын
Good point! I'll add this in the next video. I have noticed these models even struggle to string together relatively simple Typescript / react apps.
@southcoastinventors658328 күн бұрын
Nice video and test of Codestral but if you going to do a snake implementation or some of visual program please run it. Need to add some pizazz. Also its great to have some competition from Europe, always look forward to what Mistral releases.
@aifluxchannel
28 күн бұрын
Thanks for the feedback! I wanted to keep the video under 20 min! Will do a full demo next time.
@JoeBrigAI28 күн бұрын
looks good. let’s see it a real workflow.
@aifluxchannel
28 күн бұрын
What would you like to see? webdev, solidiy / web3, I'm all ears!
@onoff560427 күн бұрын
Many thanks for details in coverage of topic.
@aifluxchannel
27 күн бұрын
Glad it was helpful!
@hobologna25 күн бұрын
code inpainting is a brilliant concept!
@aifluxchannel
25 күн бұрын
I think it could become a really popular way to interact with coding models, especially if you could point / direct where you want it to focus in a codebase with comments.
@cd9260625 күн бұрын
Excellent overview. Personally my goal is ultimately to only use locally running models, so this is an exciting step!
@aifluxchannel
25 күн бұрын
Which models are you planning to run locally!?
@justindressler599228 күн бұрын
Cool Mandelbrot set that's the only use case I have for a code gen. Literally the most useful code ever. My entire carrier of 30 years can't say I ever needed or even felt the urge to write a Mandelbrot set. Why don't people use real life tasks like write a react login form with unit tests and e2e tests and backend verification with node express server and database again with unit tests. Have it explain security techniques used to protect from hacking and credentials. This is needed in almost every app. Until these things can be done flawlessly such as password encrypted in db tls enabled connection data validation avoid code injection, 2fa, cors, SSO with Google... checking, secure session db account scheme and so, rbac. They won't be replacing anyone.
@aifluxchannel
28 күн бұрын
I generally like to stick to tasks that a human could do, but also tasks that don't take too much time to demo. I generally find that a lot of coding models will "explain away" things they're unsure how to actually implement with pseudo code or explanations of "best practices" - but also because they're just regurgitating documentation when that happens. What else would you like me to focus on / change in future videos when I'm evaluating coding performance?
@maloukemallouke973525 күн бұрын
thanks for this experience
@aifluxchannel
25 күн бұрын
Thanks for watching!
@pn496028 күн бұрын
super cool!
@aifluxchannel
28 күн бұрын
Thanks, we're glad you liked it!
@PythonAndy28 күн бұрын
thanks for the vid ♥
@aifluxchannel
28 күн бұрын
You bet! Let us know what you'd like to see more of!
@peterwood687527 күн бұрын
I like to use Claude 3 haiku for coding. I can always use opus for things like coming up with the coding project itself, or to ask tricky technical questions. I talk to haiku about the implementation and to plan, then get it to come to with some unit tests, then get it to write the code. Getting it to think a bit before generating the code seems to get it to generate good code
@aifluxchannel
27 күн бұрын
Thanks for sharing! Have you used the new phi-3 as well? Curious what kind of coding you're using this for?
@peterwood6875
25 күн бұрын
@@aifluxchannel I often have conversations with Claude about maths and physics. Writing some code to do some calculations is a good way to familiarise oneself with relevant concepts, and more fun than doing calculations by hand with a pen and paper. A recent project was to implement a homomorphism and representations of Lie groups that are related to quantisation of spin. I haven't tried phi3. It looks like some versions have a decent context length, but I find that Claude's context length isn't quite enough for the way I use it.
@dkracingfan250328 күн бұрын
Yes it is beats it!
@aifluxchannel
28 күн бұрын
Pretty exciting isn't it? What kind of finetunes do you want to see done to this mistral model?
@AaronALAI28 күн бұрын
I've been having great success with wizards mixtral 8*22b model for coding. My workflow is pretty simple, I use textgen webui to talk to my models and the spider ide in another window and just talk to the llm like a normal person.
@aifluxchannel
28 күн бұрын
It'll be curious to see how similar the evals for those two models are. Given they're the same size I wonder if this is just a super-sampling of one of the "experts" from their 8x22B model
@AaronALAI
28 күн бұрын
Ooh interesting hypothesis, I noticed it was a 22b model they released and wondered if it was related in some way to their 8*22b model.@@aifluxchannel
@VastCNC28 күн бұрын
I’d like to see a model tuned to a specific language other than Python and JS derivatives. Elixir is a prime candidate with an excellent documentation library (hex docs)
@jonmichaelgalindo
27 күн бұрын
Base model training literally needs hundreds of millions of lines of code.
@aifluxchannel
27 күн бұрын
It would be interesting to train the model with as little documentation / english commentary and context to see if a more accurate or actionable model would come from it.
@VastCNC
27 күн бұрын
@@aifluxchannel do you think fine tune would be sufficient? I think with elixir, outside of the documentation, open source repositories would be of higher quality because of the skill involved to become productive compared with Python and Js
@moak405226 күн бұрын
Which ai do you recommend your coding?
@aifluxchannel
26 күн бұрын
I generally use DeepSeek Coder 33B and GPT4 ;)
@siegfriedcxf24 күн бұрын
they didnt put codeqwen1.5-7b-chat, its actually score higher in humaneval than codemistral and is way smaller 7b vs 22b. i tried both, codeqwen is actually better.
@aifluxchannel
23 күн бұрын
I haven't tried CodeQwen yet, but I've definitely been impressed with Qwen 1.5 - what kind of coding do you do with this model?
@garrettbates263928 күн бұрын
I feel like i missed something about Devin
@aifluxchannel
28 күн бұрын
Devin turned out to have faked their demo, and in reality was actually quite far away from "replacing software engineers" with ai ;)
@garrettbates2639
28 күн бұрын
@@aifluxchannel Ahhh. Makes sense. Not much better than repeatedly prompting other models, I imagine? That's unfortunate, but at least it spawned some open source projects to try and do what they pretended to do, I suppose.
@m1265225 күн бұрын
There's been so many changes in javascript, html and css in the last couple of years why would a web dev want to use a tool that is only trained to 2001...
@aifluxchannel
25 күн бұрын
Base reasoning is key, because it means finetuning on top of newer javascript docs / code is even easier and translates to solid performance after the fact.
@m12652
25 күн бұрын
@@aifluxchannel and yet every coder AI model I tried has produced such flaky code it hurts to read it. Even taking into account they might not be trained on new functionality.
@OMGanger27 күн бұрын
Phi has 128k context and is only 4B?
@aifluxchannel
27 күн бұрын
It's more about how you use the context window than it's length ;)
@tapu_27 күн бұрын
You should test out if it can write and run DreamBerd, the greatest language ever.
@aifluxchannel
27 күн бұрын
Hahaha can't tell if this is a joke or a real programming language?
@Arcticwhir28 күн бұрын
Doing some testing it can be quite lazy and its creativity is low, although its coding abilites are definietly sharp and have yet to get any bugs. The way i would use this is would be for autocomplete, psuedo code ( you have to be quite detailed).
@aifluxchannel
28 күн бұрын
Interesting, thanks for sharing your results. Curious what terms / attributes you use to measure how "creative" a coding LLM is? This might help me improve how I test models in the future!
@lel753127 күн бұрын
Why are you not running the code ?
@aifluxchannel
27 күн бұрын
I can do this in livestreams, but for model review videos it takes too much time. thanks for the suggestion.
@hjups28 күн бұрын
An interesting model, but unimpressive in my testing. Although, it seems to be dependent on the language and problem difficulty - high resource languages with simpler problems are more likely to succeed. Coming from the computer architecture side (hardware design), I always test the models on low-level C and Verilog problems (relatively simple due to low expectations). GPT 3.5 and LLama3-70B succeeded more often than not, but Codestral failed all of my test cases. In fact, Codestral broke math by insisting that a*b == a+b if b is odd else a random number (what ever was previously stored). When pointing out the contradiction, it only double-down. LLama3-70B and GPT 3.5 have never failed that badly for me.
@aifluxchannel
28 күн бұрын
It's been a while since I've written verilog, but definitely an interesting edge case to test Codestral with. What kind of work do you generally use LLama3-70B to assist / accelerate?
@hjups
27 күн бұрын
@@aifluxchannel It's a fun yet frustrating language. I haven't been using LLama3-70B to assist with any hardware tasks, it still fails on anything useful (only succeeds at simple tasks). GPT4 can sometimes generate more complicated Verilog, but usually requires manual correction. It's mostly useful for generating sub-function behavior in tooling (C and python). That still requires manual guidance, but speeds up development by ~10x. I would be more hopeful of LLama3-400B, but I guess that won't be released.
@linklovezelda28 күн бұрын
Check your title bro
@aifluxchannel
28 күн бұрын
Thanks for the tip!
@sevilnatas28 күн бұрын
Wait, what happened to Devin?
@aifluxchannel
28 күн бұрын
Demo was fake, wasn't actually as capable as it's creators claimed.
@sevilnatas
28 күн бұрын
@@aifluxchannel Ah, crazy! I guess it was good enough for Microsoft.
@firstlast49328 күн бұрын
How about AutoCoder 33b?
@aifluxchannel
28 күн бұрын
We can test this soon! Is this your go-to coding model?
@firstlast493
23 күн бұрын
@@aifluxchannel No. There's just very little video about this model.
@jonmichaelgalindo27 күн бұрын
But can it write ffmpeg commands?
@mirek190
27 күн бұрын
yes Also you can paste the newest documentation the works even better
@jonmichaelgalindo
27 күн бұрын
@@mirek190 Have you tried? I guarantee you haven't. Not even GPT-4 can do anything more complicated than mp3 -> ogg, and even struggles with something simple like that.
@aifluxchannel
27 күн бұрын
GPT4 and Mixtral 8x7B are particularly good with these commands. This was one of the first things that really impressed me about these models.
@aifluxchannel
27 күн бұрын
It can do things much more complicated! You should try it out.
@jonmichaelgalindo
27 күн бұрын
@@aifluxchannel We must be prompting it differently then. :-/ For example (real example): I wanted to input my two camera videos, convert them from fisheye to equirectangular, combine them with one on the left and the other on the right (stereo), crop 120 pixels from left and right of both, move the right down 180 pixels (bad lens alignment from manufacturer), then scale the entire output to no more than 8K. GPT-4 was nowhere near being able to write the command. (I never did figure it out. I'm doing those operations manually in Blender.)
@AI-Wire22 күн бұрын
"We all know what happened with Devin." Nice engagement bait. Just tell us what you mean. But instead, you bait us for engagement.
@aifluxchannel
22 күн бұрын
Thanks for the feedback, I assumed it was well known that Devin was caught faking their demo about a week after announcing their model.
@pigeon_official28 күн бұрын
but GPT-4o is barely decent at coding i cant image the open source stuff will be remotely useful if GPT-4o cant do 90% of coding tasks more complex than like a intro to coding course type thing
@aifluxchannel
28 күн бұрын
I do generally agree that gtp4o (outside of open AI's demo) is basically useless for real coding tasks. Especially as a co-pilot.
@brulsmurf
27 күн бұрын
the "opensource stuff" isnt lacking behind. and yes, there are a lot of problems with using llm's for coding tasks. you need to be very carefull
@yongamamkolokotho9904
27 күн бұрын
I was creating a bfs generated Maze using 4o so far for me its impressive
@handsanitizer2457
24 күн бұрын
He means for anything complex @@yongamamkolokotho9904