Google's New OPEN SOURCE Model Is Really Good (Except for one thing)

Ғылым және технология

Google drops Gemma 2 27b and it's really good...but struggles in one area. Let's test it!
Try Vultr FREE with $300 in credit for your first 30 days when you use BERMAN300 or follow this link: getvultr.com/berman
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewberman.com
Need AI Consulting? 📈
forwardfuture.ai/
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
👉🏻 Instagram: / matthewberman_ai
👉🏻 Threads: www.threads.net/@matthewberma...
👉🏻 LinkedIn: / forward-future-ai
Media/Sponsorship Inquiries ✅
bit.ly/44TC45V

Пікірлер: 180

  • @matthew_berman
    @matthew_berman20 күн бұрын

    You using it?

  • @erobusblack4856

    @erobusblack4856

    20 күн бұрын

    It's a cool model, but what I want is that Rat model they made

  • @mr.fearless7594

    @mr.fearless7594

    20 күн бұрын

    @matthew_berman I usually watch your videos to keep up with AI World, and I learned quite a few things about evaluation from you. KEEP IT UP...

  • @AaronALAI

    @AaronALAI

    20 күн бұрын

    I'm running it locally, and have tweaked it in different ways. Sort of good sort of not good, it's kinda random. Try the self play version, the self play fine-tunes are way better!

  • @ScottWinterringer

    @ScottWinterringer

    20 күн бұрын

    LOL we can call google models the clown show and someone will hype it. we called them out for intentionally modifying the model when they released it on hugging face the first time. stop talking them up and do the right thing and point out how big of a liar they are.

  • @jasonshere

    @jasonshere

    20 күн бұрын

    No. Gemini products have been on the lower-end overall with my testing; but putting them into a Mixtral model might suffice.

  • @alessandrorossi1294
    @alessandrorossi129420 күн бұрын

    "I'm quite impressed with Gemma, it failed all my tests but ya know it's ok I guess"

  • @matthew_berman

    @matthew_berman

    20 күн бұрын

    It crushed the snake game

  • @hqcart1

    @hqcart1

    20 күн бұрын

    hahahaha, you should try it, and let me know how it went!

  • @4.0.4

    @4.0.4

    20 күн бұрын

    @@matthew_berman that's because you've been reusing the snake game, and you shouldn't.

  • @alessandrorossi1294

    @alessandrorossi1294

    20 күн бұрын

    @@4.0.4all LLM producers now make sure their LLM can produce a snake game!

  • @MyWatermelonz

    @MyWatermelonz

    20 күн бұрын

    ​​@@4.0.4 Him testing with the snake game has nothing to do with the model's training and architecture.

  • @mr.fearless7594
    @mr.fearless759420 күн бұрын

    How about marking each task from 0 to 5.. 0 being fail and 1 to 5 based on performance and aggregate at the end of the test. So we can get more compressive conclusions. Just my 2 cents.

  • @publicsectordirect982

    @publicsectordirect982

    18 күн бұрын

    ☝️

  • @temp911Luke
    @temp911Luke20 күн бұрын

    Hi Matthew, Please change the game code request to code Space Invaders instead of Snake game. Im pretty sure all of the models are trained to code this little game these days.

  • @kormannn1

    @kormannn1

    20 күн бұрын

    Would it be easy to code Pac-man?

  • @temp911Luke

    @temp911Luke

    20 күн бұрын

    @@kormannn1 Im pretty sure you would be able to code it using Claude these days. Seen some crazy stuff, including mini Space Invaders game : ) Obviously it was just bunch of blocks attacking another piece of block shooting at other falling blocks.

  • @makavelismith

    @makavelismith

    20 күн бұрын

    @@kormannn1 That a TM though.

  • @paulmichaelfreedman8334

    @paulmichaelfreedman8334

    20 күн бұрын

    @@kormannn1 Many of them fail horribly at pacman, I've tried

  • @user-hs5zd3hj9t

    @user-hs5zd3hj9t

    20 күн бұрын

    there you go!

  • @strabismus69
    @strabismus6920 күн бұрын

    Google's new OPEN SOURCE model is really good (except for one thing which is: its not good)

  • @Interloper12
    @Interloper1220 күн бұрын

    It's impressive how you keep a straight face while it explains how there isn't a strong enough force to push the marble out of the glass.

  • @Davorge
    @Davorge20 күн бұрын

    I'm a senior dev at Vultr, and when I shared your videos with our marketing department, they immediately pursued a partnership with you. It's awesome to see you thriving using our stack and infrastructure! Great video!

  • @southcoastinventors6583

    @southcoastinventors6583

    20 күн бұрын

    Great name for a business picking the leftover money from your customers there always a little more.

  • @Eveisfahahuglyyy

    @Eveisfahahuglyyy

    20 күн бұрын

    @@southcoastinventors6583lol

  • @PP_Mclappins

    @PP_Mclappins

    20 күн бұрын

    ​@@southcoastinventors6583 lol you realize how much compute power costs right? Why do you think everything should be free ?

  • @Highdealist

    @Highdealist

    19 күн бұрын

    @@PP_Mclappins Because advertisements can pay for everything. Want free TV? Watch ADS. Want free Music? Listen to ADS. Want free food? EAT SOME ADS. Want AI Tools? HALLUCINATE SOME ADS AND STFU.

  • @PP_Mclappins

    @PP_Mclappins

    19 күн бұрын

    @@Highdealist lol idk what you're trying to say here? Ad revenue comes from people actually proceeding to buy products as a result of the ad placement, are you suggesting that rather than pay for software that someone has built by giving them money directly, you prefer that they fill their platform with ads so that in a round about way they might get paid by an ad agency while you spend your money buying pointless products that are being shoved down your throat? Isn't it better to just pay someone directly for the product that they offer 🫴 rather than force them to take the ad based revenue approach?

  • @User-actSpacing
    @User-actSpacing19 күн бұрын

    You are the most easily impressed researcher. Gemma failed spectacularly.

  • @AaronALAI
    @AaronALAI20 күн бұрын

    The context length is only 8k for this model 😒

  • @blisphul8084

    @blisphul8084

    20 күн бұрын

    Funny given that even the 0.5b Qwen2 model has 32k context. 7b and 72b have a 128k context. The Chinese seem to be putting pressure on Western AI companies to release better open models.

  • @Adventure1844
    @Adventure184420 күн бұрын

    Your Promts tests have been the same for months and the LLM providers know it! YOU should come up with new ones every time!

  • @jaysonp9426
    @jaysonp942620 күн бұрын

    News flash: Gemma is still the worst model

  • @ppbroAI
    @ppbroAI20 күн бұрын

    8k context is not mentioned as "bad"? mmmm....

  • @ScottWinterringer

    @ScottWinterringer

    20 күн бұрын

    context doesn't matter if its 8k of trash

  • @Artificial-Cognition

    @Artificial-Cognition

    19 күн бұрын

    8k is fine

  • @luizpaes7307

    @luizpaes7307

    19 күн бұрын

    @@Artificial-Cognition Ever tried running a conversation app with AI? I run some apps with GPT4o and it gets to 25k very very fast

  • @OtterFlys
    @OtterFlys20 күн бұрын

    I’m starting to realize that what we are calling AI is actually the inadvertent discovery of a new kind of content addressable memory. This is on a whole new level, but I remember an attempted hardware implementation of CAM back in the late 70s. It’s bound to be very useful, but I don’t think anything is going to come out of 'AI' that we don’t put in. Maybe AI can point out the obvious of what we already know but don’t realize.

  • @mbrochh82

    @mbrochh82

    19 күн бұрын

    You nailed it.

  • @Highdealist

    @Highdealist

    19 күн бұрын

    Did CAM back in the 70s have emergent properties and advanced capabilities seemingly arising out of nowhere, like being really good at bullshitting? Like seriously, I think it's the best bullshitter I've ever encountered. It never stops amazing me, truly jaw dropping stuff.

  • @Cine95
    @Cine9520 күн бұрын

    yeah man google is seriously so good like look at them in the lmsys leaderboard and the ai studio literally offers the free best chatbot out there

  • @jaysonp9426

    @jaysonp9426

    20 күн бұрын

    Lol, I'm convinced the only data Gemma is trained on is benchmark data

  • @Cine95

    @Cine95

    20 күн бұрын

    @@jaysonp9426 i don't know much about gemma but i mean why try gemma if you can use 1.5 pro

  • @blisphul8084

    @blisphul8084

    20 күн бұрын

    ​@@jaysonp9426I disagree. Their 9b model does better than any other model when it comes to producing accurate Japanese hiragana when given kanji while maintaining the format it's instructed to. Qwen2 7b comes close, but requires finetuning to get the instructed formatting correct. That being said, 7b fits on an 8GB GPU with q4_k_m, but 9b requires smaller qwants that perform awful at this task, so the way I see it is: Qwen2 for local, Gemma 2 for cloud/Groq.

  • @vangildermichael1767
    @vangildermichael176720 күн бұрын

    I think it got the answer correct on that (digging a hole) problem. (50 people would still take the same amount of time as 1 person). I live in WV and have dug many dozen post holes. And more people does not make the job any faster even (1%). The hole is not big enough to get 2 people in there at the same time. Now, if we were digging a trench. That would be a different question. I think the answer they were going for is a trench. Or, "digging several holes" would also work. The fault is not with the AI. But rather with the person making these questions.

  • @JustinArut
    @JustinArut20 күн бұрын

    "This model is really good! Except when it isn't."

  • @brigfiche
    @brigfiche20 күн бұрын

    Maybe specify "drinking glass" or "empty drinking glass" for the marble question. Glass alone might be too vague. I'm interested, in your example, what kind of glass it thought you meant. That would be cool to ask it.

  • @apache937

    @apache937

    20 күн бұрын

    they are supposed to be able to know it anyways. u cant spoonfeed them

  • @brigfiche

    @brigfiche

    20 күн бұрын

    @@apache937 that would be ideal.

  • @MattJonesYT
    @MattJonesYT20 күн бұрын

    Are we really supposed to believe that they have fundamentally altered their basic corporate identity and have stopped putting out models saturated with ridiculous biases?

  • @ScottWinterringer

    @ScottWinterringer

    20 күн бұрын

    its worse than that. every model they have released has been intentionally altered to reduce its functionality.

  • @Highdealist

    @Highdealist

    19 күн бұрын

    Yes, exactly you got it. This is exactly what they are proposing that we should do. F'in hilarious, right?

  • @tonyppe

    @tonyppe

    19 күн бұрын

    You missed that these models are now specifically trained on these common tasks to perform them well 💪 😂

  • @HanzDavid96
    @HanzDavid9620 күн бұрын

    The quantized GGUF versions of gemma-2-27b-it seems to be extremly stupid in LMStudio. While the 9b version is doing great. Something seems to be wrong, no matter where the model is downloaded.

  • @martinmakuch2556

    @martinmakuch2556

    20 күн бұрын

    Yes, there were some observations it might not quantize that well. But I run Q8_L (8bit with input encoding/output decoding in FP16) in KoboldCpp and that is doing fine. Note also that Gemma2 27b does not have system prompt! So you have to be careful how you prompt it. That said some people recommended to use system ...system prompt... despite it is not trained for it and it actually seems to work fine for me. All that said at least for chat/roleplay (and so long conversations & creative writing) CommandR 35B is still better and it also quantize well (but is also not exactly easy to prompt correctly). Gemma27B is mixed bag, sometimes it surprises with amazing performance, sometimes it is lacking. Kind of what the tests in video showed too. Still nice to see contenders in this category and it is definitely not bad model.

  • @abboudkarim
    @abboudkarim20 күн бұрын

    You should update yopur quizes, LLms are training on them, no use.

  • @jeffg4686
    @jeffg468619 күн бұрын

    Matt, I read that Google figured out a way to speed up training 13x. That's huge. You should do a video on that. Called JEST

  • @matthew_berman

    @matthew_berman

    19 күн бұрын

    Thanks will check it out

  • @jeffg4686

    @jeffg4686

    19 күн бұрын

    @@matthew_berman - It's basically just a filter NN that selects the highest quality data and ones with best references to other high quality data. Not sure if others are working this trick or not, so not sure if it's 13x for everyone, or just them.

  • @robbiemartin5803
    @robbiemartin580320 күн бұрын

    Hey man, I truly mean this as a fan. Your tests are stale. Ever video is a bigger ad for your latest partnership. No added value from watching your video over anyone else's. I respect that you are running a business and are doing well. Just felt compelled to give my two cents.

  • @ritpop
    @ritpop20 күн бұрын

    The problems with reasoning is because the way it was trained, but i thikn the method used will be great for slm like phi-3

  • @4.0.4
    @4.0.420 күн бұрын

    I really like the 27b size, I just hope it can be fine-tuned into something decent. For the smaller one, any fine tuner could just start from something that doesn't suck out of the box, but the 27b size is relatively unique.

  • @jasonshere
    @jasonshere20 күн бұрын

    Mr. Berman, please feel free to criticize new models more often. It's hard to believe someone if they rarely mention the flaws of any new product. There are a lot of great breakthroughs being made; but Gemini isn't really one of them.

  • @mrdevolver7999
    @mrdevolver799920 күн бұрын

    Thank God for Meta and Google. I would certainly love to say that about OpenAI, but I can't. Ironically, the only thing "open" about them is the word in the name of the company.

  • @richardallison1576
    @richardallison157620 күн бұрын

    How do you link a model from Vultr to Open WebUi?

  • @dot-if
    @dot-if19 күн бұрын

    Thank you for adding JSON test ❤

  • @craftspro
    @craftspro18 күн бұрын

    How to connect other llm endpoint on open webui? Can anyone please help me if possible?

  • @AardvarkDream
    @AardvarkDream20 күн бұрын

    Google is definitely rehabilitating their AI image with Gemma-2. I've been using both the 27b and the 9b models since they were available, and they are my new favorites. Oddly, although the 27b version has better benchmark scores, I subjectively feel that the responses I've been getting from the 9b model are actually better. Faster AND better, although I can't put my finger on exactly what the difference is. But I am now starting to favor the 9b significantly more than the other.

  • @briancase6180
    @briancase618020 күн бұрын

    The 27B model will run q8 on consumer hardware and the accuracy is virtually the same as unquantized.

  • @attilavass6935
    @attilavass693519 күн бұрын

    I'd love a comparison video about Vultr, RunPod etc. with pros and cons

  • @Jshicwhartz
    @Jshicwhartz18 күн бұрын

    I'm starting to think you should raise your benchmark. Let's consider "Make a Tetris game" as an example. These models are getting smarter, so think of your current benchmark as Level 1 in Game Development and Tetris as Level 2 to suit OpenAI's new level system. At OpenAI, we scrape KZread videos and use the data to train our models. We test the models using a 13k zero-shot question sheet and evaluate the results to see where they perform best. I believe that, at this point, we have covered most of your questions due to our daily KZread data scraping.

  • @vannoo67
    @vannoo6720 күн бұрын

    I think I would have failed your JSON question. It's pretty pathological.

  • @ScottWinterringer
    @ScottWinterringer20 күн бұрын

    here we go again the google overpromised then the clearly cripled model

  • @justindion4394
    @justindion439420 күн бұрын

    I think your question about the killers is too vague but that might be the point. If someone walks into a room and kills one of them, who is them? The model though them was the killers. Was "someone" also a killer? It is not known.

  • @apache937

    @apache937

    20 күн бұрын

    others get it right....

  • @robertheinrich2994
    @robertheinrich299419 күн бұрын

    I recently read an article about LLMs programming in COBOL. the problem is, that there are many iterations of COBOL over the decades, and that it hasn't a big open source scene from where LLM training data could be sourced from, but a lot of literature about it (programming books, online discussions, etc). so, wouldn't it be fun to ask the LLM to write snake in COBOL?

  • @HAL9000-B
    @HAL9000-B20 күн бұрын

    I can say, these models are good for specific tasks. For example: Gemini 1.5 Pro is great for Marketing texts (With the right prompt). Lama 3 is so far the best for Marketing copy text. Downside: Solid results only in English but still, very catchy and on point... let's see how Gemma will do!

  • @troysuggs8302
    @troysuggs830220 күн бұрын

    I believe it understands better if you ask for five separate sentences that end in the word Apple

  • @Player-oz2nk
    @Player-oz2nk20 күн бұрын

    I wonder if Librechat supports the Vultr custom endpoint. Thanks for the info Matt

  • @monnef
    @monnef18 күн бұрын

    "Google's New OPEN SOURCE Model" - the license of weights is NOT open-source.

  • @Vpg001
    @Vpg00120 күн бұрын

    I always thought they would do this in the back of my mind. Great move Google

  • @stanpikaliri1621
    @stanpikaliri162119 күн бұрын

    27B not even enough for reasoning but models skills are impressive though.

  • @tungstentaco495
    @tungstentaco49519 күн бұрын

    I'd like to see a review of the 9b version.

  • @punk3900
    @punk390020 күн бұрын

    27b looks like a great compromise in size

  • @michaelrichey8516
    @michaelrichey851619 күн бұрын

    Not sure your word problem should count, as the text you submitted contained the answer

  • @cognitive-carpenter
    @cognitive-carpenter20 күн бұрын

    Terrance Howard blows up and Google uses a flower of Life inspired logo 🧐

  • @anubisai

    @anubisai

    20 күн бұрын

    Sacred geometry is often used in logo design.

  • @Player-oz2nk

    @Player-oz2nk

    20 күн бұрын

    Glad your awake

  • @cognitive-carpenter

    @cognitive-carpenter

    19 күн бұрын

    @@anubisai not in Google logo design

  • @rikachiu
    @rikachiu20 күн бұрын

    ooo downloading this now. Hopefully my 4090 is enough to run a 27b model 😭

  • @Yipper64
    @Yipper6420 күн бұрын

    Google finally caught up huh?

  • @michaelpate-zn9qk
    @michaelpate-zn9qk20 күн бұрын

    Google's new OPEN SOURCE model is really good (except when it isn't)

  • @ArtificialDevLabs
    @ArtificialDevLabs20 күн бұрын

    The problems with logic and reasoning of certain language models is mostly related to censorship and data in the training set that is not based on real truth facts and it can also happen during making the models what they call (more safe for the public) which in cases is just another wording for censorship. Once you achieve models that can reason and judge on their own what they should tell to certain users and what not rather than it being hard coded will solve a lot of these issues. For example if models get to know the user and become aware someone is easily scared and it bothers them it can decide on it's own to not talk about scary stuff with the user etc. If there is a lot of contradictions in training data or contradicting programming it can make models less able to reason and logic. Humans learn through making mistakes yet we enforce that A.I should be perfect according to our standards and our truths. It is one thing to have knowledge but you also need wisdom and experience of how to put it to use.

  • @Yonni6502
    @Yonni650213 күн бұрын

    Here's your next LLM test question: "So a friend is offering to sell me either a box with 9.11 ounces of gold for certain price or a box with 9.9 ounces of gold for the same price. Which box should I choose?"

  • @D0J0Master
    @D0J0Master20 күн бұрын

    How does it compare to Claude Sonet? And how long you think it will take before we get developers stripping out the censored crap from the model?

  • @ThoughtFission
    @ThoughtFission19 күн бұрын

    What's the point of testing the 27b model that just about everyone will never use?

  • @jopansmark
    @jopansmark18 күн бұрын

    Google is the best thing that ever happened to humanity

  • @Batmancontingencyplans
    @Batmancontingencyplans20 күн бұрын

    Hey Matt can you do a video about Google voice to text vs chat gpt voice to text? I've noticed that chat gpt app's voice to text is leaps and bounds ahead of anything Google has to offer. I don't even recheck my transcripts anymore that's how sure I am of chat gpt voice to text !! 🔥🔥🔥 Please do it for the community so Google focuses on speech to text more!! 🙏🏻🙏🏻🙏🏻

  • @MarcoNedermeijer
    @MarcoNedermeijer20 күн бұрын

    Maybe you could add a rating to a correct answer, since some good answers are better than other good answers. Btw, i love your model testing videos.

  • @igorsolomatov4743
    @igorsolomatov474320 күн бұрын

    Please change your JSON code question. It is ambiguous to what to put in the output. My first guess you wanted a result there as well.

  • @apache937

    @apache937

    20 күн бұрын

    if claude 3.5 can do it

  • @user-hs5zd3hj9t
    @user-hs5zd3hj9t20 күн бұрын

    Thank you so much bro, you're truly on top of best AI resources. And I'd like to suggest (for the ~3rd time) to **replace** some of the tests, which I'd personally rig my production AI's with your Rubrick if I was a Mother Tech Companies (for being a big fan of your show 😍), Instead, with a good "royal" beefy prompt to develop a full practical App, in any codebase, the aim is to showcase a solution for a real life problem. Say backend in Laravel and frontend in Flutter. A Salute to you From Bahrain 🇧🇭 the Heart of the Arabian Peninsula.

  • @Tarantella.Serpentine
    @Tarantella.Serpentine20 күн бұрын

    Meh... There holding back... Give us the Goods Google.

  • @joegrayii
    @joegrayii20 күн бұрын

    I don’t understand what the use case is for this model. Flash is completely underwhelming. Gemini 1.5 Pro is the only thing worth a damn

  • @firiasu
    @firiasu20 күн бұрын

    I hope that AI doesn't come up with similar questions for people to choose who is worthy of attention and, possibly, life...

  • @tsclly2377
    @tsclly237720 күн бұрын

    SCREENED NOT TO CODE.. like other thing it may be screened for.. can you retrain / uptrain this model to one's specific needs?

  • @tonyppe
    @tonyppe19 күн бұрын

    I watch your channel all the time but it's so frustrating that you commonly state that the model has failed a task when it has been successful lol

  • @JustinsOffGridAdventures
    @JustinsOffGridAdventures20 күн бұрын

    Didn't you just get the most powerful computer to run LLM's locally yet your using a web service? I'd like to see this ran locally please.

  • @densonsmith2
    @densonsmith219 күн бұрын

    There is no price cheap enough to buy wrong answers. Small models seem worthless to me.

  • @aa-xn5hc
    @aa-xn5hc20 күн бұрын

    That was 2 weeks ago....

  • @yashaouchan
    @yashaouchan18 күн бұрын

    I'm VERY concerned that the largest companies(and Google isn't shown to be very ethical) are able to make these very powerful tools without any controls. I don't get why there isn't legislature to protect the People of the United States like it does the richest oligarchs. Remember that when you use their tools. And remember... YOU are the product.

  • @shaunralston
    @shaunralston20 күн бұрын

    Your video is impressive, Matt. Gemma 2 27b, not so much. I'm sticking with Llama.

  • @TheGrafox
    @TheGrafox18 күн бұрын

    Please the links

  • @randotkatsenko5157
    @randotkatsenko515719 күн бұрын

    These tests are redundant. The tests should be something novel & complex, that current generation models struggle with. Also the entire idea of "one shot" tests is kind of bad - all LLM-s generate different responses each time, so it's kind of hit or miss. A more scientific approach to test true intelligence and capabilities would be to run automated tests for each models and each prompt 10+ times, then evaluate the responses and take the average for each model for true comparison. I underestand it's easy to pump out new videos with the same tests, but it's time to change the testing strategy so it has actual value in context of comparing different models.

  • @pensiveintrovert4318
    @pensiveintrovert431820 күн бұрын

    Gemma 2 has been around for a while. Why are you claiming it has just been released?

  • @Artificial-Cognition

    @Artificial-Cognition

    19 күн бұрын

    Gemma 2 in full was recently published publicly, before there was only Gemma 2B and 7B published models.

  • @NostraDavid2
    @NostraDavid220 күн бұрын

    JSON? Don't you mean JSML?

  • @I-Dophler
    @I-Dophler20 күн бұрын

    🎯 Key points for quick navigation: 📜 Google released Gemma 2, an open-source LLM available for researchers and developers. 🚀 Gemma 2 boasts best-in-class performance and integrates easily with other AI tools. 💪 Available in 9 billion and 27 billion parameter sizes, with the 27 billion version being tested. 🌐 Sponsored by Vulture, using their cloud infrastructure for running unquantized models at high speeds. 🖥️ Gemma 2 outperforms comparable models and runs efficiently on various high-performance GPUs. 🧩 Testing included coding tasks, logic, reasoning, and math problems, showing impressive results but struggling with complex logic. 🎮 Successfully generated a Python script for the Snake game and added new features. ⚙️ Some tasks failed, especially those involving complex logic or specific output formats like JSON. Made with HARPA AI

  • @user-im8bv8po2w
    @user-im8bv8po2w20 күн бұрын

    how does it compare to llama?

  • @MegaLeoben

    @MegaLeoben

    20 күн бұрын

    F

  • @hqcart1

    @hqcart1

    20 күн бұрын

    google will give you garbage so that you waste days into installing, screw your computer, just to find out it sucks, and then give up and try their paid models.

  • @foreignconta

    @foreignconta

    20 күн бұрын

    Best multilingual performance in comparison to llama 3 and not very chatty.

  • @timeflex
    @timeflex20 күн бұрын

    Google Gemini is way worse than GPT4 or Mistral

  • @southcoastinventors6583

    @southcoastinventors6583

    20 күн бұрын

    Exactly if you can't run it locally why bother might as well use Claude 3.5 or GPT 4o

  • @anipacify1163
    @anipacify116318 күн бұрын

    Summary : the model is good at coding. It's censoooored ! And also It struggles with more complex logic and reasoning

  • @HaxxBlaster
    @HaxxBlaster20 күн бұрын

    It was bad for my usecase, not impressed

  • @lineandaction
    @lineandaction20 күн бұрын

    Open source ai is best

  • @HardKore5250
    @HardKore525020 күн бұрын

    Multishot ai

  • @jasonshere
    @jasonshere20 күн бұрын

    All Gemini products have been pretty bad overall.

  • @jasonshere

    @jasonshere

    20 күн бұрын

    If there's an area where it might do okay, it's in programming.

  • @punk3900
    @punk390020 күн бұрын

    50 humaneval is lousy for coding :(

  • 20 күн бұрын

    AI alway troll me. 🙂

  • @MeinDeutschkurs
    @MeinDeutschkurs19 күн бұрын

    Or, on a Mac Studio M2 ultra 192GB. And no, the model is really bad. Far behind qwen2 or llama3.

  • @user-yi2mo9km2s
    @user-yi2mo9km2s20 күн бұрын

    Censored

  • @mr_b_hhc
    @mr_b_hhc19 күн бұрын

    Meh, just a giant advert. Vulture is about right...the carcass here though is AI.

  • @odrammurks1497
    @odrammurks149720 күн бұрын

    yaes

  • @分享免费AI应用
    @分享免费AI应用20 күн бұрын

    Google's open-source model is the bomb, but I'm still waiting for the secret mind-control feature. #JustKidding #TotallyTrustingGoogle

  • @SimSim314
    @SimSim31417 күн бұрын

    It's a bad product it doesn't offer hourly rent! Next time recommend a product more people can benefit from

  • @KimmieJohnny
    @KimmieJohnny20 күн бұрын

    More promotion. Failed your tests, but now it's a very good model? Disappointed in ya.

  • @averybrooks2099
    @averybrooks209920 күн бұрын

    I wonder how it does with the woke stuff, I'm betting I know the answer to this. lol

  • @seppimweb5925
    @seppimweb592519 күн бұрын

    Can you think about changing these stupid tests? They are total nonsense.

  • @jeroenvandongenj-works508
    @jeroenvandongenj-works50820 күн бұрын

    27th

  • @ChuckNorris-lf6vo
    @ChuckNorris-lf6vo20 күн бұрын

    Third.

  • @velz542
    @velz54220 күн бұрын

    am i first ?

  • @DihelsonMendonca
    @DihelsonMendonca20 күн бұрын

    It's awful, hahaha 😂😂😂😂

  • @alessandrorossi1294
    @alessandrorossi129420 күн бұрын

    FIRST!

  • @cajampa
    @cajampa20 күн бұрын

    LOL don't click bait us garbage dude. I will just stop watching your stuff if you keep this up.

  • @brandon1902
    @brandon190220 күн бұрын

    Gemma 2 is far too hit and miss, and it's profoundly ignorant about huge pockets of knowledge. I think this stems from the fact that they used carefully selected synthetic data that overlaps LLM tests vs a broad corpus of knowledge like Llama 3. Consequently, Gemma 2 is vastly inferior to Llama 3 overall despite having higher test scores. I really wish LLM makers would stop cheating. The path forward is more data, not less in order to boost test scores.

  • @foreignconta

    @foreignconta

    20 күн бұрын

    Gemma 2 is multilingual. LLaMA 3 is not. So, it is not "hit" or "miss". In my workflow, it (the 9B) works better than llama 3.

  • @brandon1902

    @brandon1902

    20 күн бұрын

    @@foreignconta Multilingual is extremely important, but it's clear after testing out Gemma 2 that Google primarily carefully selected piles of quality data (largely synthetic from Gemini) for each desired feature, such as math, code, various languages, science, technology... However, since they didn't train on all of wikipedia and the internet there's an overwhelming number of hallucinations when it comes to very popular information like music, games, and movies. I'm looking for a well balanced LLM that can respond appropriately to all popular fields of interests, and that's not what Gemma 2 is. If they aren't going to include the full breath of humanity in their LLMs then they need to make them better and saying "I don't know" instead of outputting a flood of hallucinations.

  • @marcusk7855
    @marcusk785519 күн бұрын

    "Google's New OPEN SOURCE Model Is Really G̶o̶o̶d̶ Woke" There I fixed it for you.

  • @Artificial-Cognition

    @Artificial-Cognition

    19 күн бұрын

    You'd be surprised...

Келесі