[ML News] Llama 3 changes the game

Ғылым және технология

Meta's Llama 3 is out. New model, new license, new opportunities.
References:
llama.meta.com/llama3/
ai.meta.com/blog/meta-llama-3/
github.com/meta-llama/llama3/...
llama.meta.com/trust-and-safety/
ai.meta.com/research/publicat...
github.com/meta-llama/llama-r...
llama.meta.com/llama3/license/
about. news/2024/04/met...
minchoi/status/17...
_akhaliq/status/1...
_philschmid/statu...
lmsysorg/status/1...
SebastienBubeck/s...
_Mira___Mira_/sta...
_philschmid/statu...
cHHillee/status/1...
www.meta.ai/?icebreaker=imagine
OpenAI/status/177...
OpenAIDevs/status...
OpenAIDevs/status...
CodeByPoonam/stat...
hey_madni/status/...
cloud.google.com/blog/product...
altryne/status/17...
xenovacom/status/...
minchoi/status/17...
www.udio.com/
www.udio.com/pricing
Links:
Homepage: ykilcher.com
Merch: ykilcher.com/merch
KZread: / yannickilcher
Twitter: / ykilcher
Discord: ykilcher.com/discord
LinkedIn: / ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: www.subscribestar.com/yannick...
Patreon: / yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Пікірлер: 150

  • @tantzer6113
    @tantzer611324 күн бұрын

    “If you don’t know what I’m talking about - and I don’t know why you wouldn’t…” I don’t know it because you’re my main source for important developments in machine learning.

  • @tantzer6113

    @tantzer6113

    24 күн бұрын

    PS I don’t mind getting news with delay. I like it that you get into algorithms, capabilities, and technical overviews.

  • @mriz

    @mriz

    24 күн бұрын

    if you on Twitter and follow few folks in llm community, it is almost impossible to escape this hype and news on your timeline

  • @YuraCCC
    @YuraCCC24 күн бұрын

    I had my doubts about Zuck, but check him out now-championing open source AI like a boss! Maybe he should just grab the name 'Open AI'-that is, if nobody's snagged it yet

  • @demetriusmichael

    @demetriusmichael

    24 күн бұрын

    The training data will be a legal nightmare on these proprietary things. Making it open source is the only way in this case.

  • @antonystringfellow5152

    @antonystringfellow5152

    24 күн бұрын

    I know, this latest version of Zuck is amazing! I watched an interview of him talking about Llama 3 and he was so human-like

  • @peterfireflylund

    @peterfireflylund

    24 күн бұрын

    Opener AI?

  • @monad_tcp

    @monad_tcp

    24 күн бұрын

    @@antonystringfellow5152 yeah, his avatar got a massive upgrade, he's almost human now

  • @Wobbothe3rd

    @Wobbothe3rd

    24 күн бұрын

    He was like this in VR too. Throughout VR development Meta/Facebook published many things open source, including computer vision models.

  • @mikethedriver5673
    @mikethedriver567324 күн бұрын

    Llama 3: they had no moat

  • @float32

    @float32

    24 күн бұрын

    This wouldn’t be such big news if they didn’t have a moat that is just now being bridged.

  • @GuagoFruit
    @GuagoFruit24 күн бұрын

    The next revolution imo definitely needs to be getting things to run locally with any sort of fidelity.

  • @Aphixx

    @Aphixx

    24 күн бұрын

    If Stable Diffusion is a good historical example, then we should see some pretty significant perf improvements as soon as people (nerds) decide to stubbornly mess with it until it works.

  • @olcaybuyan
    @olcaybuyan24 күн бұрын

    I am especially happy that Llama 3 supports multiple languages :-) Most open access or open source models are English only and no real alternative to OpenAI GPT.

  • @YuraCCC
    @YuraCCC23 күн бұрын

    Those t-shirt stripes are an example of reverse CAPTCHA - it spins humans right into dizziness and blackout, but AIs? They just keep watching and learning.

  • @vladimirtchuiev2218
    @vladimirtchuiev221824 күн бұрын

    Mixture of Depth is a promising direction in modularizing LLMs, you could basically use only part of the model for specific applications

  • @mikayahlevi
    @mikayahlevi24 күн бұрын

    The anti-open source AI safety person impression at 13:48 is too accurate🤣

  • @thirdeye4654
    @thirdeye465424 күн бұрын

    Udio is great in my opinion. You don't let the AI create whole songs, but segments (of around 33s). It usually creates 2 variants at the same time. You then can extend those segments (before or after); either by a midsegement, intro or outro. You can even insert your own lyrics and it works like a charm. If you are happy with the song, you then can "publish" it and even pick a text-to-image cover art. I love that stuff.

  • @Embassy_of_Jupiter
    @Embassy_of_Jupiter21 күн бұрын

    I think a valid reason not to build models on 95% English data is that it could significantly influence the world view and "zeitgeist" of the model in all languages. it makes sense to have fully local models to not homogenize the world even further with US thought.

  • @woolfel
    @woolfel24 күн бұрын

    There a numerous papers about data quality and data selection going back to 2000. Good to see people realize quantity is not the "end all" of training LLM. Creating a good dataset has always been an art. Will the filters and pipeline for processing the data get open sourced?

  • @propeacemindfortress
    @propeacemindfortress24 күн бұрын

    As always the best curated ML news. Love your expertise and humor :D oh, and... more fish for Yann LeCat!

  • @pietrorse
    @pietrorse23 күн бұрын

    if the training text would be plain ascii, and average token length 4 characters , the training dataset would have been ~ 55 terabytes plain ascii. wow!

  • @Hacking-Kitten
    @Hacking-Kitten24 күн бұрын

    thank you very much for your videos! Could you hint me to some of the techniques that you find most promising in context length extension?

  • @Voljinable
    @Voljinable19 күн бұрын

    I really like the way you frame meta/zuck making llama 3 open source. They choose the option that is best for the company, but whats best changes. For research and optimization an open source model is better. For profit a closed source one is better. What they do just depends on what is best at the moment, but i like that its open source for llama 3 right now and hope it will stay that way!

  • @Nico79489
    @Nico7948924 күн бұрын

    Cool to have great open LLMs. Unfortunately, this is not the case for image generation models: all the recent advanced models like SDXL or Photoshop are not commercial free ones.

  • @pablowentscobar
    @pablowentscobar24 күн бұрын

    Really enjoy these ML News vids. Great for keeping AI normies like me up to speed.

  • @yaxiongzhao6640
    @yaxiongzhao664024 күн бұрын

    Right at the moment, Phi-3 changed the game again!...

  • @quickpert1382

    @quickpert1382

    24 күн бұрын

    huh yeah, the 7b standard is no more a standard. It's a pretty model really that can be runned also on 4GB VRAM gpus.

  • @marsandbars
    @marsandbars24 күн бұрын

    20:40 An alternative to this is using a documented SDXL Turbo workflow with ComfyUI locally, which can produce images of decent fidelity at even faster speeds than this demo, at least on my 3090.

  • @sebastianp4023
    @sebastianp402324 күн бұрын

    we are getting to model sizes where they might as well just be compressed lookup tables

  • @andreicozma6026

    @andreicozma6026

    22 күн бұрын

    That’s essentially what they are regardless. That’s how attention works. Tokens are used as queries against keys for computing similarity scores and then the values are summed up based on those scores. It’s literally essentially keying into a learned “dictionary” index

  • @yurykorolev
    @yurykorolev24 күн бұрын

    thank you

  • @Iaotle
    @Iaotle24 күн бұрын

    This channel is one of the rare ones that I genuinely watch, amidst hours and hours of clicbait recycled AI hype videos :)

  • @OperationDarkside
    @OperationDarkside24 күн бұрын

    Interesting times in many ways.

  • @brandonheaton6197
    @brandonheaton619724 күн бұрын

    You should recapitulate the math from code synthesis project from MIT using llama 3 because that would be lit

  • @Embassy_of_Jupiter
    @Embassy_of_Jupiter21 күн бұрын

    I find the license really fair. Their models will be obsolete by next year anyway. I think it is only appropriate that they should profit off of it until then, for having developed such a great step forward in local LLMs.

  • @pawelkubik
    @pawelkubik20 күн бұрын

    I'm not sure why people have reservations about Phi specifically. We don't know what data were used to train the other models and to what extend their performance rely on "fitting to the test dataset". Did OpenAI ever admit what role does the human-curated part of their training dataset play in the model's performance?

  • @FilmFactry
    @FilmFactry24 күн бұрын

    If we had access to GPT 4 Weights and biases how different would it be to the LLama 3 Weights? I use all the LLMs and find them pretty much even. Claude is fast but limited. I find Gemini pro a little dumber.

  • @JorgetePanete
    @JorgetePanete24 күн бұрын

    I don't even know what benchmarks to believe

  • @auresdz701
    @auresdz70124 күн бұрын

    The scores are high somehow and it makes me wonder whether they specially aligned the curated and the validation data when doing instruction finetuning!!

  • @snarkyboojum
    @snarkyboojum24 күн бұрын

    🎉

  • @sebastianp4023
    @sebastianp402324 күн бұрын

    "we have used 15T tokens from publicly available sources . . . pls don't look to close . . ." 😂

  • @sebastianp4023

    @sebastianp4023

    24 күн бұрын

    that's a big "trust me bro"

  • @skierpage

    @skierpage

    24 күн бұрын

    The open-source dataset "The Pile" contained the 108 GB Books3 shadow library of approximately 196,640 pirated books, most of which are still under copyright. It is a "Publicly available source," so the lying executives can shrug and screw over authors. They have $billions to spend on Nvidia chips but won't even buy e-books of the creative works they train on. (It's hard to tell what the current status is of The Pile and Bibliotek... intentionally so. The first Llama model trained on Books3. Rumor is all the AI companies saved a copy of Books3 before scripts pointing to the dataset were deleted, and Nvidia is being sued over training its NeMo LLM family.)

  • @huveja9799
    @huveja979924 күн бұрын

    and we are witnessing a t-shirevolution .. I'm still dizzy from seeing those stripes ..

  • @monad_tcp

    @monad_tcp

    24 күн бұрын

    its to confuse visual learning algorithms

  • @huveja9799

    @huveja9799

    24 күн бұрын

    @@monad_tcp Well, I hadn't thought of that, but now that you mention it, it may well be ..

  • @cherubin7th
    @cherubin7th24 күн бұрын

    Very nice

  • @theosalmon
    @theosalmon24 күн бұрын

    Though we appreciate this greatly. Go away and go back on vacation until you're rested!

  • @dr.mikeybee
    @dr.mikeybee24 күн бұрын

    Do I really need a better LLM than Llama3 70B? If I have a good agent with search, RAG, and memory, isn't that good enough?

  • @alan2here
    @alan2here24 күн бұрын

    Lama 3 BS generator v5 don't worry, I'll include the Lama 3 at the start

  • @DanielWolf555
    @DanielWolf55524 күн бұрын

    Does Llama 3 have any vision capabilities like GPT4?

  • @214F7Iic0ybZraC
    @214F7Iic0ybZraC16 күн бұрын

    6:23 "There is enough research to show that once you are capable at one language, you only need quite little data on another language to transfer that knowledge back and forth" Does anyone give me related papers to this argument? I am interested in cross-lingual transfer in language models.

  • @thomasmuller7001
    @thomasmuller700124 күн бұрын

    helli hello!

  • @propeacemindfortress
    @propeacemindfortress24 күн бұрын

    I tried to use existing llm's to prepare a fine tuning dataset specific on theravada thought, philosophy and practice, turned out that all models I tried were incapable of capturing any nuances in the meaning of words and concepts but stuck diligently to "their own philosophical framework of interpretation" regardless of the different system prompt, regardless of feeding scriptures, papers or video transcripts, they couldn't even identify the proper questions, so please don't mind me on disagreeing that language alone, maybe even regardless of percentage distribution, doesn't cut it on any task that require cultural, philosophical or religious understanding... not even talking about the human component in it... translation ofc is a totally different thing, used phrases and stuff can be captured quite well... the underlying unspoken human component not so much.

  • @klausschmidt982

    @klausschmidt982

    23 күн бұрын

    Yeah, that seems like a very hard task for such a model. Sometimes you have to properly manage your expectations with these things

  • @propeacemindfortress

    @propeacemindfortress

    23 күн бұрын

    @@klausschmidt982 I've given up on it, current models neither have the capability nor the training data that would allow for finer nuances on rare topics... future models might be capable but with the move to synthetic data... 🤷‍♂ very doubtful that future architectures can do it after the synthetic data has been flattened into a unified interpretation... then we will have american and chinese buddhist interpretations 😂 so I join into that "Yeah," was a nice idea but specialized things might need a lot more human work and training investment than I can afford. Have a good, thanks for reply.

  • @pandalayreal
    @pandalayreal24 күн бұрын

    Old news there is Phi-3 now.

  • @semaraugusto
    @semaraugusto24 күн бұрын

    the 8B param number strikes me as a bit weird. Why not 7B to make a fair comparison between the models? did they not achieve good results with 7B or did they just not test and decided in advance to compare against weaker models?

  • @BrandnyNikes

    @BrandnyNikes

    24 күн бұрын

    They have a larger vocabulary and through that more parameters in the embedding layers. The other architecture (number of layers and heads) should still be the same.

  • @skierpage

    @skierpage

    24 күн бұрын

    Fat-fingered typing error? 🙂

  • @OmicronChannel
    @OmicronChannel24 күн бұрын

    ScreenAI seams to be everything you need for a agent capable to perform UX interaction. That's exciting and disappointing at the same time (I agree that the accessibility via Google Vertex AI is very limited). Why can Google not provide a SIMPLE on-the-go-payment-API-call solution like OpenAI and Anthropic??

  • @mimszanadunstedt441
    @mimszanadunstedt44122 күн бұрын

    Its very easy to get models to hallucinate when asking for music recommnds. LLama is no different.

  • @timothywcrane
    @timothywcrane24 күн бұрын

    Be sure to DL the Purple and Purple Free Versions by getting two emails for each model set requested. But be prepared for TB instead of GB worth of dls.

  • @unvergebeneid
    @unvergebeneid24 күн бұрын

    I would be careful not to just laugh safety guys off as these silly modern-day luddites. Anyway, can't wait for llama3-uncensored:400B. But then again, I just want to do cool stuff and see the world burn, so don't mind me! 😊

  • @Dogo.R
    @Dogo.R24 күн бұрын

    Remember math results not utilizing wolfram are meaningless. Since their results will be childs play compared to results using worlfram as a tool.

  • @eadweard.

    @eadweard.

    24 күн бұрын

    Cannot tell what you are trying to say.

  • @skierpage
    @skierpage23 күн бұрын

    @13:16 "and with the past with Llama 2 we've already seen that all these people who have announced how terrible the world is going to be if we open-source these models have been wrong -- have been plainly wrong. The improvement in the field, the good things that have happened undoubtedly, massively, outweigh any sort of bad things that happen, and I don't think there's a big question about that. It's just that the same people now say 'Well okay not this model, but the next model... is really dangerous to release openly.' So this is the next model, and my prediction today is it's going to be just fine, in fact it's going to be amazing releasing this." @Yannic, That's quite a set of claims. What are all "the good things that have happened" beyond technical advances like more efficient models? I'm sure millions of people are more productive and writing better (or at least spewing grammatically correct verbiage), but are there actual studies of the good things, both with AI in general and open-source models? Meanwhile it's unclear how long it will be before we discover the awful uses of AI in the 2024 election cycles in major countries and other disinformation campaigns. I'm willing to believe your take, but some evidence for your optimism would be nice.

  • @hypno5690
    @hypno569024 күн бұрын

    I can't care about LLMs until we get personal assistants that are completely customizable and fully transparent with no censorship.

  • @lonelybookworm

    @lonelybookworm

    24 күн бұрын

    But you can? It just requires a beefy PC

  • @Rhannmah

    @Rhannmah

    24 күн бұрын

    Well you should care, because large language models and their evolutions are about to take over your life.

  • @logangarcia
    @logangarcia23 күн бұрын

    Pls add timestamps to the video

  • @timothywcrane
    @timothywcrane24 күн бұрын

    If it wasn't for open weights, crazies banging away on 1050tis and pis like myself would have never been "allowed".

  • @BinarySplit
    @BinarySplit24 күн бұрын

    As a language learner, it feels not so great for other languages, and I question whether 5% non-English is enough. Discuss German and it'll often make mistakes explaining the grammar. Try to talk in Chinese and it'll switch back to English at every opportunity. Hopefully these are just issues with the prompt or instruction tuning that will be fixed by other fine-tunes, but for now I'm going back to Mixtral and ChatGPT...

  • @zyxwvutsrqponmlkh
    @zyxwvutsrqponmlkh24 күн бұрын

    Llama can pretend to run code, I got it to simulate a dos prompt and play text adventure games.

  • @kbizzy111
    @kbizzy11124 күн бұрын

    More paper reviews please

  • @henrischomacker6097
    @henrischomacker609724 күн бұрын

    I really hoped that the small model would be better in the german language but unfortunately not good enough that I would prefer to talk to it only in german and don't think that some of my only german speaking friends would like to talk to it. Probably the bigger model is much better in foreign languages but unfortunately that one is again too big for a 4090. It's a pitty. So having our own app available via VPN at home from the mobile phone to let it also use our other non english speaking friends is still not really an option. Normal people are ignorant and would laugh at me. - Maybe not if I would give a female assistant an erotic french voice? ;-) But I must say that despite of that I really like the instruct model but the chat model gave me a lot ! of bs. But maybe some parameters tweaking may change that. Haven't had the time to play around with it more right now. But we'll see...

  • @andytroo
    @andytroo24 күн бұрын

    i think there is no need to make it private - the moment the model requires more than ~24gb of ram to run, it is out of the hands of most businesses to directly use - you can release the weights and you can privately run the poor models quickly, the medium models slowly, or you can laugh as your hardware runs out of ram trying to run the full Facebook model ...

  • @timothywcrane
    @timothywcrane24 күн бұрын

    If you are the first to produce the MMLU is that an achievement or shameful? luv that OpenAi just added reverse "gas fees".

  • @JacobAsmuth-jw8uc
    @JacobAsmuth-jw8uc24 күн бұрын

    Immediately passed by Phi 3

  • @mauricioalfaro9406
    @mauricioalfaro940624 күн бұрын

    0:05 The usual little cynical chuckle

  • @diga4696
    @diga469624 күн бұрын

    Wow it's only been two days since llama 3 release!? I swear it felt like a month ago..

  • @christopherknight5526
    @christopherknight552624 күн бұрын

    Yikes! Missed the phi-3 announcement..

  • @tomaszkarwik6357

    @tomaszkarwik6357

    24 күн бұрын

    The second half of the video is about phi-3

  • @TiagoTiagoT
    @TiagoTiagoT24 күн бұрын

    13:23 To be fair, you can only end the world once, and after that happen you (luckily) won't be around to witness the outcome. Black swan with a touch of the anthropic (no pun intended) principle; you can only be alive to witness the state of the world if the world you live in has not been ended yet; once that happens, you likely will not be in conditions of acknowledging that it has happened; it is not something you can look back and see it after the fact, you can only experience it the first time it happens, and that is if you will be able to have any experience at all while it is happening.

  • @Hexanitrobenzene

    @Hexanitrobenzene

    24 күн бұрын

    Yannic does not believe that AI can cause existential risk. With this generation of models, he is probably right, but the trend is not promising...

  • @TiagoTiagoT

    @TiagoTiagoT

    24 күн бұрын

    @@Hexanitrobenzene Humanity is blindly approaching the "tickling the dragon's tail" territory; but unlike with the Demon Core, once it goes criticial, it won't be just a matter of a few lab workers suffering of radiation exposure. Who knows, maybe we'll luck out and go the comic book route and gain godly super-powers; but in the real world, the odds aren't looking good. Don't get me wrong, I'm not saying we would be safer with just the big corps handling the development of the future, or the end of the future, of humanity; we're fucked either way, Molloch, you know?

  • @Rhannmah

    @Rhannmah

    24 күн бұрын

    @@TiagoTiagoT relax. A language model doesn't have the agency nor the tools to make actions in the real world, and even if it did, it wouldn't be able to react and incorporate the results. We're quite far from the situation you're thinking of. Doesn't mean you don't have to think about it because it's pretty much undoubtedly coming in the future, but there is nothing to freak out about. The only actual worry currently to be had in the immediate future is the amount of people who become unemployable because of the performance of generative models.

  • @TiagoTiagoT

    @TiagoTiagoT

    24 күн бұрын

    @@Rhannmah You must not be following the news closely in recent years; people being giving them all those abilities bit by bit at a faster and faster rate. Unemployment is a concern; but that's just the bathtub starting to overflow; meanwhile the air faintly smells like gas and there are lit candles all over the place....

  • @skierpage

    @skierpage

    24 күн бұрын

    @@Rhannmah even without agency for the unknowable goals of an ASI, current AIs allow bad actors to do bad things with minimal effort. And what's really concerning is Google/Meta/Microsoft/OpenAI are run and owned by billionaire sociopaths whose goals include: getting you hooked on a stream of content so they can know all about you to monetize your profile; avoiding any meaningful government regulation; and stopping the redistribution of their wealth. Now imagine even worse actors and political campaigns having similar capabilities.

  • @alan2here
    @alan2here24 күн бұрын

    Yeah you can get it in Africa, US, Asia, but not in the UK

  • @TiagoTiagoT
    @TiagoTiagoT24 күн бұрын

    What if they keep overtraining the smaller models until they plateau?

  • @JurekOK

    @JurekOK

    24 күн бұрын

    that's literally what they did with lama3

  • @TiagoTiagoT

    @TiagoTiagoT

    24 күн бұрын

    @@JurekOK I thought they said they hadn't plateaued yet by the time they stopped training?

  • @SimonJackson13
    @SimonJackson1324 күн бұрын

    WinAmp ....

  • @Embassy_of_Jupiter
    @Embassy_of_Jupiter21 күн бұрын

    I tried a 2 bit quantized 70B model and it blew my mind how good it still was

  • @user-rk4ux3cj8q
    @user-rk4ux3cj8q20 күн бұрын

    8:25

  • @syncrossus
    @syncrossus23 күн бұрын

    > The good that's come from these models far outweighs the bad Really? Don't get me wrong, I think language models are great but I know people have lost their jobs over this, we've seen data breaches, people are falling in love with AI personas, one guy was driven to suicide, scams are on the rise... I have no shortage of bad things to mention that have come out of AI, but I can't think of anything truly good. I mean I'm sure a good number of people are a bit more productive in their work, but that doesn't seem like a worthy tradeoff to me. I also disagree with your cavalier attitude towards safety based on past experience. It seems possible to me that as these models become more powerful, we may attain the AI singularity (ability for self-improvement). Once that happens, past experience will have very little wisdom to impart on us regarding what will happen next. It's very possible that we're worried for nothing, but given the scale of what's at stake, it only makes sense to be cautious.

  • @XOPOIIIO
    @XOPOIIIO24 күн бұрын

    People who thought that nuclear proliferation will cause nuclear war were wrong, they were wrong all along.

  • @eadweard.

    @eadweard.

    24 күн бұрын

    Well they're weren't wrong that it was extremely risky. It just didn't happen to happen.

  • @XOPOIIIO

    @XOPOIIIO

    24 күн бұрын

    @@eadweard. Exactly

  • @eadweard.

    @eadweard.

    24 күн бұрын

    @@XOPOIIIO Cannot tell what you are trying to say.

  • @XOPOIIIO

    @XOPOIIIO

    24 күн бұрын

    @@eadweard. How old you are? What is your IQ? Did you watch the video?

  • @oncedidactic

    @oncedidactic

    23 күн бұрын

    Is nuclear war possible without the proliferation of nuclear weapons? If you’re going to talk about out logical causes, be specific about the claim.

  • @seventyfive7597
    @seventyfive759724 күн бұрын

    Correction: this is not open source. Open weights without the release of processes or code, is akin to a binary library, you can build with it, but you depend on it without knowledge. Open source should be at the minimum open like Grok 1.0, otherwise it is quite an evil way of sourcing, getting regulation ease from the govt and dev ideas from the community, but keeping them dependent on the "binary". Same goes for Mistral.

  • @eadweard.

    @eadweard.

    24 күн бұрын

    Not sure what you mean. If they release no architectural code/information, how can you use it at all, even for inference?

  • @Hexanitrobenzene

    @Hexanitrobenzene

    24 күн бұрын

    It's "open weights". Classical code does not have an analogy. With open weights you can do quite a lot of customizations, in contrast to binary library, which would require an extremely difficult task of reverse engineering to do any modification. True open source would be revealing the training data, the model code and the details of training processes.

  • @seventyfive7597

    @seventyfive7597

    24 күн бұрын

    ​​​@@Hexanitrobenzene I have to disagree on your first paragraph, what you're referring to is akin to the include file that goes along the binary. It's closed code. For comparison look at the amount of information X-AI released along Grok 1.0 . Mistral and Meta are local closed code, while OpenAI are SaaS closed code, but both are closed.

  • @clray123

    @clray123

    24 күн бұрын

    It is much worse with Llama becuase they reserve a right to terminate your license by accusing you of violating Acceptable Use Policy. Which they can change basically at any time. They also force you to defend them in court (indemnification) if your users sue them. Which could be a big deal for a small company.

  • @Hexanitrobenzene

    @Hexanitrobenzene

    21 күн бұрын

    @@seventyfive7597 By "modifications" I meant fine tuning, which can be done way cheaper (

  • @ulamss5
    @ulamss524 күн бұрын

    Just a reminder that openai constantly nerfs their production models. Beating today's cgpt3.5 doesn't mean it beats the launch cgpt3.5 which formed our first impressions.

  • @Phobos11

    @Phobos11

    23 күн бұрын

    Launch version doesn't exist anymore and will never exist again, so not really sure there's a point to compare to a ghost. As long as open source models keep getting better, it's progress :D

  • @dr.mikeybee
    @dr.mikeybee24 күн бұрын

    Regarding Llama3, Sam looked scared out of his mind in a recent video. ClosedAI sucks.

  • @halocemagnum8351
    @halocemagnum835123 күн бұрын

    The obsessively blaming the safety crowd IMO is kinda cringe and lame. It’s obvious why Open AI and Anthropic don’t open source their models, it’s for profitability reasons. They don’t even pretend like it isn’t and they don’t use safety as an excuse. Constantly blaming people who care about safety is gonna lead to a rude awakening when Facebook realizes it’s tanked enough competitor market share and announces its own fully closed off monetized models.

  • @TheEarlVix
    @TheEarlVix24 күн бұрын

    Spun the 8b parameter Llama 3 model up locally with Ollama, asked it to summarise some text and it just spat out garbage. Tried it on Q4, Q8 and FP16 quantizations and apart from "Why is the shy blue?" everything else I tried was a totally rubbish response. Also found that it often went into a long, seemingly endless, cycles of outputting the same paragraph of nonsense over and over again. Can't speak for the 70b parameter model but the results with the 8b show that this smaller version is definitely not suitable for prime time.

  • @whoareyouqqq

    @whoareyouqqq

    24 күн бұрын

    ++++++ same result

  • @whoareyouqqq

    @whoareyouqqq

    24 күн бұрын

    Phi3 significantly better

  • @TheEarlVix

    @TheEarlVix

    24 күн бұрын

    @@whoareyouqqq Yes I tried Phi3 for a sanity check because it all seemed a bit odd especially after all the Llama3 release hype and Phi worked fine, not perfect but definitely without complete garbage issues.

  • @definty
    @definty24 күн бұрын

    Phi 3 is out and beats llama 3 7b model already, it's like a week after llama 3 release.

  • @TylerMatthewHarris
    @TylerMatthewHarris24 күн бұрын

    Onest

  • @jermunitz3020
    @jermunitz302024 күн бұрын

    🦙

  • @haldanesghost
    @haldanesghost24 күн бұрын

    This has really changed my perspective (from pessimistic to a little more optimistic); both the dunking on the doomers but also, by releasing these models and being unapologetic about it, we can start to get rid of the mystique that has been given to them because of this Wizard of Oz game OpenAI was playing. letting people learn to deal with these systems by themselves and see what’s under the hood. I’m confident that’s going to lead to the more efficient use of these systems, something that’s achievable when the name of the game is just “MAKE MODEL BIGGA! MOAR DATA! MOAR CIMPUTE!!!” The power of having generalized approximators is wasted if all you use them for is effectively brute force on a graph. The thing about data quality cannot be overstated. If we can be rational adults for a second and drop the hype, the fact of the matter is that calling these systems “artificial intelligence” and acting as if they’re machine god doesn’t change the fact that they’re not intelligent, doing anything close to it, nor have any of the cognitive properties the hypers and the doomers keep attributing to them. They are just functions; literal f(x)’s (granted big spicy ones). You’re fitting a function to data under some optimization procedure. The relevance of the data is that in neural networks (and siblings) we have mathematical guarantees about them being able to fit anything *(within reason)- they’re general purpose approximators. That’s a super useful thing to have! Quite powerful. You know what the weakness is though? **You can fit anything**. Anything includes things you as a human don’t want! But if the thing you don’t want generated a signal that can be used to minimize loss, then the systems doing things you don’t want, is actually working as intended. Being able to fit anything means that the function you’re using to fit ceases to be of central importance m, completely shifting the burden onto the data itself. Fitting these models (assuming you pulled it off) is just moving the data distribution from a data explicit format, to a functional representation. Hopefully, this leads to a sobering if the field and maybe an attempt is made to return to symbolics with the gains of these models and maybe, just maybe, an artificial system could not just sound like a human, but reason like one.

  • @skierpage

    @skierpage

    24 күн бұрын

    The only way these LLMs can successfully predict the next word in an answer or conversation on almost any subject expressible in a sequence of characters (!!) is by being intelligent and having cognitive properties like understanding. We are all SO BLOODY TIRED of people claiming otherwise; if you deny AI "intelligence" and "understanding," you are making up your own definitions so you can move the goalposts to another sports stadium altogether. Just say that AI intelligence, understanding, and cognitive properties are not the same as human intelligence and human understanding (yes, we know), and give us your take on how they fall short. (I tried to use Bing Copilot Chat to find the pithy tweet where an AI expert trashed your tired wisdom that these things aren't intelligent... and it couldn't find it.)

  • @markburton5318
    @markburton531824 күн бұрын

    I don’t think you can say there is no harm from open source AI. It is too early. It was inevitable anyway that there would be leaks. But people will try to cause harm. Kids in the US machine gun schools in order to be famous, so of course someone will try to create ‘terminator’. You laugh about EU legislation but at least certain activities have to be illegal otherwise they cannot be stopped. The effort on safety has to be stepped up.

  • @Effectivebasketball
    @Effectivebasketball23 күн бұрын

    No, it is not. Just another player and not the best of all.

  • @ivanstepanovftw
    @ivanstepanovftw24 күн бұрын

    Can you please put summary to the beginning or end of your videos? It is so boring to listen "wow it is so good!" or "best model" etc.

  • @mikethedriver5673

    @mikethedriver5673

    24 күн бұрын

    Adding this may improve your videos possibly, but I disagree that it is boring. I very much enjoy your videos 😊

  • @ivanstepanovftw

    @ivanstepanovftw

    24 күн бұрын

    @@mikethedriver5673 OK! Here is spoiler for LLaMA 4/Mistral 2 7B/Phi/etc: OH MY GOD, IT IS SO MUCH BETTER, IT BEATS GPT-3.5.

  • @RozenKrieg
    @RozenKrieg24 күн бұрын

    Llama 3 is a real downgrade

  • @tunestar
    @tunestar24 күн бұрын

    Kinda late video.

  • @buttpub
    @buttpub24 күн бұрын

    well, llama3 compared to mistral does not really perform much better, 7b and 8b that is.

Келесі