What's After LLMs?

Ғылым және технология

In the world of AI, the talk of the town are LLMs - Large Language Models. LLMs are a form of foundation model, a pillar of AI development. However, we seem to forget that the other pillars exist, and in some respects, there is plenty more potential revenue from these other foundation models. In this video, we go into some of these details.
Ian Cutress, More than Moore
Mukesh Khare, IBM
Sriram Raghavan, IBM
Foundation Models: ibm.biz/BdSGKM
[00:00] Beyond a Foundation Model
[04:08] Are all models like languange?
[09:48] Supply chain security for training data.
[14:52] Will foundational models get more strict?
[20:03] Whats the Future for IBM in AI?]
-----------------------
Need POTATO merch? There's a chip for that!
merch.techtechpotato.com
more-moore.com : Sign up to the More Than Moore Newsletter
/ techtechpotato : Patreon gets you access to the TTP Discord server!
Follow Ian on Twitter at / iancutress
Follow TechTechPotato on Twitter at / techtechpotato
If you're in the market for something from Amazon, please use the following links. TTP may receive a commission if you purchase anything through these links.
Amazon USA : geni.us/AmazonUS-TTP
Amazon UK : geni.us/AmazonUK-TTP
Amazon CAN : geni.us/AmazonCAN-TTP
Amazon GER : geni.us/AmazonDE-TTP
Amazon Other : geni.us/TTPAmazonOther
Ending music: • An Jone - Night Run Away
-----------------------
Welcome to the TechTechPotato (c) Dr. Ian Cutress
Ramblings about things related to Technology from an analyst for More Than Moore
#machinelearning #artificialintelligence #sponsored
------------
More Than Moore, as with other research and analyst firms, provides or has provided paid research, analysis, advising, or consulting to many high-tech companies in the industry, which may include advertising on TTP. The companies that fall under this banner include AMD, Armari, Baidu, Facebook, IBM, Infineon, Intel, Lattice Semi, Linode, MediaTek, NordPass, ProteanTecs, Qualcomm, SiFive, Supermicro, Tenstorrent, TSMC.

Пікірлер: 35

@radicalrodriguez59126 ай бұрын
channel's getting more and more polished, TechTech. Good on you
@Veptis6 ай бұрын
Foundation model or "foundational" model is quite an odd term as well. I do believe "base_model" is quite a bit more fitting and actually explains what's going on. Also the title is misleading. If you start a discussion about what's "after" LLMs, after transformers. Because there is significant limitations with left to right models. What is talked about here is just taking the transform architecture and applying it to other modalities of data. Which has already been done. ViT or the vision transformer and CLIP already manage to learn self supervised (not self labelled) embeddings for images and captions into a common embedding space. It's the basis on what stable diffusion and derivatives works. CLIP has outperformed many visual models without any fine tuning or task specific training. So those models already exists for concepts beyond sequences (natural language, code, genome, audio, tokenized images). Regarding open source models. I would argue, that open models are much more secure. You know how it's trained, you know it's architecture, you control any preprocessing or postprocessing that might happen. You can deploy and modify the model to your own liking. A modelling hiding behind "safety" secrecy and an API you don't control is much less useful for building custom applications. Plus if you spend your money with OpenAI and fine-tune a model on your data there... It will never leave their servers, and you can only deploy it with them.
@kelownatechkid
6 ай бұрын
Excellent comment. Secrecy is NOT security!
@riskin6204 ай бұрын
Hi Ian, great video as always. One thing I wanted to point out is that the Llama Community License is not an open source license, as Section 2 of the license breaks the OSI definition of Open Source. It's a source-available license, but it isn't Open Source. It's a technical, but meaningful, distinction. Second, there *are* open source models that do exist under OSI-approved licenses, like Dolly v2 from Databricks. Also unlike Llama2, Dolly v2 makes available both the training data and the weights, both under OSI approved licenses. (This is merely an example, I have no horse in the LLM race and have no opinion on Llama2 or Dolly v2.) So while you're right that companies care a great deal about the source of the training data for these models, when you call out "open source" models like Llama2 as a source of concern for businesses, what you're pointing out isn't the open source nature of the model (it's not open source in the first place, but even if it were that's not the issue) but rather the fact that the training data *isn't* open source. So the problem is actually caused by the *lack* of openness, not openness, and the example you give for this problem (Llama2) isn't an example of what you're erroneously trying to highlight. Yes, training data provenance matters a lot, as does the license covering that training data (Llama2's training data is not available under any license, as far as I'm aware). No, Llama2 isn't open source. No, Llama2 isn't a problem because of the license that covers the Llama2 model, Llama2 is a problem because the training data is secret. Making the data available under an open source license would help this, but considering it's all coming from Meta there may still be issues using either Llama2 or the data depending on their responsible AI policies or other ethics policies in various companies. The fact that the Llama2 model is/isn't available under an OSI-approved license isn't the issue, the data itself and its license is the issue. Hopefully your future videos will reflect this clarification. This is like the megatransfers vs. clock speed for RAM issue, for me. ;) Thanks, and keep up the outstanding work!
@solidreactor6 ай бұрын
I have been looking for these kind of discussions that widens the view of the current state in AI and ML. Currently It's like everyone is only focusing on the coolest new things around Transformer (previously it was CNN) however I wish we could take some time to take some steps back and get a larger view or where we are, were and are heading towards; And explore what we could do and currently researching on. With that said, my contribution to this topic and discussion are 2 things: 1) I see the LLM's future as the *interface* between humans and computers (AI models), and perhaps could call it the "LLM interface" architecture Basically I see that we humans will interact with the LLMs and they in turn acts as the intermediate (middle-man) to then "talks" to the other models. These other models could be specialised in images, video, audio, quantum physics, coding, medicine e.t.c. This is basically a multi model approach with LLM as the only interface to the user. I also believe the next step after this "LLM interface" architecture would be that the LLM consulting other LLMs as the step before talking to the specific specialised models, let's call them "board of LLM directors". This board of LLM director takes the input from the user and discusses internally the best way to answer the user, which models should be included (image, video e.t.c.) and does the answer satisfy the "AI safety protocols" and do we (LLM board) need to ask the user for more input or context and so on. 2) The other contribution is regarding the "Prompt tuning" or "Parameter efficiency" talk (around 7:33) which I like to call "Context tuning": The name of the fine tuning step "Context tuning" is for me a very good description of it, when I need to do more tuning than a (relatively) "simple" prompt engineering. I have a long discussion with the AI (often ChatGPT 4) where I present the context in detail of what I want the discussion subject be about. After a longer discussion we have now established the context, prerequisites/dependences and the goal of the discussion. A tip regarding this, sometimes I tell the LLM only the specific initial Context (with prerequisites/dependences) however having the goal to be exploratory and open, to let us discuss "open minded". This is a very good tip when you want to search for the "known unknowns", things you know exists but might be too convoluted to get a grasp on. This is a very good tip and process if you are like me who goes by the "For every hour I focus 55 minuts on the question and the last 5 minutes on the answers" ;)
@weeblewonder5 ай бұрын
Loved this chat. The question of "Where is the data you're using to build and train these models, coming from?" needs to be much more visible and recurrent in this industry.
@shmookins6 ай бұрын
Very helpful and informative. Thank you.
@WalnutOW6 ай бұрын
11:15 made me lol
@goodfortunetoyou6 ай бұрын
String rewriting is equivalent to a turing machine. So, in some sense, language *is* computation.
@lost4468yt
6 ай бұрын
All models are computation? If they weren't they'd be hypercomputers, aka oracles, aka magic.
@goodfortunetoyou
6 ай бұрын
@lost4468yt oh oracular uniform distribution, please grant me a sign. If thou wishest me go left, let the coin face up heads, and if thou wishest me go right, trails. (Your milage may vary)
@KrunoslavSaho6 ай бұрын
Thanks for the video.
@zyxwvutsrqponmlkh6 ай бұрын
The golden rule; he who has the gold, rules.
@vasudevmenon24966 ай бұрын
Is it still under developement or do we have primitive models beyond LLM/MLM?
6 ай бұрын
So it´s similar to a interactive Dictionary.
@bobsyouruncle15746 ай бұрын
This presentation would not feel out of place if watching involved inserting a VHS tape into a VCR on a rolling CRT television rack in a wood paneled board room.
@INeedAttentionEXE6 ай бұрын
Ayy-Yaiy!
@cal21276 ай бұрын
ive made the joke that if you want to get any new equipmeny in i.t. tell the bosses its ai related
@v-sig2389
6 ай бұрын
Hahaha good one
@aravindpallippara15776 ай бұрын
I know you are self aware, but far too many buzzwords at a foundational level makes a man a bit sceptical
@AlbertoMontesSoto6 ай бұрын
AI!?
@jorcyd
6 ай бұрын
AI!
@v-sig2389
6 ай бұрын
¡ Caramba !
@marktackman28866 ай бұрын
Please keep the topics reachable by new audiences, content is really good
@ullibowyer6 ай бұрын
LLMs are so 2023. Glimpse of AGI *yawn*, what's next?
@v-sig2389
6 ай бұрын
Bots that will do everything for us, including fullfilling our psychological needs, and we will devellop hard-drug-like addictions. People camping in front of the store for the last iphone will be nothing compared to how people will be addicted.
@lost4468yt
6 ай бұрын
@@v-sig2389 There are some things AI cannot replace no matter what - because the human is the unique part. E.g. I wouldn't be able to have an AI as a friend or romantic partner, because it would be fundamentally not meaningful to me. It has to be a human, even if an AI were able to act like an ideal friend/partner, it still wouldn't be to me.
@gogogomes70256 ай бұрын
I know people are sick of hearing of AI, but people were also sick of hearing about cloud and that didn't make it any less of a revolution, I get that for us Techies hearing meaningless corpo marketing gibberish it's tiresome and we can see from a mile away when all they care is about making VCs and board member's funny bits tingle with their endless yaping, yet still this also doesn't make it any less of a revolution, sometimes I think to myself "man, If I had jumped into cloud earlier I could be making bank" and for my money AI it's the next cloud.
@Wobbothe3rd
6 ай бұрын
Most people aren't sick of it at all, it's just resentment on social media. Nothing about this AI revolution is meaningless corporate marketing, Jensen Huang knows exactly what he's talking about and isn't bluffing. People just resent his sucess. It is true that LLMs have been overemphasized, but that's really just a reflection of how massive the AI revolution actually is, whereas even just a subset of a subset is still a massive revolution in itself.
@aravindpallippara1577
6 ай бұрын
Cloud have been slowly getting to a plateau, especially when computation on edge or on site is critical - also for sometime it has been cheaper to run your program in your own device compared to the cloud infrastructure in terms of raw power costs (but infrastructure management and resilience are another matter)
@Wild_Cat3 ай бұрын
A.I.
@Ren334696 ай бұрын
😂
@abao6 ай бұрын
MLM? hahaha
@marktackman28866 ай бұрын
Please make a video that integrates this topic with the open ai breakthru

What's After LLMs?

Ғылым және технология

Пікірлер: 35

@kelownatechkid

6 ай бұрын

@lost4468yt

6 ай бұрын

@goodfortunetoyou

6 ай бұрын

@v-sig2389

6 ай бұрын

@jorcyd

6 ай бұрын

@v-sig2389

6 ай бұрын

@v-sig2389

6 ай бұрын

@lost4468yt

6 ай бұрын

@Wobbothe3rd

6 ай бұрын

@aravindpallippara1577

6 ай бұрын

Келесі