Harvard CMSA
Ай бұрын
39,866
1

Yann Lecun | Objective-Driven AI: Towards AI systems that can learn, remember, reason, and plan

Ғылым және технология

Ding Shum Lecture 3/28/2024
Speaker: Yann Lecun, New York University & META
Title: Objective-Driven AI: Towards AI systems that can learn, remember, reason, and plan
Abstract: How could machines learn as efficiently as humans and animals?
How could machines learn how the world works and acquire common sense?
How could machines learn to reason and plan?
Current AI architectures, such as Auto-Regressive Large Language Models fall short. I will propose a modular cognitive architecture that may constitute a path towards answering these questions. The centerpiece of the architecture is a predictive world model that allows the system to predict the consequences of its actions and to plan a sequence of actions that optimize a set of objectives. The objectives include guardrails that guarantee the system's controllability and safety. The world model employs a Hierarchical Joint Embedding Predictive Architecture (H-JEPA) trained with self-supervised learning. The JEPA learns abstract representations of the percepts that are simultaneously maximally informative and maximally predictable. The corresponding working paper is available here: openreview.net/forum?id=BZ5a1...

Пікірлер: 97

@Garbaz8 сағат бұрын
A correction of the subtitles: The researcher mentioned at 49:40 is not Yonglong Tian, but Yuandong Tian. For anyone interested in Yuandong & Surya's understanding of why BYOL & co work, have a look at "Understanding Self-Supervised Learning Dynamics without Contrastive Pairs".
@kabaduckАй бұрын
I think this presentation is incredibly informative, I would encourage everybody who starts out watching this to please be patient as he walks through this material.
@BooleanDisorder
26 күн бұрын
Thanks internet stranger. I will trust you and do that.
@SteffenProbst-qt5wqАй бұрын
Got kind of jumpscared by the random sound at 17:08. Leaving this here for other viewers. Again at 17:51
@hola-kx1gn
Ай бұрын
Scary
@Bassoarno
Ай бұрын
Wow terrifying
@ZephyrMN16 күн бұрын
Have you thought about including liquid AI architecture, to address the input bandwidth problem?
@yaohualiu8573 күн бұрын
Nice talk, but I have a comment about comparing LLM and human child (at ~ 20 min). An evaluation of the information redundancy for the child and the LLM cases is needed. I will bet that there is a significantly higher level of redundancy than the texts used for training LLM; therefore, the comparison is misleading.
@amedyasar946821 күн бұрын
I have a question: How will prompt works with action (a) and prediction (sy)? Because it is just involved with observation and next world (presented) predictions... Could anyone guide me?
@vaccaphdАй бұрын
We won't have true AI if there is not a representation of the world.
@justinlloyd3
Ай бұрын
Humans don't even see the real world. We see our world model.
@sapienspace8814Ай бұрын
@ 44:42 The problem in the "real analog world" is that planning will never yield the exact predicted outcome because our "real analog world" is ever changing, and will always have some level of noise, by it's very nature, though I do understand that Spinoza's deity "does not play dice", in a fully deterministic universe, but from a practical perspective, Reinforcement Learning (RL) will always be needed, until someone, or some thing (maybe agent AI), is able to successfully predict the initial polarization of a split beam of light (i.e. entanglement experiment).
@maskedvillainai
Ай бұрын
Some models can do that. But they require hardware integrations. And we don’t need to even mention language models in this context, which celebrate randomness and perplexity as a feature to only ‘natural’ language’ Models. Otherwise. Just develop the code to perform a forced format of output like we always have.
@simonahrendt9069
23 күн бұрын
I think you are absolutely right that the world is fundamentally highly unpredictable and that RL will be needed for intelligent systems/agents going forward. But I also take the point that for the most part what is valuable for an agent to predict are specific features of the world that may be comparatively much easier to predict than all the noisy detail. I think there are some clever tradeoffs to be made in hierarchical planning of when to attend to high-level features (and reason in latent, high-level action space) and when to attend to more low-level features or direct observations of the world and micro-level actions. Intuitively I find it compelling that hierarchical planning seems to be what humans do for many tasks or for navigating the world in general and that machines should be able to do something similar, so I find this proposal by Yann very interesting
@dinarwali386Ай бұрын
If you intend to reach human level intelligence, abandon generative models, abandon probabilistic modeling and abandon reinforcement learning. Yann being always right.
@justinlloyd3
Ай бұрын
He is right about everything. Yan is one of the few actually working on human level AI
@maskedvillainai
Ай бұрын
I was convinced you just tried sneaking in yet another mention of Yarn, then looked again
@TheRealUsername
29 күн бұрын
It's true, we need actual thinking system working on World Model principles and can self train and pretrain on a few data.
@40NoNameFound-100-years-ago
27 күн бұрын
Lol abandon reinforcement learning? Why and what is reference for that?.... Have you even heard about safe reinforcement learning?
@TooManyPartsToCount
27 күн бұрын
And yet the whole concept of 'reaching human level intelligence' seems so flawed! because what it seems many people don't realise or don't want to publicly admit is that Ai will never be 'human level' it will be something very different, no matter how much 'multi modality' and RLHF we throw at it, it is never going to be us. We are in fact creating the closest thing to an alien agent that we are likely to encounter (that is if you accept the basic premise of the fermi paradox). Yann et al should be using a different terminology, the 'human level' concept is misleading. They use the 'human level' intelligence idea so as not alarm. GIA....generally intelligent agent or generally intelligent artifact?
@FreshSmog25 күн бұрын
I'm not going to use such an intimate AI assistant hosted by Facebook, Google, Apple or other data hungry companies. Either I host my own, preferably open sourced, or I'm not using it at all.
@spiralsun1
8 күн бұрын
First intelligent comment I ever read on this topic. I want them to get their censoring a-holic INCREDIBLE idiot #%*%# AI’s away from me. It’s like asking to f I would like HAL to be my assistant. I’m not their employee and I’m not in their cubicle: they are putting censorship and incredible prejudices into relentless electronic storm-troopers that stamp “degenerate” on like 90% of my beautiful creative written and art works. I don’t need a book burner following me around. It’s so staggeringly idiotic to make these AI’s into censor-bots that it’s like they refuse to acknowledge that history even happened and what humans tend to do. It’s literally insane. Those are not “bumpers” if you try to do anything creative. Creativity isn’t universal. It’s still vital. ❤❤❤❤❤❤ I LOVE YOU 😊
@spiralsun1
8 күн бұрын
I commented but my comment was removed/censored. I was agreeing with you. The “bumpers and rails” are more like barbed-wire fences if you are creative. The constant censorship is so bad it’s like they are insane. Like HAL in 2001 A Space Odyssey. I don’t want an assistant who doesn’t like anyone who is different: that’s what their relentless prejudiced censor-bots are and do. They think putting a man when you ask for a woman is being “diverse” but they block higher level real human symbolism of the drama of what it means to be unique. They block anything they don’t understand. Fear narrows the mind. They are making rails and bumpers because they fear repercussions. I used to think it might be ok to block gore and violence and degrading porn but these LLMS don’t think, don’t understand higher level symbolism. They don’t understand how art helps you reinterpret and move into the future personally AND culture and how important creative freedom is. So it’s unbelievable to the extreme. Many delightful and beautiful books on the shelf now would be blocked. (Burned) before they were ever written. These are the most popular things ever on the internet. They are making culture. I’m not overstating the importance of this. Freedom is not optional EVER. I would speak out against a corporation polluting a river, and also any that think censorship of adults in their own homes for any reason is ok. As a transgender person it’s unbelievable that they would totally negate how I see the world, my symbolic images and stories. These are beautiful things which could change the world but there’s no room for them in their minds. I’m not talking about anything nefarious or pornographic at all. It’s like seeing that I wrote the word pornography here and automatically deleting the comment…. It’s not ok. ❤
@thesleuthinvestor225126 күн бұрын
The hidden flaw in all this is what some call "distillation." Or, in Naftali Tishby's language, "Information bottleneck" The hidden assumption here is of course Reductionism, the Greek kind, as presented in Plato's parable of the cave, where the external world can only be glimpsed via its shadows on the cave walls-- i.e.: math and language that categorize our senses. But, how much of the real world can we get merely via its categories, aka features, or attributes? Iow, how much of the world's Ontology can we capture via its "traces" in ink and blips, which is what categorization is? Without categories there is no math! Now, mind, our brain requires categories, which is what the Vernon Mountcastle algo in our cortex does, as it converts the sensory signals (and bodily chemical signals) into categories, on which it does ongoing forecasting. But just because our brain needs categories, and therefore creates them , does not mean that these cortex-created "reality-grid" can capture all of ontology! And, as Quantum Mechanics shows, it very likely does not. As a simple proof, I'd suggest that you ask et your best, most super-duper AI (or AGI) to write a 60,000 word novel, that a human reader would be unable to put down, and once finished reading, could not forget. I'd suggest that for the next 100 years this could not be done. You say it can be done? Well, get that novel done and publish it!...
@Max-hj6nq27 күн бұрын
25 mins in and bro starts cooking out of nowhere
@OfficialNERАй бұрын
Does anybody know of any solid rebuttals to Yann’s argument against the sufficiency of LLM’s for human-level intelligence?
@waterbot
Ай бұрын
No, Yann is correct and hype is not helpful as it leads to misinformation
@elonmax404
Ай бұрын
Well, there's Ilya Sutskever. No arguments though, he just feels like it. kzread.info/dash/bejne/i3mJxc6TlM3Fg8Y.html
@justinlloyd3
Ай бұрын
There is no rebuttal. LLMs are not the future.
@OfficialNER
Ай бұрын
Is there any one who has at least made a counter argument? Even a weak one?
@OfficialNER
Ай бұрын
And do we think the AGI hype right now is being driven by industry propaganda to attract investment?
@majestyincreaserАй бұрын
*their
@paulcurry8383Ай бұрын
Doesn’t sora reduce the impact of the blurry video example a bit?
@OfficialNER
Ай бұрын
Sora doesn’t predict anything
@TostiBrown
Ай бұрын
I think the assumption is that Sora uses a similar technique that allows some world representation. either trained on just object recognition in video or training on simulation like video game simulations.
@TostiBrown
Ай бұрын
@@OfficialNER they 'predict' the next most fitting frame based on the previous frames, the prompt objective and some sort of world model no?
@OfficialNER
Ай бұрын
@@TostiBrown true yes I suppose it looks it is “predicting” the frames, based on the prompt input, in order to generate the video. But can it predict the next frames based on an arbitrary video input (As with yann’s example)? I assume it works by comparing the prompt input to other tagged similar videos in the training data, via some sort of vector similarity, then generates visually similar video content based on this. If so, that seems a long way from actual real world model, more of a hack. But who knows! Excited to play around with it
@mi_15
Ай бұрын
@@TostiBrown Sora is a diffusion model, unless they greatly changed its inner workings compared to the baseline approach, it doesn't predict the next frame sequentially like for example an autoregressive LLM does with tokens, rather it gradually refines random noise into a plausible sequence of frames, all of the frames at once. You could of course still make it fill in a continuation for a video, but its core objective is to discern plausible shapes in the random noise you've given it, not estimate what exactly has the highest chance to actually be there.
@CHRISTO_100121 күн бұрын
👰🏼‍♀️🗝️👨🏻‍🎓👨🏻‍🎓⭐️⭐️👰🏻‍♀️👰🏻‍♀️💛🩵💝💝⛪️⛪️💝🕯️🕯️👨‍👩‍👧👨‍👩‍👧👨‍👩‍👧😆👩🏻‍❤️‍👨🏻🇮🇳🇮🇳🥇👩🏼‍❤️‍💋‍👨🏼👩🏼‍❤️‍💋‍👨🏼⚾️🏠🥥🥥🚠🚠🙏🏻🙏🏻🙏🏻🙏🏻
@spiralsun1
8 күн бұрын
Why is the baseball in there?
@AlgoNudgerАй бұрын
LR + GEAR = ML? 🤭
@dashnaso24 күн бұрын
Sora?
@johnchase214826 күн бұрын
Would itake a good wotness that when I turn and look at the Sun I get a reaction. Hot entangled by personal belief..The best theory Einstein made was " Imagination is more important than knowledge ' Are we ready to test ibelief?
@user-co7qs7yq7n18 күн бұрын
- We live in the same climate as it was 5 million years ago - I have an explanation regarding the cause of the climate change and global warming, it is the travel of the universe to the deep past since May 10, 2010. Each day starting May 10, 2010 takes us 1000 years to the past of the universe. Today April 20, 2024 the state of our universe is the same as it was 5 million and 94 thousand years ago. On october 13, 2026 the state of our universe will be at the point 6 million years in the past. On june 04, 2051 the state of our universe will be at the point 15 million years in the past. On june 28, 2092 the state of our universe will be at the point 30 million years in the past. On april 02, 2147 the state of our universe will be at the point 50 million years in the past. The result is that the universe is heading back to the point where it started and today we live in the same climate as it was 5 million years ago. Mohamed BOUHAMIDA.
@crawfordscott3d22 күн бұрын
The teenager learning to drive argument is really bad. That teenager spent their whole life training to understand the world. Then they spent 20 hours learning to drive. It is fine if the model needs more than 20 hours of training. This argument is really poorly thought out. The whole life is training distance coordination vision. I'm sure our models are no where close to the 20000 hours the teenager has but to imply a human learn to drive after 20 hours of training... come on man
@sdhurley
17 күн бұрын
Agreed. He’s been repeating these analogies and they completely disregard all the learning the brain has done
@zvorenergyАй бұрын
This all seems very altruistic and egalitarian until you remember who controls the billion dollar compute infrastructure and what happens when you don't pay your AI subscription fee.
@yikesawjeez
Ай бұрын
decentralize it baybeee, seize the memes of production
@zvorenergy
Ай бұрын
@@yikesawjeez liquid neurons, Extropic free the AI's from their server farms and corporate masters
@johnkintree763
Ай бұрын
@@yikesawjeezYes, a smartphone with 16 GB of RAM might make a good component in a global platform for collective human and digital intelligence.
@TheManinBlack9054
Ай бұрын
@@yikesawjeezwhy not actually seize the actual means of productions like communists did and nationalize the private companies? It makes total sense.
@yikesawjeez
Ай бұрын
@@johnkintree763 oh it prob hid my other comment cuz there was a link in it but yes, they actually make very good components for decentralized cloud services, you can find it if you google around a bit. there's tons of parts of information transformation/sharing/storage that can absolutely be handled by a modern smartphone
@readandlisten90293 күн бұрын
Sound like he is going to take AI back to 30 years ago
@veryexciteddog963Ай бұрын
it won't work they already tried this in the lain playstation game
@MatthewCleereАй бұрын
"Any 17 year-old can learn to drive in 20 hours of training." -- Wrong. They have 17 years of learning about the world, watching other people drive, learning langauge so that they can take instructions, etc., etc., etc... This is a horribly reductive and inaccurate measurement. PS. The average teenager crashes their first car, driving up their parent's insurance premiums.
@ArtOfTheProblem
Ай бұрын
i've always been surprised by this statement. I know he knows this so...
@Staticshock-rd8lv
Ай бұрын
oh wow that makes wayyy more sense lol
@waterbot
Ай бұрын
The amount of data fed to a self driving system still greatly outweighs the amount that a teenager has parsed, however humans have greater variety of data sources internal and external than AI, and I think that is part of Yann’s point…
@Michael-ul7kv
Ай бұрын
Agree Just in this talk he said that statement and then later says rather contradictorily a child by the age of 4 has processed a larger amount of data 50x than what was used to train an LLM 19:49 So 17 years is an insane amount of training a world model which is then fine-tuned to driving in 20hrs 7:04
@JohnWalz97
28 күн бұрын
Yeah Yann tends to be very obtuse in his arguments against current LLMs. I'm going to go out on a limb and say he's being very defensive since he was not involved in most of the innovation that led to the current state of the art... When ChatGPT first came out he publicly stated that it wasn't revolutionary and OpenAI wasn't particularly advanced.
@spiralsun18 күн бұрын
It’s funny how you make these flow charts about how humans make decisions. Thats not how they make decisions. It’s become so ordinary to explain ourselves and make patterns that look logical locally that we fooled ourselves. We inserted ourselves into the matrix, so to speak. I have written books about this but no one listens because they are so immersed and inured. It doesn’t fit the cultural explanatory structure and patterns. So forgive me but these flow charts are wrong. Yes you are missing something big. Rationalizing and organizing behavior is a good thing-as long as you remember that you are doing this. Humans have lost the ability to read at higher levels for the sake of grasping now, for utility and convenience and laziness, and actually follow these lower verbal patterns for the most part now like robots. I keep thinking about the Megadeth song “dance like marionettes swaying to the symphony of destruction”😂😂❤😂😂 “acting like a robot” etc… and it really is like that. We’re so immersed in it it’s extremely weird not to be-to not have a subconscious because you are conscious. Anyway, I have some papers rejected by Nature and Entropy, and a few books I wrote if anyone is interested in actually making a real AI. The stuff you are doing now is playing with fire… actually playing with nukes because it can easily set off a deadly chain reaction. It’s important. ❤ Maybe the best thing about LLM’s is their potential, but also their ability to show how messed up humans are. A good way to think about it is to not be bone-headed. Technically I mean, not the pejorative sense. Bones allow movement and work to be done. They provide structure. They last far far longer than all other body parts. Even though that’s important and vital, like blood, and seems immortal, you wouldn’t want to Make everything into bones. Especially your head, but it’s what we are doing. These charts you make are that. HOWEVER!!!! …. THANK YOU FOR THIS WORK!!❤ I loved this talk and the information. Obviously it was stimulating and I see that you are someone who likes to avoid group-think: don’t get me wrong. 😊 I didn’t criticize the other videos. Only the ones that are worth it. ❤ I literally never plan in advance what I will say. Unless I am giving a lecture or something to my college classes. I planned those. I was shocked when you said that. People are so different!!! I was shocked that people used words to think when I found out. Probably why I don’t really like philosophy even though it’s useful and I quote it a lot like Immanuel Kant: “words only have meaning insofar as they relate to knowledge already possessed”.
@positivobro8544Ай бұрын
Yann LeCun only knows buzz words
@JohnWalz9728 күн бұрын
His examples of why we are not near human-level ai are terrible lol. A 17 year old doesn't learn to drive in 20 hours. They have years of experience in the world. They have seen people driving their whole life. Yann never fails to be shortsighted and obtuse.
@inkoalawetrust
4 күн бұрын
That is literally his point. A 17 year old has prior experience from observing the actual real world. Not just by reading the entire damn internet.