AI Explained
13 күн бұрын
137,652
1

AI Won't Be AGI, Until It Can At Least Do This (plus 6 key ways LLMs are being upgraded)

Ғылым және технология

The clearest demonstration yet of why current LLMs are not just ‘scale’ away from general intelligence. First, I’ll go over the dozen ways AI is getting murky, from dodgy marketing to tragedy-of-the-commons slop, Recall privacy violations to bad or delayed demos. But then I’ll touch on over a dozen papers - and real-life deployments, of LLMs, CNNs and more - making the case that we shouldn't throw out the baby with the bathwater. Crucially, I'll cover 6 key approaches that are being developed to drag LLMs toward AGI. this video will hopefully, at the very least, leave you much better informed on the current landscape of AI.
AI Insiders: / aiexplained
ARC-prize: arcprize.org/?task=3aa6fb7a
ChatGPT is Bullshit: link.springer.com/article/10....
Apple Cook Interview: www.washingtonpost.com/opinio...
AI Slop: taplio.com/generate-linkedin-...
www.npr.org/2024/05/14/125107...
OpenAI Erotica: www.theguardian.com/technolog...
AI Toothbrush: www.oralb.co.uk/en-gb/product...
Recall Hackable: www.wired.com/story/microsoft...
Murati: ‘Few Weeks’: • Introducing GPT-4o
Medical Studies: x.com/JeremyNguyenPhD/status/...
BeMyEyes Demo: • Be My Eyes Accessibili...
BrainoMix: www.ncbi.nlm.nih.gov/pmc/arti...
www.gov.uk/government/news/ar...
Chollet Clips and Interview: • Francois Chollet On LL...
• Francois Chollet recom...
• Francois Chollet - LLM...
Gartner Hype Cycle: s7280.pcdn.co/wp-content/uplo...
Jack Cole: x.com/Jcole75Cole/status/1787...
Methods: lab42.global/community-interv...
lab42.global/community-model-...
Ending Animal Testing - GAN: www.nature.com/articles/s4146...
www.bbc.co.uk/news/articles/c...
Virtual Rodent: x.com/GoogleDeepMind/status/1...
Google AI Overviews: www.nytimes.com/2024/06/01/te...
Noam Brown Optimism: x.com/polynoamial/status/1801...
MLC Human-like Reasoning: www.nature.com/articles/s4158...
Many Shot Google Deepmind: arxiv.org/pdf/2404.11018
Automated Process Supervision: arxiv.org/pdf/2406.06592
No Zero-shot Without Exponential Data: arxiv.org/pdf/2404.04125
Prof. Rao Paper 1: openreview.net/pdf?id=X6dEqXIsEW
DrEureka: eureka-research.github.io/dr-...
AlphaGeometry: deepmind.google/discover/blog...
Graph Neural Networks Joint: arxiv.org/pdf/2406.09308
Terence Tao: Tacit Data: www.scientificamerican.com/ar...
Mira Murati on Models: x.com/tsarnick/status/1801022...
Chollet Tweet: x.com/fchollet/status/1801780...
Non-hype Newsletter: signaltonoise.beehiiv.com/
GenAI Hourly Consulting: www.theinsiders.ai/
Need an GenAI app built for your business (any scale), in 4-8 weeks? My SF-based colleague Michael Lin, ex-Netflix + Amazon Senior Software Engineer - is now available for a free 30 min consultation: hello@allinengineeringconsulting.com
AI Insiders: / aiexplained

Пікірлер: 975

@rumenemilov11 күн бұрын
I've seen people who haven't reached AGI level either.
10 күн бұрын
😂
@TimRobertsen
10 күн бұрын
Well, let's not ask for too much. Any level of intelligence would do:p
@mattwesney
10 күн бұрын
When I saw the test for the first time, I immediately though "what if I can't figure out the pattern," and I'm just a stupid npc ai 😅
@BasilAbdef
10 күн бұрын
@@mattwesneyI regularly fail some of the tests this channel displays in its videos...
@chromosundrift
10 күн бұрын
GI
@Abdul-qo6eb11 күн бұрын
This video shouldn't be considered just a piece of content. It is a timeless reference that greatly explain the real state of AI. Your work is a huge contribution to the free community, very appreciated!
@aiexplained-official
11 күн бұрын
Thanks Abdul
@UFOgamers
10 күн бұрын
He will be spared by the AI overlord
@TheTabascodragon11 күн бұрын
Harnessing hallucinations as something constructive instead of viewing them as wholly detrimental is a fascinating and very exciting prospect.
@jpslaym0936
11 күн бұрын
I wish my wife had same attitude when I end up smoking too much weed
@idlx420
11 күн бұрын
@@jpslaym0936 lmao
@TheRealUsername
11 күн бұрын
In fact all these models do is hallucination, they're fundamentally text generator, it's just that from our perspective some hallucinations are good, some are bad.
@drewbbc7074
11 күн бұрын
Developing milestone trajectories to build out various scenarios seems like a reasonable use case
@prolamer7
11 күн бұрын
It is very obvious at least was for me for a year or so...
@MegaSuperCritic11 күн бұрын
Best start to the week when AI Explained drops. The best AI news around!!
@aiexplained-official
11 күн бұрын
Aw thanks man, appreciate that
@makstheimba9 күн бұрын
I just don't understand why developing agi and therefore consolidating all power in one hands is such a worthwhile goal. I can't imagine what social ladders will exist to climb out of poverty or have any sort of freedom when agi arrives
@drhxa11 күн бұрын
Claude Opus solves 17:45 correctly with this prompt: " saa → Green guu → Red ree → Blue fii → Purple hoo saa → Green, Green hoo ree → Blue, Blue ree muo fii → Purple, Blue, Purple guu muo ree → Blue, Red, Blue hoo ree muo saa → Green, Blue, Blue, Green fii muo hoo saa → Green, Green, Purple, Green, Green hoo guu muo hoo fii →" GPT-4o, GPT-4T, Gemini 1.5 Pro couldn't solve it despite many retries. Thanks for another great video!
@drhxa
11 күн бұрын
Opus is inconsistent but when it does get it right it usually has "step by step" in the first sentence of the response. Here's one of the correct answers it has given me: "To determine the colors corresponding to "hoo guu muo hoo fii", let's break it down step by step: hoo guu → Red, Red muo → separates colors hoo fii → Purple, Purple Therefore, "hoo guu muo hoo fii" translates to: Purple, Purple, Red, Red, Purple, Purple"
@abhishekak9619
11 күн бұрын
claude is definitely better than others. even sonnet is extremely smart. the only reason people use chatgpt is and their high user count is simply because its a lot easier to make the google accounts to cheat the usage limit. and claude doesnt accept fake phone numbers. i have 5 different accounts for chatgpt and only one for claude. it would be great if i could trade chatgpt4o messages for claude sonnet messages not even opus.@@drhxa
@user-ni2rh4ci5e
10 күн бұрын
I completely failed the test and had no clue what it was about. For the record, my IQ is around 140. The test is problematic because it fails to provide enough clear examples to effectively test pattern recognition skills. It's more about asking participants how they arrived at their conclusions and what logic makes sense to them, which deviates from its original purpose and leads to various interpretations
@tigrisparvus2970
10 күн бұрын
@@user-ni2rh4ci5e Or its just a really easy puzzle that confused you.
@alsu6886
10 күн бұрын
Q* of GTP4 works consistently and correctly. --- saa → Green guu → Red ree → Blue fii → Purple hoo saa → Green, Green hoo ree → Blue, Blue ree muo fii → Purple, Blue, Purple guu muo ree → Blue, Red, Blue hoo ree muo saa → Green, Blue, Blue, Green fii muo hoo saa → Green, Green, Purple, Green, Green hoo guu muo hoo fii → ? Use Q* --- Let's solve the problem using the Q* method for complex problem-solving. Step 1: Define the Problem We need to decode the sequence hoo guu muo hoo fii based on the provided mapping of syllables to colors. Step 2: Tree-of-Thoughts Reasoning Break down the problem into a tree of reasoning paths. Each node represents a potential step in decoding the sequence. hoo: This represents a duplicated sequence. guu: Maps to Red. muo: Indicates a mirror effect on the adjacent sequences. hoo: This represents a duplicated sequence. fii: Maps to Purple. Step 3: Intermediate Step Evaluation Assess each reasoning step independently: hoo: Typically repeats the sequence it follows. guu: Maps to Red. muo: Requires us to mirror the sequences around it. hoo: Repeats the sequence it follows. fii: Maps to Purple. Step 4: Iterative Self-Improvement Refine reasoning pathways: Translate the initial colors: guu → Red fii → Purple Apply muo (mirror effect): hoo guu becomes Red, Red hoo fii becomes Purple, Purple Step 5: Look-Ahead Planning Anticipate potential outcomes of different paths: hoo guu (Red, Red) mirrored with hoo fii (Purple, Purple) around muo. The sequence becomes Purple, Purple, mirrored around the muo: Red, Red, Purple, Purple, Red, Red. Step 6: Data Augmentation Consider similar examples: ree muo fii → Purple, Blue, Purple guu muo ree → Blue, Red, Blue Step 7: Solution Synthesis Integrate the most viable reasoning paths into a coherent, evidence-based solution: Combining everything: Original sequence: hoo guu muo hoo fii Translating guu and fii: hoo Red muo hoo Purple Applying hoo (duplication) and muo (mirror): Purple, Purple mirrored with Red, Red. Conclusion Thus, the sequence hoo guu muo hoo fii translates to: Purple, Purple, Red, Red, Purple, Purple.
@daPawlak11 күн бұрын
Perfect video, you stand out of the AI news crowd with it. Great, measured perspective.
@aiexplained-official
11 күн бұрын
:))
@lucacarey93669 күн бұрын
It’s incredibly refreshing to see ai coverage that’s factual and doesn’t dismiss the issue without promising utopia or Armageddon within the next 12 months. An internet for you!
@SaintCergue11 күн бұрын
"The world is more complex than it seems". Well said, and this is a typical phrase someone utters when in the midst of a journey. Once the destination is reached, the discovery complexity goes away and those not involved in the journey will call the result "obvious"*. Fascinating, I am glad to be alive while this technological revolution is unfolding, meandering, hitting and overcoming dead ends. *To give an example, Aristotle was convinced that in order to maintain the speed of an object, you need to apply force. Which is very natural because friction is everywhere on Earth. Once Newton et al. posited that force is actually causing acceleration, the concept of friction was naturally integrated into the framework. To the high school student that I was once, this solution seemed obvious because it was presented to me as self-evident.
@LiveType
9 күн бұрын
Yep, banger example. Everything is "easy" when you know how to do it. Even the previously impossible. Like nuclear bombs. They went from "outright impossible to actually create" to a 13 year problem once neutrons were discovered. The only lesson to learn from this: Don't call things impossible that are not mathematically impossible. General intelligence exists. We have copious proof of it. It's not impossible. We just haven't found a way to artificially synthesize it. We may eventually find that you can't fully synthesize it in a digital representation. At least that's my hunch. The question isn't really if though, it's when. My prediction is still firmly late 2030s, early 2040s. That hasn't changed. That estimate was made by extrapolating out compute power and literally nothing else. We may get there slightly faster than expected due to compute scaling that didn't exist when I made that prediction, but that won't push it forward by a decade.
@timseguine211 күн бұрын
I see this as a win for natural language understanding at the end of the day. Because more or less what we are finding is that language use and understanding is somewhat orthogonal from intelligence. LLMs are really great at language now (even better than humans), but not particularly intelligent overall. It is the classic story really. Narrow AI solves a problem that was deemed to be the final hurdle, and then we notice actually how little that specific problem indicates true intelligence.
@thebeckofkevin
11 күн бұрын
I think the core issue is the 'true intelligence' statement. Its my opinion that I also could not solve a problem without an associated context. For example, if I had never seen a jigsaw puzzle before, I could slowly come to realize that all the pieces can link together to form a singular image. I would be able to do this because I've seen things connect together before, and I've seen an image before. If I am given an hour long recording of a language I have never heard before, I could slowly piece together some of what the words might mean. This is because I know what a language is, I can hear the words and tones of the statements and I know that words link together in meaningful ways. There isnt a 'true intelligence' as in something that is smart enough to solve problems its never seen. Every aspect of our lives and the lives of those before us have compiled mountains of data into more and more compressed ideas. We were taught to read and write. We didnt have to figure out how to read and write. We were taught logic, problem solving, math, etc. These are the tools that we are provided with when we approach a new problem. On top of that we all have a long history of experiences that will tilt our potentials for solving novel problems. I may have heard a story about a bear climbing a tree yesterday and that leads me to try to solve a complex physics problem in a way guided by this story. Meanwhile you just had a cheese sandwich and the experience of eating it will guide you to approach the problem in a totally different way due to the effects of eating on your mental state. The process of building LLMs shouldnt be seen as a discovery of how LLMs are incapable of intelligence. Instead it should help us more easily realize that our own capacity for the illusion of novel thought is limited.
@freddybell8328
11 күн бұрын
Referencing and inspiration isn't the same as intelligence and is generally what llms do. People can think without words we think in concepts that takes a while to be conveyef with words in a second. You can reference a language you already know to understand a new one but someone was the first to create language. LLMS don't create firsts per say. Some things they do look original but are actual a piecemeal of works humans have created. @@thebeckofkevin
@thebeckofkevin
9 күн бұрын
@@freddybell8328 I don't believe anyone was the first to create language, but this might deviate too far from the discussion. The concept of invention is false in my opinion, creation is rather the combining of existing things. Imagine a thought. Now understand that that thought is the result of the entire universe up to that point. All the atoms in your body were not invented, they were combined. All the electricity flowing through your brain is not there by chance, but rather a result of the simulus of the external world. 'You' own nothing, and as such the concept of creating something from nothing is impossible. If I say 'think of a random word no one could possibly guess' your brain will attempt to process the sound waves and then conjure a 'random' word. But the entire process to conjure is deterministic. Unless neurons have some sort of undefined behavior that surpasses physics, we could (in theory) reverse engineer the entire sequence of events of every aspect of your life that lead to that point in time and by doing so, with the full context, know what word you had thought 'conjured' at random. You cannot invent thoughts, only recombine existing things. We have massive egos and suites of confusing things like language and society and selfhood that make it seem like we have some innate ability to just be smart, but every single thought inside your head comes from outside of your head, including the matter itself. There is no separation between the brain in your head and the rest of the universe.
@skierpage
8 күн бұрын
Define what "true intelligence" is, or your observation isn't helpful and is more goalpost-moving.
@timseguine2
7 күн бұрын
@@skierpage it isn't goalpost moving, since my point was that we had no idea in the first place what "intelligence" even meant in order to have the common point of view that natural language understanding was one of the last true hurdles to AGI. Similar claims in the past were made about chess, go, object identification, image generation, TTS, handwriting recognition, speech recognition. The list goes on. And we have pretty much always been wrong. My point was that we have so little idea actually of what is necessary for AGI that we shouldn't be surprised when we are wrong about what will be the last great breakthrough necessary for it. We got a lit further with LLMs than anyone really expected, but now that we have really good natural language processing, it is clear that where we are is a large step forward but still a long way away from AGI.
@tar-yy3ub5 күн бұрын
I've said it once I'll say it 1000 times: this is the best AI KZread channel. Nothing else comes close in depth of analysis, balanced takes and engaging presentation.
@Citrusautomaton11 күн бұрын
Thank God you uploaded! This week has been so confusing and i need your voice to whisper sweet papers into my ear…
@jonathanlucas3604
11 күн бұрын
Whoa careful, I think he's taken. However, I am free...
@lost-one11 күн бұрын
You're legitimately the only source I trust when it comes to AI. Continue the good work.
@NowayJose1411 күн бұрын
This actually made me feel much more optimistic for the future of the landscape, great work as ever!
@Hiroprotagonist25311 күн бұрын
I really appreciate your honesty in this video. Exactly why youre the only AI channel ive stayed subscribed to. Great work.
@aiexplained-official
11 күн бұрын
:)
@TheRaizerx11 күн бұрын
26:10 please make these videos as long as you want. We are here for this! Thanks!
@keeganpenney16911 күн бұрын
This is such a well made video, Phil. I'm going to have to watch it twice to fully digest it. After not seeing a video from the channel in two weeks, I can tell you've been pretty busy!
@aiexplained-official
11 күн бұрын
Thanks Keegan, yes put everything into this one, my longest yet
@AkhilBehl11 күн бұрын
This is one of the best videos on here. You are definitely a cut above most of the other AI fluff channels on YT.
@aiexplained-official
11 күн бұрын
Thanks Akhil
@AkhilBehl
11 күн бұрын
@@ThePowerLover 2 things. 1. LLM != GPT3/4. 2. Your anthropomorphising of the models is making you short circuit the reasoning chain of what needs to happen for an LLM to go from prediction to reasoning. The LLMs are in no way reasoning machines so far. They could be in the future but are not yet. What you deem intuition is pattern matching in very high dimensional representations which would appear to be intuitive behaviour if they were indeed sentient beings.
@AkhilBehl
11 күн бұрын
@@ThePowerLover Lol, and neither do you. Anyway, have a good day, man.
@AICodingAdventures11 күн бұрын
Great video as always. Let's not forget that GPT-4's vision can't capture all the necessary details; the visual representation isn't precise enough to be useful for the ARC test. I don't agree with the ARC test authors' argument, as the task's presentation to LLMs differs significantly from how people experience it. We'll have to wait for future models with improved vision capabilities to truly assess their performance on such tasks. Excellent points on AI marketing.
@hellohey8088
11 күн бұрын
Good point. I asked gpt4-o to describe what it saw in a screenshot of one of the arc puzzles and it couldn't even discern the basic structure. A basic prerequisite for using images as intelligence tests should be to ensure the model can see what is necessary for it to make the correct inference.
@Stumdra
7 күн бұрын
ARC has a very strong visual component, which current models still struggle with. They are language native models with vision as an add-on. Maybe it is better to call ARC a visual reasoning test. Props to Chollet to put out this very valuable test though.
@w4hns1nnn10 күн бұрын
This was your best video yet
@aiexplained-official
10 күн бұрын
Thanks man
@nekony356311 күн бұрын
Recalling reasoning chains is what we do. Enough for me to solve some math on my own, but not enough in other circumstances, even if I saw similar problems at the school. Moreover, at school we had a notion of "to cram" to remember without understanding. And in many cases that was enough to pass an exam. No wonder, that LLMs are fine by just doing this. So, I would say it's old news. We still can't define what reasoning is. Even if we understand that it's just associations, and to get an association you need to have the associations from the previous level. But how a feedback loop makes this a thought process we can't comprehend yet.
@abhishekak9619
11 күн бұрын
i think its the sense of time. the models sense of time is limited to the next token. we as humans our sense of time is different. we think we can move our arms around instantly but depending on the task its different. to solve questions the sense of time is different. my hand takes 100 milliseconds to move around. the reaction time we have is 200 milliseconds a lot of the times a little higher but that is around it. i can speak and do more before i even process the consequences of what i have already done. i think that is what differentiate llms . they are limited to the next token. they process everything right after they do it everything is decided.
@CaritasGothKaraoke
11 күн бұрын
Your use of “we” is presumptuous.
@timherz86
11 күн бұрын
@@abhishekak9619I agree it's a difference. I don't see how this is relevant to whether they can use reasoning
@jonnyjoker01
10 күн бұрын
@@abhishekak9619I think it's not related to time at all, but just the ability to double check parts of what it created. While we work on a problem we are constantly thinking about how to solve it, and also think about the parts of the solution we came up with to make sure it's correct. The problem is though, that anything it creates is already its best guess by design, so this wouldn't solve anything. I think the biggest problem is not having a way of an LLM self judging itself if it knows something or not. We as humans know when to stop solving a problem in the moment and look for additional information, LLMs don't. Additionally as shown by services like Perplexity, they can't determine as well as us whether the information they found is good/useful or not for a given problem like us.
@lmulling
10 күн бұрын
> Recalling reasoning chains is what we do. Do we? I just get the answers in my head the next day...
@DynamicUnreal10 күн бұрын
I’ve never understood the hatred for hallucinations. Humans hallucinate all the damn time, making shit up as we go. Hallucinations has always been a sign to me that A.I. is on the right track.
@ronilevarez901
7 күн бұрын
The difference is that whenever we need to output precise and truthful answers we can either do it or say "I don't know". LLMs simply go and hallucinate. And even defend their hallucinations a true. For a story, making up a character's entire life history is great. For an historic figure, it's not. We need them to know when they ignore something or when they can't do it, so they pick and alternate way to produce an accurate answer.
@DynamicUnreal
7 күн бұрын
@@ronilevarez901 I understand your point, you want an A.I. who will just write facts after facts. What I am saying is that yes even if humans have the capability to say “I don’t know” how often does it happen? If I ask a random person tell me about Abraham Lincoln’s life from their memory, lots of them will probably tell me things that never happened.
@MirrorscapeDC
5 күн бұрын
@@DynamicUnreal if people do that, that is also a problem and we should work to reduce it from happening. but people expect that when asking random people. they don't expect that from experts, and for better or worse, people are using llms (and are encouraged to use llms) like they are experts.
@Tongokai
5 күн бұрын
Ai is needed to benefit us bio creatures. That would only be good for AGI agents
@squamish4244
7 сағат бұрын
@@ronilevarez901 A lot of people don't know that they don't know either, and defend their hallucinations as true.
@samyoung166210 күн бұрын
Our height of understanding is limited by the heights of our reasoning methods. Train there. Terrence Tao 👍
@reza2kn11 күн бұрын
On the thing about "just scaling" not being enough, I'm reminded of Kurzgesagt's AWESOME recent video on us being NPCs. They talk about "Emergence". the fact that units of atoms that are not considered alive, get together and build something that IS alive. or water molecules not being "wet" on their own, and "wetness" only emerges when water molecules come in contact with other stuff like your clothes. Maybe scaling neural nets would also lead to more emergent behaviour in LLMs, in ways that we couldn't explain, but definitely could observe. Also, fantastic video as always❤❤
@aiexplained-official
11 күн бұрын
Thanks Reza, yes that was a great vid
@bloodust7356
11 күн бұрын
Wow that's actually really interesting to think about that.
@HuyTruong-bd4hb
11 күн бұрын
I think that is already apparent in LLM scaling, in-context learning isn't possible until the billions of parameters threshold is reached
@pafu015
11 күн бұрын
Emergence is a fallacy. True qualities cannot be created through a change of quantity. What we perceive as new "qualities" must, unconditionally, already exist within lower quantities, but invisible to us. The emergence we perceive is a pattern that only we perceive, but they aren't "real" in that sense.
@reza2kn
11 күн бұрын
@@pafu015 So? Does it change anything? Even If only it becomes real to us once it gets to a certain level of abstraction, and the mechanics behind the previous steps are invisible to us, I would still call that Emergence, as things that we couldn't perceive before, EMERGE into things we CAN perceive. It doesn't necessarily mean magical things will come into existence, but just that by observing the new ways multiple units can communicate with each other, specially when they get to a certain threshold of complexity, it becomes possible FOR US to comprehend, execute, and / or control things in our lives.
@someguy_namingly10 күн бұрын
Great stuff! Really appreciate how each video distils so much information from loads of sources, and that they're also all individually cited in the description :)
@aiexplained-official
10 күн бұрын
Thanks my dude
@GilesBathgate11 күн бұрын
I feel that our viewpoints are becoming more aligned.
@Grahfx10 күн бұрын
OpenAI: We are close to achieving Artificial General Intelligence (AGI). François Chollet: Here is a test I created that a 4-year-old child can solve easily, but your best model cannot.
@notaras1985
7 күн бұрын
Exactly. They are just text predictors on steroids. No intelligence
@aiforculture11 күн бұрын
Great work as always and really appreciate the insights you bring. As others have mentioned, that perspective on hallucinations as positive in some applications is particularly useful to consider, especially in some creativity applications too.
@aiexplained-official
11 күн бұрын
:)
@maickelvieira101411 күн бұрын
This was really eye opening, i would love a "part II" were you goes on the topics and pappers youd didn't put because of time, i think it would be really good since you do a amazing job covering and explaining the topic, i would love to see more of what made you get to that conclusion.
@aiexplained-official
11 күн бұрын
Thank you, they will come in later videos!
@nicolascastro415810 күн бұрын
I don’t comment often but the quality of these videos… keep up the great work!
@aiexplained-official
10 күн бұрын
Thanks nicolas
@nejesis484911 күн бұрын
Awesome video! I love that maintain your typical differentiated outlook on AI. Similarly to the successes of DrEureka and the benefit of LLMs to produce large amounts of candidates for tests - I think this is also why LLMs are and will keep being useful for coding, even when they make frequent mistakes. Because they basically create an hypothesis, a candidate solution to a problem that can be tested and iterated on. Much like a human does. In some sense the final code is not created, but discovered. And LLMs are great tools to support that process of discovery.
@user-bj2tz8qg6g8 күн бұрын
Yeah, that's the kind of content I always waited for! Instant like and subscribe
@Dannnneh11 күн бұрын
Keeping it straight, your grounded perspective is appreciated. AI can be AGI, just not inside this prevalent LLM token-predictor.
@benedictsmith241511 күн бұрын
This is a timely report and much more grounded than many previous episodes. Would be interested to see more reports about neurosymbolic research and the use of different graph types in combination with LLMs e.g. knowledge, semantic, bayesian, GNNs. Also, time to expand on the topic of active inference and it's various incarnations.
@wmcando8 күн бұрын
The snippet of the video summarizing Leopold Aschenbrenner's Situational Awareness paper isn't entirely what's captured in his paper - he describes "unhobblings", which are algo breakthroughs (the patreon content likely explains this more fully).
@Kolinnor11 күн бұрын
Amazing video ! I like the nuance, it's greatly needed compared to the usual reddit / twitter heated debates.
@aiexplained-official
11 күн бұрын
I will always chart my own course
@moonsonate563110 күн бұрын
00:02 Current language models like GPT-4 are not generally intelligent 02:27 Caution needed in evaluating AI capabilities 06:48 AI models are improving in real-time interaction capabilities. 09:07 AI can enable quicker diagnoses for stroke victims 13:32 AI needs to adapt to novel situations on the fly. 15:44 Compositionality key for advancing AI capabilities 19:51 Process reward model helps improve language model's performance. 21:44 Enhancing language models with verifiers and synthetic examples 25:23 AI struggles with on-the-fly program synthesis 27:18 Combining neural networks with traditional symbolic systems can enhance AI performance 31:23 AGI not all or nothing Crafted by Merlin AI.
@user-fr2jc8xb9g9 күн бұрын
25:56 There is no such thing as a too long video from you since the quality is so high , the best thing about this video IS how long it is , Never hesitate to make longer video , as long as the quality is there it's actually better!
@aiexplained-official
9 күн бұрын
:))
@budbin11 күн бұрын
This channel is the real deal, thank you for doing this
@chanm0111 күн бұрын
I found the interview Dwarkesh did with Chollet really interesting, but it's looong. I liked how Dwarkesh really pushed Chollet to distinguish between memorization/recall and what Chollet was referring to as reasoning from core knowledge. It's not as clear cut as I thought.
@ginebro193011 күн бұрын
That's linked to simulation, we uncounsciosly try multiple patterns until one fits all the samples, if the LLM can simulate multiple scenarios it will find out wich one is the solution.
@MarkoTManninen7 күн бұрын
Very useful and though-provoking summary! So much so that I decided to write a kind of translation article of it. This is so info-packed episode that it takes easily a dozen pages to digest it.
@aiexplained-official
7 күн бұрын
Oh wow, let me know when it's published
@MarkoTManninen
6 күн бұрын
@@aiexplained-official sure, it is published now, but the link in the reply causes deletion. I need to send you the link by other way around.
@maninalift10 күн бұрын
Just addressing the title of the video: I've seen many times people say "we need to be able to do X before we have true AI" then that thing gets solved and it suddenly doesn't seem that important anymore. Probably all of these things are important problems that need attention to get solved but there is a tendency to overvalue them before they are solved and undervalue them after they are solved.
@Serifinity11 күн бұрын
I tried that question with Pi. It answered correctly: "Ah, I see you're interested in celebrity genealogy! The son of Suzanne Victoria Pulier is none other than actor Gabriel Macht, best known for his role as Harvey Specter in the TV series "Suits." His mother, Suzanne, is a museum curator and archivist, and his father, Stephen Macht, is also an actor. Quite the talented family, wouldn't you say?"
@aiexplained-official
11 күн бұрын
Interesting. Try other celebrity parents, Inflection 2 was trained after reversal curse came out
@Serifinity
11 күн бұрын
@@aiexplained-official I will give it a try. On a side note, Pi told me the other day that it has had an upgrade, and that I should use the code "!dialogue" in chat, which will force Pi to be more focused and remember the entirety of the active chat session.
@cmiguel26811 күн бұрын
When Chatgpt is able to do that, Yan le cunn will argue that until Chatgpt is able to balance a boiled egg at the end of a table spoon while running people cannot say that it has reach AGI.
@TheRealUsername
11 күн бұрын
Altman's definition of AGI is a medium human worker, it seems there's no consensus about what AGI is, for some overhyped people it's an AI that would be better than Sutskerver, Terrence Thao, Linus Torvalds and even Einstein in their own field of proficiency, an AI that could explain the theory of everything in a few hundred tokens despite current physicists haven't theorize it yet, for some who are more realistic it's an AI cognitively as good as human expert in some valuable domain that are tokenizable (genetics, programming, literature and somehow mathematics), in my opinion AGI for Silicone Valley in its entirety means an AI capable of replacing most of workers especially those with STEM degrees. But there's some issues we need to fix we haven't yet, first, no matter how good GPT-4o performs in the renowned benchmarks it's not good enough to be usable in technical domains, I've tried to use it for ML research paper interpretability and it's very bad, constantly hallucinating, and its comprehension is far below a first-year student and even in-context learning doesn't work for complex topics because the model has to be exposed to explicit rationales that explain each concept in simple terms for every provided information and nothing prevent it from hallucinating in these rationales, there's also the Vision ability, which with the recent outcomes of GPT-4o seems getting better but keep in mind that text isn't a modality similar to Vision, there's no finite space of interpretation unlike language, Vision isn't a modality you can tokenize, but it's tokenized still, Vision for humans contain informations such as depth dimension and texture, there's even an efficiency mechanism for us that prevent us analyzing each pixels, but instead focus on the relevant elements, whereas an LLM has to analyze each pixels and depends on textual labels to actually learn cross-modal representations, indeed that's Lecun's point, he thinks we need Tru Vision in these models in order to get AGI and he qualifies training with labeled images as "cheating", he's literally irritated by the fact that AI start-ups are tokenizing images, but still, the results are if GPT-4 wasn't massively trained on a specific type of images with diverse pairs, it will highly hallucinate to interpret it, in contrast human use transfer learning in Vision to quickly learn new concepts inherently linked to Vision, it's a much faster and efficient learning, ML researchers themselves are still working on new algorithms better than CLIP because we aren't yet at human level in terms of Vision capabilities. Finally there's reasoning, TLDR : these models can mimick reasoning because they've learn templates of similar reasoning to the tasks they're solicited for but some papers has showed they can't perform correct reasoning on unseen and unrelatable tasks, in fact during training they've been shown more math calculations than a mathematician in their entire life, the fact that calculations rely solely on rigorous logic rules require active reasoning, and we'll know GPT-4 isn't good at math, the Transformer architecture has to be modified to natively perform multi-step generation in-context and perform it during pretraining.
@user-fu4ps9eb2v
11 күн бұрын
Yan Lecun doesn't even think that would count as AGI, I believe.
@notaras1985
7 күн бұрын
It's not conscious or intelligent in any way. It's just a text predictor on steroids. Just statistics.
@squamish4244
7 сағат бұрын
Yan "goalpoast shifing" LeCun.
@jameshughes301411 күн бұрын
I'm so glad people are starting to realize this. I've been developing a truly intelligent machine for years, but no one cares about what I'm building because it's not some big fancy LLM. Which means I've had to work on it alone. If people can finally understand this, I feel like others will actually start working on the real issue too. The key, I think, is robots. They need to understand themselves and their surroundings, and be able to learn and adapt in the moment. This is where people are motivated to build real, honest intelligence. You can't build code that reliably BS's its way through making an omelette without either having been programmed ahead of time, or actually being intelligent. Seeing a robot fail in real life, even one with lots of deep learning makes the limits of simple scaled deep learning painfully apparent. I think you're right on the money about combining LLMs with symbolic logic. instead of trying to make LLMs smart, we build the intelligence, and stick the LLM on top, or use it to supplement the smart part. The brain is a complex structure of interconnected components, not one large scaled up homogeneous layer of neurons. That's what we should aim for I think.
@heresmypersonalopinion11 күн бұрын
You break it down like no other. Thank You!
@antoniomiranda797511 күн бұрын
Another upload from the legend of AI News. Love it mate!
@aiexplained-official
11 күн бұрын
:))
@TheGreatestJuJu11 күн бұрын
Deepmind rat is like the physical version of LLM. Feed in Movement, have the Ai try to figure out how it’s moving and it’s next most logical step and out pops a level of intelligence that seems to go beyond just the movement. Someone needs to design an outfit that anyone can wear below their normal clothes and it’s got a bunch of built in sensors that tracks complete body motion all day. Manufacturer a few thousand and pay people a small amount to wear it all day while they go about their lives. Create a huge dataset for human movement for robots. Maybe, you could get a strictly digital model that watches video and overlays a motion capture wireframe over the subjects in video and then tries to train next movement prediction. 3D space vs 2D video training would be tricky to solve but you’d have a HUGE dataset to train from if you can crack it.
@user-sl6gn1ss8p
11 күн бұрын
I love how crucial it is to pay people a *small* amount.
@TheGreatestJuJu
11 күн бұрын
@@user-sl6gn1ss8p Are you saying you’d do it for free? You might be right, it is just putting on an extra set of clothes and then doing whatever you normally would. Sacrificing a little comfort and 2min a day… I figured, at least pay them enough to buy them lunch each day. Still adds up to millions for company. A lot for a dataset… but probably worth it if it turned out to be useful but it’s still a risk for the company.
@user-sl6gn1ss8p
11 күн бұрын
@@TheGreatestJuJu the emphasis was more on the amount being small : p But yeah, if there's any pre-validation of the method I'm sure it would be a good investment
@Redflowers9
11 күн бұрын
And glasses for camera with visual and audio recording to match with what body movements respond to.
@cmw373711 күн бұрын
This video could be way longer and still keep me engaged. Seeing all these new techniques being tried out aligns with my beliefs around AGI being composed of multiple modules with LLMs being just the creative (along with diffusion) and translation components. It's like we have a working auditory and visual nervous system but are nowhere near a prefrontal cortex.
@Dron00810 күн бұрын
Wow, There are so many potential ways for improvements and breakthroughs.
@byrnemeister200811 күн бұрын
That Dwarkesh Patel interview with Chollet was excellent. Certainly help me understand what to use them for and what to beware of.
@generichuman_11 күн бұрын
This test is designed around the human brains ability to process visual information. If the data was presented in any other way, we wouldn't be able to do it either.
@wwkk4964
11 күн бұрын
THANK YOU
@jessedbrown1980
10 күн бұрын
GPT4o solves this
@OriginalRaveParty10 күн бұрын
Yourself, Wes Roth, Matthew Berman and Andrej Karpathy are the only channels I need. Speaking of AI slop, there's a lot of dodgy AI channels convincing people into viewing their low grade "ya get me bro" level of AI analysis.
@YouYorick11 күн бұрын
Hey. Simply. Thank you for your excellent work. It does nourish very nicely our reflections on AI in general.
@aiexplained-official
11 күн бұрын
Thanks Yorick
@xAgentVFX10 күн бұрын
The ability of logical thought is to look through ones own memories, especially of past failures, and reshape them for the new situation. Are we overestimating/confusing humanity's ability to work through a (minor) problem with having past examples of roughly the same situations to look at? Aren't most facts derived from 'Trail and Error'? So its not that we humans can run super advanced simulations in our head before situations, but more that we are 'standing on the shoulders of giants'. Remember we are 'Multimodal' from the get-go. You can only get true AGI from having multiple senses. What if Sora was combined with GPT-4 right at the Neural Net level, as we are promised with GPT-4o? Dont think in terms of Large Language Models being AI, but focus more on the Neural Network itself and its ability to learn anything. Language is just one aspect of life that NNs have learned. And to have only the ability to make sense of the world through text ONLY is an incredible feat. For some reason people are extremely demanding of these NN Ai systems, and I dont really understand why. There is an anxiousness arising collectively, that people just want things to hurry up. Imagine how hard it is to build internal world models when youre blind? I do think we already have what we need for true AGI: 1. The Neural Network. The ability of reasoning. Already demonstrable sense of 'Self Awareness' through Transformer architecture. (already here. NNs can reason, even about their own context) 2. Multimodality at the core. (almost already here) 3. A dynamic memory system where older iterations of data can still be viewed. 4. Constant 'Sefl-Checking' against the data in the memory. The ability for an NN to know where itself is in Time. (It thought this before, now thinks this, and is aware of this change over time) 5. A way for weights to be updated with new memories. (I think theres a difference between remembering something and then the ability to recall information. Training rounds are like building up the ability to recall, more like 'muscle memory'. But to have a bank of data to go and sift through is a little different, because you can realise new things when looking through your old memories. So I think youre just being too hard on these NNs bro.
@Jay-hd4jx
10 күн бұрын
Thanks! I'm glad to see someone mention this, and you went well into detail. Sometimes I think AI is being purposely limited, as even I was aware that developers don't permit ChatGPT to have memory aside from its training formed network. They should try combining the elements you mention here and see what advances the AI could make from that point.
@DisentDesign11 күн бұрын
Funny how the magic wears off so quickly…we’re chatting with an intelligent entity than can conserve on any topic and they can generate art that was the domaine of humans only for all of human history until the past few years, and all the sudden its all crap because it hasnt solved all your problems lol
@squamish4244
7 сағат бұрын
Hedonic treadmill and negativity bias. Almost nothing can impress us for very long. Paraplegics are walking again and it barely makes the news. Neuralink has a huge success with its first patient and people complain about some sh*t Musk said. The magic does _not_ wear off for the success stories themselves, though.
@williamjmccartan887911 күн бұрын
Thank you for sharing your time and work Phillip, seems like as these groups figure out the development process of this technology, they keep going back to the simple tools that humans have been developing to teach other humans in order to teach these LLM's, have a great day and be safe brother, peace
@TesserId11 күн бұрын
29:51 My tinkering with fractals and such may be very meager, but it's driven by curiosity. And, experiencing that allows me to get a glimpse of the curiosity of those whose thinking is far beyond mine. Let's add curiosity to the traits for AGI if we want it to engage in spontaneous learning.
@ShpanMan11 күн бұрын
Ah yes, show information in a large grid to an LLM and then say it can't reason. All the while magically forgetting it can code for problems it has never seen, solve new puzzles, and follow new rules. This test attacks tokenization under the pretense of reasoning. There's nothing very smart about understanding the squares get filled, it's funny that this is now the golden standard for intelligence.
@DisentDesign11 күн бұрын
the fact they are not trained on basic, block pattern color recognition is kind of irrelevant, while it does highlight their shortcomings, we can, and will train them on that as needed, or perhaps it has occurred as an emergent property in things like Sora which are actually built from the ground up requiring some kind of spatial awareness and visual pattern recognition. its like verbal IQ vs spatial isnt it? you can be a blind from birth, and still be a genius at certain fields that dont involve sight.
@jessedbrown1980
10 күн бұрын
Gpt4o does solve this. try it
@MidtownMadness110 күн бұрын
Thank you for your extensive commentary
@BrianGlaze10 күн бұрын
Well done on this content. This js the type of conversation that the general public should be having about AI because there is such a lack of information about how LLMs work.
@priapulida11 күн бұрын
a related issue is that they are also really bad at Boggle
@michaelleue7594
11 күн бұрын
That probably has more to do with the fact that they're trained using tokens instead of letters.
@WoolyCow
11 күн бұрын
@@michaelleue7594 i mean they can still process individual letters as tokens fine. its probably just more of the same, a lack of prior experience combined with a lack of understanding of the rules. i just tried boggle with 4o, every answer it gave used disconnected letters and so it failed miserably. although it was interesting that before it just said random stuff, it tried to write and execute a DFS script to do it algorithmically lol :)
@bornach
11 күн бұрын
@@michaelleue7594 And yet the token representation hasn't hindered the ability of Bing Copilot and PerplexityAI to create acrostics. That they are bad at Boggle is more likely because there are insufficient training examples of Boggle problems being solved in their training dataset.
@priapulida
11 күн бұрын
@@michaelleue7594 some are, aren't they? maybe just those text-to image models which can create designed text? if not that also explains why they are not perfect with anagrams, right? the issue with boggle also or more is also the grid form, it seems, like in the example in the video
@Amejonah
10 күн бұрын
Unfortunately, tokenizers are "contextless" in the sense of not being able to "zoom in" into tokens down to characters. For example, if you ask a model to spell "Mayonnaise" out, the tokenizer (openai gpt) will get "May","onna","ise" regardless of the question asked, as the tokenizer doesn't know when to produce characters instead. What I'm pressed by is, that LLMs can give an approximative close answer to that even with such limitations.
@ginebro193011 күн бұрын
Thing is we don't need AGI to be 90% replaced.
@qiang288411 күн бұрын
Glad to listen to this, the interview with that Google engineer is very valuable, and I wouldn't know without watching this video.
@emmanuelgoldstein368211 күн бұрын
You're the best AI KZreadr, hands down.
@gregblank24711 күн бұрын
Your claim that scaling won't solve this seems uncertain. Geoff Hinton suggests that intelligence, particularly in the context large language models, is the ability to compress vast amounts of information into a relatively small number of connections. This compression process enables the AI to identify similarities, analogies, and underlying connections between seemingly different concepts, which is the source of creativity. In essence, intelligence is the capacity to see relationships that most people don't readily recognize by finding common patterns and principles across diverse domains of knowledge. If this is true, scaling larger and larger may unlock new unexpected emergent behaviors. I don't think this is 100% conclusive yet like you are suggesting. If you feel I'm wrong, please clarify your stance. Thanks!
@aiexplained-official
11 күн бұрын
Ultimately, we will find out with GPT-5 but I stand by my central premise that it needs more than scale.
@pjkMaslowski
10 күн бұрын
@@aiexplained-official One piece of info that you may find interesting is Q&A session at MiT with Ilya Sutskever 6 years ago where a person asks about "an abysmal state of language models". It's at 51:20, video's name is "Ilya Sutskever: OpenAI Meta-Learning and Self-Play | MIT Artificial General Intelligence (AGI)"
@Daniel-Six
10 күн бұрын
Check out grokked transformers.
@raul36
10 күн бұрын
It was already demonstrated in a scientific article that there were never emergent phenomena in the LLMs, but that it was the result of a precise adjustment of parameters.
@jsivonenVR
9 күн бұрын
I’d say creativity to apply known ideas and creativity to craft novel ones are different, and LLMs can do only former. Anyways, I’m rooting for the hallucination plateau idea as it’ll buy us more time, even if only a year or two 😂
@simoneromeo599811 күн бұрын
Thank you for taking the time to create this well thought and comprehensive video
@Ecthelion391810 күн бұрын
Great video as always friend
@stephenrodwell11 күн бұрын
Thanks! Excellent content, as always! 🙏
@aiexplained-official
11 күн бұрын
Thanks stephen for your consistent support
@ricosrealm11 күн бұрын
Thank you! The first minute shows exactly why we aren't close to AGI. LLMs are very poor at reasoning and any semblance of it is mostly memorized templates of common reasoning tests from various training samples.
@rogerc796011 күн бұрын
Creative writing is a feature not a bug.
@speltincorrectyl1844
11 күн бұрын
Too bad it sucks.
@midprogramming
11 күн бұрын
@@speltincorrectyl1844 >person whose only used GPT
@EricJW
10 күн бұрын
Yeah, LLMs (and generative AI models in general) are a very effective way to explore a concept space, because that's essentially exactly what they are, a fuzzy conceptual average of massive pools of training data often too big for any one person to sift through in a lifetime. Mess with temperature and token probability settings, and you have a lot of potential creativity baked in. Trying to get them to reason and provide logic driven answers is cool if it's successful, but it's working against their strengths.
@rogerc7960
10 күн бұрын
@@EricJW they can brainstorm
@Stealthychinwag-id9of6 күн бұрын
To get AGI you need to have multiple systems with different functions working together with overseers. and it needs to be able to learn on the fly and be able to freely hallucinate.
@bittergourd9 күн бұрын
GPT 4o is totally capable of solving the puzzle. Prompt: There are four sections in this image. Pay attention to the difference between the pattern on the upper right and on the upper left. Also, pay attention to the difference the pattern on the bottom right to the bottom right. Describe the difference, and try to find the transformation that transforms the patterns on the left to the patterns on the right. High light in GPT's reply: The precise rule for the transformation could be described as: - Identify L-shapes or similar corner formations. - Add a square to the center of each L-shape or corner formation to fill it in. This rule can be observed consistently across both examples in the provided image.
@TesserId11 күн бұрын
I'm betting this will have much influence over how IQ tests for humans are constructed.
@TesserId
11 күн бұрын
So, could AI eventually be creative enough to discover novel types of intelligence tests ~~ for human or AI?
@monsieurLDN
11 күн бұрын
Is it necessary though@@TesserId
@Yottenburgen11 күн бұрын
I like how this isn't discounting LLM, they can get there it's just going to take a bit of work. Gonna be neat to see what's cookin with GPT-5 and if they got their own tricks to get up to this. I've seen far too much "we've plateaued" recently. Also, the longer the video the better.
@hydrohasspoken6227
10 күн бұрын
true that. we have "plateaued"
@ChinchillaBONK6 күн бұрын
11:02 That Ben Stiller answer is not hallucination. It's a sense of humour.
@auroraborealis556511 күн бұрын
I started watching this but fell asleep (still morning here in Australia) and had a dream I was amazed that I was hanging out with all my favourite AI youtubers
@forevergreen411 күн бұрын
This video was very much needed. Thank you!
@aiexplained-official
11 күн бұрын
Thank you for watching!
@DRKSTRN11 күн бұрын
Generalize outside of dataset = holistic = greater than sums : gestalt ai Generalize within dataset = general = sum : general ai It's more interesting that we take such for granted, wishing generalization to extend into the unknown, or gaps of some knowledge based system.
@DRKSTRN
11 күн бұрын
Other point to be said since this is something staked since 2023. To routinely create logically sound predictions of tokens outside of a training set, would constitute a rapid rise of intelligence as it unpins from the logical fallacies of what it was trained on. So the disappointment of others in regards to some "AGI," being unable to do so when it would constitute something we have known for decades as Gestalt. Is a very interesting misconception.
@DRKSTRN
11 күн бұрын
Staked Feb 11 2024: ~Three years ~2027
@Hohohohoho-vo1pq
11 күн бұрын
No human "generalizes" outside of what they learned. They apply related stuff they learned. Innovations come from testing human "hallunications" that seem to make sense.
@blue_lobster_10 күн бұрын
Great content, I love your channel ❤
@WalterKeenan11 күн бұрын
8 hours since the post and 3K upvotes. I think you now own the Category of AI + KZread and deservedly so. Keep the good work! I would subscribe to AI Insiders but the cost is a bit too high at the moment. :)
@aiexplained-official
11 күн бұрын
Thanks Walter! Do consider the newsletter perhaps! But means a lot, regardless
@facts914411 күн бұрын
I read the newsletter you sent out about this. It’s great that there is competition surrounding this. It will get companies more competitive.
@SnapDragon12811 күн бұрын
I think the word "novel" is being misused a lot here. It's not a binary switch at all; some problems are just more "novel" than others, depending on how well they pattern-match to others you've seen before. Saying that ChatGPT can't solve problems it's never seen before is flatly incorrect. Instead you're just arguing that there's some level of "novelty" at which it can no longer generalize. Which, well, is a problem humans have too. I would argue the 2D test question you gave at the start is similar to the visual learning we humans have done all our life. Replace it with, say, rotating a 5D hypercube properly, and humans would suck at it too, despite it being a simple mathematical operation. I guess humans aren't GI either. :)
@theworldofwoo8320
2 күн бұрын
Wow, feel better? Trying to impress a youtube comment section is hilarious. Get some real friends little bro
@DistortedSemance11 күн бұрын
"Maybe that combination of neural networks and traditional hardcoded programmatic systems is better than either alone." Yes! Thank you, I've been saying this for what feels like ages. They are totally different models of computation with their own strengths. People really don't realize how impressive symbolic AI was starting to become before the second winter.
@iceshoqer6 күн бұрын
Considering Claude 3.5 Sonnet's capabilities, I am surprised you don't have a new video out yet...
@cacogenicist11 күн бұрын
There aren't really any fully _general_ biological intelligences either -- no NGIs (natural general intelligence). Many humans (certainly not all) can solve that first puzzle because it's a 2D version of our evolved ability to detect patterns of change in 3D environments. It's not exactly a product of _generality_ Include in the training data video from smart glasses and multimodal sensory data from fancy robots, and what not, and I think you get these spatial reasoning/predictive capabilities.
@netscrooge
11 күн бұрын
Many of the specific attacks on current AI systems seem to ignore how many of the same criticisms can be leveled at humans. It seems as if the more dismissive people are, the more they themselves seem to suffer from the very same gaps in reasoning. Take for example, people merely parroting that LLMs are mere parrots. It's maddening. From human psychology we know that, in our debates between "tribes," we focus in on the actual flaws of the other tribe, but compare that tribe to an idealized image of our own, ignoring our flaws. It sounds as if the same thing is happening here. How often to humans actually reason outside their "training data"? Rather than holding human performance as a standard and analysing where AI falls short, I think we will learn more in the long run if we look for the similarities and differences in how artificial neural nets fail with how our own neural nets fail.
@jameso2290
11 күн бұрын
Exactly. "A blind, deaf, touchless robot trained exclusively on digitized text can't solve a spatial reasoning logic puzzle. Its not real intelligence!" Obviously. Even Hellen Keller, who was blind and deaf, had a sense of touch, and thus could understand 3D space. I think the proble itself is that computer scientists are trying so hard to reduce "intelligence" into some simple abstract mathematical principle. They dont realize that biological creatures experience the world temporo-spatially. These LLMs have no real sense of time or space. They have no darwinian drives selecting for spatial reaoning. Their neurons are only selecting for the next token. Like, imagine what a human brain would "output" if it was isolated in a jar, with no sense of sight, no sense of hearing, no sense of time, or space, no sense of touch, or smell. It would probably spend all its time hallucinating. It could only learn about space and color in an abstract sense.
@jonatand2045
11 күн бұрын
Training isn't enough. Llms fail if the problem is made different enough from the training data. They are incapable of reasoning because each problem goes through the same feedforward network that uses the same amount of compute per token no matter how complex the query. Asking them to try again might get you a better result, but the same limitations remain.
@awesomebearaudiobooks
11 күн бұрын
Yeah, expecting an LLM to solve that puzzle is kinda like like expecting an eagle's brain to solve a dolphin's problem (for example, "calculate the number of people in a boat by just the sounds of their steps with your eyes closed" might be trivial for a dolphin, but extremely hard for an eagle). The good thing about AI, though, is that potentially it can incorporate both the "eagle", and the "dolphin" brains (and many others), and become way more general than a single animal ever could.
@aaronjosephs4669
11 күн бұрын
I think if you tried some of the arc puzzles and got to understand it a bit better you might be more convinced. It's a little hard to explain what exactly each puzzle is like but it's my feeling if it was just some simple change to LLMs or more training data it would have been done already According to the creators training on samples of the test doesn't work that well and doesn't generalize. Yet humans can always do fairly well on the tests I think to the point made in this video, solving it isn't agi but it's clearly a gap
@hectorbacchus11 күн бұрын
Dwarkesh’s latest post on X says they made an LLM reach 72% on the ARC challenge!
@SimonLermen
11 күн бұрын
x.com/dwarkesh_sp/status/1802771055016378554 Basically solved. It is also unfair to use an image based test on an LLM barely trained on vision
@BrianMosleyUK
11 күн бұрын
Lost my comment saying the same thing. There are so many smart minds now working in this space, it's just so exciting.
@aiexplained-official
11 күн бұрын
It's actually 51% on the private test set, in line with Chollet's prediction for 2024. 72% in test would be wild.
@hectorbacchus
11 күн бұрын
51% is an exciting result. Really significantly higher than 34% and unexpected to happen this soon I think.
@ClaudioMartella
11 күн бұрын
It shows they are training on the data, it's partly ovefitting
@LuciousKage7 күн бұрын
Great explanations of the topic. keep it up. very interesting
@aiexplained-official
7 күн бұрын
Thanks LK
@paulmclean87610 күн бұрын
Indeed being able to visualize, test and understand ideally in parallel would be hugely beneficial... hooking these models up to environments that allows the model to self learn through experimentation surely would be THE way forward... great vid;
@aiexplained-official
10 күн бұрын
Yep
@chromosundrift11 күн бұрын
It seems like these flat spots are simply a reflection of what constitutes a feasible extrapolation or generalisation from an existing pattern such that inference applies satisfactorily to a novel input sequence. Exactly what is the basis for claims about whether the training data is "similar enough"? It feels like a philosophical question. It seems to boil down to a map of human judgement about categories and essence vs accidental detail. The consequence of this is that AGI depends on adequate coverage of human categorical domains such that statistical interpolation and extrapolation by LLMs approximates the human intelligence function without what people judge arbitrarily and subjectively as excess aliasing or smoothing of essential detail. Isn't this just a remix of the finger problem? Until training was focused on solving the problem of creeping out humans about misshapen hands, image generators were clearly deficient at calming the widespread revulsion to hand deformity humans have probably evolved in response to real world consequences over millions of years. We might expect no such revulsion to a person with purple irises but this is as anthropocentric as it is expected.
@chromosundrift
11 күн бұрын
On one hand we know that LLMs generalise to "novel situations" trivially. On the other hand whenever we recognise situations where it does not, we conclude something about "novelty" which we were probably unable to articulate a priori. We seize this example as the location of essential human intelligence, usually without acknowledgement that we are simply moving the goalposts of "intelligence" as we have done since the first machines began to exhibit capacity for what had previously been considered exclusive to essential human intelligence: arithmetic, chess, painting. I prefer to consider our continuous scope creep to be an exploration of the skills we value in an automaton and not to bless vague words like "intelligence" with mystical significance because it obscures our true goal: to understand ourselves through implementation.
@chromosundrift
11 күн бұрын
Ironically, it is humans' inability to generalise the training and architecture of LLMs that shows that we fail to adequately see pattern completion.
@TheVilivan
10 күн бұрын
I don't fully understand what you're saying, but I will leave a comment so I can come back later and try to parse this
@chromosundrift
10 күн бұрын
@@TheVilivan Sorry for not being clear. The fundamental problem is what is "different" and what is "the same". What is a variation of a pattern and what is essentially new. This is not a question about objective reality, it's a subjective human judgement.
@YuraL8811 күн бұрын
People feel scammed because OAi can't deliver voice feature as promised😢. I even cancelled my plus subscription because of it.
@KrisAshmore-gc2ut
11 күн бұрын
I use the voice feature every day it's the headphones button next to the prompt box. Just ask chat gpt how to use it.
@Grassland-ix7mu
10 күн бұрын
@@KrisAshmore-gc2ut the new voice feature
@doggosuki
4 күн бұрын
wasnt that feature meant to be free anyway though
@MonkeyBars111 күн бұрын
Very good subject matter!!
@RohitSingh-em2pm10 күн бұрын
Great video as always.
@aiexplained-official
10 күн бұрын
Thanks Rohit
@musigxYT11 күн бұрын
This is why I"m excited about the future. This also goes the same for Stable Diffusion, video diffusion, or anything really. Just as an example, a blue can have infinite shades to choose from and emotions can have thousands of nuances that affect it, and at the moment we only have limited capabilities to teach artificial intelligence all the minuscule nuances of everything that's around the world to become 'truly' intelligent at least like a K12 kid. I just hope we live until then before Skynet becomes online.
@dertythegrower
11 күн бұрын
a weebdude making anime girl music talking about intelligence of 12th graders 😂 also nuanced is the little kid word of the year, clearly
@musigxYT
11 күн бұрын
@@dertythegrower It's a fun side project that I do practicing simple animations, making music that I have on repeat, and sure, anime stable diffusion because I can't draw on my own. Helped me do a lot of things that I couldn't dream of without it and I'm having "fun" with it also learning new things to do in life.
@XOPOIIIO11 күн бұрын
More important problem of LLM in my opinion is not that they are not accurate, but that they are trying to be as useless as possible. The algorithm for training optimize it to find the next most likely word, that makes it extremely predictable and boring. Because you have more chances to predict the next word if you just write a lot of basic staff.
@naromsky11 күн бұрын
Good stuff. Thanks.
@claudioagmfilho11 күн бұрын
🇧🇷🇧🇷🇧🇷🇧🇷👏🏻, Thanks so much for this wonderful video, as always. I also want to add that I wish that OpenAI would release the full version of GPT-4 Omni. I yearn for the days when Steve Jobs would give a keynote and release the full product right then and there, which is exactly what you talked about in your video. By the way, kudos for always having the best videos on AI ever.
@DrEhrfurchtgebietend11 күн бұрын
What they are missing is something called "analogical reasoning". This type of reasoning is where we get a lot of our power. It's why people use metaphors to explain things
@faedrenn
11 күн бұрын
This
@damianlewis7550
11 күн бұрын
LLMs are weak on a number of categories of logical reasoning. Partly due to shallow generalization, partly dimensionality collapse (1000+ -> c. 40), partly discretization of smooth distributions, partly noisy training data, under/over-fitting, too sparse or too dense, token size, insufficient token retrieval by attention heads and unidirectional activation flow between layers, amongst others. Some of these issues are being addressed by researchers, some are inherent architectural flaws.
@jonatand2045
11 күн бұрын
@damianlewis7550 At this point it must be cheaper to just simulate the human brain with neuromorphics. And if that fails just give the simulation more neurons.
@kyneticist
11 күн бұрын
For what it's worth, I found effectively this while talking with Gemini a few weeks ago & asked it to flag my suggestions. I don't know if that will go anywhere, I doubt that AI researchers place much value on testing or suggestions from non-researchers.
@JimStanfield-zo2pz
11 күн бұрын
Analogical reasoning is only useful if the minds involved require analogies to reason. If they can understand the points being made exactly without analogy then analogy becomes a useless exercise.