AI Explained
6 ай бұрын
229,826
1

Q* - Clues to the Puzzle?

Ғылым және технология

Are these some clues to the Q* (Q star) mystery? Featuring barely noticed references, KZread videos, article exclusives and more, I put together a theory about OpenAI’s apparent breakthrough. Join me for the journey and let me know what you think at the end.
www.assemblyai.com/playground
AI Explained Bot: chat.openai.com/g/g-804sC5lJ6...
AI Explained Twitter: / aiexplainedyt
Lukasz Kaiser Videos: • Deep Learning Decade a...
• Lukasz Kaiser (OpenAI)...
Let’s Verify Step by Step: arxiv.org/abs/2305.20050
The Information Exclusive: www.theinformation.com/articl...
Reuters Article: www.reuters.com/technology/sa...
Original Test Time Compute Paper arxiv.org/pdf/2104.03113.pdf
OpenAI Denial: / 1727472179283919032
DeepMind Music: deepmind.google/discover/blog...
Altman Angelo: / sama
Karpathy: peterjliu/status/...
STaR: arxiv.org/abs/2203.14465
Noam Brown Tweets: polynoamial/statu...
Q Policy: www.analyticsvidhya.com/blog/...
Sutskever Alignment: • Ilya Sutskever - Openi...
/ aiexplained Non-Hype, Free Newsletter: signaltonoise.beehiiv.com/

Пікірлер: 991

@aiexplained-official6 ай бұрын
My computer crashed 7 times while making this video and I had a hard deadline to get a flight. There is little of my normal editing in here, or captions, just my raw investigation! Do follow the links for more details.
@literailly
6 ай бұрын
We appreciate your dedication, sir!
@JohnVance
6 ай бұрын
Still the best AI channel on KZread, none of the hype of the other channels. Maybe the news cycle will calm down and you can get some sleep!
@patronspatron7681
6 ай бұрын
Bon voyage
@thebrownfrog
6 ай бұрын
It's great as always!
@alertbri
6 ай бұрын
You did a great job Philip, as always! Much appreciated attention to detail and balance. Exciting times ahead! Have a safe trip. 🙏👍
@SaInTDomagos6 ай бұрын
Dude woke up and thought to himself, how thorough will I be today and said: “Yes!” You definitely should get some interviews with those top researcher’s.
@Dannnneh
6 ай бұрын
Oooh, that would be interesting!
@aiexplained-official
6 ай бұрын
Stay tuned :)
@JustinHalford
6 ай бұрын
@@aiexplained-official🔥🫡
@daikennett
6 ай бұрын
We'll hold you to this. ;) @@aiexplained-official
@DaveShap
6 ай бұрын
Philip is nothing if not thorough. Dude reads like several novels worth of text per day.
@nathanfielding85876 ай бұрын
I'm truly grateful for this channel. Finding accurate news about almost anything is hard as heck, and having accurate AI news is especially important. We can't afford to be mislead.
@akathelobster1914
6 ай бұрын
He's good, I'm very interested in reading the references.
@DevinSloan6 ай бұрын
Ah, the Q* video I have been waiting for from the only youtuber i really trust on the subject. Thanks!
@aiexplained-official
6 ай бұрын
Let me know what you think of the theory
@AllisterVinris
6 ай бұрын
Same
@Elintasokas
6 ай бұрын
@@aiexplained-official Rather hypothesis, not theory.
@aiexplained-official
6 ай бұрын
@@Elintasokas but the evidence came first, so a theory no?
@sebby007
6 ай бұрын
My thought exactly
@gaborfuisz95166 ай бұрын
Who else is addicted to this channel
@danielbrockman7402
6 ай бұрын
me
@FranXiT
6 ай бұрын
He is literally me
@a.thales7641
6 ай бұрын
I am
@shaftymaze
6 ай бұрын
7 min later. He digs a bit further than I have time to. And yeah. Ilya was on our side.(humanity) Remember that.
@ytrew9717
6 ай бұрын
who else do you follow? (Please feed me)
@dcgamer10276 ай бұрын
I'd expect the Q to refer to Q-learning. Human beings think/function by predicting the future and acting upon those predictions, at least at a subconscious level. The way we make these predictions is by simulating our environment and observing what would happen in different variations of that simulation given the different choices we make. We then pick the future we feel is best and take the actions to manifest that future. I think a good example might be walking through a messy room with legos everywhere. You observe that environment(the room) identify the hazards(legos) then plan out a course through the room of where you can step to be safe(not step on lego). You would imagine that stepping in one spot would mean you are stuck or would step on a lego, so that whole route is bad and you try another. Repeat till you find a solution or decide there isn't one and just pick some legos up, or give up, or whatever. Of course not everyone does this, some people just walk on through without thought and either accept stepping on legos or regretting that they did not stop to think. These emotional responses of acceptance of consequences or regretting them is more akin to reinforcement learning imo. There are times when you need to act without thought, for example, if the room was on fire you might not have the time (or compute) to plan it all out. The Q learning stuff, in the context of these LLMs, seems like it would be their version of simulating the future/environment. It would generate a whole bunch of potential options(futures) then pick the best one. The difficult task there is creating a program that knows what the best option actually is, but they apparently already have that figured out. My bet is we will need to add in a few different systems of ‘thought’ that the AI can choose from given different contexts and circumstances, these different methods of decision-making will become tools for the AI to use and deploy and at that point it will really look like AGI. That’s just my guess and who knows how many tools it will even need. Either way it's cool to see progress and all this stuff is so cool and exciting. Now to go look for some mundane job so I can eat and pay off student loans lmao, post-money world come quickly plz XD.
@gregoryallen0001
6 ай бұрын
normally a long post like this will be trash so THANK YOU for this helpful and engaging response ❤
@RichardGrigonis
6 ай бұрын
Many years ago AI researchers speculated how to represent "thoughts." One approach was to treat them essentially as "mental objects," the other was to resort to possible worlds theory.
@GS-tk1hk
6 ай бұрын
What you described is just reinforcement learning, Q-learning is a specific algorithm for solving the RL objective and the "Q" refers to the Q-function, which has a specific meaning in RL. It seems likely that Q* refers to the Q-function (and star generally means "optimal"), but not necessarily the Q-learning algorithm.
@kokopelli314
6 ай бұрын
But if you have the whole world in q learning you can just use your intelligence to make money and pay someone to sweep up the room
@lucasblanc1295
6 ай бұрын
Anyone that played a bit with those LLMs intuitively know that already. I prompt it all the time chain-of-thought and other reasoning methods like "Write a truth table to check for errors in our logic". The major issue I always arrive at, is that it always ends up getting stuck somewhere along its line of reasoning and it needs human intervention. This happens exactly because it was never taught how to think and structure its thoughts, it was just a side-effect of language. I believe once its able to reason through mathematical problems with the proper proofs, it will be able to generalize for any fields due to its lateral knowledge transfer. So, they will just need to keep fine-tuning the model towards that direction, effectively creating a feedback loop of improving its capability at reasoning correctly, so that it will require less parameters and less compute for the same quality. And adding on top of that new breakthroughs such as bigger context window, AGI is just matter of quantity and quality of the same technique. Just run that thing in a loop, because that's how thinking happens. It's a trial and error process. Then, fine-tune it at being better at trial-and-error processes, instead of simply giving seemingly useful answers. We were simply being lazy about it by tuning it towards being useful quickly, without caring about how it's doing it in the first place. It is already AGI, but it's severely misaligned, just like GPT-3 was impressive before Chat fine-tuning. Now, we are fine-tuning Chat as Q*. It's just a step. After Q*, it will probably be fine-tuned for improvement at further generalization, instead of simply the domain of math/programming. This will be tricky to train, humans don't generate textual content for the sake of thinking through it, perhaps only mathematical proofs get there, and it's extremely time-consuming. Because we make assumptions about the reader's pre-existing intelligence, we tell information through text without ever showing our full thought process. In other words, we are truly starting to fine-tune it for using text for thinking, not simply generating cute answers to fool humans. This may seem obvious, but I don't think people get this.
@pedxing6 ай бұрын
THIS was the technical dive I've wanted to find for the last few days. thank you so much for taking the time to dig into the development of these papers and the technologies they represent.
@Reece-hf1zx
6 ай бұрын
saaaaaahhj
@bobtivnan6 ай бұрын
Wow. Very impressive investigative journalism. No other AI channel does their homework better than you. Well done sir.
@caiorondon6 ай бұрын
This channel outpaces in quality ANY other channel on AI News in KZread. The way you try your best to keep the hype out and reduce the amount of speculation is really something to be proud of and really what makes your content so different from other creators. You sir, is the only channel in the topic that I am happy to watch (and like) every video. ❤ Cheers from Brazil!
@aiexplained-official
6 ай бұрын
Thanks so much cai
@Madlintelf6 ай бұрын
We all spent the last week watching the soap opera drama and listening to wild ideas and nobody put it all together in a nice package with a bow on it until you posted this video. It is a theory, but one that is well thought out has references, and seems extremely logical. Thanks for putting so much work into this, but it's not falling on deaf ears, we truly appreciate you. Thanks, Bill Borgeson
@lollerwaffleable
6 ай бұрын
Who is listening? Remember I just want like a fucking job. From OpenAI specifically.
@lollerwaffleable
6 ай бұрын
When do we announce that I’m the new ceo of open ai
@lollerwaffleable
6 ай бұрын
Lmao
@gmmgmmg6 ай бұрын
The New York Times or another major newspaper should hire you, seriously. The amount and quality of research and the way you explain and convey AI news and information is truly remarkable. You are currently my favourite yt channel.
@aiexplained-official
6 ай бұрын
Thanks so much gm, too kind
@Peteismi6 ай бұрын
The Q* as an optimizing search through the action space sounds quite plausible. Just like the A* algorithm that is more of a generic optimal path finding algorithm.
@adfaklsdjf
6 ай бұрын
ohhh that Q* / A* link is very interesting!
@productjoe4069
6 ай бұрын
This was my thought too. Possibly using edits of the step-by-step reasoning as the edges, or some more abstract model. You could then weight the edges by using a verifier that only needs to see a bounded context (the original, the edited, and the prompt) to say whether or not the edit is of high quality. It’s sort of like graph-of-thought, but more efficient.
@ZeroUm_
6 ай бұрын
A* was my first thought as well, it's such a famous, CompSci graduate level algorithm. (Sagittarius A* is also the name of the Milky Way's central supermassive black hole)
@mawungeteye657
6 ай бұрын
Even if it's just speculative it's a decent idea for an actual study. Wish someone would test it.
@sensorlock
6 ай бұрын
I was thinking something along this line too. Is there a way to prune chains of thought, like A* prunes minimax?
@grimaffiliations36716 ай бұрын
This really is the best AI channel around, we're lucky to have you
@DomainAspect6 ай бұрын
This channel and Dave Shapiro are my go to for AI news!
@Veileihi
6 ай бұрын
lmao, I left the same comment on one of Daves videos but in reverse
@MarkosMiller15
6 ай бұрын
I'd add Wes too which I discovered recently but yeah, those 2 really are the main trustworthy non *cryptobro* vibes channels
@krishp1104
6 ай бұрын
I just found Dave Shapiro today but I think he's wayyy too impulsive to sound the AGI alarm
@nescirian6 ай бұрын
At 17:20 Lukacs Kaiser says multi-modal chain of thought would be basically a simulation of the world. Unpacking this, you can think of our own imaginations as essentially a multi-modal "next experience predictor", which we run forwards as part of planning future actions. We imagine a series of experiences, evaluate the desirability of those experiences, and then make choices to select the path to the desired outcome. This description of human planning sounds a lot like Q-learning - modeling the future experience space as a graph of nodes, where the nodes are experiences and the edges are choices, then evaluating paths through that space based on expected reward. An A* algorithm could also be used to navigate the space of experiences and choices, possibly giving rise to the name Q*, but it's been many years since I formally studied abstract pathfinding as a planning method for AI, and as far as I can tell from googling just now over my morning coffee, it seems like the A* Algorithm would not be an improvement over the markov decision process traditionally used to map the state space underlying Q-learning. My extrapolation gets a bit muddy at that point, but maybe there's something there. To me, a method that allows AI to choose a path to a preferred future experience would seem a valuable next step in AI development, and a possible match for both the name Q* and the thoughts of a researcher involved with it.
@a.s88976 ай бұрын
you are my first source for AI news, you go deep into the details and do not cut corners, like a true teacher
@rcnhsuailsnyfiue26 ай бұрын
18:49 I believe Q* is a reference to the “A* search algorithm” in graph theory. Machine learning is fundamentally described by graph theory, and an algorithm like A* (which traverses each layer of a graph as efficiently as possible) would make total sense.
@bl2575
6 ай бұрын
It was also my though when I heard the algorithm name. It is basically a cost minimization algorithm to reach a target node. Difficult part in this context is figuring out what heuristic to use to evaluate if a step of reasoning is closer to answering the question than another one. Maybe that where the Q-learning policy play a role.
@rioiart6 ай бұрын
Hands down best KZread channel for AI news.
@apester26 ай бұрын
I was in two minds about whether to take the Q* thing seriously until you posted about it. Now I accept that it is atleast not just sensational hype. Thanks for keeping us up to date!
@DavidsKanal6 ай бұрын
"You need to give the model the ability to think longer than it has layers" is what really sticks with me, it's such an obvious next step for LLMs which currently run in constant time. Let's see where this leads!
@etunimenisukunimeni13026 ай бұрын
Amazing work. Thanks for, ahem, pushing back the veil of ignorance 😁 So refreshing to get an informed and non-sensational take on this latest OpenAI X-Files case. It doesn't even matter if your educated guess ends up missing the mark. It's this kind of detective work that is sorely needed in any case, at least before we get some official and/or trustworthy info on this James Bond style "great achievement" called Q*
@DaveShap6 ай бұрын
This is way better than breaking AES-192.
@zero_given
6 ай бұрын
Loved your video mate!
@prolamer7
6 ай бұрын
You are big person for acknowledging that this video is better than yours!
@DaveShap
6 ай бұрын
@@prolamer7 we're all speculating here and I have a lot of respect for my fellow creators. I view it as all part of a bigger conversation.
@prolamer7
6 ай бұрын
@@DaveShap That said!!! Of many other AI youtubers you are consistently among TOP too!!! I hate to sound too simplistic. Sadly yt comment system is kinda designed to allow only short thoughts and shouts.
@garrettmyles64936 ай бұрын
As someone outside the industry, this is such a great resource. Thank you very much for the hard work and keeping us in the loop! I've been waiting for this video since the Reuters article
@zandrrlife6 ай бұрын
I would say he's actually understating the dramatic impact CoT has on multi-modal output. Also things get wacky when you combine vertical CoT iteratively reflecting horizontal CoT outputs(actual outputted tokens). Increasing model inner monologue(computation width) across layers is def the wave. Again why I think synthetic data/hybrid data curation cost will soon match model pretraining. Even if you're perturbating existing data, you can lift it's salient density to better fit this framework. Also why I keep saying local models are the way and why I've been obsessed with increasing representational capacity in smaller models.
@MasonPayne6 ай бұрын
A* is an algorithm mainly used in path finding. Which works very similar to what you described as Q. Imagine the idea landscape as a set of information you need to search through to find a path to the answer. That is what I think they mean by Q*
@ddwarful6 ай бұрын
Q* found the fabled NSA AES backdoor.
@xXWillyxWonkaXx6 ай бұрын
By far one of the most informative and condensed videos about the essential concepts/building blocks towards creating AGI. Very succinct, great tempo. 👏🏼
@adfaklsdjf6 ай бұрын
as always, _whatever happens_ , thank you for your work
@darinkishore96066 ай бұрын
you’re goated for this one man
@colin2utube6 ай бұрын
Game Developers will be familiar with the "A*" algorithm, used to find optimal shortest paths between 2 points on a grid containing obstacles (eg. a path between the players location and some target, or between an AI opponents position and the players position). I wonder if Q* is some similar shortest path finding algorithm between two more abstract nodes in an AI network problem containing some kind of obstruction that has to be navigated around ?
@johntiede2428
6 ай бұрын
I'd add that mazes can be decomposed into trees, and A* is applied to that. Think Trees of Thought not just Chain of Thought, and applying an A*-like algorithm.
@Rawi8886 ай бұрын
I'm laying here depressed beyond all reasoning, hearing you speak about your passions really lift my spirits. Thank you friend.
@aiexplained-official
6 ай бұрын
Thanks Rawi, that's so kind. Now time for you to find and speak on your passions!
@Rawi888
6 ай бұрын
@@aiexplained-official GOTCHA 🫡. You just joined twitter, imma find you and make you proud.
@sgstair6 ай бұрын
Here's the idea that I had: Let's say you think of the output of a "Let's verify step by step" prompt as a tree of possible responses. Each step has a wide variety of possible subsequent steps. Then let's say you have a classifier network that decides relatively how good chains of responses are Then you could run an A* search algorithm over the tree of possible response chains efficiently, only following the most useful ones, and explore an unimaginably huge search space without that much compute.
@TheLegendaryHacker6 ай бұрын
Damn, to me this feels like the discovery of nuclear chain reactions. It's not quite there yet, but you can see the faint glimmer of something world changing to come. Especially that "general self-improvement" stuff... GPT-5 is gonna be wild.
@KP-sg9fm6 ай бұрын
Would love to see you do interviews with lesser known but key figures in the industry, you would have such good questions.
@aiexplained-official
6 ай бұрын
Coming
@drawnhere
6 ай бұрын
Excellent 👌
@lollerwaffleable
6 ай бұрын
Lol
@uraszz6 ай бұрын
I've been seeing news about Q* for a day or two but refused to watch anything before you uploaded. I trust you with anything AI. Thank you!!
@aiexplained-official
6 ай бұрын
I might be wrong, but I gathered quite a bit of evidence for you to evaluate!
@stcredzero6 ай бұрын
This makes me want to produce a generative AI comic called, "The Verifier." It would be about a verifier AGI fighting a David versus Goliath guerilla war against a malevolent superoptimizer, using its ability to poke holes in the answers of a much larger model to save humanity. EDIT: The tactic of doing lots of iterations, then rewarding on the raw probability of winning -- This smells a lot like evolution by natural selection. It's a brutally simple emergent fitness function!
@lollerwaffleable
6 ай бұрын
Heartwarming, thanks whoever you are
@omaviquadir
6 ай бұрын
Read blame! for inspo.
@KalebPeters99
6 ай бұрын
Wow, amazing... 💕
@spaceadv60606 ай бұрын
Still the highest quality AI channel on KZread. Thanks again!
@user-hk8jt6so3l6 ай бұрын
YOU ARE THE BEST! I am so happy to have found you back at the beginning of AI "craze", and words cannot describe how grateful me and your other viewers are to you for such a high quality content! I believe your work will play a huge role in humanity's future! edit: grammar
@nathanbanks23546 ай бұрын
GPT-4 is already using let's verify step by step. I've often asked it to program something or refactor something, at the first thing it does is come up with an English list of what it's about to do. This list then becomes part of the tokens it uses to generate the following tokens as it actually writes the program. It's like it changes my query into an easier query. It wasn't doing this when I signed up in April.
@thearchitect5405
6 ай бұрын
It does it on small scales, but not quite on the same scale as in the paper. Otherwise you'd be getting 30 line responses to basic questions. It also doesn't verify on a step by step basis.
@nathanbanks2354
6 ай бұрын
@@thearchitect5405 I meant that they're using the techniques suggested from some papers earlier this year which suggested to use "think step-by-step" as part of the query to an LLM. It was a prompt-engineering technique. This was one of several techniques which substantially improved accuracy for answering exam questions. It could definitely be improved and I didn't read this particular paper, so I'm sure you're right about the scale being larger.
@adfaklsdjf
6 ай бұрын
@@nathanbanks2354 have you set any custom instructions, by chance? ;)
@homelessrobot
6 ай бұрын
@@thearchitect5405 maybe there is a threshold of compexity or something, but a couple of weeks ago I did an open book calculus course with GPT-4. It was generating step-by-step answers so large that it would stop and ask me if I wanted it to continue. > 30 lines each. Much greater. These answers took several minutes to generate each in full. It also passes that course with flying colors.
@mattpattison42276 ай бұрын
Great video! The only thing you forgot to do is ask ChatGPT what project Q-Star is. It knows more than you might think (without needing to search the web): Me: What is the difference between the GPT models (GPT-3, GPT-4, GPT-5) and Project Q-STAR? ChatGPT: The GPT (Generative Pretrained Transformer) models, such as GPT-3, GPT-4, and the hypothetical GPT-5, and Project Q-Star represent different approaches and objectives within the field of artificial intelligence. GPT Models (GPT-3, GPT-4, GPT-5) Type of AI: The GPT models are examples of large language models (LLMs). They are designed primarily for natural language processing tasks. Training and Functionality: These models are trained on vast amounts of text data and use this training to generate text, answer questions, translate languages, and perform a variety of other language-related tasks. Capabilities: Each successive version of the GPT series typically represents an increase in the size of the model (number of parameters), training data, and consequently, capabilities in language understanding and generation. For example, GPT-4 is more advanced than GPT-3 in terms of its understanding of context, subtlety of language, and ability to provide detailed and nuanced responses. Application Focus: The GPT models are focused on applications involving language, such as content creation, customer service automation, language translation, and more. Project Q-Star Type of AI: Project Q-Star (Q*) is aimed at developing Artificial General Intelligence (AGI). Objective: The goal of Q* is to create an AI system that can perform a wide range of tasks at or beyond human level, not just language processing. This includes the ability to understand, learn, and apply intelligence to a variety of problems, similar to how a human would. Capabilities: Q* is reported to have capabilities in mathematical reasoning, a skill traditionally considered exclusive to human intelligence. Unlike the GPT models that excel in language tasks, Q* aims to generalize across a broader spectrum of cognitive abilities, including reasoning and problem-solving in various domains. Application Focus: While less specific information might be available about Q*'s applications due to its nature as a general intelligence project, it potentially includes a wide array of tasks, from scientific research to complex problem-solving across various disciplines. Key Differences Specialization vs. Generalization: GPT models are specialized in language-related tasks, whereas Project Q-Star aims for a more generalized cognitive ability. Approach to Intelligence: GPT models replicate aspects of human intelligence related to language, while Q* aspires to a broader replication of human cognitive abilities. Scope of Capabilities: GPT models, despite being advanced in their domain, are still considered narrow AI, focused on specific tasks. Q*, on the other hand, represents an attempt to create a more versatile and broadly capable AI system. In conclusion, while the GPT models are groundbreaking in the field of natural language processing, Project Q-Star represents an ambitious step towards the development of AGI, with a focus on a wider range of cognitive abilities beyond language.
@Lvxurie6 ай бұрын
Listening to the guy talk about AlphaGo reminds me of how human development occurs. An early stage of learning is the actor stage where kids copy what the people around them do to try and figure out the correct way to act often also copying poor behaviours. The next stage is called the motivated agent. To be an agent is to act with direction and purpose, to move forward into the future in pursuit of self-chosen and valued goals. Since AI is essentially trying to recreate human thinking i wonder if creating AI models that following the development of humans is the best way to get to AGI.
@honkhonk8009
6 ай бұрын
Lol Il apply that with my math courses. Im having trouble with proofs. Right now all I can do, is just copy what other people have wrote and regurgitate it. But hopefully with enough practice I can get into the "motivated agent" phase likie you suggest I ugess lmfao.
@Neomadra6 ай бұрын
It's just incredible how you connect all these dots in such a short amount of time. Even if Q* turns out to be a mirage, at least I learned something about promising research directions :)
@xXWillyxWonkaXx
6 ай бұрын
If im understanding this correct, its: Test Time Computation, Chain of Thought (CoT), Let's Verify Step by Step and Self-Taught Reasoning.
@Gonko1006 ай бұрын
By far the best channel regarding this topic. Like, it's not even close.
@ShadyRonin6 ай бұрын
Love the longer video format! Amazing as usual
@FranXiT6 ай бұрын
I was just thinking about how much I wanted a new video from you :3 thank you.
@agenticmark6 ай бұрын
this is the basis of a montecarlo search, or even a genetic algorithm. you are simulating many worlds, and selecting the world that best fits the needed model. - by the way, this is great work. the research you did, the papers you referenced, and the video in general! love it.
@Rawi8886 ай бұрын
You, Matt Wolfe, Wes and David Shapiro are my only trusted sources. Especially you. I reaaaaally love and appreciate all the work you do.
@jimg82966 ай бұрын
Great research. Thank you very much. I appreciate how you have pulled together vast amount of data into an understandable video. It would take me months to get close to this understanding of Q*. Now is was just 1/2 hr with your research and video editing. RESPECT!
@Stephen_Lafferty6 ай бұрын
I can barely believe that is has been just seven days since Sam Altman was fired by OpenAI. What an American Thanksgiviing it was for Sam to return to OpenAI. Thank you for your insightful analysis as always!
@tlskillman6 ай бұрын
Great job. A real service to us all. Thank you.
@MrSchweppes6 ай бұрын
Oh, I've been waiting for your video since the Q Star news. Great dive. Thanks a lot for making this video! 👍
@zaid65276 ай бұрын
Just came across your ai channel, i found it to be one of the best ai channels on youtube that you can find, and also i like the intuition part where you told about the lets verify, Amazing video, keep up the good work 👍
@aiexplained-official
6 ай бұрын
Thanks zaid
@ryanhm10046 ай бұрын
This reminds me of the movie "Arrival", when it was very difficult to communicate with aliens because you had to explain what an adjective is, what a noun is and it would be easier to communicate with robots through mathematics than language(like said Karpathy), because you could simply reward giving functions to solve and evolving this ability to reason about, in the end, as it says in Aristotelian Logic Theory, language is mathematics too.
@Mr_Duck_RVA
6 ай бұрын
I just watched that movie for first time the other night
@electron6825
6 ай бұрын
@@Mr_Duck_RVAWhat did you think about it?
@jtjames796 ай бұрын
Q* make me a design for a cold fusion powered jetpack, please. 😎👍
@bobbyjonas2323
6 ай бұрын
XDC 😎👍
@Dojimanoryyu6 ай бұрын
So basically if we give LLM enough inference time and it generate million possibilities by its creativity with reasoning step then give that to LLM verificator to verify the correct step, then possibilities from the correct step indicates best result / option. With these approach it is possible for LLM to predict the next invention that is original and not bound with its training data, just like alphago that surpass its human training data. Then the data it got from the process will be used to further training the model. Unlimited self improvement.
@xwkya6 ай бұрын
This channel is a blessing. I have been navigating news the past week, but this is the place that I feel gives the most accurate information and informed speculations.
@xwkya
6 ай бұрын
And the theory on Q* applying Q learning to decode is very interesting. Thinking of GPT Zero, I have wondered if algorithms used in Alphazero such as MCTS (using GPT as a policy function) have been tested for decoding purpose, this also fits the idea of increasing inference cost. I hope you will continue to share your knowledge and investigations
@6GaliX6 ай бұрын
The name Q* might be just an hommage to the A* pathfinding method. Therefor a special way of creating chain of thoughts. While "Q" = Q-learning a common reinforment learning method in machine learning.
@andrew.nicholson6 ай бұрын
20:45 The idea of training a model on its own output makes me think about our own brains and how dreaming and sleep are critical to our ability to learn. Sleep is when we take what has happened during the day - our success and failures - and integrate them into our long term memory.
@adfaklsdjf
6 ай бұрын
we also loop over our thoughts while we're thinking about a problem... we come up with an idea and then reconsider it, poke holes, test it out in various ways, compare it to other ideas. a neural net's inputs traverse the network once and become outputs.. without loops it's like it's only given one shot to "think about" something before giving its answer.. the sleeping/dreaming/integration analogy is interesting.
@homelessrobot
6 ай бұрын
or the concept of active recall. You read a little, then you answer questions about what you learned or summarize what you have learned and receive feedback for that. It's important to note that the output its training on isn't just raw output, it's not a closed loop. There is a second model 'grading' the answers. So, there is external feedback involved.
@gobl-analienabductedbyhuma53876 ай бұрын
Such deep research! Man, you're just always way ahead of everyone else with your work. Thank you!
@guilleru23656 ай бұрын
Things are going so fast that it’s hard to imagine how it will look like in just a few more weeks. Amazing work!
@jeff__w6 ай бұрын
25:45 “I think the development is likely a big step forward for narrow domains like mathematics but is in no way yet a solution for AGI the world is still a bit too complex for this to work yet.” That’s a really important qualification-we’re not _yet_ on the verge of our glorious/terrifying AGI future-and that, I think, undercuts the (to me, much over-hyped) theory that some AI “breakthrough” was what spooked the board into ousting Sam Altman. Some old-fashioned power play/interpersonal conflict seems a lot more likely to me (although an AI breakthrough might have exacerbated the already-existing tensions). And that Q* is a reference to “the optimal Q-function” 18:44 seems entirely plausible. It’s just what you’d expect from the AI researchers at OpenAI.
@krishp11046 ай бұрын
I've been checking your channel impuslively waiting for this video
@Datalata6 ай бұрын
This is the information I’ve been looking for. Thanks for doing the heavy lifting on the research that we’ve all needed on this topic!
@Shulyaka6 ай бұрын
If the step-by step verification creates a smarter network, we can use it for a reinforcement training. 1. Use step-by-step network of the current level as a verifier (i.e. fitness function) for training of a new network. 2. The new network is better trained than the previous one. Make it the current one 3. Repeat.
@supremebeme6 ай бұрын
AGI happening sooner than we think?
@SaInTDomagos
6 ай бұрын
That’s the power of exponential functions.
@middle-agedmacdonald29656 ай бұрын
Thanks! Very down to earth, and well thought out.
@DiscoTuna6 ай бұрын
Wow - what a detailed line of thought and extensive amount of research you have gone through to produce this vid. Thanks
@SamGirgenti6 ай бұрын
You and Wes are the best AI presenters on youtube in my opinion. Thanks for taking the time to teach. :)
@Y3llowMustang6 ай бұрын
I've been refreshing waiting for this video from you
@Chris-se3nc6 ай бұрын
Obviously they have developed Q from Star Trek. Q is initially presented as a cosmic force judging humanity to see if it is becoming a threat to the universe, but as the series progresses, his role morphs more into one of a teacher to Picard and the human race generally - albeit often in seemingly destructive or disruptive ways, subject to his own will
@giantneuralnetwork6 ай бұрын
This was super interesting! Was just thinking programming provides a nearly unlimited dataset for LLMs as the process would be to write some code, actually run it, and verify the output. So I would expect LLMs to improve greatly in their programming abilities using these new techniques of writing many programs and picking the best ones at test time. One could even take existing code, run it to get the output, then clear out portions of it and present that and the desired output to the LLM and have the LLM fill in the missing code, run it, and verify the outputs match exactly. On a side note of using existing models to get better answers, adversarial agents could be interesting too, pitting different LLMs against each other as they solve a problem creating a more discussion like reasoning process..
@holographicman6 ай бұрын
Hands down the best AI update channel, I just remove any suggested channels popping up at this point. Oh and as a musician and synth developer, that last demo is cool, I can imagine a synthesizer or DAW in the future with where humans can interact in super creative ways. Love it. ❤
@aiexplained-official
6 ай бұрын
Nice!
@KyriosHeptagrammaton6 ай бұрын
I remember back when they had AIs learning to play mario and it was super slow to get generically good, and then they encouraged it to get a high score instead of reaching the end goal, or something like that, and suddenly it was learning way faster and much better at arbitrary levels.
@aspuzling6 ай бұрын
Thank you for not just spouting the "have OpenAI reached AGI?" hyperbole. This is really interesting research.
@skier3406 ай бұрын
Fantastic breakdown. Really doing your homework to get us some real concrete possibilities of what's actually happening with the architecture of Q* when everything else just seems wild speculation.
@2023gainer6 ай бұрын
AI surging in 2023...The future is here.? * SoundHound...* SOUN climbing 38 % month. All AI sector stocks surpassing earnings estimates in 3rd qt. Stronger revenue and positive news. Global investors filling the Dips daily. Thumbs UP Video.
@randomuser52376 ай бұрын
This actually makes me even less hopeful about open-source AI. It's quite clear that most of the people who can make new breakthroughs are working in these companies and will not publish their research. It also throws out the idea that it's only about data and compute to make better models. Open source will keep lagging behind them every day unless the government steps up and provide the financial incentives to the national labs so that they get the top researchers and publish open source models.
@prolamer7
6 ай бұрын
You are right there is only really handful of really smart people in opensource which is for "free" unline in companies where you are paid milions. BUT once there is as smart model as GPT4 for everyone to use it will help even small guys to create novel and good models.
@antoniopaulodamiance6 ай бұрын
Best channel. The amount of time dude spend reading and following all the noise to get to a high quality 15 min videos is fantastic
@HenriKoppen6 ай бұрын
Whenever I have a discussion about any topic, when someone is making a claim, I ask "please help me understanding your conclusion, can you bring me there step by step?" . This is so powerful, because when some claim is heavily biased it will emerge from this step-by-step process. It really made me stronger in having discussions and share my step-by-step reasoning. All truth comes from the details... This video is really inspiring, smart, in the right tone, well explained. Thank you for spending the time to do this right!
@aiexplained-official
6 ай бұрын
Thanks so much Henri
@kombinatsiya60006 ай бұрын
This is the channel i return to over and over again to make sense of the latest AI research.
@QuarkTwain6 ай бұрын
As if things weren't trending enough towards the conspiratorial, now they have their own "Q". Feel the AGI!
@adfaklsdjf
6 ай бұрын
I feel it.
@tomaszkarwik63576 ай бұрын
3:44, hey, another polish note. "Łukasz" is indeed the polish version of lucas, but if you want to be 100% correct, the translatileretion into british english would be something like "wukash" Edit, i forgot to "cite" my sources. I am a native polish speaker
@JustinHalford6 ай бұрын
I was waiting for this one! Absolutely riveting. Our collective progress is quickly being rendered compute bound.
@JohnSmith762A11B
6 ай бұрын
This is perhaps why Sam has been running around trying to get new chip fabs built. Nvidia is simply not enough when infinite computing power is best. This in fact has always been a primary doom scenario: that an AGI/ASI becomes addicted to getting smarter and reformats the entire cosmos into one gigantic mind.
@En1Gm4A6 ай бұрын
thats i can go sleep fine again now - u uncovered the behind the scenes - It is even aligned with what i thought might be the key to more capabilities. THANK YOU !!!!
@nomadv78606 ай бұрын
Amazing video. I appreciate your investigation into this
@aiexplained-official
6 ай бұрын
Thanks nomad
@felipoto6 ай бұрын
New Ai Explained video letss goooooo
@Robert_McGarry_Poems6 ай бұрын
20:00 I think you pretty much nailed it. This sounds pretty amazing. In all honesty this should be how the core models are trained. In my opinion, this type of processing would make alignment super easy. In the sense that you could have multiple "observers" all with their own _obviously programmed in bias_ as a second layer, that then would be filtered by a third layer, which is the true autonomous "observer."
@jacorachan6 ай бұрын
Great video as usual. Please keep on making them! You provide a thoughtful vision of current state of AI and I really appreciate the way that you elaborate your ideas with what you read or listen in videos. Again, fantastic work 👏
@aiexplained-official
6 ай бұрын
Thanks jaco!
@nanow19906 ай бұрын
Let's breakdown this step-by-step.
@vnehru16 ай бұрын
This is one of the only channels on AI - or virtually anything - that is no fluff, all good information. I never write comments, but I can't help but commend the high quality.
@attilaszekeres7435
6 ай бұрын
I too rarely write comments, but I was compelled to say that I skip over comments like yours. Flattery adds nothing to the conversation and only makes it harder to find valuable information. Hopefully my little feedback contributes toward seeing less ass licking and more juice.
@brycefegley
6 ай бұрын
I smashed the subscribe button
@michaelwoodby52616 ай бұрын
20:43 is the most succinct description of Q* I have heard yet.
@MrBorndd6 ай бұрын
This channel provides the best, most well researched, cutting edge information about AI development available. While the competition just repeat eachother and dont offer much more than what we already got from reuters. Excellent journalism!
@beowulf27726 ай бұрын
A 6B model with that much capability 💀
@the_prophet_of_N176 ай бұрын
The custom Gpt is good man. AI Explained Bot User What is the most exciting development in AI up to now? GPT The most exciting development in AI up to now, based on my knowledge, revolves around significant advancements in AI models, specifically in areas such as reasoning, language processing, and problem-solving capabilities. One key breakthrough has been the progress made in optimizing existing AI models to improve their reasoning abilities. This includes work on language models that can solve complex tasks involving reasoning, like mathematics or science problems. A critical development has been in the approach of using verifiers in AI models. This technique involves training a separate model to identify correct solutions or reasoning steps, dramatically enhancing the performance of the base language models. This method has proven to be more effective than simply increasing model size, offering performance boosts equivalent to significant model size increases. These advancements are not just theoretical but have practical implications in various fields, including mathematics, chemistry, physics, and potentially even more general problem-solving scenarios. Such developments represent a significant step towards more sophisticated and capable AI systems.
@WilliamsDarkoh6 ай бұрын
Congrats on the 200 k, my predictions were on point!
@williamjmccartan88796 ай бұрын
Thank you Phillip, glad to see you've taken the dive on X, thank you again, teaching these lessons are really important to a lot of us who haven't your skills and experience in researching all of this material and are educated through that process. Peace
@aiexplained-official
6 ай бұрын
Thanks Bill. Peace
@jumpstar90006 ай бұрын
Maybe it is a search for Quality/model refinement of the weights using something similar to the A* algorithm. Pure speculation of course. Very interesting stuff. Thanks for the insights and commentary Philip.
@aiexplained-official
6 ай бұрын
Thanks Jumpstar!
@jumpstar9000
6 ай бұрын
@@aiexplained-official I was thinking. Maybe it's a goal seeking strategy that's better than simple CoT. That would make a lot of sense.
@jumpstar9000
6 ай бұрын
@@aiexplained-official You know what else I was thinking. If Ilya is running the superalignment team, and super means superintelligence, doesn't that kind of imply that AGI is already done if they are on to superintelligence. Unless they are just trying to get ahead of the game a bit of course. but it is difficult to guess what an ASI would even be like.
@TheRealistMus6 ай бұрын
First. Always a pleasure when AI explained uploads
@jerkevandenbraak
6 ай бұрын
Best AI channel there is (or that I know of)
@kugelblitz7776 ай бұрын
Activate ads! I would voluntarily watch several of them to support this channel. This is magnificent work.
@aiexplained-official
6 ай бұрын
Will let you know another way to support soon!
@memegazer6 ай бұрын
Thanks!
@aiexplained-official
6 ай бұрын
Thanks memegazer!!
@memegazer
6 ай бұрын
Really glad you dug into this to offer some new insight...it is really fascinating