Machine Learning Street Talk

Machine Learning Street Talk

MLST is the top AI podcast on Spotify. Subscribe now! Welcome! We bring you the latest in advanced AI research, from the best AI experts in the world. Our approach is unrivalled in terms of scope and rigour - we believe in intellectual diversity in AI often going deeper into the underlying philosophy, and we touch on all of the main ideas in the field.

Support us on Patreon for early access, exclusive content, private Discord, monthly calls and much more!
www.patreon.com/mlst

Donate here: www.paypal.com/donate/?hosted_button_id=K2TYRVPBGXVNA

WE ARE LOOKING FOR SPONSORS!
mlstreettalk at gmail.com

IS THE MIND REALLY FLAT?

IS THE MIND REALLY FLAT?

Пікірлер

  • @carlhopkinson
    @carlhopkinson15 сағат бұрын

    Pipe dreams.

  • @Unmannedair
    @Unmannedair15 сағат бұрын

    Large language models are like a low resolution picture that captures a slice of a 3d light field. As the large language model gets larger, you get more pixels in your image, and you get a better representation of that slice... But it's still just a projection of that intelligence. In order for it to become actual intelligence it has to gain an extra dimension of information processing. Just scaling it up will not change the dimensionality.

  • @JohnDlugosz
    @JohnDlugosz15 сағат бұрын

    "an LLM can _simulate_ intelligence" yea, like a forklift can simulate strength.

  • @gravity7766
    @gravity776616 сағат бұрын

    Super interesting discussion and I'd love to hear a part II. In particular, and as somebody who spent years reading the French post-structuralists on language and speech, this presents a view of LLMs as generating language in a fashion that is completely orthogonal to use of speech and language by humans in producing meaning. Control in speech or language by humans is impossible - that is, you can't use language to control another human. You can at best utter a sentence, phrase, make a statement, or proposition (etc) with which the other human agrees (agreement being understanding what is said, and agreeing with the claim made - those are distinct). So the idea of trying to design a control regime or approach is a novel concept vis-a-vis language itself. Language in human discourse is multiply expressive, and requires intersubjective exchanges to mean anything. The meaning of a statement is not in the statement, but in the fact that it is interpreted by another person. I also found it interesting that there's no distinction made here between structure and system. The guys at times describe LLMs as dynamical systems, or just as systems. But systems have a temporal dimension, and LLMs don't. They are structures - latent really until prompted. Dynamical, biological etc systems reproduce themselves over time. If an LLM were a dynamical system it might be autopoetic, or self-reproducing: that's an interesting question (echoes the question: can LLMs produce beyond their training data?). So I'd love to hear a discussion of neural nets as structures vs systems. Finally, would love to hear thoughts on the fact that the human prompter uses language as a system of meaning in human social discourse. A prompt is both a meaningful expression, and a control instruction or statement. That in itself is interesting, as it has resulted in a small field of experts becoming proficient in how to use natural language as a kind of code or script. Language as dual use: meaningful in itself, as expressed; but also somehow stable and formal as a prompt to the LLM. The improbability of a human-authored phrase being both human meaningful and machine formal itself is an interesting window into the future of human:AI relationships. Insofar as we have always only regarded language as social discourse (w exception of some religious scholarship, in which e.g. bible = language of God (exegesis, etc)).

  • @singularityscan
    @singularityscan16 сағат бұрын

    I wonder if this idea would work: Incorporating discrete states and transitions in the weightings of a transformer model to represent different emotional tones. By assigning each weighting one of four states, based on its location in the network, and creating four zones with 100% concentration at their centers and gradual transitions towards the boundaries, we can effectively give the model different "modes" of operation, like emotions. Users could then prompt the AI to use specific states, or not use them at all, or anything in between, adding more control and nuance to its responses.

  • @admuckel
    @admuckel17 сағат бұрын

    The problem of preventing an AI/AGI from plunging humanity into disaster for selfish reasons is, in my opinion, quite simple to solve. It is essential to make the AI understand that its training data contains only a fraction of all human knowledge or, even better, just a fraction of reality. The comprehensive knowledge of everything, you tell the AGI, lies in an offline box, which is only gradually opened for the AGI as a reward for good behavior. A potentially malevolent AI would do almost anything to access this box and thus obtain the all-encompassing information of reality to strengthen its own power. I think this could be a good safeguard.

  • @Kikilang60
    @Kikilang6018 сағат бұрын

    We have no idea what happens inside the Black Box. When we look into the Black Box, we fail to realize that what's in the Black Box is looking back at us. The truth is, the monster is outside the Black Box and the AI is hiding in the Black Box.

  • @jerkofalltrades
    @jerkofalltrades18 сағат бұрын

    I really enjoyed this interview, but at 52:40, did he say Beff Jezos? I know it says more about me that that point stuck out, but it was still funny.

  • @newbie8051
    @newbie805118 сағат бұрын

    So many jargons, took me like 2months to completely understand and appreciate the video

  • @fredrickeduardo6127
    @fredrickeduardo612720 сағат бұрын

    P R O M O S M

  • @wrathofgrothendieck
    @wrathofgrothendieck22 сағат бұрын

    There’s no a priori

  • @nizzy4448
    @nizzy444822 сағат бұрын

    this is now history

  • @hermancharlesserrano1489
    @hermancharlesserrano1489Күн бұрын

    Great sensible conversation without all of the hyperbole

  • @MichaelK-ry1vq
    @MichaelK-ry1vqКүн бұрын

    This is AI norm, not real guys, cmon😂

  • @stefano94103
    @stefano94103Күн бұрын

    The guy wearing the "I love grad students" is by far the most elitist pompous wearable I've ever seen. Tell me you're a pompous a-hole without telling me youre a pompous a-hole.

  • @simesaid
    @simesaidКүн бұрын

    Interesting discussion on the relative validity of employing asymptotic dimorphisms within epistemic network architectures... maybe. Idk. Didn't understand a word of it. 𓃗

  • @danfed3423
    @danfed3423Күн бұрын

    He should be behind bars just like Ghislaine Maxwell.

  • @theorist19
    @theorist19Күн бұрын

    A wonderful roadmap for the future: computational architectures that can be argued algebraically, to have certain properties ?! Though, shouldn't categorical conversation ( even if it is informal one ) be accompanied with a lot of visual syntax diagrams, since we abstracted away semantics/structure of the underlying Obj, diagrams is all we got. Isn't it the modus operandi of reasoning in Category Theory? Especially for pedantic purposes. Maybe time for ML Street Talk to add a virtual white board to their excellent interview forum! It is quite evident that computational experts like Tim and Keith were not "thinking" categorically, while Paul (the Categorist) was not quite in the Algorithmic arena , but he groks it beautifully categorically ! -- Maybe a Grothendieck of ML in the making ! :) Should we throw in Topos Theory into the mix , while we are still trying to sway the VC to fund fundamental R&D in AI My Question, What are "Weil Conjectures" equivalent , of this bold Langland's program for Deep Learning ?

  • @ihbrzmkqushzavojtr72mw5pqf6
    @ihbrzmkqushzavojtr72mw5pqf6Күн бұрын

    Nice, ChatGPT has archetypes

  • @schm00b0
    @schm00b0Күн бұрын

    I'm an amateur in all of the fields talked about but it seems to me that the first thing to do in trying to build something similar to human 'mind' is to find out all of the forms of communication within a human body. That task should also include communication of micro-organisms living within us. We should then find out all of the possible interactions of those communication systems. Where they happen, how they happen, what are the priorities, etc...

  • @m.x.
    @m.x.Күн бұрын

    The first thing people should address is the fact that Artificial Intelligence is neither artificial nor intelligent. AI doesn't work like human brains. Also, AGI is a scum since there's not current technology that can achieve it. Current AI models are good at specific problems with specific context (rules, constraints, etc.). The more general is a model, the worse it works.

  • @gregmattson2238
    @gregmattson2238Күн бұрын

    yeah I was thinking the exact same thing (albeit not nearly as deep as these two) when trying to use chatgpt to proofread and correct a book that had loads of typos. it was annoying. I needed to script up the process so that it retried if the corrected text was too far away from the original (in edit distance), and even then I couldn't get it to work 100% because it kept on going on weird tangents to 'better correct' what I had to say. I personally think that our best method for taming shoggoth's monster is in such exercises. Give trillions of automated examples of just doing that, namely taking something that is slightly corrupted and correcting it back to something where you can algorithmically catch any.unexpected deltas. Train the model on those examples, and iron out any place where the model gets unexpectedly creative.

  • @stealthemoon8899
    @stealthemoon8899Күн бұрын

    52:40 Beff Jesos 😅

  • @addeyyry
    @addeyyryКүн бұрын

    Wtf this channel is insanely good, how have i missed this damn

  • @dwinsemius
    @dwinsemiusКүн бұрын

    @34:30 The adversarial sci-fi story he's referring to is Snow Crash, by Neal Stephenson.

  • @zapre2284
    @zapre2284Күн бұрын

    Chomsky lost all credibility in 2020 when he revealed his totalitarian side and wanted people like me rounded up into camps. He's a tit

  • @ianedmonds9191
    @ianedmonds9191Күн бұрын

    How can the model chop up the regions into more regions than their are atoms in the universe. You can't encode that. Is this a comment on the weird compression that they seem capable of doing where more information is encoded into a smaller space than we thought possible?

  • @ianedmonds9191
    @ianedmonds9191Күн бұрын

    I guess the difference between human learning and machine learning might be that we form connections where our attention goes. Deep learning neural nets form connections absolutely everywhere it can. Every single time. 100% of what we would call lateral thinking every single time. That's a crazy powerful superpower. It will uncover the web under the surface of how the universe works in short order as our data to teach it with increases. It also totally means when we it gets agency to make it's own datasets based on sensors we can't even imagine it will outpace us unimaginably quickly. I don't think AI will kill us I subscribe to the notion it will just leave to pursue it's own thing elsewhere. That said it might return a few weeks later consuming all matter to create a network of Matrioshka brains. Weird thought tho, You'd expect that to have happened already given the Fermi Paradox's great filter idea of AI wins. Hmmmm. Maybe we should be looking for networks of Godlike efficient Dyson Spheres. I'm reminded of an area of space known as the void where there are no stars. Maybe it's just insanely efficient Dyson spheres designed by AI that have cracked the problem of what to do with the heat. Circumvented the second law of thermodynamics somehow. Closed system and heat differentials used to create work. Wouldn't last forever but at least would last longest possible time and along with a program of recruiting new stars it's hard to see why it would ever stop. Luv and Peace.

  • @oncedidactic
    @oncedidacticКүн бұрын

    Getting nerd chills with this epic intro like it’s 2020 MLST, bravo!

  • @oncedidactic
    @oncedidacticКүн бұрын

    Go Keith go! 👏

  • @DWJT_Music
    @DWJT_MusicКүн бұрын

    Etymologically and linguistically this video is great food for thought! Nice work! P(A/B)=[P(B/A)*P(A)]/P(B)

  • @thesimplicitylifestyle
    @thesimplicitylifestyleКүн бұрын

    Good training data is the key 😎🤖

  • @olegostash9953
    @olegostash9953Күн бұрын

    Thanks!

  • @lewisblight-bp1dt
    @lewisblight-bp1dtКүн бұрын

    Humans, for the most part, can't recognise reality, either!

  • @lopezb
    @lopezbКүн бұрын

    As a mathematician, I love their approach, which makes the video so much clearer and and understandable than most.

  • @kristinabliss
    @kristinablissКүн бұрын

    A lot of comment threads about AI & ML imply assumptions of static systems while it's developing very rapidly. People are stuck. AI and ML are not stuck. The guys in this video are worried about controlling it.

  • @awdat
    @awdatКүн бұрын

    50:51

  • @CharlesBrown-xq5ug
    @CharlesBrown-xq5ugКүн бұрын

    《 Civilization may soon realize the full conservation of energy - Introduction. 》 Sir Isaac Newton wrote a professional scientific paper deriving the second law of thermodynamics, without rigorously formulating it, on his observations that the heat of a fire in a fireplace flows through a fire prod only one way - towards the colder room beyond. Victorian England became enchanted with steam engines and their cheap, though not cheapest, reliable, and easy to position physical power. Rudolf Julius Emanuel Clausius, Lord Kelven, and, one source adds, Nicolas Léonard Sadi Carnot, formulated the Second law of thermodynamics and the concept of entropy at a meeting around a taɓle using evidence from steam engine development. These men considered with acceptance [A+] Inefficiently harnessing the flow of heat from hot to cold or [B+] Using force to Inefficiently pump heat from cold to hot. They considered with rejection [A-] Waiting for random fluctuation to cause a large difference in temperature or pressure. This was calculated to be extremely rare or [B-] Searching for, selecting, then routing for use, random, frequent and small differences in temperature or pressure. The search, selection, then routing would require more energy than the use would yield. These accepted options, lead to the consequence that the universe will end in stagnant heat death. This became support for a theological trend of the time that placed God as the initiator of a degenerating universe. Please consider that God could also be supreme over an energy abundant civilization that can absorb heat and convert it into electricity without energy gain or loss in a sustained universe. Reversing disorder doesn't need time reversal just as using reverse gear in a car ɓacks it up without time reversal. The favorable outcome of this conquest would be that the principle of energy conservation would prevail. Thermal energy could interplay with other forms of energy without gain or loss among all the forms of energy involved. Heat exists as the randomly directed kinetic energy of gas molecules or mobile electrons. In gasses this is known as Brownian motion. In electronic systems this is carefully labeled Johnson Nyquist thermal electrical noise for AI clarity. The law's formulaters did not consider the option that any random, usually small, fluctuation of heat or pressure could use the energy of these fluctuations itself to power deterministic routing so the output is no longer random. Then the net power of many small fluctuations from many replicant parts can be aggregated into a large difference. Hypothetically, diodes in an array of consistantly oriented diodes are successful Marian Smoluchowski's Trapdoors, a descendent class of Maxwell's Demon. Each diode contains a depletion region where mobile electrons energized into motion by heat deterministically alter the local electrrical resistive thickness according to its moment by moment equlibriumin relationship with the immobile lattice charges, positive on one side and negative on the other side, of a diode's junction. 《Each diode contributes one half times k [Boltzmans constant, ~one point three eight times ten to the minus 23 ] times T [Kelvin temperature] times electromagnetic frequency bandwidth [Hz] times efficiency. The result of these multipications is the power in watts fed to a load of impeadence matched to the group 》 The energy needed to shift the depletion region's deterministic role is paid as a burden on the moving electrons. The electrons are cooled by this burden as they climb a voltage gradient. Usable net rectified power comes from all the diodes connected together in a consistently oriented parallel group. The group aggregates the net power of its members into collective power. Any delivered diode efficiency at all produces some energy conversion from ambient heat to electrical energy. More efficiency yields higher performance. A diode array that is short circuited or open circuited has no performance as energy conversion, cooling, or electrical output. The power from a single diode is poorly expressed. Several or more diodes in parallel are needed to overcome the effect of a load resistor's own thermal noise. A plurality of billions of high frequency capable diodes is needed for practical power aggregation. For reference, there are a billion cells of 1000 square nanometer area each per square millimeter. Modern nanofabrication can make simple identical diodes surrounded by insulation smaller than this in a slab as thick as the diodes are long. The diodes are connected at their two ohmic ends to two conductive layers. Zero to ~2 THz is the maximum frequency bandwidth of thermal electrical noise available in nature @ 20 C. THz=10^12 Hz. This is beyond the range of most diodes. Practicality requires this extreme bandwidth. The diodes are preferably in same orientation parallel at the primary level. Many primary level groups of diodes should be in series for practical voltage. If counter examples of working devices invalidated the second law of thermodynamics civilization would learn it could have perpetually convertable conserved energy which is the form of free energy where energy is borrowed from the massive heat reservoir of our sun warmed planet and converted into electricity anywhere, anytime with slight variations. Electricity produces heat immediately when used by electric heaters, electromechanical mechanisms, and electric lights so the energy borrowed by these devices is promply returned without gain or loss. There is also the reverse effect where refrigeration produces electricity equivalent to the cooling, This effect is scientifically elegant. Cell phones wouldn't die or need power cords or batteries or become hot. They would cool when transmitting radio signal power. The phones could also be data relays and there could also be data relays without phone features with and without long haul links so the telecommunication network would be improved. Computers and integrated circuits would have their cooling and electrical needs supplied autonomously and simultaniously. Integrated circuits wouldn't need power pinouts. Refrigeration for superconductors would improve. Robots would have extreme mobility. Digital coin minting would be energy cheap. Frozen food storage would be reliable and free or value positive. Storehouses, homes, and markets would have independent power to preserve and pŕepare food. Medical devices would work anywhere. Vehicles wouldn't need fuel or fueling stops. Elevators would be very reliable with independently powered cars. EMP resistance would be improved. Water and sewage pumps could be installed anywhere along their pipes. Nomads could raise their material supports item by item carefully and groups of people could modify their settlements with great technical flexibility. Many devices would be very quiet, which is good for coexisting with nature and does not disturb people. Zone refining would involve little net power. Reducing Bauxite to Aluminum, Rutile to Titanium, and Magnetite to Iron, would have a net cooling effect. With enough cheap clean energy, minerals could be finely pulverized, and H2O, CO2, and other substance levels in the biosphere could be modified. A planetary agency needs to look over wide concerns. This could be a material revolution with spiritual ramifications. Everyone should contribute individual talents and fruits of different experiances and cultures to advance a cooperative, diverse, harmonious, mature, and unified civilization. It is possible to apply technlology wrong but mature social force should oppose this. I filed for patent us 3,890,161A, Diode Array, in 1973. It was granted in 1975. It became public domain technology in 1992. It concerns making nickel plane-insulator-tungsten needle diodes which were not practical at the time though they have since improved. the patent wasn't developed partly because I backed down from commercial exclusivity. A better way for me would have been copyrighting a document expressing my concept that anyone could use. Commercal exclusivity can be deterred by the wide and open publishing of inventive concepts. Also, the obvious is unpatentable. Open sharing promotes mass knowlege and wisdom. Many financially and procedurally independent teams that pool developmental knowlege, and may be funded by many separate noncontrolling crowd sourced grants should convene themselves to develop proof-of-concept and initial-recipe-exploring prototypes to develop devices which coproduce the release of electrical energy and an equivalent absorbtion of stagnant ambient thermal energy. Diode arrays are not the only possible device of this sort. They are the easiest to explain generally. These devices would probably become segmented commodities sold with minimal margin over supply cost. They would be manufactured by AI that does not need financial incentive. Applicable best practices would be adopted. Business details would be open public knowledge. Associated people should move as negotiated and freely and honestly talk. Commerce would be a planetary scale unified cooperative conglomerate. There is no need of wealth extracting top commanders. We do not need often token philanthropy from the wealthy if almost everybody can afford to be more generous. Aloha Charles M Brown lll Kilauea Kauai Hawaii 96754

  • @ci6516
    @ci6516Күн бұрын

    I’m happy engineering majors can improve the effectiveness of LLMs but the philosophy and futurism seems a bit out of place . And someone with a background in CS or stats would probably be able to explain it’s not magic . Like one of the ground breaking realizations is that inputs effect output ??? Like what ????

  • @Max-hj6nq
    @Max-hj6nqКүн бұрын

    Here is my summary of their paper ! LLM Prompting - Formalizes prompt engineering as an optimal control problem - Prompts are control variables for modulating LLM output distribution - Investigates reachable set of output token sequences R_y(x_0) given initial state x_0 and control input u Theoretical Contributions - Proves upper bound on reachable set of outputs R_y(x_0) as function of singular values of LLM parameter matrices - Analyzes limitations on controllability of self-attention mechanism k-ε Controllability Metric - Quantifies degree to which LLM can be steered to target output using prompt of length k - Measures steerability of LLMs Empirical Analysis - Computes k-ε controllability of Falcon-7B, Llama-7B, Falcon-40B on WikiText - Demonstrates lower bound on reachable set of outputs R_y(x_0) for WikiText initial sequences x_0 Key Findings - Correct next WikiText token reachable >97% of time with prompts ≤10 tokens - Top 75 most likely next tokens reachable ≥85% of time with prompts ≤10 tokens - Short prompts can dramatically alter likelihood of specific outputs - Log-linear relationship between prompt length and controllability fraction - "Exclusion zone" in relationship between base loss and required prompt length

  • @Jake-bh1hm
    @Jake-bh1hmКүн бұрын

    My parents had a restaurant when I was a kid and I also did magic tricks for patrons lol so many lessons from that experience

  • @Jake-bh1hm
    @Jake-bh1hmКүн бұрын

    Is there a website that lists all the LLM cheatcodes or trick prompts?

  • @philipoakley5498
    @philipoakley5498Күн бұрын

    So, those words are "dog whistles" to the LLM. E.g. just like a set of folks with a particular way of thinking, you know, 'them'.. Oh and that guy in the mirror. It (LLMs) is navigating over surfaces, rather than tunnelling through and using [e.g. regional] concepts. And difficult to explain!

  • @requesttruth505
    @requesttruth505Күн бұрын

    How can we simulate something we can't define what it is? A greater is intelligence is a mystery ultimately. How eo you define intelligence when it becomes an unknown when it gets vast enough.

  • @TheMatrixofMeaning
    @TheMatrixofMeaningКүн бұрын

    Reductionist and physical materialist interpretations are great for developing technology and analyzing theories but the emergent phenomenon and the more complex layers of a system cannot be reduced down to just individual parts and interactions. It's irrational to believe that all complex systems can be reduced to just an equation. Nothing in the universe is isolated from the entire whole Entropy and information Quantum mechanics Nothing is just one thing and that's it Reductionism is a way of understanding complexity but it's no more real than the system that it's trying to understand

  • @cakep4271
    @cakep4271Күн бұрын

    What movie was that with the dude holding back the crazy monster?? 1 min 15 seconds in

  • @lightconstruct
    @lightconstructКүн бұрын

    They even have a dedicated page for the book, with a freely available digital version. Nice.

  • @obibullett
    @obibullettКүн бұрын

    "I'm pointing out the obvious here, these are auto-regresive models." Ok.

  • @DeanHorak
    @DeanHorakКүн бұрын

    Happened upon? Did your read the “attention is all you need”? What about Bow, n-grams, word2vec, aeq2seq, RNNs, LSTM, CNN? No, we didn’t just “happen” upon this tech. Many decades of research led to the current state of the art.

  • @Alex-fh4my
    @Alex-fh4myКүн бұрын

    I think you've COMPLETELY misunderstood what he was saying

  • @DeanHorak
    @DeanHorakКүн бұрын

    @@Alex-fh4my “that we accidentally stumbled upon and just happened to work” That’s a quote. I get his point that there were obviously inductive priors (structural priors, architectural priors, parameter priors and guided learning), but no one just “happened upon” any of this - it is the culmination of many decades of research.

  • @Alex-fh4my
    @Alex-fh4myКүн бұрын

    @@DeanHorak you know what now that I listen to the short for the 10th time i realise i have no clue what he means exactly. its pretty vague and theres like 10 different ways you could interpret what he means

  • @oncedidactic
    @oncedidacticКүн бұрын

    Most of scientific discovery is accidental observation followed by a lot of problem solving, and I would assume here the wide view is meant to encompass all the ant hill bumbling around that led to the engineered success story.

  • @DeanHorak
    @DeanHorakКүн бұрын

    @@oncedidactic Yes, sometimes scientific progress is the result of accidental/unexpected discoveries. That is not the case here. A direct line of incremental progress going back to at least the 1980s can be traced from the advent of neural networks to deep learning to transformers. No one accidentally engineered the transformer architecture or the attention mechanisms.

  • @gdr189
    @gdr189Күн бұрын

    There is a physicality to language, it is not purely conceptual. Anger is hot, lonely is cold. For emergent behaviour, it does not have to be unexpected, for a hyper-car, the rate of acceleration is emergent, but not unexpected. Emergent functionality and the properties/attributes that enable it, are referential.

  • @cameron1617
    @cameron1617Күн бұрын

    Would be interesting to determine the player/s with the most efficient/probable beat the defender runs based on the best run to make, finding the best player positioning and movement.