LLMs will hit the data wall if they can’t generalize - OpenAI cofounder John Schulman

Ғылым және технология

Full Episode: • John Schulman (OpenAI ...
Apple Podcasts: podcasts.apple.com/us/podcast...
Spotify: open.spotify.com/episode/1ivz...
Transcript: www.dwarkeshpatel.com/p/john-...
Me on Twitter: / dwarkesh_sp

Пікірлер: 44

  • @charleshetterich8514
    @charleshetterich851416 күн бұрын

    co-founder ?? i swear they're just writing new characters into this OAI plot-line

  • @MatRuizMat

    @MatRuizMat

    16 күн бұрын

    this guy is a scientific legend in the AI/RL field bro

  • @user-jf5uv9ir5k

    @user-jf5uv9ir5k

    16 күн бұрын

    Exactly, he must be the 10th person to claim cofounder status

  • @kevinamiri909

    @kevinamiri909

    15 күн бұрын

    Bro this is the real person behind all OpenAI innovations I swear.

  • @hdhgdhgdhfhjfjhfjh
    @hdhgdhgdhfhjfjhfjh16 күн бұрын

    this guy AIs.

  • @oscbit
    @oscbit16 күн бұрын

    Dwarkesh pls stop uploading teasers before the actual show.. seeing shortform content suggests that the episode exists and there is no way to know until visiting you channel, only to then get disappointed.

  • @daniellawson9894

    @daniellawson9894

    16 күн бұрын

    Could keep it but put teaser / preview in the title

  • @radekwarowny

    @radekwarowny

    16 күн бұрын

    Yeah I hate that too

  • @aazzrwadrf

    @aazzrwadrf

    16 күн бұрын

    The full ep is probably not done editing yet. I don’t mind it tbh.

  • @forthehomies7043

    @forthehomies7043

    16 күн бұрын

    such an entitled take bro. just sub and keep notis on

  • @noone-ld7pt

    @noone-ld7pt

    16 күн бұрын

    @@forthehomies7043 not an entitled take at all, he shared his opinion an a lot of people agreed. that's useful constructive feedback.

  • @hemanthkorrapati1412
    @hemanthkorrapati141216 күн бұрын

    When will be the uploaded of full podcast link

  • @user-bp2ol4wi1c
    @user-bp2ol4wi1c16 күн бұрын

    what is with the sound mixing, something is off

  • @BadWithNames123

    @BadWithNames123

    15 күн бұрын

    they use ai to "clean" the audio track.. I hate it

  • @user-bp2ol4wi1c

    @user-bp2ol4wi1c

    15 күн бұрын

    @@BadWithNames123 it sounds shit , raw would do better i think

  • @craiginzana
    @craiginzana13 күн бұрын

    Didn't really age well with Claude 3 and GPT 4o

  • @nitap109
    @nitap10915 күн бұрын

    Wow, great topic

  • @junwang9927
    @junwang992716 күн бұрын

    Another legend. This is definitely my go-to AI podcast.

  • @DynamicUnreal
    @DynamicUnreal16 күн бұрын

    They will never run out of data. What they will likely run out of is captured data. Humans collectively likely produce massive amounts of _text data_ just by talking to each other every day, the question is how to capture it in a voluntary manner? Even if LLMs on their own can’t get us to AGI by themselves, they can serve as a sophisticated foundation on which to train other modalities on top of.

  • @dovekie3437

    @dovekie3437

    16 күн бұрын

    How much of the human corpus of knowledge and history and science and literature are LLMs actually trained on? I would guess that it's less than 1/50th of existing books given the training size vs total amount of terabytes of text data all the books would require.

  • @squamish4244

    @squamish4244

    8 күн бұрын

    @@dovekie3437 Not to mention the five million scientific papers produced every year, a number that has soared in recent years.

  • @dovekie3437

    @dovekie3437

    7 күн бұрын

    @@squamish4244 Hopefully the LLMs put information gained from "scientific" papers from the humanities in the same place in its memory that it puts religious texts.

  • @groundcrewz
    @groundcrewz8 күн бұрын

    and the game is back to algorithms and compute, again!

  • @tusharjain9366
    @tusharjain936616 күн бұрын

    My hypothesis ( yet don’t have data to support it) : Current generative AI technologies (LLMs ) will reach at plateau soon(again lacks data) due to at least three reasons. Reason 1: underlying models zero in on a single value which makes cross domain generation of text (or images, videos, or data points) very limited and sometimes awkward. Reason 2: post 2022/23 distinction between naturally occurring (as well generating data) and synthetic data is blurring very fast which puts learning data in downward self spiral. Reason 3: Limited labeled data availability with respect to niche . For example images about various trees vs images of tree.

  • @Hexanitrobenzene

    @Hexanitrobenzene

    14 күн бұрын

    You might be right. Mike Pound on Computerphile discusses a new paper: kzread.info/dash/bejne/lniJpY-FobnYgLg.html

  • @jackbauer322
    @jackbauer32216 күн бұрын

    it's not the data but the ARCHITECTURE that is a dead end

  • @kraithaywire

    @kraithaywire

    16 күн бұрын

    What do you mean by dead? Will we not see any more progress for quite some time or what? I would really love to know. Thank you.

  • @IcySpicy3

    @IcySpicy3

    16 күн бұрын

    You mean x86?

  • @JackLawrence-dn2jb

    @JackLawrence-dn2jb

    15 күн бұрын

    @@kraithaywire People have been saying the ARCHITECTURE IS A DEAD END for years, but that continues to be disproven time and time again. Don't listen to the doomers and naysayers.

  • @egor.okhterov

    @egor.okhterov

    14 күн бұрын

    ​@JackLawrence-dn2jb how is it disproven? By fancy UI? 😂

  • @JackLawrence-dn2jb

    @JackLawrence-dn2jb

    14 күн бұрын

    @@egor.okhterov The fact that the models are getting better year by year. Elo scores going up, now we have multi-modality, improved text to video, improved text to image. People like you been saying these are a dead end for years. Clowns lmao

  • @kevinamiri909
    @kevinamiri90915 күн бұрын

    I found someone that makes sense, please release the full interview, I cannot wait to watch his interview.

  • @vsma6517
    @vsma651716 күн бұрын

    "uhm"

  • @CrunchyAI-fu6de
    @CrunchyAI-fu6de15 күн бұрын

    REALLY don't like seeing clips of a full length interview that doesn't exist. Please stop doing this.

  • @matiasortizxxi
    @matiasortizxxi16 күн бұрын

    Well this aged like milk.

  • @assgoblin3981

    @assgoblin3981

    16 күн бұрын

    what the fuck happened

  • @aloysius_music

    @aloysius_music

    16 күн бұрын

    Did it? GPT-4o is super impressive (and uncanny), but the core reasoning isn't a massive step up. There's a reason they didn't call it GPT-5.

  • @squamish4244

    @squamish4244

    8 күн бұрын

    @@aloysius_music It reveals the potential of where we can go from here, though. LLMs have a limit, but it's not GPT-4.

  • @Natron1time
    @Natron1time16 күн бұрын

    uhhhhhh ummmmm uhhhhh

  • @alanrobertson3172
    @alanrobertson317215 күн бұрын

    He’s not a good public speaker.

  • @Derick99

    @Derick99

    7 күн бұрын

    I think he's having trouble answering without saying to much

Келесі