What does it mean for computers to understand language? | LM1

An introduction to language modeling, followed by an explanation of the N-Gram language model!
Support me on Patreon! / vcubingx
Part 2! • Why Recurrent Neural N...
3blue1brown series on Transformers: • But what is a GPT? Vi...
The source code for the animations can be found here:
github.com/viv...
These animation in this video was made using 3blue1brown's library, manim:
github.com/3b1...
Sources (includes the entire series): docs.google.co...
Chapters
0:00 Introduction
1:39 What is NLP?
2:45 What is a Language Model?
4:38 N-Gram Language Model
7:20 Inference
9:18 Outro
Music (In Order):
GameChops - National Park
Philanthrope, mommy - embrace chll.to/7e941f72
Helynt - Bo-omb Battlefield
Helynt - Undewater
Helynt - Route 10
Helynt - Twinleaf Town
Follow me!
Website: vcubingx.com
Twitter: / vcubingx
Github: github.com/viv...
Instagram: / vcubingx
Patreon: / vcubingx

Пікірлер: 72

@vcubingx4 ай бұрын
If you enjoyed the video, please consider subscribing :) Part 2: kzread.info/dash/bejne/pIiumMqalLCXfMo.html I'm excited to be starting this new series! NLP is the topic I feel like I have the most to say about, but I'll avoid throwing in my personal opinions into these videos :p Stay tuned for the next chapter which I'll be posting next Monday!! (And the third chapter next to next week). Also, let me know what other kinds of topics you'd be interested in seeing!!!
@sethdon1100
4 ай бұрын
Strange timing you go there… 3B1B published basically the same video before you.
@yjp20
4 ай бұрын
The 🐐
@1XxDoubleshotxX1
4 ай бұрын
so hot
@buddhadevbhattacharjee1363
4 ай бұрын
Hi, One of the topics which I am struggling with understanding is the requirement of V in QKV and why the multihead attention outputs are concatenated rather than doing any other operation.If you could make a video on concatenation of vectors and how they retain information better that would be great
@vcubingx
4 ай бұрын
@@buddhadevbhattacharjee1363 hmmm, interesting question for sure! I believe the reason concatenation is done is just because its loss-less (retains all information).This is just a standard DL practice - for example, we concatenate the positional embeddings too. The next to next chapter will be on attention. Let me know if that addresses you questions, and if not, I'll look into what a follow-up video could contain. Thanks ofr your input!
@mikoal14634 ай бұрын
Just got directed here by 3blue1brown! I am excited to learn!❤
@PiercingSight4 ай бұрын
What timing for this video~ Looking forward to more!
@vcubingx
4 ай бұрын
Thanks!!
@ianthehunter35324 ай бұрын
Odd timing 🤔
@jujdj6214
4 ай бұрын
ye.
@vcubingx
4 ай бұрын
Indeed, haha!
@MayaPasricha4 ай бұрын
This is wonderful, excited for the next video! Also nice choice of music :)
@vcubingx
4 ай бұрын
Thanks! Love nintendo music :)
@chineduezeofor24813 ай бұрын
Found this by watching 3Blue1Brown Awesome channel!
@cyanurecyanures1302 ай бұрын
Thanks a lot for the video ! Now I understood that Trigrams model just take into acount the last three words.
@johndewey72434 ай бұрын
Just came from 3B1B, subbed this is excellent. Thanks!
@averagemilffan4 ай бұрын
Great video!! I'm hoping you discuss some of the history in the next episodes too though
@1XxDoubleshotxX1
4 ай бұрын
agree!
@vcubingx
4 ай бұрын
That's the plan! I'm trying to touch on key papers until 2016
@artahir1234 ай бұрын
brother never stop making these videos these are very interesting
@vcubingx
4 ай бұрын
Glad you like them!
@Randomstiontastic4 ай бұрын
You uploaded this a minute after 3b1b’s video, how?
@Orillians
4 ай бұрын
IKR. FIRST I WAS WONDERING HOW AND THEN THIS TOO WHAT WHAT
@cwaddle
4 ай бұрын
This dude must be 3b1bs younger bro, or a buddy
@user-fe8hp6jv9f
4 ай бұрын
I was like when did 3b1b release the video about transformers? Turns out same time as this video
@vcubingx
4 ай бұрын
:)
@user-fe8hp6jv9f
4 ай бұрын
@@vcubingx What a troll.
@dattatreyadas4 ай бұрын
10:14 Brought to you by... 3Blue1Brown!!
@MrWater24 ай бұрын
Good one! I'll be waiting for the next one
@vcubingx
4 ай бұрын
Thanks! Currently working on it - should be up on Monday
@jamesking24394 ай бұрын
We're eating good today guys.
@amirjutt04 ай бұрын
You'll go up boi. Just put in the effort. Make the quality content. People are looking for quality content related to ML.
@PastisPastek4 ай бұрын
Perfect timing
@vcubingx
4 ай бұрын
Indeed!
@stellastaraj3 ай бұрын
Hi, love the video - just one thing, the C in the Probability equation throws me off. I keep reading it in my mind as "complement" - as in the complement of a set. I'm probably missing the right context for it. I can grasp from what you're saying that it probably signifies occurrences of the event, but uncertain why it's "c". Is it c for condition ?
@vcubingx
3 ай бұрын
C stands for count. Sorry! It can be a bit confusing - should’ve explained it. Some of the notation NLP folk use is certainly questionable
@gwonchanjasonyoon808728 күн бұрын
From 3b1b!
@alexeypankov81804 ай бұрын
great vid frfr
@tomoki-v6o4 ай бұрын
Still waiting for part 3 on neural networks
@vcubingx
4 ай бұрын
Dang, it's been 4 years already...how time flies by. I'll try and make this my next-to-next-to-next video (After Chapter 3 of this series). Sorry for the delay, and I'm happy you're still around to wait for it :)
@calix-tang4 ай бұрын
mfv what a great job you have done
@ElliottKobelansky
4 ай бұрын
i couldnt agree more mfc
@vcubingx
4 ай бұрын
mfc + mfe = mfce
@AnmolSharma-ij1ut4 ай бұрын
Dame bro it was too good i don't know about g gram
@rohitkavuluru89984 ай бұрын
Goated
@skifast_takechances4 ай бұрын
bro is basically alan turing at this point
@vcubingx
4 ай бұрын
ski fast take chances
@YoussefMohamed-er6zy4 ай бұрын
you know what chatgpt is unfortunately, the manifestation of the Chinese room paradox, and it is SO humorous that we are taking that much time to realize
@Iknowwereyousleep289
4 ай бұрын
You’re stupid: The Chinese room argument doesn't work for complex tasks beyond fixed rule-based symbolic manipulation. AI like ChatGPT goes beyond counting word co-occurrences, making decisions based on intricate feature interactions. We need to clearly define "understanding" first. Understanding involves making functional predictions by compressing data into representations in vector space synaptic interactions etc. GPT-4 doesn’t store explicit symbols but extracts features from data, comprehending context rather than concrete content. Fixed translation are without representational ability to demonstrate understanding.
@ucngominh33544 ай бұрын
hi
@vcubingx
4 ай бұрын
Hello!
@1XxDoubleshotxX14 ай бұрын
you should make a video on how to get girls
@dannysunginpark75614 ай бұрын
breh
@vcubingx
4 ай бұрын
dang retired cuber comes out of the dead only to smash mohanraj's 3x3x3 PR average
@blankboy-ww7jt4 ай бұрын
Third
@Dhruvbala4 ай бұрын
First
@fintech13784 ай бұрын
This is so Asian
@OBGynKenobi4 ай бұрын
Computers "understand" languages in as far as they can compute statistics. But they don't really understand like humans do. For example can they understand the levels of meaning of poetry, or sarcasm, or cynicism?
@panulli4
4 ай бұрын
What makes you think that human brains don’t just compute statistics?
@aaronspeedy7780
4 ай бұрын
@@panulli4 I think the difference is that LLMs compute statistics on words themselves, while humans "perform statistics" on lots of different inputs, and then transform whatever result it gets into language
@vcubingx
4 ай бұрын
To be honest, it's really unclear what it even means to "understand" language. I'm fairly certain that we should be able to get to a sarcasm-detection level of humans within the next 10 years. See relevant work: scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=sarcasm+detection&btnG=&oq=sarcasm+detection I feel like 5 years ago, the idea of being able to generate code was unfathomable. Yet, here we are, and Github Copilot knows C++ syntax almost perfectly. Who's to say that everything in our brain is not a type matrix multiplication? We don't know :)
@JosephParker7
4 ай бұрын
@@panulli4 And that intuition may be a consequence of analogical thinking and overlooking the subtleties involved. Not that it's "wrong", but arguments such as "the brain is he brain is definitely like a stack of LSTMs", or "the brain is just a Markov chain" etc. has always existed and they've only focused on certain overlaps to construct a simplistic explanation. Sure, certain submodules of the brain may operate stochastically, but it's also evident that there are a lot of other architectural complexities involved that allows for agentic behavior, continuous learning, inferring priors from observations, meta-awareness and deliberate allocation of attention and cognitive resources, and adapting to highly chaotic and out-of-distribution environments and contexts to name a few. Qualia itself hasn't been fully explained or understood and it's unclear if it can be, however there are good reasons to think it's a crucial mechanism that allows for agentic models to operate consistently and develop a coherent world model. It's highly likely it wouldn't simply "emerge" from scaling up statistical models. And equivalently, it's easy to conceptualize why a statistical model can achieve a high level of mastery in specific domains which are already deterministic or statistical in nature, or can at least be brute-force computed and generalized for but a lot of things aren't. You can for example, give the impression that you understand quantum mechanics by simply paraphrasing scientific articles, especially if you can do so at scale and very efficiently.
@OBGynKenobi
4 ай бұрын
@vcubingx yes, I'm not saying it can't happen. I'm only saying that at this point it's not there and it may take a while with more tech. And when I say a while, I mean that in the most open sense.