Mamba Language Model Simplified In JUST 5 MINUTES!

Ғылым және технология

#mamba #ai #llm
Here’s a super simplified explanation of the Mamba language model with the Selective State Space Model (Selective SSM architecture). In the previous videos, I used the example of sequences of words to show how transformers use the Attention Mechanism to process natural language and predict the next word in a sequence of words, e.g., a sentence. In this video, I show you how Mamba’s AI architecture uses the Selective State Space Model to figure out which parts of the data. e.g., which words in a word sequence, are connected and how they might affect what happens next, e.g., to predict which word comes next.
Don’t forget to subscribe and watch these related videos:
Transformer Language Models Simplified in JUST 3 MINUTES!
• Transformer Language M...
This Is How Wxactly Language Models Work in AI - NO background needed!
• This is how EXACTLY La...
Backpropagation Simplified in JUST 2 MINUTES! --Neural Networks
• The Concept of Backpro...
www.youtube.com/@analyticsCam...
Key terms and concepts in the video:
00:00 Intro
00:31 Why Mamba?
00:52 State Space Models
01:14 Selectivity
01:25 Two stages of Selective SSM
01:48 Parameters
02:01 First stage: Projecting the Input
02:08 Discretization
02:25 Linear Time Invariance (LTI)
02:50 Dynamic data
03:14 B Parameter
03:19 C Parameter
03:39 Selection Mechanism
03:49 Hidden State update
03:58 Delta Parameter resets itself
04:30 Input Selection
04:41 Collocation
05:04 Each state update
05:09 Predicting the next word
05:23 Hardware-aware algorithm for Selective SSM
05:27 GPU with High Bandwidth Memory
05:34 Mamba’s overall architecture (H3 + Multi-layer Perceptron)
Stick around for more videos on LLM, Natural Language Processing (NLP), Generative AI, fun coding and machine learning projects, and follow Analytics Camp on Twitter (X): / analyticscamp

Пікірлер: 12

@doublesamiАй бұрын
Very informative looking forward for the in depth video on vision mamba or vmamba
@analyticsCamp
Ай бұрын
Thanks for watching and for your suggestion. Stay tuned :)
@zagoguic4 ай бұрын
Great video! Keep making them!
@analyticsCamp
4 ай бұрын
Thanks! Will do!
@optiondrone54684 ай бұрын
Thanks for this video, keep up the good work.
@analyticsCamp
4 ай бұрын
Thanks for watching!
@kvlnnguyieb95222 ай бұрын
a great video. next video, may be you can explain the details about selective mechanisms in code
@analyticsCamp
2 ай бұрын
Great suggestion! Thanks for watching :)
@nidalidais99994 ай бұрын
I liked your style and your funny personality
@analyticsCamp
4 ай бұрын
Thanks for watching, I love your comment too :)
@ln2deep4 ай бұрын
It's a bit unclear to me how the Mamba architecture works recurrently when looking at the architecture in 5.30. What is the input here? the whole sequence or individual tokens? Surely it'd have to be the whole sequence for Mamba to build a representation recurrently. But then it seems strange to have a skip connection on the whole sequence. I think I've missed something.
@analyticsCamp
4 ай бұрын
Hi, thanks for your comment. I mentioned that delta discretizes the input as the word sequence into tokens, ..., and the fact that, at every step of the hidden state update, it takes into account the previous hidden state and the 'current input word'. I try to make an update on this, maybe reviewing the entire article if I can. Please do let me know if you are interested in any particular topic for a video.