Attention is all you need (Transformer) - Model explanation (including math), Inference and Training

Ғылым және технология

A complete explanation of all the layers of a Transformer Model: Multi-Head Self-Attention, Positional Encoding, including all the matrix multiplications and a complete description of the training and inference process.
Paper: Attention is all you need - arxiv.org/abs/1706.03762
Slides PDF: github.com/hkproj/transformer...
Chapters
00:00 - Intro
01:10 - RNN and their problems
08:04 - Transformer Model
09:02 - Maths background and notations
12:20 - Encoder (overview)
12:31 - Input Embeddings
15:04 - Positional Encoding
20:08 - Single Head Self-Attention
28:30 - Multi-Head Attention
35:39 - Query, Key, Value
37:55 - Layer Normalization
40:13 - Decoder (overview)
42:24 - Masked Multi-Head Attention
44:59 - Training
52:09 - Inference

Пікірлер: 544

  • @umarjamilai
    @umarjamilai11 ай бұрын

    Slides' PDF: github.com/hkproj/transformer-from-scratch-notes

  • @bhaskartripathi

    @bhaskartripathi

    4 ай бұрын

    I am not able to download the pdf file. My friends also tried. Will it be possible to put it on a downloadable link please? your content is too good and needs to be read again and again.

  • @mahek6110

    @mahek6110

    3 ай бұрын

    its getting downloaded@@bhaskartripathi

  • @hackie321
    @hackie3212 күн бұрын

    The best Transformer explanation on internet till now and I have seen almost all of it. Kudos! You are a true teacher. I dare to compare you with Andrew NG. Please become a professor and not a corporate slave.

  • @gabrielnsionu8583
    @gabrielnsionu85835 ай бұрын

    This is arguably the best explaination of the multi-head attention in the internet hands down. Very thorough and most important to folks like me using attention mechanism as my underpinning mechanism in developing my novel neural architecture to be applied to my deep reinforcement learning architecture. Sir, pls never stop making this type of videos.

  • @umarjamilai

    @umarjamilai

    5 ай бұрын

    You're welcome! 🤓

  • @csikel22

    @csikel22

    5 ай бұрын

    I couldn't agree more. Best video on transformers I have seen so far. I doesn't get clearer than this. It would be very interesting to give some insight why this whole thing works and what are other variations and alternative architectures.

  • @rkbshiva

    @rkbshiva

    5 ай бұрын

    ​@@umarjamilaibro you're a legend!!!!

  • @pablofe123

    @pablofe123

    Ай бұрын

    There are still a couple of things that are not explained well in the video. Q, K and V matrixs are the same matrix? and where do the parameters matrix Wq, Wk and Wv comes from? Besides that, excellent video.

  • @peregudovoleg

    @peregudovoleg

    24 күн бұрын

    @@pablofe123 21:25 "QKV are the same matreces". As for W matrices, he only says that they are "parameter matrices", and parameters is something we train during training process.

  • @kerrykilian9127
    @kerrykilian91276 күн бұрын

    best explanation of the paper on the whole internet

  • @JulianHarris
    @JulianHarris22 сағат бұрын

    I'm so glad I found this again. Do NOT rely on KZread watch history it doesn't look at all your history. This is definitely the best explanation of transformers and attention and believe me I've watched quite a few! Kudos again Umar.

  • @umarjamilai

    @umarjamilai

    20 сағат бұрын

    You should subscribe to the channel to never lose it 😇 thanks for the kind words.

  • @DembaDiop-om3gv
    @DembaDiop-om3gv4 ай бұрын

    The best explanation of "Attention is all you need" from my point of view, guys "This explanation is all you need". Thank you very much

  • @ajithshenoy5566
    @ajithshenoy55666 ай бұрын

    Bless you Umar One of the finest tutorials out there. Please don't ever stop. We're willing to support you in every way possible.

  • @calewang3713
    @calewang37137 ай бұрын

    Oh Man, you deserve a Turing Award.....

  • @silasnginyo7744
    @silasnginyo77445 ай бұрын

    So far the best laid out presentation of Transformers I have ever walked through

  • @sushantpenshanwar8038
    @sushantpenshanwar80386 ай бұрын

    You did the best job of describing the complicated details in a fluid manner. Sat, watched and took notes in one sitting. Hands down best one so far.

  • @abc-by1kb
    @abc-by1kb9 ай бұрын

    Such a great video! Explained all the key concepts so clearly and precisely while giving very nice intuition!

  • @Patrick-wn6uj
    @Patrick-wn6uj2 ай бұрын

    This is the most important channel I have come across on youtube. keep creating these long form videos you are saving our lives in a huge away

  • @vrvlbl
    @vrvlbl3 ай бұрын

    Amazing explanation. I struggled too long to understand the architecture until I landed on your video. Way to go!!

  • @cristinaballesteros93
    @cristinaballesteros933 ай бұрын

    I have watched a lot of videos about transformers, and this is by far the best one. I finally understand how they work. Thank you so much!

  • @vitoroliveiradesouza4214
    @vitoroliveiradesouza421425 минут бұрын

    I'm really glad to have found your video! Congratulations on the clean and yet detailed explanation

  • @jdbrinton
    @jdbrinton5 ай бұрын

    the clearest description I've found to-date. bravo!

  • @tariqkhan1518
    @tariqkhan151821 күн бұрын

    TBH The best Explanation of Attention in whole Internet.

  • @ishaanjoshi6959
    @ishaanjoshi69594 ай бұрын

    The best explanation of attention based mechanism I found online , thank you so much Umar for making this video.

  • @keithchua1723
    @keithchua17232 ай бұрын

    Spent days trying to understand this and I wished I had come across this video first because now I understand everything fully. Immediately subscribed, keep it up!!

  • @SagarVibhute
    @SagarVibhute5 ай бұрын

    Kudos on the commendable work, and simplified explanation! I appreciate that you are also trying to explain the intuition behind each step and not just math. I'll view and re-view this a few times to understand more with successive passes. Thank you!

  • @utkarshashinde9167
    @utkarshashinde9167Ай бұрын

    I cannot tell you how grateful I am for this explanation provided by you .............. nowhere I find this detailed and easy-to-understand description, a go-to video for every interview preparing students

  • @mculabs
    @mculabs4 ай бұрын

    Probably the best explanation of the paper and the encoder and decoder sub layers. Kudos!!

  • @megatroneata9911
    @megatroneata99113 ай бұрын

    After watching this video and the stable diffusion video, I can say forsure that you are an amazing teacher. Extremely digestible content and easy to follow along.

  • @AIVidya
    @AIVidya5 ай бұрын

    One of the best transforrmers videos encountered so far.

  • @NazerkeSafina
    @NazerkeSafinaАй бұрын

    This is brilliant. Thank you Umar for your hard work. Please keep new videos coming. You are helping immensely. May you live long and happy and healthy

  • @abhilashbalachandran7160
    @abhilashbalachandran71607 ай бұрын

    super useful. I really loved how you explain this with linear algebra. Very insightful. actually easier to understand than a lot of lectures at universities

  • @zeeshanmehdi3994
    @zeeshanmehdi39942 ай бұрын

    can't thank you enough, this is the best explanation of transformers i could find after trying for days to understand it. Thank you ❤

  • @Nereus22
    @Nereus225 ай бұрын

    This is really a great video, exactly what I was searching for! Everything that you mentionned was explained in details (others are skipping a lot).

  • @KunalTiwariBCI
    @KunalTiwariBCI9 күн бұрын

    Bro, legit the best explanation I have ever seen so far.

  • @ddstar
    @ddstar3 ай бұрын

    Excellent. You answered a lot of questions I had about where the weights come from and how they were updated

  • @profyao
    @profyaoАй бұрын

    Absolutely the best explanation for multi-head attention so far!

  • @tgyawali
    @tgyawali5 ай бұрын

    Thank you, so much for putting together such a detailed video. This helps technical people who do not have a lot of experience in research but have some background in machine learning to understand this very important and historic paper in AI.

  • @anirudhjoshi1607
    @anirudhjoshi16076 ай бұрын

    This is the clearest explanation on this paper I have ever heard. Always had doubts about Multi-Head attention and now finally I can visualise this 100%. Thanks a lot Umar Jamil.

  • @saima6759
    @saima67592 ай бұрын

    transformer model never got so clear to me! thank you Umar!

  • @sergewilsonmendy9051
    @sergewilsonmendy905110 ай бұрын

    Thank you man, this is the best transformer video I've seen. Well explained and very detailed.

  • @channel8048
    @channel804810 ай бұрын

    This is very clear! Better than anything I have read up till now. Grazie!

  • @70152136
    @701521364 ай бұрын

    your presentation skill are simply amazing!!! best video on transformers I've seen so far

  • @debjyotimukherjee8275
    @debjyotimukherjee8275Ай бұрын

    Excellent video gave a complete description with a great explanation. Looking forward to more such amazing content!

  • @yuk-hoiyiu7023
    @yuk-hoiyiu70233 ай бұрын

    The only video that explains the difference between training and inference in the Transformer model!

  • @ameyadesai6382
    @ameyadesai63826 ай бұрын

    The best explanation on this paper, can't wait to see the other videos on this topic.

  • @BritskNguyen
    @BritskNguyen2 ай бұрын

    this is the best lecture on transformer one can get, period.

  • @albert4392
    @albert43929 ай бұрын

    I really appreciate your talent to present knowledge. Nice explaination, thank you so much!

  • @juwanyirenda3457
    @juwanyirenda34575 ай бұрын

    Excellent exposition! Thank you Umar for the great work.

  • @Udayanverma
    @Udayanverma6 ай бұрын

    I would understand much deeper with your explanation. Rest of the world is scarying with diagrams and tables without explaining practical implementation. thank you dear!

  • @gauravmalik3911
    @gauravmalik39113 ай бұрын

    Detailed explanation, did great work on explaining difficult topic by dividing in chunks, I don't think any part is missed in explanation. Best Explanation

  • @hamzaomari7052
    @hamzaomari70522 ай бұрын

    This is the best explanation, it took me 4 hours, to take notes and revise stuff, and going with you word by word, with intuitions, and now I feel that I truly understand the transformer architecture and the mathematical intuition behind every detail. A thing that you cannot find in any other video. Thank you so much sir, this is very instructif and helpful.

  • @sudzam
    @sudzamАй бұрын

    What a wonderful video with clear explanation! Thanks for making this and sharing with the community.

  • @JohnSmith-he5xg
    @JohnSmith-he5xg7 ай бұрын

    The best overview I've seen. Great job!

  • @shuchenwu170
    @shuchenwu1702 ай бұрын

    This tutorial translates complex and terse structures into intuitions. A masterpiece of tutorials!

  • @saravanannatarajan6515
    @saravanannatarajan65153 ай бұрын

    One of the best videos I have seen on this topic. Thanks a lot for making it easy for us. Great effort, hats off!

  • @priyanjaligoel4294
    @priyanjaligoel42943 ай бұрын

    omg! I love it. Finally so many answers to my questions. I had a very abstract version of the process in my head before but now its much clearer. Thank you so much!

  • @nirajdesai
    @nirajdesaiАй бұрын

    Brilliant explanation of basics - thanks for putting this video together!

  • @1tahirrauf
    @1tahirrauf9 ай бұрын

    Umar! You nailed it. Please make more videos. It was truly helpful. Thank you.

  • @_seeker423
    @_seeker4232 ай бұрын

    The clearest explanation of a very important breakthrough paper that I have seen on KZread. Thank you!

  • @_seeker423

    @_seeker423

    2 ай бұрын

    One thing that I felt was missing is the logical explanation of what is the role of value vector (V).

  • @haoming3430
    @haoming3430Ай бұрын

    Your video is very helpful and easy to follow. I have to say this is the best tutorial about transformer I've seen.

  • @atrijpaul4009
    @atrijpaul40094 ай бұрын

    Best explanation of Attention throughout KZread!!!!! Thank you sir for making this video and helping us..

  • @ltbd78
    @ltbd78Ай бұрын

    You are incredible. Please continue making these type of tutorials.

  • @aurelagbodoyetin3321
    @aurelagbodoyetin33215 ай бұрын

    This is a masterclass. Thank you for your work

  • @NJCLM
    @NJCLM4 ай бұрын

    This video is surely among the top 3 among the 50 videos that I watched to understand this subject. We are very grateful to you, keep the energy, KZread numbers will follow !

  • @marsupilami125

    @marsupilami125

    2 ай бұрын

    Can you tell me the other 2?🙏

  • @ActualCode0
    @ActualCode06 ай бұрын

    I like how u used examples and drew out the matrices to show what was going on in the attention block. It rly helped me understand the concept better

  • @danielvillalba4457
    @danielvillalba44575 ай бұрын

    Lots of new insights about transformers technology, every document and video provides more details, great video sir!

  • @richeek10
    @richeek102 ай бұрын

    Such a nice explanation with a soothing voice. Thanks so much!

  • @brothachris
    @brothachris10 ай бұрын

    Excellent tutorial! Please keep up the great work.

  • @madhuvamsi7055
    @madhuvamsi70557 ай бұрын

    You've definitely earned a lifelong subscriber bro! Great video.

  • @rkjellbe
    @rkjellbe6 ай бұрын

    Thank you, Umar. This was very helpful and I feel I have a much better understanding of the process now. Great work!

  • @brunogatti383
    @brunogatti383Ай бұрын

    Best video for attention mechanism hands down

  • @dalilabdouraman3557
    @dalilabdouraman35574 ай бұрын

    Definetely the best explanation of the mutli head attention with the transformer ...just awesome

  • @AbhinavSharma-dc3kv
    @AbhinavSharma-dc3kvАй бұрын

    the best explanation for attention architecture. kudos to you sir!

  • @tipu461
    @tipu4619 ай бұрын

    I really appreciate your efforts to make it understandable for us 👍. Thanks a lot.

  • @lyte69
    @lyte696 ай бұрын

    Thank you for your great explanation and effort, this was very informative and honestly there are no problems with the video, it's only a preference for me if there was some code alongside each part explained so it's even better understood, but I want you to know that this was a huge help thank you again. ❤

  • @Zineb-ru8bp
    @Zineb-ru8bp5 ай бұрын

    I was struggling trying to understand Transformers but you make it easy for me. Thank you so much

  • @sujeethav9885
    @sujeethav988515 күн бұрын

    This is just perfect! A wholesome video on Transformers!

  • @nadyaabdel5559
    @nadyaabdel55594 ай бұрын

    Amazing explanation. First time every bit is super clear. Thank you.

  • @skc909887u
    @skc909887u7 ай бұрын

    This is the best explanation for an engineer for sure .love this

  • @jeffrey5602
    @jeffrey56027 ай бұрын

    This is pure gold. Thank you so much for your efforts

  • @andreicristea997
    @andreicristea9977 ай бұрын

    Finally the fancy "black box" called transformer became more understandable for me. Really interested in the other content you are making. Thanks for the explanation.

  • @aeigreen
    @aeigreen8 ай бұрын

    great explanation. thank you for demistify trasformer. I have come to your explantion after watching countless videos on transformer, your explanation is simply the best.

  • @oleksandrasaskia
    @oleksandrasaskia2 ай бұрын

    Thank you SO MUCH for your humane, empathic explanation! This means a lot! Keep it up!

  • @rajkrishnamurthy8474
    @rajkrishnamurthy84747 ай бұрын

    Love it Umar. This is the best explanation of the paper. Thank you very much.

  • @bsuhaib
    @bsuhaib9 ай бұрын

    This is called decoding a transformer. What I really liked was explaining each chunk. That was really helpful for this topic and surely taught me the approach to decode any problem. Jazaakallah ul Khair

  • @ciliamadani3046
    @ciliamadani30462 ай бұрын

    The best explanation I have ever watched, thank you

  • @vincetran6321
    @vincetran63218 ай бұрын

    Best explanation of transformer ive come across! Thanks so much :)

  • @user-pz5nn2kg2j
    @user-pz5nn2kg2j4 ай бұрын

    The best video explaining the Transformer so clearly I have ever seen. Thanks very much for your efforts. I really appreciate your methods of explaining every steps with a concrete examples and explicitly give the shapes of every matrices that involve. The shapes of matrices in each step are the most confusing part for me to understand Transformer models, and you make it so clear for me. Thanks a lot Umar.

  • @umarjamilai

    @umarjamilai

    4 ай бұрын

    不客气!你们可以在领英交流

  • @srikanthvoleti5942
    @srikanthvoleti59423 ай бұрын

    Superb video, the best explanation, I have been trying to understand transformers for a long time and this definitely helped me a lot

  • @arnonil
    @arnonil4 ай бұрын

    Thank you for the excellent introduction. I'm looking forward to your advanced topic videos on Transformers, especially those that include examples of using Transformers for various tasks or scenarios with only the Encoder.

  • @umarjamilai

    @umarjamilai

    4 ай бұрын

    You should watch my explanation about BERT then ;-)

  • @adrianovr9735
    @adrianovr9735Ай бұрын

    Best explanation of Transformer, HANDS DOWN

  • @baabakasadi5440
    @baabakasadi54402 ай бұрын

    Thanks for the beautiful explanation.

  • @arrozenescau1539
    @arrozenescau15394 ай бұрын

    This is by far the BEST explanaition of Transformers i have ever seen, amazing video.

  • @somdubey5436
    @somdubey54363 ай бұрын

    you have put such a hard work to explain it so clearly.....hats off to you :)

  • @koko-wf8vz
    @koko-wf8vz6 ай бұрын

    Thank you so much for this video, hands on the best in depth video i have seen. I love the graphical explanations, it helps to visualize matrixes for a math noob :) much love

  • @Stephanfreund
    @Stephanfreund4 ай бұрын

    Awesome explanation for those who seek to truly understand the fundamentals of the most important paper of this decade

  • @AvinashKumar-pb2op
    @AvinashKumar-pb2op18 күн бұрын

    Best Explanation Ever Existed in the whole Universe !!

  • @xue8888
    @xue888811 ай бұрын

    Thank you man, you are amazing. Keep it up ❤ good luck, I have fingers crossed for your success

  • @alexandredamiao1365
    @alexandredamiao13653 ай бұрын

    Thank you for for putting this lesson together! This is great stuff!

  • @bornabiljan1294
    @bornabiljan129410 ай бұрын

    Excellent video! Thank you for making it.

  • @hugopristauz538
    @hugopristauz5387 ай бұрын

    good job - your single stepping (with remarking) is really helpful

  • @user-xk7dy4nb7w
    @user-xk7dy4nb7w5 ай бұрын

    Thank you for the excellent video. It is very illustrative, and you explained each concept very well.

  • @noeloc
    @noeloc4 ай бұрын

    Great work, thanks for putting this together!!

  • @shakibyazdani9276
    @shakibyazdani92764 ай бұрын

    best video on transformers I've seen so far

  • @vassilisworld
    @vassilisworld8 ай бұрын

    amazing tutorial Umar. I finally understood the transformer.

Келесі