Attention mechanism: Overview

Ғылым және технология

This video introduces you to the attention mechanism, a powerful technique that allows neural networks to focus on specific parts of an input sequence. Attention is used to improve the performance of a variety of machine learning tasks, including machine translation, text summarization, and question answering.
Enroll in this course on Google Cloud Skills Boost → goo.gle/436ZFPR
View the Generative AI Learning path playlist → goo.gle/LearnGenAI
Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech

Пікірлер: 59

@googlecloudtech Жыл бұрын
Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech
@caughtya40
3 ай бұрын
0:12
@alexanderscoutov6 ай бұрын
4:08 "H_b"... I could not find H_b here :-( I don't understand what are the H_d7 entities in the diagram. So confusing.
@aqgi7
3 ай бұрын
I think she meant H_d, with d for decoder. H_d7 would be the 7th hidden state produced by decoder. But not clear why H_d7 appears three times (or more).
@user-tq2og8wo7m4 ай бұрын
Besides some mistakes. The invertion mechanism is not clear here. Where in the final slide is it shown? All I see is a correct order of words. Would be great to visualize where and how the ordering occurs.
@PapiJack3 ай бұрын
Great video. One tip: Include some sort of pointer so you can direct the attention of the viewer towords a particular part of the slide. It helps following your explantion of the information dense slides.
@samuelqueiroz1568 ай бұрын
Still not clear for me. How does the network know which hidden state should have the higher score?
@unknown-otter
8 ай бұрын
I guess the answer you were looking for is the following: the same as the network knows how to classify digits, for example. It learns it by optimizing a loss function through backprop. So, attention is not a magic thing that connects inputs with outputs but just a mechanism for a network to learn what it needs to attend to. One cool thing is that you can think of attention head as a fully connected layer with weights that can change based on the input. While a normal fully connected layer has fixed weights and will process any data with them, attention head first calculates what would is the most beneficial in that input data and then run it through a fully connected layer!
@cy1234310 ай бұрын
So confusing...😵‍💫
@alileo1578
7 ай бұрын
Yeah, many many concepts depend on the neural networks and deducing parameters with back-propagation
@JohnDoe-pq8yw
16 күн бұрын
This takes place after the base model is trained and there are fine tuning training mechanisms as well, so this is not confusing at all, it is part of the information about LLM's.
@llewsub9 ай бұрын
confusing
@for-ever-223 ай бұрын
Thanks to the creator. Will be coming back to this video which is amazing and well detailed
@aakidatta6 ай бұрын
I watched it almost 4 times and still not able figure out. Where is Alpha in the slide 3:58?
@dhirajkumarsahu999
6 ай бұрын
she referred 'a' as alpha
@manjz7hm5 ай бұрын
Google should give attention to simplify the content to public , couldn't completely get the concept .
@BR-hi6yt9 ай бұрын
Thanks for the hidden states, very clear.
@KiranMundy6 ай бұрын
Very helpful video, but I got confused at one point and am hoping you can help clarify some points. At timestamp 4:14: You talk of "alpha" representing the attention weight at each time step. I don't see any "alpha" onscreen, so am a bit confused. Is "alpha" a weight that will get adjusted with training and indicates how important that particular word is at time step 1 in the decoding process? I'm also not completely clear on the difference between hidden state and weights, could you explain this? It would help me if while explaining you could point to the value you're referring to onscreen and if it were possible to clarify that when you talk about time step, you are referring to the first decoder time step (is that right?)
@NetworkDirection
5 ай бұрын
I assume by 'alpha' she means 'a'
@m1erva
4 ай бұрын
hidden state is activation function for each word
@yashvander Жыл бұрын
Just a quick question, I'm not able to wrap my head around how encoder gets the decoder hidden state annotated by Hd?
@kartikthakur-ql9yn
Жыл бұрын
Encoder doesn't it decoder hidden states .. it's opposite
@MrAmgadHasan
Жыл бұрын
What happens is: The encoder encodes the input and passes it to the decoder. For each time step in the output, the decoder gets the hidden states of all time steps concatenated as a matrix. It then calculates the attention weights.
@thejaniyapa3660
11 ай бұрын
@@MrAmgadHasan Thanks for the explanation. Then how does the Encoder hidden states said to be associated with each word(3.26)? It should be part of sentence before nth word + nth word
@Udayanverma7 ай бұрын
where is alpha in this whole diagram! why do you guys make it more difficult than it is.
@richardglady30098 ай бұрын
Very complex concepts that were well presented. I may not understand everything (I didn’t-but that is a reflection of my ignorance), the overall picture of what occurred is clear. Thank you.
@carloslami9184
4 ай бұрын
1:01
@user-wr4yl7tx3w7 ай бұрын
I think you are introducing an interesting angle that hasn’t been presented before. Thanks.
@user-wr4yl7tx3w7 ай бұрын
Where’s the alpha on the slide?
@mushroom45332 ай бұрын
Hard to understand the final slide....
@user-tq2og8wo7m4 ай бұрын
Beside some mistakes, it is still not clear to me how the inverting mechanism operates. All i can observe is an already correctly ordered sequence of words. Would be great to visualize where and how the ordering occurs.
@muskduh11 ай бұрын
thanks
@kartikpodugu5 ай бұрын
I think this is explanation for general attention mechanism and not attention in transformers.
@iGhostr6 ай бұрын
confusing😢
@user-ez9ex8hx6v5 ай бұрын
Ok got it watched thank you yeah
@gg-ke6mw5 ай бұрын
this is so confusing. Why are Google courses so difficult to understand?
@ipurelike9 ай бұрын
too high leveled, not enough detail... where are the dislikes?
@abhiksaha34516 ай бұрын
Is this video made by a generative AI 😂?
@user-ez9ex8hx6v5 ай бұрын
Yeah okay watched
@user-su9pg1jo4x4 ай бұрын
4:04 there is no alpha but an "a" in the sum on the left.
@abhyutaichou8322 Жыл бұрын
❤
@julius30053 ай бұрын
La explicación es pobre, esconden gran cantidad de procesos.
@ChargedPulsar3 ай бұрын
I think these tutorials are thrown in the internet to further slow down and confuse people. The video explains nothing. It will only make sense to people who already know attention mechanism.
@arkaganguly113 күн бұрын
Confusing video. Very difficult to follow
@dariovicenzo81394 ай бұрын
just a waste of time and memory for youtube servers
@saurabhmahra40848 ай бұрын
You are the example why everyone should not start making youtube videos. You literally made a simple topic look complex.
@jiadong7873
3 ай бұрын
agree
@Dom-zy1qy
3 ай бұрын
Disagree heavily. For me, this was more palletable than other videos I'd seen on the subject. Don't see the point for needlessly harsh criticism.
@Omsip123
2 күн бұрын
You are the example why commenting should be disabled
@Omsip123
2 күн бұрын
Besides, you probably meant to write "not everyone should" instead of "everyone should not" but that might be too complex too.
@tamurchoudary34522 ай бұрын
Regurgitating spoon fed knowledge … google has fallen behind.
@kislaya72393 ай бұрын
This is a poor video for someone who does not know this topic.
@user-ep1tz8gv8f9 ай бұрын
confusing
@vedansharts827410 ай бұрын
confusing
@yahavx11 ай бұрын
confusing
@Shmancyfancy536
2 ай бұрын
You’re not gonna learn it in 5 min
@sergeykurk7 ай бұрын
confusing
@dimitrisparaschakis32808 ай бұрын
confusing
@VikasDubeyg10 ай бұрын
confusing