Attention mechanism: Overview
Ғылым және технология
This video introduces you to the attention mechanism, a powerful technique that allows neural networks to focus on specific parts of an input sequence. Attention is used to improve the performance of a variety of machine learning tasks, including machine translation, text summarization, and question answering.
Enroll in this course on Google Cloud Skills Boost → goo.gle/436ZFPR
View the Generative AI Learning path playlist → goo.gle/LearnGenAI
Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech
Пікірлер: 59
Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech
@caughtya40
3 ай бұрын
0:12
4:08 "H_b"... I could not find H_b here :-( I don't understand what are the H_d7 entities in the diagram. So confusing.
@aqgi7
3 ай бұрын
I think she meant H_d, with d for decoder. H_d7 would be the 7th hidden state produced by decoder. But not clear why H_d7 appears three times (or more).
Besides some mistakes. The invertion mechanism is not clear here. Where in the final slide is it shown? All I see is a correct order of words. Would be great to visualize where and how the ordering occurs.
Great video. One tip: Include some sort of pointer so you can direct the attention of the viewer towords a particular part of the slide. It helps following your explantion of the information dense slides.
Still not clear for me. How does the network know which hidden state should have the higher score?
@unknown-otter
8 ай бұрын
I guess the answer you were looking for is the following: the same as the network knows how to classify digits, for example. It learns it by optimizing a loss function through backprop. So, attention is not a magic thing that connects inputs with outputs but just a mechanism for a network to learn what it needs to attend to. One cool thing is that you can think of attention head as a fully connected layer with weights that can change based on the input. While a normal fully connected layer has fixed weights and will process any data with them, attention head first calculates what would is the most beneficial in that input data and then run it through a fully connected layer!
So confusing...😵💫
@alileo1578
7 ай бұрын
Yeah, many many concepts depend on the neural networks and deducing parameters with back-propagation
@JohnDoe-pq8yw
16 күн бұрын
This takes place after the base model is trained and there are fine tuning training mechanisms as well, so this is not confusing at all, it is part of the information about LLM's.
confusing
Thanks to the creator. Will be coming back to this video which is amazing and well detailed
I watched it almost 4 times and still not able figure out. Where is Alpha in the slide 3:58?
@dhirajkumarsahu999
6 ай бұрын
she referred 'a' as alpha
Google should give attention to simplify the content to public , couldn't completely get the concept .
Thanks for the hidden states, very clear.
Very helpful video, but I got confused at one point and am hoping you can help clarify some points. At timestamp 4:14: You talk of "alpha" representing the attention weight at each time step. I don't see any "alpha" onscreen, so am a bit confused. Is "alpha" a weight that will get adjusted with training and indicates how important that particular word is at time step 1 in the decoding process? I'm also not completely clear on the difference between hidden state and weights, could you explain this? It would help me if while explaining you could point to the value you're referring to onscreen and if it were possible to clarify that when you talk about time step, you are referring to the first decoder time step (is that right?)
@NetworkDirection
5 ай бұрын
I assume by 'alpha' she means 'a'
@m1erva
4 ай бұрын
hidden state is activation function for each word
Just a quick question, I'm not able to wrap my head around how encoder gets the decoder hidden state annotated by Hd?
@kartikthakur-ql9yn
Жыл бұрын
Encoder doesn't it decoder hidden states .. it's opposite
@MrAmgadHasan
Жыл бұрын
What happens is: The encoder encodes the input and passes it to the decoder. For each time step in the output, the decoder gets the hidden states of all time steps concatenated as a matrix. It then calculates the attention weights.
@thejaniyapa3660
11 ай бұрын
@@MrAmgadHasan Thanks for the explanation. Then how does the Encoder hidden states said to be associated with each word(3.26)? It should be part of sentence before nth word + nth word
where is alpha in this whole diagram! why do you guys make it more difficult than it is.
Very complex concepts that were well presented. I may not understand everything (I didn’t-but that is a reflection of my ignorance), the overall picture of what occurred is clear. Thank you.
@carloslami9184
4 ай бұрын
1:01
I think you are introducing an interesting angle that hasn’t been presented before. Thanks.
Where’s the alpha on the slide?
Hard to understand the final slide....
Beside some mistakes, it is still not clear to me how the inverting mechanism operates. All i can observe is an already correctly ordered sequence of words. Would be great to visualize where and how the ordering occurs.
thanks
I think this is explanation for general attention mechanism and not attention in transformers.
confusing😢
Ok got it watched thank you yeah
this is so confusing. Why are Google courses so difficult to understand?
too high leveled, not enough detail... where are the dislikes?
Is this video made by a generative AI 😂?
Yeah okay watched
4:04 there is no alpha but an "a" in the sum on the left.
❤
La explicación es pobre, esconden gran cantidad de procesos.
I think these tutorials are thrown in the internet to further slow down and confuse people. The video explains nothing. It will only make sense to people who already know attention mechanism.
Confusing video. Very difficult to follow
just a waste of time and memory for youtube servers
You are the example why everyone should not start making youtube videos. You literally made a simple topic look complex.
@jiadong7873
3 ай бұрын
agree
@Dom-zy1qy
3 ай бұрын
Disagree heavily. For me, this was more palletable than other videos I'd seen on the subject. Don't see the point for needlessly harsh criticism.
@Omsip123
2 күн бұрын
You are the example why commenting should be disabled
@Omsip123
2 күн бұрын
Besides, you probably meant to write "not everyone should" instead of "everyone should not" but that might be too complex too.
Regurgitating spoon fed knowledge … google has fallen behind.
This is a poor video for someone who does not know this topic.
confusing
confusing
confusing
@Shmancyfancy536
2 ай бұрын
You’re not gonna learn it in 5 min
confusing
confusing
confusing