Neural Networks: Stochastic, mini-batch and batch gradient descent
What is the difference between stochastic, mini-batch and batch gradient descent?
Which is the best? Which one is recommended?
0:00 Introduction
0:20 How do we train a neural network?
1:25 3 types of gradient descent
1:55 My silly training dataset
2:55 Stochastic gradient descent
4:05 Mini-batch gradient descent
5:20 Batch gradient descent
5:45 What is an epoch?
7:10 So why do we not use batch gradient descent?
8:10 What does the literature say on gradient descent in neural networks?
8:12 Goodfellow - stochastic gradient descent
9:10 Wikipedia - stochastic gradient descent
9:25 Lecun - BackProp, stochastic gradient descent
9:58 Andrew Ng - mini-batch and stochastic
10:43 Conclusion
Пікірлер: 29
A good video for student of Machine Learning to get a grip of back propagation, chain rule and gradient descent concept - Thank you very much!
Excellent video with neat and clear explanation for a beginner to learn and motivate towards neural networks.
Short and concise but very understandably explained. Thank you very much!
No even the half of the video and you already answered most of my questions. My deeply gratitude for this video and your clean and smooth explanation, despite that I increased the audio speed, however your speech rate is perfect. Thank you once again, I have been struggling to understand those terminology and most bibliography assume that the readers know what that means. You could not explain it better. Just one small suggestion: get rid off the white background and use something more pleasing to our eyes. Most consumer of this type of content live in front of a computer and a darker background will be appreciated.
Perfectly explain 😌 finally the best explanation ❤ thankyou sir
Great video, really easy to understand. Thanks!
Excellent sir!Thank you for this awesome tutorial
awesome explanation, thanks so much ^^
Very good ! Thanks for the lesson!
@bevansmithdatascience9580
3 жыл бұрын
Glad to help :)
Excellent video
Just amazing, nothing to say, thank you very much...
Just Amazing
Thank you for your videos. Very well explained. However, how the mini batch looks like in a practice? How to put multiple rows in the network in the same time? Do we need bigger layer? Could you provide some details how to do that? Thank you.
Your series is very helpful and easy to understand.
@bevansmithdatascience9580
Жыл бұрын
Glad to hear that!
Such an amazing video!!!!! Thanks a lottttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt
Thx dad
Nice
thanks for lession
@bevansmithdatascience9580
2 жыл бұрын
Pleasure
I dont understant in batch you sum gradient to update neurones right? For example you have 2 neurones in output and you get 2 gradients diferents to each one and if you have 4 training datas you get 8 gradients and to get a general gradient to each one you need sum all gradients of output 1 and divide between 4 to get a general gradient of output 1 and to get a gradient of output 2 you need sum all gradients of output 2 and divide between 4 to get a general gradient of output 2 and you update each weight with respective gradient. Im ok? Because i dont understant how batch work.
Why not keep on iterating back propagation on one input vector until the NN gives the right answer? Did I miss something?
@bevansmithdatascience9580
8 ай бұрын
Not sure I understand your question, but if you only feed in one vector of data, then it will only train on that tiny slice. You need to feed in the entire dataset. What this video is saying is that you can feed the entire dataset in piece by piece. Each time a batch is fed in, the network can learn a little. It is fed in bit by bit until the entire dataset is complete. When you have fed the entire dataset in for training, that is called an epoch. Then you start again and do it again until another epoch. And on and on until the the model has learned sufficiently.
Is each mini batch using the average of the losses batch to update its weights and biases? this part is unclear.
@bevansmithdatascience9580
3 ай бұрын
however large the batch size is, it calculates a mean squared error (if regression) of those samples
We are actually machine-learner-learners...the 2nd derivative.
@HoHoHaHaTV
3 ай бұрын
*Machine Learning Learners
The whole video: "Mini-batch" is like a batch but smaller. No calculus