Backpropagation in Convolutional Neural Networks (CNNs)

In this video we are looking at the backpropagation in a convolutional neural network (CNN). We use a simple CNN with zero padding (padding = 0) and a stride of two (stride = 2).
► SUPPORT THE CHANNEL
➡ Paypal: www.paypal.com/donate/?hosted...
These videos can take several weeks to make. Any donations towards the channel will be highly appreciated! 😄
► SOCIALS
X: x.com/far1din_
Github: github.com/far1din
Manim code: github.com/far1din/manim#back...
---------- Content ----------
00:00 - Introduction
00:51 - The Forward propagation
02:23 - The BackPropagation
03:31 - (Intuition) Setting up Formula for Partial Derivatives
06:07 - Simplifying Formula for Partial Derivatives
07:05 - Finding Similarities
08:55 - Putting it All together
---------- Contributions ----------
Background music: pixabay.com/users/balancebay-...
#computervision #convolutionalneuralnetwork #ai #neuralnetwork #deeplearning #neuralnetworksformachinelearning #neuralnetworksexplained #neuralnetworkstutorial #neuralnetworksdemystified #computervisionandai #backpropagation

Пікірлер: 94

  • @louissimon2463
    @louissimon2463 Жыл бұрын

    great video, but i don't understand how we can find the value of the dL/dzi terms. At 7:20 you make it seem like dL/dzi = zi, is that correct?

  • @far1din

    @far1din

    9 ай бұрын

    No, they come from the loss function. I explain this at 4:17. It might be a bit unclear so I’ll highly reccomend you watch the video from 3blue1brown: kzread.info/dash/bejne/pn2Zqq6nmtabhZs.htmlsi=Z6asTm87XWcW1bVn 😃

  • @rtpubtube

    @rtpubtube

    6 ай бұрын

    I'm with @louissimion, you show how dL/dw1 is related to dz1/dw1+... (etc), but you never show/expain where dL/dz1 (etc) comes from. Poof - miracle occurs here. Having a numerical example would help a lot. This "theory/symbology" only post is therefore incomplete/useless from a learing/understanding standpoint.

  • @mandy11254

    @mandy11254

    2 ай бұрын

    ​@@rtpubtubeIt's quite literally what he wrote. He hasn't defined a loss function so that's just what it is from the chain rule. If you're asking how the actual value of dL/dz1 is computed, the last layer has its own set of weights besides the ones shown in the video, in addition to an activation function. You use that and a defined loss function to compute dL/dzi. It's similar to what you see in standard NNs. If you studied neural networks, you should know this. This is a video about CNNs not an intro to NNs. Go study that before this. It's not his job to point out every little thing.

  • @abhimanyugupta532
    @abhimanyugupta5322 ай бұрын

    Been trying to understand backpropogation in CNN for years until today! Thanks a ton mate!

  • @yosukesharp

    @yosukesharp

    2 ай бұрын

    it was obvious primitive algo dude... people like you are being called "data scientists" now, which is really sad...

  • @JessieJussMessy
    @JessieJussMessy Жыл бұрын

    This channel is a hidden gem. Thank you for your content

  • @haideralix
    @haideralix9 ай бұрын

    I have seen few videos before, this one is by far the best one. It breaks down each concept and answers all the questions that comes in the mind. The progression, the explanation is best

  • @far1din

    @far1din

    9 ай бұрын

    Thank you! 🔥

  • @DVSS77
    @DVSS77 Жыл бұрын

    really clear explanation and good pacing. I felt I understood the math behind back propagation for the first time after watching this video!

  • @nizamuddinkhan9443
    @nizamuddinkhan9443 Жыл бұрын

    Very well explanation, I search many videos but no body explained regarding change in filter's weight. Thank you so much for this animated simple explanation.

  • @boramin3077
    @boramin307718 күн бұрын

    Best video to understand what is going on the under the hood of CNN.

  • @khayyamnaeem5601
    @khayyamnaeem5601 Жыл бұрын

    Why is this channel so underrated? You deserve more subscribers and views.

  • @eneadriancatalin

    @eneadriancatalin

    Жыл бұрын

    Perhaps developers use ad blockers, and as a result, KZread needs to ensure revenue by not promoting these types of videos (that's my opinion)

  • @zemariamm
    @zemariamm9 ай бұрын

    Fantastic explanation!! Very clear and detailed, thumbs up!

  • @sourabhverma9034
    @sourabhverma90342 ай бұрын

    Really intuitive and great animations.

  • @bambusleitung1947
    @bambusleitung19473 ай бұрын

    great job. this explanation is really intuitive

  • @saikoushik4064
    @saikoushik40644 ай бұрын

    Great Explanation, helped me understand the background working

  • @DSLDataScienceLearn
    @DSLDataScienceLearn6 ай бұрын

    great explanation, clear direct and understandable, sub!

  • @RAHUL1181995
    @RAHUL1181995 Жыл бұрын

    This was really helpful....Thank you so much for the vizualization...Keep up the good work...Looking forward to your future uploads.

  • @giacomorotta6356
    @giacomorotta6356 Жыл бұрын

    great video, underrated channel , please keep it up with CNN videos!

  • @paedrufernando2351
    @paedrufernando2351 Жыл бұрын

    your channel is a Hidden Gem..My suggestion is to start a discord and get some crowd functing and one on ones for people who want to learn from you..youa re gifted in teaching.

  • @ramazanyel5979
    @ramazanyel59792 ай бұрын

    excellent. the exact video i was looking for.

  • @guoguowg1443
    @guoguowg14433 ай бұрын

    great stuff man, crystal clear!

  • @pedroviniciuspereirajunho7244
    @pedroviniciuspereirajunho72447 ай бұрын

    Amazing! I was looking for some material like this a long time ago and only found it here, beautiful :D

  • @far1din

    @far1din

    7 ай бұрын

    Thank you my brother 🔥

  • @shazzadhasan4067
    @shazzadhasan40678 ай бұрын

    Great explanation with cool visual. Thanks a lot.

  • @far1din

    @far1din

    8 ай бұрын

    Thank you my friend 😃

  • @markuskofler2553
    @markuskofler2553 Жыл бұрын

    Couldn’t explain it better myself … absolutely amazing and comprehensible presentation!

  • @heyman620
    @heyman62010 ай бұрын

    What a masterpiece.

  • @Peterpeter-hr8gg
    @Peterpeter-hr8gg9 ай бұрын

    what i was looking for. well explained

  • @Joker-ez2fm
    @Joker-ez2fm7 ай бұрын

    Please do not stop making these videos!!!

  • @far1din

    @far1din

    7 ай бұрын

    I won’t let you down Joker 🔥🤝

  • @user-gg2ov3up5k
    @user-gg2ov3up5k11 ай бұрын

    Nicely put, thank you so much.

  • @MarcosDanteGellar
    @MarcosDanteGellar Жыл бұрын

    the animations were super useful, thanks!

  • @aliewayz
    @aliewayz2 ай бұрын

    really beautiful, thanks.

  • @elgs1980
    @elgs1980 Жыл бұрын

    Thank you so much!!! This video is so so so well done!

  • @far1din

    @far1din

    Жыл бұрын

    Thank you. Hope you got some value out of this! 💯

  • @aikenkazin4096
    @aikenkazin40968 ай бұрын

    Great explanation and visualization

  • @far1din

    @far1din

    8 ай бұрын

    Thank you my friend 🔥🚀

  • @jayeshkurdekar126
    @jayeshkurdekar126 Жыл бұрын

    You are a great example of fluidity of thought and words..great explanation

  • @far1din

    @far1din

    Жыл бұрын

    Thank you my friend. Hope you got some value! :)

  • @jayeshkurdekar126

    @jayeshkurdekar126

    Жыл бұрын

    @@far1din sure did

  • @LeoMarchyok-od5by
    @LeoMarchyok-od5by3 ай бұрын

    Best explanation

  • @PlabonTheSadEngineer
    @PlabonTheSadEngineer6 ай бұрын

    please continue your videos !!

  • @gregorioosorio16687
    @gregorioosorio1668710 ай бұрын

    Thanks for sharing!

  • @akshchaudhary5444
    @akshchaudhary54446 ай бұрын

    amazing video thanks!

  • @harshitbhandi5005
    @harshitbhandi50057 ай бұрын

    great explanation

  • @osamamohamedos2033
    @osamamohamedos20333 ай бұрын

    Masterpiece 💕💕

  • @objectobjectobject4707
    @objectobjectobject4707 Жыл бұрын

    Great example thanks a lot

  • @AsilKhalifa
    @AsilKhalifa22 күн бұрын

    Thanks a lot!

  • @ManishKumar-pb9gu
    @ManishKumar-pb9gu6 ай бұрын

    thanku you so much for this

  • @samiswilf
    @samiswilf Жыл бұрын

    Well done.

  • @yuqianglin4514
    @yuqianglin45148 ай бұрын

    fab video! help me a lot

  • @far1din

    @far1din

    8 ай бұрын

    Glad to hear that you got some value out of this video! :D

  • @r0cketRacoon
    @r0cketRacoon6 күн бұрын

    tks u very much for this video, but it's probably more helpful if you also add a max pooling layer.

  • @ziligao7594
    @ziligao75942 ай бұрын

    Amazing

  • @farrugiamarc0
    @farrugiamarc04 ай бұрын

    This is a topic which is rarely explained online, but it was very clearly explained here. Well done.

  • @SolathPrime
    @SolathPrime Жыл бұрын

    Well explained now I need to code it my self

  • @far1din

    @far1din

    Жыл бұрын

    Haha, that’s the hard part

  • @SolathPrime

    @SolathPrime

    Жыл бұрын

    @@far1din I think I came up with a solution Here def backward(self, output_gradient, learning_rate): kernels_gradient = np.zeros(self.kernels_shape) input_gradient = np.zeros(self.input_shape) for i in range(self.depth): for j in range(self.input_depth): kernels_gradient[i, j] = convolve2d(self.input[j], output_gradient[i], "valid") input_gradient[j] += convolve2d(output_gradient[i], self.kernels[i, j], "same") self.kernels -= learning_rate * kernels_gradient self.biases -= learning_rate * output_gradient return input_gradient First i initialized the kernel gradient as an array of zeros with the kernel shape then I iterated through the depth of the kernels the the depth of the input then for each gradient withe respect to the kernel I did the same to compute the input gradients Your vid helped me understand the backward method better So I have to say thank you sooo much for it

  • @SolathPrime

    @SolathPrime

    Жыл бұрын

    @@far1din I'll document the solution and but it here when I do please pin the comment

  • @far1din

    @far1din

    Жыл бұрын

    @@SolathPrime That’s great my friend. Will pin 💯

  • @OmidDavoudnia
    @OmidDavoudnia3 ай бұрын

    Thanks.

  • @simbol5638
    @simbol56387 ай бұрын

    +1 sub, excellent video

  • @far1din

    @far1din

    7 ай бұрын

    Thank you! 😃

  • @rodrigoroman4886
    @rodrigoroman48869 ай бұрын

    Great video!! Your explanation is the best I have found. Could you please tell me what software you use for the animations ?

  • @far1din

    @far1din

    9 ай бұрын

    I use manim 😃 www.manim.community

  • @PeakyBlinder-lz2gh
    @PeakyBlinder-lz2gh6 ай бұрын

    thx

  • @user-ki3jf6gu6l
    @user-ki3jf6gu6l4 ай бұрын

    I've had no trouble learning about the 'vanilla' neural networks. Although your videos are great, I can't seem to find resources that delve a little deeper into the explanations of how CNNs work. Are there any resources you would recommend ?

  • @im-Anarchy
    @im-Anarchy8 ай бұрын

    perfect, one suggestion make videos a little longer 20-30 is a good number

  • @far1din

    @far1din

    8 ай бұрын

    Haha, most people don't like these kind of videos too long. Average watchtime for this video is about 3minutes :P

  • @im-Anarchy

    @im-Anarchy

    8 ай бұрын

    ​@@far1din​oh shii! 3 minutes, that was very unexpected, maybe it's because people revisit the video to revise specific topic.

  • @far1din

    @far1din

    8 ай бұрын

    Must be 💯

  • @govindnair5407
    @govindnair54074 ай бұрын

    What is the loss function here, and how are the values in the flattened z matrix used to compute yhat ?

  • @piyushkumar-wg8cv
    @piyushkumar-wg8cv9 ай бұрын

    Great explanation. Can you please tell which tool do you use for making these videos.

  • @far1din

    @far1din

    9 ай бұрын

    Thank you my friend! I use manim 😃 www.manim.community

  • @SiddhantSharma181
    @SiddhantSharma181Ай бұрын

    Is the stride only along the rows, and not along columns? Is is common or just simplified?

  • @arektllama3767
    @arektllama3767 Жыл бұрын

    1:15 why do you iterate in steps of 2? If you iterated by 1 then you could generate a 3x3 layer image. Is that just to save on computation time/complexity or is there something other reason for it?

  • @far1din

    @far1din

    Жыл бұрын

    The reason why I used a stride of two (iterations in steps of two) in this video is partially random and partially because I wanted to highlight that the stride when performing backpropagation should be the same as when performing the forward propagation. In most learning materials I have seen, they usually use a stride of one, hence a stride of one for the backpropagation. This could lead to confusion when operating with larger strides. The stride could technically be whatever you like (as long as you keep it within the dimensions of the image/matrix). I could have chosen another number for the stride as you suggested. In that case, with a stride of one, the output would be a 3 x 3 matrix/image. Some will say that a shorter stride will encapsulate more information than a larger one, but this becomes “less true” as the size of the kernel increases. As far as I know there are no “rules” for when to use larger strides and not. Please let me know if this notion has changed as everything changes so quickly in this field! 🙂

  • @arektllama3767

    @arektllama3767

    Жыл бұрын

    @@far1din I never considered how stride length could change depending on kernel size. I guess that makes sense, the larger kernel could cover the same data as a small kernel, just in fewer steps/iterations. I also figured you intentionally generated a 2x2 image since that’s a lot simpler than a 3x3 and this an educational video. Thanks for the feedback, that was really insightful!

  • @ItIsJan
    @ItIsJan10 ай бұрын

    5:24 does this just mean we divide z1 by w1 and ultiply by L divided by z1 and do that for all z'S to get the partial derivative of L in respect to w1?

  • @far1din

    @far1din

    9 ай бұрын

    It’s not that simple. Doing the actual calculations is a bit more tricky. Given no activation function, Z1 = w1*pixel1 + w2*pixel2 + w3*pixel3… you now have to take the derivative of this with respect to w1, then y = z1*w21 + z2*w22… take the derivative of y with respect to z1 etc. The calculus can be a bit too heavy for a comment like this. I’ll highly reccomend you watch the video by 3blue1brown: kzread.info/dash/bejne/pn2Zqq6nmtabhZs.htmlsi=Z6asTm87XWcW1bVn 😃

  • @bnnbrabnn9142
    @bnnbrabnn91424 ай бұрын

    What about the weights of the fully connected layer

  • @mandy11254

    @mandy11254

    2 ай бұрын

    No point in adding it to this video since that's something you should know from neural networks. That's why he just leaves it as dL/dzi.

  • @MoeQ_
    @MoeQ_ Жыл бұрын

    dL/dzi = ??

  • @far1din

    @far1din

    9 ай бұрын

    I explain the term at 4:17. It might be a bit unclear so I’ll highly reccomend you watch the video from 3blue1brown: kzread.info/dash/bejne/pn2Zqq6nmtabhZs.htmlsi=Z6asTm87XWcW1bVn 😃

  • @user-oq7ju6vp7j
    @user-oq7ju6vp7j9 ай бұрын

    You have nices videos, that helped me better understand the concept of CNN. But, from this video, it is not really obvious that matrix dL/dw - is convolution of image matrix and dL/dz matrix, as showed here kzread.info/dash/bejne/gqJrtK1wpNLMgMo.html. The stride of two is also a little bit confusing

  • @far1din

    @far1din

    9 ай бұрын

    Thank you for the comment! I believe he is doing the exact same thing (?) I chose to have a stride of two in order to highlight that the stride should be similar to the stride used during the forward propagation. Most examples stick with a stride of one. I now realize it might have caused some confusion :p

  • @burerabiya7866
    @burerabiya7866 Жыл бұрын

    Hello well explained. I need your presentation

  • @far1din

    @far1din

    11 ай бұрын

    Just download it 😂

  • @int16_t
    @int16_t10 ай бұрын

    w^* is an abuse of math notation, but it's convenient.

  • @CorruptMem
    @CorruptMem Жыл бұрын

    I think it's spelled "Convolution"

  • @far1din

    @far1din

    Жыл бұрын

    Haha thank you! 🚀