Batch normalization | What it is and how to implement it

In this video, we will learn about Batch Normalization. Batch Normalization is a secret weapon that has the power to solve many problems at once. It is a great tool to deal with the unstable gradients problem, helps deal with overfitting and might even make your models train faster.
Get your free speech-to-text API token 👇
www.assemblyai.com/?...
We will first go into what batch normalization is and how it works. Later we will talk about why you might want to use it in your projects and some benefits of it. And lastly, we will learn how to apply Batch Learning to your models using Python and Keras. Even though it is fairly simple to apply Batch Normalization using Keras, we will touch upon some details that might need extra care.

Пікірлер: 58

  • @ludwigstumpp
    @ludwigstumpp Жыл бұрын

    Thanks for doing these videos. :) As someone who is familiar with Batch Normalization I was personally missing a few important information which is why I add them here for the community: - The normalisation is happening over the batch dimension (in contrast to other variants such as layer normalisation where we normalise over the layer dimension), meaning that we normalize the feature over the mini-batch - which is why it does not work well for smaller batch sizes (usually 16+) - another advantage for the scale and the offset parameter is that it allows the network to undo the BN, meaning that BN can't make your result worse - during test time with e.g one sample only we can't compute mean and std since we don't have a batch. This is why we use running statistics of mean and variance calculated during training

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    Thank you for the additional information Ludwig!

  • @malayali_thaaram

    @malayali_thaaram

    Жыл бұрын

    Thank you for clarifying this!

  • @celilkaangungor6570
    @celilkaangungor657011 күн бұрын

    Tebrikler , en iyi anlatan kanal !

  • @AlexKashie
    @AlexKashie9 ай бұрын

    Best explanation of Normalization and Standardization... Thank you

  • @ramiallouch1052
    @ramiallouch1052 Жыл бұрын

    simple and very useful . thank you for this great content

  • @donmiguel4848
    @donmiguel48482 ай бұрын

    The hint to omit the bias when using afterwards batchnorm_layers is very good, the information that batch_norm can be used while omitting the learnable scaling and offset would also be helpfull because these functionality is also computational very expensive and not the core feature of the batch_norm.

  • @mahdighribi4151
    @mahdighribi415123 күн бұрын

    Best explanation , Thank you

  • @paulj.murphy7447
    @paulj.murphy74478 ай бұрын

    great work! thank you.

  • @sajolsajol8393
    @sajolsajol839311 ай бұрын

    Thank you....That was awesome...

  • @juliusodunuga8911
    @juliusodunuga8911 Жыл бұрын

    Thank you 👍🏻👍🏻👍🏻 The explanation is superb

  • @user-fh7yk8nm6s
    @user-fh7yk8nm6s Жыл бұрын

    Very clear explanation!

  • @karthiklogan9384
    @karthiklogan9384 Жыл бұрын

    Amazing, thanks a lot .

  • @Luca-yy4zh
    @Luca-yy4zh Жыл бұрын

    No one really knows why BN works at the moment. The best intuition we have is that it counteracts the internal covariance shift problem during training.

  • @codematrix

    @codematrix

    Жыл бұрын

    I think you just answered your own question. Its to help keep features and activation values within in a finite range thus avoiding exploding and vanishing gradients. Having said that, though, isn’t that what the sigmoid and htan activation function supposed to do?

  • @bwowekeith4472

    @bwowekeith4472

    Жыл бұрын

    @@codematrix well u might end up getting dead Neurons without batch normalization

  • @eyadaiman1559
    @eyadaiman15592 жыл бұрын

    Thank you, Please keep doing this kind of videos, your explanation is simple and clear

  • @AssemblyAI

    @AssemblyAI

    2 жыл бұрын

    Great to hear, thank you!

  • @VritanshKamal
    @VritanshKamal7 ай бұрын

    Nicely Explained ! I liked the part where we start from definitions.

  • @mariajoseapolo1989
    @mariajoseapolo1989 Жыл бұрын

    Thanks for the info, It was super easy to understand and clear. ( :

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    Great to hear :)

  • @sharadchandakacherla8268
    @sharadchandakacherla8268 Жыл бұрын

    great!

  • @spider853
    @spider853 Жыл бұрын

    First thanks for an amazing video, answered many questions. It felt like you predicted my questions during the video and proposed the answers right after! I have one question, if we don't normalize the input and use BatchNormalisation, wouldn't it behave completly different? Like for example say we feed training images of luminance value of 0-200, but on real world interference or during validation we use some other images that have full scale luminance values 0-255. Giving we know the range of our luminance values during modeling, wouldn't be better to use prenormalization, as Batch normalization will behave incorrectly during real world/validation process? P.S. To avoid any confusion about the data and why we didn't feed 0-255 before, say we have grayscale images, and we don't know if they're all in range, and what we'll have during validation, basically random split.

  • @narendrapratapsinghparmar91
    @narendrapratapsinghparmar914 ай бұрын

    THanks

  • @CharlyBraga
    @CharlyBraga7 ай бұрын

    Thanks

  • @anshulzade6355
    @anshulzade6355 Жыл бұрын

    This is really nice. Please keep up the good work , the world needs it. If possible can you also share the notebook. Lots of love from India!

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    Thank you!

  • @Omsip123
    @Omsip1234 ай бұрын

    What I learned from other videos, that all of this is applied across the samples per batch for each weight. Not across all weights. I hope I got it right…

  • @malayali_thaaram
    @malayali_thaaram Жыл бұрын

    Great Explaination!

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    Thank you!

  • @jeffinkachappilly9708
    @jeffinkachappilly9708 Жыл бұрын

    Thanks a lot.

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    You're very welcome!

  • @nayanparnami8554
    @nayanparnami8554 Жыл бұрын

    Superb explanation to one of the important interview question.. Great work.!!👍🏻👍🏻👍🏻 Thanks for the video.

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    You're very welcome!

  • @tachyon7777
    @tachyon7777 Жыл бұрын

    Marvelous.

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    Thank you! - Mısra

  • @saikalyangonuguntla594
    @saikalyangonuguntla594 Жыл бұрын

    Your explanation very good keep doing more videos on datascience concepts

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    Thank you Sai!

  • @nazaninadavoodi3563
    @nazaninadavoodi3563 Жыл бұрын

    perfect explanation 😍

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    Thank you :)

  • @mohammedamirjaved8418
    @mohammedamirjaved84182 жыл бұрын

    Love...

  • @mayankanand111
    @mayankanand1112 жыл бұрын

    I have watched this video over and over, you explained it very well though I am not a fan of Keras. From scratch, implementation would have been more helpful.

  • @AssemblyAI

    @AssemblyAI

    2 жыл бұрын

    Hey Mayank, thank you! I'm glad to hear it was helpful. -Mısra

  • @mehrdadkazemi3969
    @mehrdadkazemi39692 жыл бұрын

    ty

  • @AssemblyAI

    @AssemblyAI

    2 жыл бұрын

    You're welcome :)

  • @DavidPham86
    @DavidPham86 Жыл бұрын

    nice

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    Thanks

  • @j7m7f
    @j7m7f Жыл бұрын

    I got impression that this video is about normalization. There is nothing about batches and what does it mean BATCH normalization

  • @ssnprakash
    @ssnprakash5 ай бұрын

    worth it

  • @malikfahadsarwar2281
    @malikfahadsarwar2281 Жыл бұрын

    so if we have three columns age ,weight and height all are in different scale so we don't have to scale them separately instead we can use batch normalization to bring them to same scale

  • @art4eigen93
    @art4eigen93 Жыл бұрын

    This is great!... This video needs a much high view count. What is going on @youtube???? you need to work on your ranking algorithm.

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    Hahah that's great to hear that you like the video Aritra!!

  • @AssemblyAI

    @AssemblyAI

    Жыл бұрын

    Hahah that's great to hear that you like the video Aritra!!

  • @donmiguel4848
    @donmiguel48482 ай бұрын

    Why does she begins with making the difference between normalization and standardization and then @4:30 describe standardization with mean=0 and var=1 under the name "Batch normalization" ,does anyone understand this advance ?

  • @grownupgaming
    @grownupgaming Жыл бұрын

    Batch normalization works upon all samples in a batch, but only one feature i thought?

  • @grownupgaming

    @grownupgaming

    Жыл бұрын

    2:22 this is two separate features right, number of phones, and amount of money withdrawn from ATM.

  • @ilkeyigiter
    @ilkeyigiter Жыл бұрын

    Are you turkish? We are so curious :)

  • @donmiguel4848
    @donmiguel48482 ай бұрын

    There is a difference between squeezing the data between 0 and 1 and on the other hand pushing the mean to 0 and squeezing the variance to 1. The lady does not seem to understand what the difference is @10:20 ! The normalization layer just can use the statistic of the actual 28x28 image. But the statistic of the other 60000 images are all individual, so they are all normalized based on different metrics in contrast to just squeezing the data between 0 und 1 by dividing by 255. She does not understand what she is doing and misuse the workflow. The network has to work for all input images and so the input layer has to be adjusted to the average statistic of all images if at all. Her initial /255 method is way better for the mnist data than the lazy NormalizationLayer abuse she is advising which is way more computational expensive on top of that.