The Bias Variance Trade-Off

The machine learning consultancy: truetheta.io
Want to work together? See here: truetheta.io/about/#want-to-w...
Article on the topic: truetheta.io/concepts/machine...
The Bias Variance Trade-Off is an essential perspective for developing models that will perform well out-of-sample. In fact, it's so important for modeling that most hyperparameters are designed to move you between the high bias-low variance and low bias-high variance ends of the spectrum. In this video, I explain what it says exactly, how it works intuitively, and how it's used typically.
SOCIAL MEDIA
LinkedIn : / dj-rich-90b91753
Twitter : / duanejrich
Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon: / mutualinformation
TIMESTAMPS
0:00 The Importance and my Approach
0:46 The Bias Variance Trade off at a High Level
2:06 A Supervised Learning Regression Task and Our Goal
3:41 Evaluating a Learning Algorithm
5:39 The Bias Variance Decomposition
7:19 An Example True Function
8:07 An Example Learning Algorithm
9:41 Seeing the Bias Variance Trade Off
12:59 Final Comments
SOURCES
The explanation I've reviewed the most is in section 2.9 of [1]. Also, I found
Kilian Weinberger's excellent lecture [2] useful. If you'd like to learn how this concept generalizes beyond a regression model's square error, see [3]
[1] Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York: Springer.
[2] Weinberger, K. (2018). Machine Learning Lecture 19 Bias Variance Decomposition -Cornell CS4780 SP17, KZread, • Machine Learning Lectu...
[3] Tibshirani, R (1996), Bias, variance and prediction error for classification rules. Department of Preventive Medicine and Biostatistics and Department of Statistics, University of Toronto, Toronto, Canada

Пікірлер: 80

  • @karanshah1698
    @karanshah16983 жыл бұрын

    The moment you flashed the decomposed equation, it clicked to me this looks a lot like Epistemic and Aleatoric Uncertainty components. P.S: We need much more quality content like this on high-end academic literature, please keep going full throttle. You earned my subscribe!

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Thank you very much! I’m not familiar with those components, but I’m glad to hear you are seeing relationships I don’t :) and will do, I have 4-5 videos in the pipeline. New one every 3 weeks!

  • @superman39756
    @superman39756 Жыл бұрын

    This channel will explode soon - quality of content is too good, thank you !

  • @charudattamanwatkar8340
    @charudattamanwatkar8340 Жыл бұрын

    Your videos have the perfect balance between rigor and simplicity. Kudos to you! Keep making such great content. You're destined to be really successful. 🎉

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    I appreciate that! Hope you're right :)

  • @arongil
    @arongil10 ай бұрын

    I love the humor at the end ("if you make the heroic move of checking my sources in the description"). I'm learning so much from you, thank you!

  • @hspadim
    @hspadim3 жыл бұрын

    Incredible work man! I’m truly looking forward for more content!

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Thank you! More coming!

  • @taotaotan5671
    @taotaotan56712 жыл бұрын

    Wow this video truly opened my mind. I have been heard this term from ML people many many times, but it remains vague until I watch this video!

  • @juanvelez3889
    @juanvelez38892 жыл бұрын

    This is sooooo good. Thanks a lot for sharing your knowledge in such an amazing explanation!

  • @ConnectinDG
    @ConnectinDG2 жыл бұрын

    I have been reading a lot on bias-variance trade-off and have been using it for some time now. But the way you explained it with amazing visuals, it was mind-blowing and very intuitive to understand. Totally like your content and will be keep waiting for more content like this in future.

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Excellent! More coming soon!

  • @Boringpenguin
    @Boringpenguin Жыл бұрын

    This is probably the best take on Bias Variance Trade-Off I have ever seen on KZread, the one from ritvikmath is a close second. Please don't ever stop making video like this, great stuff :)

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    Currently, the plan is to keep going - Thanks!

  • @winoo1967
    @winoo19673 жыл бұрын

    Great video!! The beginning as a creator in yt is pretty hard, so don't give up

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Thank you! I won’t, especially with the encouragement

  • @kellsierliosan4404
    @kellsierliosan44042 жыл бұрын

    I am studying a MSc in Stats at a decent uni and I have to say that your channel is damn amazing. Good job there, the intuition that you manage to put in your videos is mindblowing. You gained a subscriber :)

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Thank you! Very happy to have you. More good stuff coming soon :)

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    And if you’d think it be helpful to your classmates, please share it with them 😁

  • @vladvladislav4335
    @vladvladislav43352 жыл бұрын

    Hey DJ, the quality of your videos is mindblowing, I subscribed even before watching the video till the end. I'm 100% sure your channel will blow up in the nearest future!

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Thank you brother! I’m very happy to hear you like them and excited to have you as a sub. More to come!

  • @akhaita
    @akhaita2 жыл бұрын

    Beautifully done!

  • @kashvinivini2264
    @kashvinivini22642 жыл бұрын

    Highly underrated video! Great work

  • @arminkashani5695
    @arminkashani5695 Жыл бұрын

    Great explanation! Thanks so much.

  • @FarizDarari
    @FarizDarari Жыл бұрын

    Simply awesome explanation!

  • @revooshnoj4078
    @revooshnoj4078 Жыл бұрын

    clearly explained thanks!

  • @orvvro
    @orvvro3 жыл бұрын

    Thank you, very clear video

  • @tirimula
    @tirimula Жыл бұрын

    Awesome graphical visualization.

  • @NoNTr1v1aL
    @NoNTr1v1aL2 жыл бұрын

    Amazing video!

  • @murilopalomosebilla2999
    @murilopalomosebilla29992 жыл бұрын

    Well explained! Thanks!!

  • @MP-if2kf
    @MP-if2kf2 жыл бұрын

    Great video!

  • @Kopakabana001
    @Kopakabana0013 жыл бұрын

    Awesome info!

  • @cerioscha
    @cerioscha Жыл бұрын

    Great video thanks!. I've never seen this explained in a regression context, only for classification in terms of VC dimension.

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    Glad you appreciate it. This is an old video but I learned to lighten up on the on screen text, but I'm glad it still works for some

  • @antoinestevan5310
    @antoinestevan53103 жыл бұрын

    really nice content and intuitions, liked it a lot !

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Thank you!

  • @navanarun
    @navanarun10 ай бұрын

    Thank you! this is amazing content.

  • @akhilezai
    @akhilezai3 жыл бұрын

    This is GOLD

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Thank you very much! :)

  • @loukafortin6225
    @loukafortin62253 жыл бұрын

    this needs more views

  • @wexwexexort
    @wexwexexort4 ай бұрын

    fantastic visualizations

  • @AbhishekJain-bv6vv
    @AbhishekJain-bv6vv2 жыл бұрын

    Until 7:22, I thought this was very theoretical, but as soon as you started the animations, everything made more sense and became clear . Truly incredible, amazing work. Lots of love from India, and please keep up the good work. You are the 3blue1brown of data science.

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Thank you, encouragement like this means a lot. I’ll make sure to keep the good stuff coming :)

  • @AbhishekJain-bv6vv

    @AbhishekJain-bv6vv

    2 жыл бұрын

    @@Mutual_Information I am a student in IIT Kanpur (one of the premier institutes of India), and I am currently doing a course Statistical Methods for Business Analytics. Here is the link to the playlist and the (lecture slides in the description). kzread.info/head/PLEDn2e5B93CZL-T8Srj_wz_5FIjLMMoW- Just play any video in this, and tell me would you be willing to learn from these videos . The way of teaching is lagging far behind in our country.

  • @eulefranz944
    @eulefranz9442 жыл бұрын

    Excellent.

  • @peterkonig9537
    @peterkonig95376 ай бұрын

    Super cool stuff.

  • @JoseManuel-pn3dh
    @JoseManuel-pn3dh7 ай бұрын

    thanks you're carrying my MsC

  • @Lucifer-wd7gh
    @Lucifer-wd7gh2 жыл бұрын

    I can see bright future of this channel. God job man . Keep uploading ❤️ . From United States Of India 🇮🇳😆

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Will do!

  • @LvlLouie
    @LvlLouie2 жыл бұрын

    Subscribed want to learn this stuff but not sure where to start!

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Well I may be biased, but I think this channel is a fine place to start :)

  • @partyhorse420
    @partyhorse420 Жыл бұрын

    Have you seen recent results in deep learning that show larger neural networks have both lower bias and lower variance than smaller models? Past a point, more parameters give less variance, which is amazing! See “Understanding Double Descent Requires a Fine-Grained Bias-Variance Decomposition” Adlam et Al

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    I hadn't seen this before but now that I've read some of it, it's quite an interesting idea. Maybe it explains some of the weird behavior observed in the Grokking paper? I still am mystified by how these deep NNs sometimes defy the typical U shape of test error.. wild! Thanks for sharing

  • @bajdoub
    @bajdoub3 жыл бұрын

    Excellent video! One question I have is in practice, what is the relationship between EPE and the mean square error (MSE) loss we usually optimize for in practice for regression problem? Is EPE an expected value of MSE? Or is MSE only related to the bias term in EPE? or are they completely unrelated?

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Glad you enjoyed it! They are certainly related :) To make MSE and EPE comparable, the first thing we'd have to do is integrate EPE(x_0) over the domain of x, which we can call EPE, as you do. In that case, MSE is a biased estimate of EPE (to answer your question, it's an estimate of the whole of EPE - not any one of the terms). The MSE is going to be more optimistic/lower than EPE. This is because when fitting, you chose parameters to make MSE low.. if you had many parameters, you could make MSE really low (overfitting!). But EPE measures how good your model is relative to the p(x, y) - more parameter doesn't necessarily mean a better model! To get a better estimate, you could look at MSE out of sample. And that's what we do to determine those hypers.

  • @bajdoub

    @bajdoub

    3 жыл бұрын

    @@Mutual_Information thanks so much for taking the time to reply! I will need sometime and probably another pass of the video and putting things on paper before I digest it all :-D but you have given me all elements of explanation. Keep up the good work your videos are some of the best out there, you put the bar very high! :-)

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    @@bajdoub thanks! It means a lot. I’ll try to keep the standard high :)

  • @theleastcreative
    @theleastcreative4 ай бұрын

    It feels like you're reading out of that textbook on the table behind you

  • @Mutual_Information

    @Mutual_Information

    3 ай бұрын

    The whole channel started b/c I actually wanted to write a book on ML.. but then I figured few people would read it, so might as well communicate the same those on a YT channel, where it had a better chance. Literally, I'd say "It's a textbook in video format". But then I realized, it can make the videos very dense and a little dry. So I've evolved a bit since.

  • @jadtawil6143
    @jadtawil61432 жыл бұрын

    subscribed. would u mind sharing how to quickly make the visuals with the math equations? Id love to use a similar resource for my students.

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Hey Jad. I have plans to open source my code for this, but it’s not ready yet. I’ll make an announcement when it’s ready,

  • @Throwingness
    @Throwingness2 жыл бұрын

    I love the channel. I have a few topic requests... KL Divergence. Diffusion Networks. Policy Gradient RL models.

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Policy Gradient RL methods will be out this summer! Diffusion.. that's a whole beast I don't have plans for right now. I'd need to learn quite a bit to get up to speed. KL Divergence, for sure I'll do that. Possibly later this year.

  • @Throwingness

    @Throwingness

    2 жыл бұрын

    @@Mutual_Information Diffusion. Did you see Dalle-2? It's a milestone. I can't wait for the music and videos a system like this well create.

  • @sathyakumarn7619
    @sathyakumarn76192 жыл бұрын

    This video clearly deserves a lot more views than this. Keep up the good work.

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Thanks! Slowly things are improving. I think eventually more people will come to appreciate this one.

  • @manueltiburtini6528
    @manueltiburtini6528 Жыл бұрын

    A masterpiece of yt

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    I'm glad you think so.. I was actually thinking about re-doing this one

  • @MegaSesamStrasse
    @MegaSesamStrasse Жыл бұрын

    So can I understand bias and variance in terms of a sampling distribution from which my specific model is taken? If the variance is high, the mean of this sampling distribution will be quite close to the true value. But since the variance of this distribution is so large, it is unlikely that my specific model represents the true value (but not impossible?). And if the model is very low in complexity, the variance of the sampling distiribution will be quite small. But since the expected value from the sampling distribution is far from the true value, it is very unlikely that my specific model represents the true value?

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    That sounds about right. Think of it this way. There is some true data generating mechanism that is unknow to your model. A complex model is more likely to be able to capture it. In doing so, if you re-sample from the true data generating process.. fit the model.. and look at the average of those fits.. then those will equal the average of the true distribution. This is what I mean when I say "The complex model can 'capture' the true data generating mechanism". Aka, the model is low bias. However, the cost of such flexibility is that the model produces very different ("high variance") fits over different re-samplings of the data. Does that make sense?

  • @melodyparker3485
    @melodyparker34853 жыл бұрын

    Do you use the Manim Python Library for your animation?

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    No, though I should explore that one day. I use a personal library that leans heavily on Altair, which is a Python static plotting library based on d3.

  • @melodyparker3485

    @melodyparker3485

    3 жыл бұрын

    @@Mutual_Information Cool!

  • @AlisonStuff
    @AlisonStuff3 жыл бұрын

    Woo!

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Haha thank you sister

  • @AlisonStuff

    @AlisonStuff

    3 жыл бұрын

    @@Mutual_Information your welcome brother. How are you? How was your day?

  • @ilyboc
    @ilyboc2 жыл бұрын

    😮😮😯❤️

  • @ClosiusBeg
    @ClosiusBeg2 жыл бұрын

    Man, please more pictures..

  • @kashvinivini2264
    @kashvinivini22642 жыл бұрын

    Please provide subtitles for foreign language speakers!

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    I have a list of outstanding changes I need to make and this is one of them. I’ll make it priority! Thanks for the feedback