The Bias Variance Trade-Off
The machine learning consultancy: truetheta.io
Want to work together? See here: truetheta.io/about/#want-to-w...
Article on the topic: truetheta.io/concepts/machine...
The Bias Variance Trade-Off is an essential perspective for developing models that will perform well out-of-sample. In fact, it's so important for modeling that most hyperparameters are designed to move you between the high bias-low variance and low bias-high variance ends of the spectrum. In this video, I explain what it says exactly, how it works intuitively, and how it's used typically.
SOCIAL MEDIA
LinkedIn : / dj-rich-90b91753
Twitter : / duanejrich
Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon: / mutualinformation
TIMESTAMPS
0:00 The Importance and my Approach
0:46 The Bias Variance Trade off at a High Level
2:06 A Supervised Learning Regression Task and Our Goal
3:41 Evaluating a Learning Algorithm
5:39 The Bias Variance Decomposition
7:19 An Example True Function
8:07 An Example Learning Algorithm
9:41 Seeing the Bias Variance Trade Off
12:59 Final Comments
SOURCES
The explanation I've reviewed the most is in section 2.9 of [1]. Also, I found
Kilian Weinberger's excellent lecture [2] useful. If you'd like to learn how this concept generalizes beyond a regression model's square error, see [3]
[1] Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York: Springer.
[2] Weinberger, K. (2018). Machine Learning Lecture 19 Bias Variance Decomposition -Cornell CS4780 SP17, KZread, • Machine Learning Lectu...
[3] Tibshirani, R (1996), Bias, variance and prediction error for classification rules. Department of Preventive Medicine and Biostatistics and Department of Statistics, University of Toronto, Toronto, Canada
Пікірлер: 80
The moment you flashed the decomposed equation, it clicked to me this looks a lot like Epistemic and Aleatoric Uncertainty components. P.S: We need much more quality content like this on high-end academic literature, please keep going full throttle. You earned my subscribe!
@Mutual_Information
3 жыл бұрын
Thank you very much! I’m not familiar with those components, but I’m glad to hear you are seeing relationships I don’t :) and will do, I have 4-5 videos in the pipeline. New one every 3 weeks!
This channel will explode soon - quality of content is too good, thank you !
Your videos have the perfect balance between rigor and simplicity. Kudos to you! Keep making such great content. You're destined to be really successful. 🎉
@Mutual_Information
Жыл бұрын
I appreciate that! Hope you're right :)
I love the humor at the end ("if you make the heroic move of checking my sources in the description"). I'm learning so much from you, thank you!
Incredible work man! I’m truly looking forward for more content!
@Mutual_Information
3 жыл бұрын
Thank you! More coming!
Wow this video truly opened my mind. I have been heard this term from ML people many many times, but it remains vague until I watch this video!
This is sooooo good. Thanks a lot for sharing your knowledge in such an amazing explanation!
I have been reading a lot on bias-variance trade-off and have been using it for some time now. But the way you explained it with amazing visuals, it was mind-blowing and very intuitive to understand. Totally like your content and will be keep waiting for more content like this in future.
@Mutual_Information
2 жыл бұрын
Excellent! More coming soon!
This is probably the best take on Bias Variance Trade-Off I have ever seen on KZread, the one from ritvikmath is a close second. Please don't ever stop making video like this, great stuff :)
@Mutual_Information
Жыл бұрын
Currently, the plan is to keep going - Thanks!
Great video!! The beginning as a creator in yt is pretty hard, so don't give up
@Mutual_Information
3 жыл бұрын
Thank you! I won’t, especially with the encouragement
I am studying a MSc in Stats at a decent uni and I have to say that your channel is damn amazing. Good job there, the intuition that you manage to put in your videos is mindblowing. You gained a subscriber :)
@Mutual_Information
2 жыл бұрын
Thank you! Very happy to have you. More good stuff coming soon :)
@Mutual_Information
2 жыл бұрын
And if you’d think it be helpful to your classmates, please share it with them 😁
Hey DJ, the quality of your videos is mindblowing, I subscribed even before watching the video till the end. I'm 100% sure your channel will blow up in the nearest future!
@Mutual_Information
2 жыл бұрын
Thank you brother! I’m very happy to hear you like them and excited to have you as a sub. More to come!
Beautifully done!
Highly underrated video! Great work
Great explanation! Thanks so much.
Simply awesome explanation!
clearly explained thanks!
Thank you, very clear video
Awesome graphical visualization.
Amazing video!
Well explained! Thanks!!
Great video!
Awesome info!
Great video thanks!. I've never seen this explained in a regression context, only for classification in terms of VC dimension.
@Mutual_Information
Жыл бұрын
Glad you appreciate it. This is an old video but I learned to lighten up on the on screen text, but I'm glad it still works for some
really nice content and intuitions, liked it a lot !
@Mutual_Information
3 жыл бұрын
Thank you!
Thank you! this is amazing content.
This is GOLD
@Mutual_Information
3 жыл бұрын
Thank you very much! :)
this needs more views
fantastic visualizations
Until 7:22, I thought this was very theoretical, but as soon as you started the animations, everything made more sense and became clear . Truly incredible, amazing work. Lots of love from India, and please keep up the good work. You are the 3blue1brown of data science.
@Mutual_Information
2 жыл бұрын
Thank you, encouragement like this means a lot. I’ll make sure to keep the good stuff coming :)
@AbhishekJain-bv6vv
2 жыл бұрын
@@Mutual_Information I am a student in IIT Kanpur (one of the premier institutes of India), and I am currently doing a course Statistical Methods for Business Analytics. Here is the link to the playlist and the (lecture slides in the description). kzread.info/head/PLEDn2e5B93CZL-T8Srj_wz_5FIjLMMoW- Just play any video in this, and tell me would you be willing to learn from these videos . The way of teaching is lagging far behind in our country.
Excellent.
Super cool stuff.
thanks you're carrying my MsC
I can see bright future of this channel. God job man . Keep uploading ❤️ . From United States Of India 🇮🇳😆
@Mutual_Information
2 жыл бұрын
Will do!
Subscribed want to learn this stuff but not sure where to start!
@Mutual_Information
2 жыл бұрын
Well I may be biased, but I think this channel is a fine place to start :)
Have you seen recent results in deep learning that show larger neural networks have both lower bias and lower variance than smaller models? Past a point, more parameters give less variance, which is amazing! See “Understanding Double Descent Requires a Fine-Grained Bias-Variance Decomposition” Adlam et Al
@Mutual_Information
Жыл бұрын
I hadn't seen this before but now that I've read some of it, it's quite an interesting idea. Maybe it explains some of the weird behavior observed in the Grokking paper? I still am mystified by how these deep NNs sometimes defy the typical U shape of test error.. wild! Thanks for sharing
Excellent video! One question I have is in practice, what is the relationship between EPE and the mean square error (MSE) loss we usually optimize for in practice for regression problem? Is EPE an expected value of MSE? Or is MSE only related to the bias term in EPE? or are they completely unrelated?
@Mutual_Information
3 жыл бұрын
Glad you enjoyed it! They are certainly related :) To make MSE and EPE comparable, the first thing we'd have to do is integrate EPE(x_0) over the domain of x, which we can call EPE, as you do. In that case, MSE is a biased estimate of EPE (to answer your question, it's an estimate of the whole of EPE - not any one of the terms). The MSE is going to be more optimistic/lower than EPE. This is because when fitting, you chose parameters to make MSE low.. if you had many parameters, you could make MSE really low (overfitting!). But EPE measures how good your model is relative to the p(x, y) - more parameter doesn't necessarily mean a better model! To get a better estimate, you could look at MSE out of sample. And that's what we do to determine those hypers.
@bajdoub
3 жыл бұрын
@@Mutual_Information thanks so much for taking the time to reply! I will need sometime and probably another pass of the video and putting things on paper before I digest it all :-D but you have given me all elements of explanation. Keep up the good work your videos are some of the best out there, you put the bar very high! :-)
@Mutual_Information
3 жыл бұрын
@@bajdoub thanks! It means a lot. I’ll try to keep the standard high :)
It feels like you're reading out of that textbook on the table behind you
@Mutual_Information
3 ай бұрын
The whole channel started b/c I actually wanted to write a book on ML.. but then I figured few people would read it, so might as well communicate the same those on a YT channel, where it had a better chance. Literally, I'd say "It's a textbook in video format". But then I realized, it can make the videos very dense and a little dry. So I've evolved a bit since.
subscribed. would u mind sharing how to quickly make the visuals with the math equations? Id love to use a similar resource for my students.
@Mutual_Information
2 жыл бұрын
Hey Jad. I have plans to open source my code for this, but it’s not ready yet. I’ll make an announcement when it’s ready,
I love the channel. I have a few topic requests... KL Divergence. Diffusion Networks. Policy Gradient RL models.
@Mutual_Information
2 жыл бұрын
Policy Gradient RL methods will be out this summer! Diffusion.. that's a whole beast I don't have plans for right now. I'd need to learn quite a bit to get up to speed. KL Divergence, for sure I'll do that. Possibly later this year.
@Throwingness
2 жыл бұрын
@@Mutual_Information Diffusion. Did you see Dalle-2? It's a milestone. I can't wait for the music and videos a system like this well create.
This video clearly deserves a lot more views than this. Keep up the good work.
@Mutual_Information
2 жыл бұрын
Thanks! Slowly things are improving. I think eventually more people will come to appreciate this one.
A masterpiece of yt
@Mutual_Information
Жыл бұрын
I'm glad you think so.. I was actually thinking about re-doing this one
So can I understand bias and variance in terms of a sampling distribution from which my specific model is taken? If the variance is high, the mean of this sampling distribution will be quite close to the true value. But since the variance of this distribution is so large, it is unlikely that my specific model represents the true value (but not impossible?). And if the model is very low in complexity, the variance of the sampling distiribution will be quite small. But since the expected value from the sampling distribution is far from the true value, it is very unlikely that my specific model represents the true value?
@Mutual_Information
Жыл бұрын
That sounds about right. Think of it this way. There is some true data generating mechanism that is unknow to your model. A complex model is more likely to be able to capture it. In doing so, if you re-sample from the true data generating process.. fit the model.. and look at the average of those fits.. then those will equal the average of the true distribution. This is what I mean when I say "The complex model can 'capture' the true data generating mechanism". Aka, the model is low bias. However, the cost of such flexibility is that the model produces very different ("high variance") fits over different re-samplings of the data. Does that make sense?
Do you use the Manim Python Library for your animation?
@Mutual_Information
3 жыл бұрын
No, though I should explore that one day. I use a personal library that leans heavily on Altair, which is a Python static plotting library based on d3.
@melodyparker3485
3 жыл бұрын
@@Mutual_Information Cool!
Woo!
@Mutual_Information
3 жыл бұрын
Haha thank you sister
@AlisonStuff
3 жыл бұрын
@@Mutual_Information your welcome brother. How are you? How was your day?
😮😮😯❤️
Man, please more pictures..
Please provide subtitles for foreign language speakers!
@Mutual_Information
2 жыл бұрын
I have a list of outstanding changes I need to make and this is one of them. I’ll make it priority! Thanks for the feedback