Maximum Likelihood Estimation - THINK PROBABILITY FIRST!

Ғылым және технология

In this tutorial, we will see why it is important to have a probabilistic first view when modeling. This view enables us to have predictive distributions for our target variables instead of focusing on point estimate values.
We will also see that linear regression is a special case of maximum likelihood estimation when Gaussian noise is assumed. The sum squared error function formulation from part 1 will also naturally arise in this framework. In other words, no justification is needed anymore.

Пікірлер: 42

@zgbjnnw93062 жыл бұрын
I learned both point estimate and MLE in my statistics class,I had no idea they are related somehow! Thanks for this inspiring video!
@KapilSachdeva
2 жыл бұрын
🙏
@GoingData5 ай бұрын
I have been studying stats and maths for years... this guy is INCREDIBLE!, is like khan academy... i can understand.... this is a real blessing to my thirst for knowledge. Thank you !!!
@KapilSachdeva
5 ай бұрын
🙏
@wish.e.9925 Жыл бұрын
I have just watched two videos and want to say that the logic you use to explain things is the best approach for delivering such knowledge. Thank you so much.
@KapilSachdeva
Жыл бұрын
🙏
@user-tu5kn6ck2y4 ай бұрын
Kapil, I will be forever greatful to you for this lecture series.
@KapilSachdeva
2 ай бұрын
🙏
@ArashSadr2 жыл бұрын
I am just falling in love with the way you break down the problem and go after each part!!
@KapilSachdeva
2 жыл бұрын
🙏
@shchen16 Жыл бұрын
I've just watch two of your videos and honestly it has saved my life. Thanks for such great video!
@KapilSachdeva
Жыл бұрын
🙏
@bikinibottom2100 Жыл бұрын
I'm going to look at every video in this channel, I can't wait to see the next ones, specially love the ones about sampling !
@KapilSachdeva
Жыл бұрын
🙏
@a_j_lifestyle65154 ай бұрын
Brilliantly done. Connecting the dots and democratizing ML math
@KapilSachdeva
4 ай бұрын
🙏
@spyrosp.5515 ай бұрын
Great work sir, thank you
@KapilSachdeva
5 ай бұрын
🙏
@mahlatseseneke58942 жыл бұрын
You are heaven sent... May the Most High turn everything you touch into gold... You deserve a Nobel Prize :,)
@KapilSachdeva
2 жыл бұрын
🙏 Not sure if I deserve this much appreciation but many thanks for your kindness and best wishes.
@mahlatseseneke5894
2 жыл бұрын
@@KapilSachdeva will you be doing any more videos for this book?
@KapilSachdeva
2 жыл бұрын
Yes.
@rajibkhan97492 жыл бұрын
Thanks @Kapil Sachdeva . Its helping a lot
@KapilSachdeva
2 жыл бұрын
🙏
@danmathewsrobin59913 жыл бұрын
Great series. Waiting for more of it :) Do you solely depend on the mentioned textbook content for creating the video? Could you mention some other similar resources? I wish somebody would make a series like this around Machine Learning: A Probabilistic Perspective by Kevin Murphy :D. Kudos to the good work around Bishop's book anyways. And may I know how do you make these kind of animations in your videos? Thank you, Kapil. ~dan
@KapilSachdeva
3 жыл бұрын
Thanks Dan for the appreciation & interest in this series. > Do you solely depend on the mentioned textbook content for creating the video? Not at all; except for few videos that will be part of this series as I am planning to use the examples from Dr. Bishop's book and hence providing the deserved citation & credit. Also, I believe it would be helpful to read the book after watching the videos in this series or vice versa :). I am aiming to publish few more videos in the series this week. > Could you mention some other similar resources? My process of learning is quite ad-hoc but if I were to recommend a book on this topic it would be Statistical Rethinking (xcelab.net/rm/statistical-rethinking/) by Dr. McElreath. There are various ports of this book using various toolkits including one done by me using Tensorflow Probability. IMHO, it would be better to read this book before PRML or Kevin Murphy's book. In terms of resources, read various papers, thesis, and books as it is important to get different perspectives and explanations of the same subject. More importantly, always seek the "why" .... question & challenge everything ... implement the concepts from scratch and it would do wonders. > And may I know how do you make these kind of animations in your videos? I use PowerPoint for these animations. Am also quite familiar with manim (by 3blue1brown) now but have not used to create any video so far. I keep coming back to PowerPoint; so far it has worked quite well. I use the "morph" transition quite a lot to animate things. Hope this answers some of your questions if not all.
@ztoza71583 жыл бұрын
Awesome! Will need some example to understand this better . But great start for me. Thanks fir putting this together
@KapilSachdeva
3 жыл бұрын
🙏 The example is same as that of part 1. If you use the notebook from part 1, all you need to do is to compute the precision using the analytical expression shown in this tutorial.
@bodwiser1006 ай бұрын
Wonderful video! One question -- we did not make use anywhere of the fact that the function f is a polynomial in x (unless I am mistaken, embarrassingly). So it means that we can have any generic set of features on the right hand side instead of only powers of x. Then what was the reason for assuming that f is a polynomial in x?
@sanjaythorat Жыл бұрын
Excellent series. It is helping me understand Bayesian Stats/Regression better. Thanks for the series. Keep doing more. I have few questions though. - At around 14:00, you are multiplying normal distributions together. Shouldn't it be multiplication of probabilities of target values given normal distribution? - 20:45 - Wouldn't MSE help us derive uncertainty?
@KapilSachdeva
Жыл бұрын
- in any case we have assume variables to be Gaussian in this example to resulting variable after multiplication will be Gaussian. - the tutorial showed the relation between MSE and MLE but by itself it is not quantifying the uncertainty. May be I have misunderstood your question? Follow up if my answer is not clear.
@sanjaythorat
Жыл бұрын
@@KapilSachdeva Let me rephrase my questions: - At around 14:00, you are multiplying normal distributions together. But, I think you should be multiplying the probabilities of the target variables given the parameters of the normal distribution. isn't it? It's probably just the notation that I do not understand completely. - 20:45 - I think we should be able to compute uncertainty/variance of the target variables using mean squared error computed with linear regression, we probably do not need Bayesian regression for that as you claimed.
@KapilSachdeva
Жыл бұрын
it's the notation and inconsistent vocabulary. its the distribution functions that are getting multiplied (note - in reality you would take log and add them and not multiply) You could do that but thinking in probability first is a more natural approach to obtain the information
@Daydream_DynamoАй бұрын
One stupid question here, Why we were interested in finding max joint probability, in the first place?? were there any other way to find w and beta??
@tobe76022 жыл бұрын
Thanks for this great video. Can you explain me why probability of a point is not Zero in the formula p(t0).p(t1)...p(tn), i always learn p(x) is 0 for a pdf. We must speak lileihood better l(t0).l(t1)...l(tn), who are associate with N(t0,sygma) and N(t1,sygma)...and N(tn,sygma) ? You are a very good pedagogue. Thanks for you time. Antonio
@KapilSachdeva
2 жыл бұрын
🙏 It is the likelihood. Unfortunately, the symbol used is still p :(
@ssshukla262 жыл бұрын
One question sir, the minimization part, it's derivative of NLL w.r.t w and B and then equate them (both the equations) to 0?
@KapilSachdeva
2 жыл бұрын
That is correct. here is one good reference. It uses variance instead of precision but you will see the full derivation. www.cs.princeton.edu/courses/archive/fall18/cos324/files/mle-regression.pdf
@ericzhang44862 жыл бұрын
In this baby step case, since each target has the same sigma/precision, dose it mean that the precision of predictions of the model will always be the same which is beta? In practical problems, we will have different betas for each target, right?
@KapilSachdeva
2 жыл бұрын
Yes, it would be better to get the sigma for each target variable.
@Shashank_Shahi19892 жыл бұрын
Where is part 1 of this video ? Link please. Which playlist in your channel should be watch first ? Any chronological order or it's just random videos on ML topic ?
@KapilSachdeva
2 жыл бұрын
Link to the playlist. You should be able to see it at the channel page as well. kzread.info/head/PLivJwLo9VCUISiuiRsbm5xalMbIwOHOOn