zedstatistics
Күн бұрын
105,058
1

Maximum Likelihood Estimation (MLE) | Score equation | Information | Invariance

For all videos see www.zstatistics.com/
0:00 Introduction
2:50 Definition of MLE
4:59 EXAMPLE 1 (visually identifying MLE from Log-likelihood plot)
10:47 Score equation
12:15 Information
14:31 EXAMPLE 1 calculations (finding the MLE and creating a confidence interval)
19:21 Properties of MLE
23:53 Invariance and MLE parameter transformations
27:42 Multiple parameters
30:58 EXAMPLE 2 (finding the MLE for two parameters)
40:37 Other estimation methods

Пікірлер: 69

@GuppyPal Жыл бұрын
I am stunned. This video is about a 1000X clearer than the explanation my professor gave on all this. You are SO clear. It's a life-saver! Thank you!
@davidbanahene3074 жыл бұрын
You don't know the number of people you are helping every now and then. Kudos! I do appreciate your great effort to help in a way contribute to our success. #GODBLESSYOU
@Elizabeth_Lynch5 жыл бұрын
Thank you, so helpful. I appreciate that you touched on MLE with multiple parameters.
@markwilson94903 жыл бұрын
The explanation of the MLE, Score function & Information etc.. here, is unbelievably simple and effective! This alternative perspective really helped my understanding. Thank you.
@BabakFiFoo4 жыл бұрын
Thank you for this amazing video! It is very informative and it could be even better if whenever you are using a vector of parameters as "X", use "X bold". Then the notation will become less confusing.
@lucaslopesf5 жыл бұрын
You saved my life! Thank you SO much!
@jlz59075 жыл бұрын
Thank you SO much! This really helped me a lot
@davidradulovic90344 жыл бұрын
The progression graph at the beginning of each video might seem to some people as a minor aspect of the whole video, but it's very significant for me. Lets me know what to expect and that feels good. :)
@michaelbaudin3 жыл бұрын
Thank you very much for sharing this. There is a possible confusion at 33:03. The equation shows the likelihood depending on (mu, sigma^2) but the plot shows it depending on (mu, sigma) i.e. without square. This is not an error, because the maximum likelihood estimator is for the (mu, sigma^2) vector as well as for (mu, sigma). It does not change much of the graphical meaning of the figure, but introduces a confusion on the intent of this figure. I guess that a clarification might be helpful on this topic. Anyway, your video was very helpful: thanks again for it.
@kjyfhjjj3 жыл бұрын
Thank you so much! This is so helpful! Can you please make more videos with more proof and algebra? For example, the proof that MLE being asymptotically normal, the calculation of variance estimate, etc?
@cdr.dr.shishirsahay9184 Жыл бұрын
Very nicely explained. A BIGGG GOD BLESS to you!
@srishtigupta95343 жыл бұрын
Thank you, it was very helpful.
@craighennessy3183 Жыл бұрын
Why can't my textbooks explain it like this. Zed, you are a legend!
@abcpsc4 жыл бұрын
Thanks for the video. How about the confidence interval in your multivariable example?
@ciaranmahon74153 жыл бұрын
I would be so fucked in my Math Stats class rn without these videos. Thank u
@fightwithbiomechanix4 жыл бұрын
I'm an engineer in the manufacturing sector. You're videos have been essential in understanding the statistics I use to justify process improvement designed experiments
@user-dy2sn1ny5t3 жыл бұрын
Awesome video. Much better than the disorganized lecture by my prof lol.
@Maymona933 жыл бұрын
Thank you, could you please share the sources that you mentioned could help with calculus & differentiation?
@ProfessionalTycoons5 жыл бұрын
great video mate.
@harikrishnareddygali6244 Жыл бұрын
You have put a great deal of work into explaining that. Thank you very much.
@erich_l46444 жыл бұрын
42 minutes? yuck, no thanks. Oh wait, he said Saddle Up. I'm IN! LETS GO
@mikelmendibeabarrategi1102 Жыл бұрын
You are crazy good at this
@kprao99494 жыл бұрын
superb lecture
@sherlocksilver93923 жыл бұрын
Does anyone know why in a score test we divide by the information at the null parameter values? I know that the information at the MLE represents the "sharpness" of the likelihood function, but what does information represent at a different parameter value that is not the maxima of the likelihood function?
@wtsg1982 Жыл бұрын
This helps me in understand how likelihood helps to estimate model in which the max is obtained by score equation. But i might need your help to understand at ~15:40 how derivative to 0 is transformed?
@yelshadaygebreselassie31633 жыл бұрын
I love your videos. You explain the concepts so clearly. I have one question. In the first example, why would the probability of getting pregnant on the second attempt depend on the first event? Aren't the different attempts independent? Shouldn't the probability of getting pregnant be 0.15 for all individual attempts?
@coolblue5929
2 жыл бұрын
This part I think I can answer. The probability of getting pregnant on the second attempt must exclude the probability of success on the first attempt so, success on the 2nd attempt means failure on 1st AND success on 2nd. Prob of success on 1 = 0.15 so prob of failure on 1 = 1 - 0.15 = 0.85. Therefore prob of failure on 1st AND success on 2nd = 0.85 * 0.15.
@adamkolany1668 Жыл бұрын
@18:26 So you postulate that θ is normally distributed with mean obtained from MLE and variance being 1/I(θ) ?
@mohdirfan-pu8fc2 жыл бұрын
Nice lecture sir. Sir kindly make a vedio on MLE for multiple parameters in implicit form with r code.
@ouafaeouaali4676 Жыл бұрын
Thanks for the course, it's clearly explained.... May i know what logiciel or application you use for the course ( beamer ? PowerPoint? )
@lucarampoldi7743 Жыл бұрын
Really well done - the examples following the theoretical discussion are especially useful. Thank you so much for uploading this!
@BilalTaskin-om6il10 ай бұрын
Life saver...❤
@angelzash4u23 жыл бұрын
hi. can get you assistance in solving a problem using the maximum likelihood method?
@backerlifan3 жыл бұрын
I once heard OLS and MLE yield the same result under a normal distribution. if that's the case, the pro and cons (especially the pros) just seems negligible, isn't it?
@enjoying-the-ride12952 жыл бұрын
I'm learning tons from your content Zed, thank you can anyone tell me 36:06 why is mu not a negative? the log likelihood function (after removing the constant and the component with log sigma square), starts with a negative so shouldn't it be nagative?
@steffenmuhle6517
2 жыл бұрын
If x=0 then -x=0 as well. That mu at 36:06 comes from setting the nominator to zero.
@joeekstein9174 Жыл бұрын
Thanks!
@arpitanand4693 Жыл бұрын
Hi could anyone help me with reading the notation L(theta ; y) in the context of the pregnancy example which he gave in the video?
@whetstoneguy67173 жыл бұрын
Mr. Justin Z--It would have been helpful if you had gone over the intermediary math steps. Thank you. WhetstoneGuy
@youssefdirani2 жыл бұрын
17:06 where did this expectation formula come from ?
@ruchikalalit13043 ай бұрын
which book is being referred to in this series or any other book for this topic. Anyone who knows please tell
@Kogsworth2 жыл бұрын
If I graph the likelihood function at 10:28, it doesn't look anything like the graph in the video. I get really small values for 0.2 rather than really large ones.
@jimjohnson3572 жыл бұрын
18:48 you say that the square root of the variance is the standard error (which is then used to find upper and lower limits of confidence interval). I thought the square root of variance is the standard deviation? And therefore, you would need an extra 1/sqrt(n) factor to take the standard deviation to the standard error which can then be used to find the limits? Why in this case is the square root of the variance = standard error and not standard deviation?
@sdsa007 Жыл бұрын
I'm going over my notes...and this tutorial is very clear and I enjoy verifying the math... but I got stuck at around 15:24 trying to understand the estimator mathematically... intuitively it totally makes sense that the estimate should be 20/100, but I am not understanding how it comes from the derivative of l(theta).... when i isolate for theta I get theta/(1-theta) one one side... but that is not the same as reducing to a single theta variable....
@sdsa007
Жыл бұрын
finally got the math right... even though I couldn't isolate theta as a single variable! I got down to n/y-n = theta/1-theta..... substituting I get 20/100-20 = theta/1-theta.... dividing the left side by 100 (top and bottom), I get 0.20/1-0.20 = theta/1-theta... therefore by visual analogy, theta is 0.20 (estimate). You can reduce to a single variable by cross-multiplying the denominators, expanding and reducing, but is a lot of tedious work... 0.20(1-theta)=theta(1-0.20)... blah blah blah....
@DJMoSheckles8 ай бұрын
Hi this video is incredible as are all of yours, but I'm very confused why the second derivative at 16:39 has both values negative. I've taken it multiple ways and plugged it into Wolfram Alpha and receive (y-n)/(1-theta)^2 - n/theta^2
@aschiffer
Ай бұрын
This seems right to me too, the derivative of (y-n)/(1-theta) swapped signs on the first derivative and there's no reason it wouldn't swap back on the second. You still have to chain rule d(theta) which is -1, right?
@mightbin2 жыл бұрын
convinced again
@anindadatta1642 жыл бұрын
Var(RV)=EV(RV^2)- (Mean of RV)^2 ,an easy method. So where is the need to do partial differentiation for two simulteneous equations and setting to zero, as effectively same result for variance is thrown up.
@Catwomen45125 жыл бұрын
I don't understand why the E(Y) is equal to n/theta
@k.sladkina872
5 жыл бұрын
I have the same problem
@Catwomen4512
5 жыл бұрын
@@k.sladkina872 I found out it is simply related to the distribution you use. Google different distributions (normal, binomial, etc.) and if you look at the wikipedia page, on the right, it states what the mean E(X) and variance V(X) are equal to
@adamkolany1668 Жыл бұрын
@13:45 In order to speak aboyt the "expected" value you MUST have a random variable. Where are they ?? @13:59 WHY ??
@Pier_Py2 жыл бұрын
You are so f good
@minma022623 жыл бұрын
If there is a god, I want it to be you.
@joeyquiet40208 ай бұрын
best best best
@snackbob1004 жыл бұрын
why cant uni lectures be like this. i pay so much money for an inferior education
@coolblue59292 жыл бұрын
Where is the sample data though?? Aren’t we supposed to fitting the distribution to a sample? Isn’t that the whole point? Why do you just say, oh, 15%??
@johnmook1354 жыл бұрын
why does this stuff matter. Im taking math stats for the second time and I understand zero. I can do the basic stuff described in videos but the problems are never just multiply all the pdfs together, take log, derive, and then set to zero... There's always wrinkles. like one problem I have to deal with an absolute value and they start taking about the median in the solution... Iye-yi-yi. I dislike math stats and really want to know how this will help me predict stocks or in any future job.
@zedstatistics
4 жыл бұрын
pretty sure it's what god created on the 3rd day. He created the heaven and earth, the land and the waters, and then differential calculus.
@johnmook135
4 жыл бұрын
@@zedstatistics The calculus isn't that bad. I love it. Although I question it. It's a language to explain something, something very complex. Seems like there could be flaws. But these things work time and time again? crazy. More particularly I just don't know how all this MLE and bayes theorm, sufficient statistics, data reduction, improving an estimator relates to real life problems. I'm data science major. I like sentdex's videos on youtube. All this advanced stats classes I am taking just don;t make sense. Or atleast reading from the book and my teachers just don't relate it to the real world and It doesn't make sense. Any suggestions/tips/ or playlists you could point me to that would help my statistical data science career and understanding? I like math, I like stocks. Not sure how to combined them outside sentdex's videos.
@johnmook135
4 жыл бұрын
any playlist that would help me solve problems like this -- Suppose that 21 observations are taken at random from an exponential distribution for which the mean μ is unknown (μ > 0), the average of 20 of these observations is 6, and although the exact value of the other observation could not be determined, it was known to be greater than 15. Determine the M.L.E. of μ. -- my book is Probability and Statistics 4th edition by DeGroot, there is free pdf available online.
@lzl4226
4 жыл бұрын
On the subject of predicting stocks, I guess you want to build a robot that takes today's stock market data and spits out a distribution of actions you can take that would make you the most money. Let's call this robot π(θ), because it's just a function parameterises by θ. And you want the maximum likelihood of θ that will make you the most money (let's call that Q*, where Q(a|s) is the reward of taking action a at step s). Since you're a data major you probably can see where this is going. You want a neuro-net that models π(θ) and you want to train it to solve for -Δlog(π(θ))Q (notice the score function here), where Q is the Reward of your trading actions (and in practice simulated by another neuro-net). Notice you want to find the set of θ for π(θ) that maximises Q(Q*) (using maximum likelihood and past stock data possibly flattened by some RNN). Furthermore you want to incrementally improve π within a confidence interval, so you don't make too big of a step that will collapse your convergence.... and you'll see the fisher information matrix come up in this calculation if you dig further. So yeah it prob helps in your future job in stock market prediction, if that's where you're headed.
@coolblue5929
2 жыл бұрын
@@lzl4226 except, stock prices are not produced by a stationary process.
@jhanvitanna4541 Жыл бұрын
Your content is amazing but sound quality is really bad
@CaptZdq14 жыл бұрын
Fellow 'nerds'?! That's very abusive. You should be imprisoned for that.
@zedstatistics
4 жыл бұрын
perhaps fellow 'sailors' aye captain?