Phil Chan
11 жыл бұрын
83,535
1

Difference between the error term, and residual in regression models

Errors and residuals are not the same thing in regression.The confusion that they are the same is not surprisingly given the way textbooks out there seem to use the words interchangeably. Let me introduce you then to residuals and the error term.

Пікірлер: 72

@piggystories22723 жыл бұрын
Wow, 8 years have passed but this video still is the best/simplest explanation on KZread. Cheers mate! Thanks!
@victorystocktv4 жыл бұрын
0:20 residual term, error term(=disturbance term) 4:29 Y = value predicted by line + error (disturbance) 6:14 residual thank you
@siekimting41483 жыл бұрын
Finally someone who completely solved my confusions! Thanks!
@ShameenFerdinando3 жыл бұрын
Learning smth is a skill. A few amount of people have it. They are called good learners. Teaching something is a skill. An absolutely miniscule amount of people can do it right. And you my good sir are absolute legend of a teacher.
@mrchristall8 жыл бұрын
phil you my personal hero...please never stop making these kinds of videos
@ambroseezzat27035 жыл бұрын
Thank you very much. It was helpful to review this basic idea as I started mixing them up for some reason. You talk very slowly, but I put the video on 2x the speed and your explanation was straight and to the point. Thanks!
@ednaT19915 жыл бұрын
So an error is the difference between a sample and the ground truth model, whereas the residual is the difference between a sample and a model we estimated.
@divye.ruhela2 жыл бұрын
I had this doubt for like about a century now!! Thanks for finally resolving it. 😅
@shamuom7 жыл бұрын
Thank you so much Phil Chan... Your way of explanation is so good
@kotsioscoolboy3 жыл бұрын
Mate you are an ABSOLUTE LEGEND!! Thanks for this explanation.
@askarasanov26637 жыл бұрын
Perfect! Thank you Phil!!!!
@Foblah9 жыл бұрын
Very clear and highly entertaining.
@krasendimov246411 жыл бұрын
My econ textbook constantly iterates that errors and residuals are not the same, but at the same time does not give any intuition behind it. Thank you for the video and keep up the good work!
@alfredalademomi75648 жыл бұрын
This is fantastic. The video is very helpful
@Hiwaradan Жыл бұрын
Perfect man. Crystal clear.
@markovnikovchung91523 жыл бұрын
Thanks a lot! A very clear explanation!
@phuonglinh27282 жыл бұрын
thank you so much for your sharing. You made my study easier.
@pradeepc16 жыл бұрын
Really Nice.. Thanks for uploading.
@daughterofunicorns38732 жыл бұрын
really nicely and simply explained :)
@md.tariqujjaman Жыл бұрын
Great.. Thank a lot for such a nice video.
@wobblebass45622 жыл бұрын
phenomenal explanation
@maxpercer71194 жыл бұрын
Please elaborate on this. We never have the 'true' line because typically we don't have the entire population set.
@hippolyte2175
2 жыл бұрын
thanks this comment helped me a lot
@damirb6294
2 жыл бұрын
Thanks for this comment. That is something that missing in the video. Other is just "simple" math.
@mohammadehsansadiq58144 жыл бұрын
Thanks, nice explanation!
@svea55306 жыл бұрын
I now hopefully understand the difference between residual and error term! But my new problem has to do with the Gauss-Markov assumptions. One assumption says "the expected value of the error term is zero" - but how do you control this when the error terms are unobserved? I´m so confused...
@PhilChanstats
6 жыл бұрын
This condition is more technical and depends on assumption on X. In observational data, Xs like Ys are random. So your zero mean error assumption is E(u|X)=0. This is the zero mean assumption. Within it it implies zero correlation between error and Xs. If when you plot residuals v fitted Ys and see there's a pattern in scatterplot like a linear pattern, then it points to this assumption not being met.
@horaciosalgado10 жыл бұрын
great work mate!!!
@tilarmeister4 жыл бұрын
so is true line Y and estimated Line Y hat?
@harold55345 жыл бұрын
Thanks! well explained
@yasmineshawn19336 жыл бұрын
SO useful Thanks
@francis_1_1 Жыл бұрын
Thank you Sir!
@mm22sapphire507 жыл бұрын
i have a question im doing an OLS analysis on trade policy and i have a problem, when it comes to including the disturbance term i get confused , i know it has a mean of 0 and is a term referring to things outside of our measurement abilities that can have an effect on the dependent variable, but the question is do i just include it without giving it any value, do i just write an epsilon in the end and thats all? i hope you understand what i mean cause i means there must be a numerical value or something??
@PhilChanstats
7 жыл бұрын
simple answer, "yes"
@mm22sapphire50
7 жыл бұрын
ok, i guess your sure of that right?
@angelaynn7 жыл бұрын
amazing video thanks
@anarki77710 жыл бұрын
So we don't know the true line. Is that because the true line is the real relationship between Y and X and we can only estimate the relationship? I was a bit confused about that part - just need clarification.
@DMaTTh32
9 жыл бұрын
anarki777 Yes that what he's saying
@fffffmenicafffff7 жыл бұрын
great thank you very much !! :)
@RashidAli-mc9cz5 жыл бұрын
very useful sir
@oscarwilde3998 жыл бұрын
What software did you use to make this video? It is excellent!
@PhilChanstats
8 жыл бұрын
+Oscar Wilde Powerpoint. Took ages.
@user-fu7ps1ei7s9 жыл бұрын
great video
@wedsadun3417 жыл бұрын
Thanks a lot .
@cucuholmes11 жыл бұрын
nice video. thanks
@SamarATaha-nl2yn10 жыл бұрын
What can i do to avoid error ( disturbance) or reduce it ???
@DMaTTh32
9 жыл бұрын
Samar A.Taha one of the assumptions in statistics is that the error term is zero. So in theory it is reduced (to 0)
@engelecalex10 жыл бұрын
lovely!
@ShanakaChathurangarocky10 жыл бұрын
wow this is awesome this video solved my all problems But i learned that errors are not associated with estimated regression line can u tell me wht is the meaning of that
@masaru4444 жыл бұрын
So can we say that u is the aquivalent of û (or other way round) because in my college u is defined as "any other factor that influences y except for x" which always sounds so abstract. I mean I would like to see u on a graph or like with numbers but when I watch your video it seems to be like û. The difference between the graph and the observations. Am I right?
@PhilChanstats
4 жыл бұрын
The error term is not the same as the residual. The error term is not observable(cannot be computed); the residual may be computed. Yes, you can view the error term as containing all other relevant X variables + noise.This noise can come from different sources depending on where your data comes from. It could be due in part o measurement errors, or just natural randomness.
@PhilChanstats
4 жыл бұрын
But quite often in texts and lectures I see residual used in place of error. So long as you understand the meaning that's ok.
@masaru444
4 жыл бұрын
Phil Chan thank you for the answer :) I didn‘t say that the error term and the residual are the same. They just seem to be similar in their meaning. One is the difference between observation and true line and the other one the difference between observation and estimated line. I now have my confirmation :)
@sivaji63628 жыл бұрын
Difference between Error and Residual?
@RPDBY8 жыл бұрын
but the true line does not exist. i mean, it's never a perfectly linear relationship in reality, so what the true line really means? and if the true line does not exist, what does error term represent? i still don't get it :)
8 жыл бұрын
Estimated regression line - is best fit line, that we can do from points, that we have in our sample. Our sample is limited = we do not know all points in whole population. That means, our estimated line is not (most probably) representing whole population - or said differently - it is not the TRUE regression line. We would get the TRUE LINE only if we calculated it from all points in the population - what is not possible in almost all cases. If you understand the difference between ESTIMATED regression line vs TRUE fit line, it is very easy now. RESIDUALS = distance between OBSERVED points and ESTIMATED fit line. ERROR = distance between OBSERVED points and TRUE fit line (which is uknown). Note, that ERROR is theoretical and abstract value = uknown value = we cannot calculate it, because do not have all points of population. We have only points from our sample. What's the relation? Why it is made so complicted in theory? Answer is really simple: We expect, that: RESIDUALS (known value) "APPROXIMATE" the ERROR (which is unknown)
@RPDBY
8 жыл бұрын
Štefan Šimík okay, thank you fro the effort. i think i get it better now
@tqri9795
7 жыл бұрын
Thanks for your further explanation!
@rafiullahkhan4622
6 жыл бұрын
True line shows the exact relationship between variables. That exact relationship is only known to ALLAH and is beyond the scope of human knowledge. One of the factors is the use of sample data instead of population data. There are many other reasons due to which we can't make the true line.
@vasilis_fr
6 жыл бұрын
yeah tottally agree with stefan and i would like to add one last thing. GM assumptions (in order to be BLUE the OLS estimate) hold for the disturbances and because we cannot know them they are called assumptions. Finally, residuals dont have necessarily to be normal distributed though in many cases it is convenient.
@goodvibings35666 жыл бұрын
Let e1, e2 . . . , en be the residual values for the simple linear regression model Yi = β0 + β1xi + εi for i = 1, 2, . . . , n. Using the above model equation, explain why residuals can be used to estimate the unobserved values of the errors ε1, ε2, . . . , εn.
@rodrjgue9 жыл бұрын
thx
@michaelrichardson26938 жыл бұрын
phil can i have your email to ask you about a question i have? thanks
@PhilChanstats
8 жыл бұрын
+Michael Richardson Michael - you can try posting your question on youtube.
@michaelrichardson2693
8 жыл бұрын
Could you give me a real world example of a theoretical model that is endogenous in OLS? I am studying for my undergraduate degree and am struggling with the concept of endogeneity and exogeneity
@fahadfardan4 жыл бұрын
I read in Wikipedia and it is the exact opposite!!! what is going on ?? A statistical error (or disturbance) is the amount by which an observation differs from its expected value, the latter being based on the whole population from which the statistical unit was chosen randomly. For example, if the mean height in a population of 21-year-old men is 1.75 meters, and one randomly chosen man is 1.80 meters tall, then the "error" is 0.05 meters; if the randomly chosen man is 1.70 meters tall, then the "error" is −0.05 meters. The expected value, being the mean of the entire population, is typically unobservable, and hence the statistical error cannot be observed either. A residual (or fitting deviation), on the other hand, is an observable estimate of the unobservable statistical error. Consider the previous example with men's heights and suppose we have a random sample of n people. The sample mean could serve as a good estimator of the population mean. Then we have: The difference between the height of each man in the sample and the unobservable population mean is a statistical error, whereas The difference between the height of each man in the sample and the observable sample mean is a residual. www.wikiwand.com/en/Errors_and_residuals
@damirb62942 жыл бұрын
The key question is not explained entirely: what is the difference between "True" and "Estimated" line?
@PhilChanstats
2 жыл бұрын
The true line has the parameter values (in the example it's the intercept term and slope parameter). These values are not known, so we can't draw the "true" line. Using the data we can get estimates of the parameters. Chances are the estimates are close to but not equal to the true values.
@damirb6294
2 жыл бұрын
Thanks a lot! It is completely clear now.
@alhabfortnite52975 жыл бұрын
Putting this video with 1.5x speed is a life saver, you talk soooo unecessarily slowly, like youre detailing a murder case
@udcreation928 жыл бұрын
Explaining very slow, however the video was helpful