Simple Linear Regression: The Least Squares Regression Line
An introduction to the least squares regression line in simple linear regression.
The pain-empathy data is estimated from a figure given in:
Singer et al. (2004). Empathy for pain involves the affective but not sensory components of pain. Science, 303:1157--1162.
The Janka hardness-density data is found in:
Hand, D.J., Daly, F. , Lunn, A.D., McConway, K., and Ostrowski, E., editors (1994). The Handbook of Small Data Sets. Chapman & Hall, London.
Original source: Williams, E.J. (1959). Regression Analysis. John Wiley & Sons, New York. Page 43, Table 3.7.
Пікірлер: 90
why is finding information this clean and organized about statistics is so hard? even the textbooks have confusing languages, inconsistent notations etc. thank you so much for all your hard efforts. your videos are invaluable resources.
@user-pl7zr2jm5h
4 ай бұрын
I absolutely agree with you
OH MY GOD ! I FOUND THE HEAVEN ON KZread ! I was so scared to fail comm 215 as I was not understanding anything about this damn course and I found your chanel ! You are genius ! You save me as you explain things so well ! I am french, I am not used to technical english so I wasn't following well in lectures and thanks to you, I CAN NOW UNDERSTAND THESE DAMN CHAPTERS FOR THE FINALS ! THANK YOU SIR ! YOU ARE MY HERO !
Hi Dr Jeremy Balka, The entire JB Statistics video series is a truly outstanding work. Many thanks for making your work public so that people like me can benefit from it. Cheers
Thanks man, love the simplicity of you videos! Cheers
Love your channel. Thank you for the hard work!
@jbstatistics
8 жыл бұрын
+Jrnm Zqr You are very welcome. I'm glad I could be of help!
Thank you very much for taking the time to do this, it is very much appreciated. All the concepts are perfectly explained and generally done much better than my 2 hour long university lectures!
@jbstatistics
7 жыл бұрын
You are very welcome. I'm glad you found my video helpful!
I was so confused about regression but now it seems very simple Thank you,
Thanks! It's so much easier to learn statistics with your help!
It's wonderful to have your materials in addition to my lectures. It's simple to understand and very helpful indeed. Thank you
@jbstatistics
9 жыл бұрын
You are very welcome! I'm glad you find my videos helpful.
Thank you so very much for making your lectures available. It is very helpful getting these excellent explanation at my own pace.
@jbstatistics
Жыл бұрын
You're very welcome. I'm glad to be of help!
Good bless you Sir. U have d most easy to follow Explanatory Statistics channel on KZread. Wish I could rate more than 5 stars. Thank you so much ❤️
Thank you for making these videos they were extremely helpful in my learning of the content.
@jbstatistics
8 жыл бұрын
+juji432 You are very welcome! All the best.
I decide to stick to this channel. Very closely related to reality, logically explained, and useful. Wish I had found out sooner.
Thank you for saving my Probability and Statistics course grades!
Great help for finals week
you are a big help! oh my goodness! thank you so much!!! :)
Thank you so much!! I didn't know error was assumed to be normally distributed so I was confused for the longest time
@jbstatistics
6 жыл бұрын
I'm glad to be of help!
Outstanding, very clear.
Didn't know about the last part, great explanation!
@jbstatistics
7 жыл бұрын
Thanks!
Your videos very helpful. Big thanks
@jbstatistics
5 жыл бұрын
You are very welcome!
You're an awesome Professor! There are not many people out there who would take the time to do this for their students. Thanks for making statistics easier to understand!
Great video, cheers!
Hi, im wondering how to approximate the unknown b in a Rayleigh-distributed random variable using least squares having some values that the random variable take. Is it possible to give a short explanation of that?
your voice is so cool
@jbstatistics
7 жыл бұрын
Thanks!
Merci!
Thanks!
7:13 and if two points determine a line, once you know x, y, the mean value of x,y, then just use the slope to determine the next value of y for a given change in x above x-mean value, such that using the y intercept is not needed to make a second point? Sometimes x with a value of zero is not practical to assume either, as when you use x to be the price of an ounce of gold and y to be the price of ten oz.'s of copper, in a scatterplot.
Thanks a lot sir. The most informative video i seen on entire you tube. Please provide video on "The Likelihood function" also.
@jbstatistics
8 жыл бұрын
+Rupesh Wadibhasme Thanks for the compliment Rupesh! And thanks for the suggested topic. I do hope to get videos up on the likelihood function and maximum likelihood estimation, but time is a little short these days. All the best.
You're welcome!
Thanks! This project has just about killed me, but it seemed like a good idea at the time :)
Good videos,,,,-,,,,,I learn a lot and clear my concept easily
@jbstatistics
7 жыл бұрын
I'm glad to be of help!
Excellent explanation, but What is the interpretation of model equation?
_You don't yet know how to fit that line but I do_ Thanks for making statistics kinda fun :)
Hi, thank you for this video, one question, in 6:44, the videos gives residue must sum to zero for least square regression, why is that? The residue is just minimized that could be non-zero, can you explain that?
@randycragun
5 жыл бұрын
Suppose that the average of the residuals was 2 (the sum would be 2 times however many points there are). That means you could move the line up vertically by 2 and have a better fit to the data points. For a simple example, imagine two points: one with a residual of 4, and another with a residual of 0 (it is on the regression line). Then the sum of the residuals is 4, and the mean of the residuals is 2. But we can do better than this by moving the regression line up to go between these points (rather than directly through one of them). In that case, the residuals would become -2 and 2, respectively, and their sum would be 0. You can see this also by looking at the sum of the squares of the residuals. In this case, the sum of the squares of the residuals is 0^2+4^2 = 16. That is large compared to what we get if we move the line up by 2 so that it goes between the two points. Then the sum of the squares of the residuals is (-2)^2+2^2 = 8. This is really easier to illustrate by drawing points and lines, so I hope you try that yourself.
very nice...thanx.
Could you explain why « (Sx)^2 », « (Sy)^2 » and « Cor(x,y) » are divided by « n-1 », and not just « n » ? and by the way your videos are the best explanation on this subject ! Definitely a life saver. Keep on the good work =D
@jbstatistics
7 жыл бұрын
Thanks for the compliment! I have a video that discusses the one sample case: The sample variance: why divide by n-1. It's available at kzread.info/dash/bejne/a4OCtK-ynbWYdco.html
@donatorolo2779
7 жыл бұрын
Thank you very much for your reply. So kind ! I'll watch it =D
Hi i like your videos. I had a question. I know that the values you list for b1 and bo work when the errors follow N(0,var(x)). My question is what would the least squares estimators for b0 and b1 be if the errors follow N(0,2x)
@jbstatistics
5 жыл бұрын
The least squares estimators are the least squares estimators -- they are the same formulas regardless of the distribution of errors.. The *properties* of the least square estimators depend on what the distribution of the errors is. Are you asking what would happen if the variance in the epsilons increases with X? If there is increasing variance, and we ignore that, then the resulting least squares estimators (the usual formulas) will still be unbiased, but the reported standard errors will be smaller than they should be.
thanks
Can you do that in Excel?
your videos are amazing may ALLAH bless you.
hi, I dont understand why B1 is the SPxy/SSxx, can you please explain?
@aslgulyalcndag318
3 жыл бұрын
kzread.info/dash/bejne/l6uixZOciK3Td6Q.html
Hi i like your videos. I had a question. I know that the values you list for b1 and bo work when the errors follow N(0,var(x)). My question is what would the least squares estimators for b0 and b1 be if the errors follow N(0,2x).
@jbstatistics
6 жыл бұрын
The least squares estimators are still the least squares estimators, regardless of whether the variance of y is constant or has some relationship with x. If we use our regular least squares estimators in a situation where the variance of y is non-constant, then the estimators are still unbiased but the standard errors will be off (and we thus may have misleading conclusions in our statistical inference procedures). If the assumptions of the model are all met, except for the fact that the variance of y is changing with x, then weighted regression will take care of that. In weighted regression, the notion is that points that have a high variance in the random variable y contain less information, and thus should receive less weight in the calculations. We typically weight by the inverse of the variance.
@Delahunta
6 жыл бұрын
Okay thanks, so under the weighted transformation the estimator for B1 would be (x'wx)^(-1) x'wy where the w matrix has (1/2xi)^2 for its diagonals?
Thanks
4:36 If we can solve for beta0 and beta1 using the equations beta0 = mean(y) - beta1(mean(x)) and beta1 = cov(x,y)/var(x). why should we use OLE instead?
@jbstatistics
Жыл бұрын
We're not solving for beta_0 and beta_1, as they are parameters whose true values are unknown. We are solving for the least squares estimators of beta_0 and beta_1. At 4:36 I'm referring to the sample covariance of X and Y, and the sample variance of X, and just giving another way of expressing the formula we just derived.
Doesn't everything seem that way at the time. Speaking of which, after all those hours of studying and review, guess who forgets to bring a calculator to the exam last night. This guy.
solidarity for Canadians who call zero 'nought'
What's the difference between *Random error component* and *Residuals*?
@jbstatistics
7 жыл бұрын
Epsilon represents the theoretical random error component (a random variable). The residuals are the differences between the observed and predicted values of Y.
@TreBlass
7 жыл бұрын
So epsilon is basically a Random variable which takes the disturbance (values) from the mean (the regression line) and a Residual is an element of a Random error component? In other words, a Residual is a subset of Random Error Component? Also, a Residual is one of the many disturbances from the regression line for a given X? pls correct me if I am wrong
hi jb, just curious, why all your videos dose not have ad? I love it !!!!!
@jbstatistics
9 жыл бұрын
Lixi W I don't enable ads for a number of reasons. The main one is that I'm simply trying to help people learn statistics, and forcing people to watch 5 seconds of an ad before getting some help just feels wrong. And the amount of revenue would be pretty small (forcing people to watch a video ad 3 million times just so I can get $2k or so taxable dollars just doesn't add up to me).
@luisc212
8 жыл бұрын
S / O to @jbstatistics for not being a sell out!
@jbstatistics
8 жыл бұрын
+Luis C Thanks Luis!
Beauty
I love your videos, which consists of concise knowledge structure and sexy voice >.
so does the computer just guess at random?
@jbstatistics
7 жыл бұрын
I don't know what you're asking. If you clarify, I might be able to answer. Cheers.
@BraveLittIeToaster
7 жыл бұрын
@5:00 what formula does the computer use to identify the slope/intercept of y?
@jbstatistics
7 жыл бұрын
The software calculates the sample slope and intercept using the formulas I discuss earlier in the video (at 4:09).
I love my jbstatistics, my superhero
the sum of PRODUCTS
My teacher uses b0 & b1 as â and b^
we don't know how to fit the line but I DO. LOL
hahaha today I found out we go to the same university professor Balca xD
@jbstatistics
10 жыл бұрын
Yes, that happens a lot :)
dude sounds like max kellerman lol
tmkc
This was not your best
Thanks!
You're welcome!