Simple Linear Regression: The Least Squares Regression Line

An introduction to the least squares regression line in simple linear regression.
The pain-empathy data is estimated from a figure given in:
Singer et al. (2004). Empathy for pain involves the affective but not sensory components of pain. Science, 303:1157--1162.
The Janka hardness-density data is found in:
Hand, D.J., Daly, F. , Lunn, A.D., McConway, K., and Ostrowski, E., editors (1994). The Handbook of Small Data Sets. Chapman & Hall, London.
Original source: Williams, E.J. (1959). Regression Analysis. John Wiley & Sons, New York. Page 43, Table 3.7.

Пікірлер: 90

  • @s2ms10ik5
    @s2ms10ik53 жыл бұрын

    why is finding information this clean and organized about statistics is so hard? even the textbooks have confusing languages, inconsistent notations etc. thank you so much for all your hard efforts. your videos are invaluable resources.

  • @user-pl7zr2jm5h

    @user-pl7zr2jm5h

    4 ай бұрын

    I absolutely agree with you

  • @Stalysfa
    @Stalysfa9 жыл бұрын

    OH MY GOD ! I FOUND THE HEAVEN ON KZread ! I was so scared to fail comm 215 as I was not understanding anything about this damn course and I found your chanel ! You are genius ! You save me as you explain things so well ! I am french, I am not used to technical english so I wasn't following well in lectures and thanks to you, I CAN NOW UNDERSTAND THESE DAMN CHAPTERS FOR THE FINALS ! THANK YOU SIR ! YOU ARE MY HERO !

  • @probono2876
    @probono28768 жыл бұрын

    Hi Dr Jeremy Balka, The entire JB Statistics video series is a truly outstanding work. Many thanks for making your work public so that people like me can benefit from it. Cheers

  • @duartediniz8255
    @duartediniz82559 жыл бұрын

    Thanks man, love the simplicity of you videos! Cheers

  • @ezquerzelaya
    @ezquerzelaya8 жыл бұрын

    Love your channel. Thank you for the hard work!

  • @jbstatistics

    @jbstatistics

    8 жыл бұрын

    +Jrnm Zqr You are very welcome. I'm glad I could be of help!

  • @catherinedumbledore
    @catherinedumbledore7 жыл бұрын

    Thank you very much for taking the time to do this, it is very much appreciated. All the concepts are perfectly explained and generally done much better than my 2 hour long university lectures!

  • @jbstatistics

    @jbstatistics

    7 жыл бұрын

    You are very welcome. I'm glad you found my video helpful!

  • @saifa9456
    @saifa945610 жыл бұрын

    I was so confused about regression but now it seems very simple Thank you,

  • @jojokaleido
    @jojokaleido8 жыл бұрын

    Thanks! It's so much easier to learn statistics with your help!

  • @bunmeng007
    @bunmeng0079 жыл бұрын

    It's wonderful to have your materials in addition to my lectures. It's simple to understand and very helpful indeed. Thank you

  • @jbstatistics

    @jbstatistics

    9 жыл бұрын

    You are very welcome! I'm glad you find my videos helpful.

  • @valeriereid2337
    @valeriereid2337 Жыл бұрын

    Thank you so very much for making your lectures available. It is very helpful getting these excellent explanation at my own pace.

  • @jbstatistics

    @jbstatistics

    Жыл бұрын

    You're very welcome. I'm glad to be of help!

  • @danielladamian1596
    @danielladamian15969 ай бұрын

    Good bless you Sir. U have d most easy to follow Explanatory Statistics channel on KZread. Wish I could rate more than 5 stars. Thank you so much ❤️

  • @juji432
    @juji4328 жыл бұрын

    Thank you for making these videos they were extremely helpful in my learning of the content.

  • @jbstatistics

    @jbstatistics

    8 жыл бұрын

    +juji432 You are very welcome! All the best.

  • @renjing
    @renjing3 жыл бұрын

    I decide to stick to this channel. Very closely related to reality, logically explained, and useful. Wish I had found out sooner.

  • @puneetkumarsingh1484
    @puneetkumarsingh14847 ай бұрын

    Thank you for saving my Probability and Statistics course grades!

  • @cassini4052
    @cassini40524 жыл бұрын

    Great help for finals week

  • @sarygirl4776
    @sarygirl47768 жыл бұрын

    you are a big help! oh my goodness! thank you so much!!! :)

  • @pesterlis
    @pesterlis6 жыл бұрын

    Thank you so much!! I didn't know error was assumed to be normally distributed so I was confused for the longest time

  • @jbstatistics

    @jbstatistics

    6 жыл бұрын

    I'm glad to be of help!

  • @waawaaweewaa2045
    @waawaaweewaa204511 жыл бұрын

    Outstanding, very clear.

  • @CortezPro
    @CortezPro7 жыл бұрын

    Didn't know about the last part, great explanation!

  • @jbstatistics

    @jbstatistics

    7 жыл бұрын

    Thanks!

  • @mlbbsea6446
    @mlbbsea64466 жыл бұрын

    Your videos very helpful. Big thanks

  • @jbstatistics

    @jbstatistics

    5 жыл бұрын

    You are very welcome!

  • @brodeurheaton
    @brodeurheaton11 жыл бұрын

    You're an awesome Professor! There are not many people out there who would take the time to do this for their students. Thanks for making statistics easier to understand!

  • @blink11101
    @blink111019 жыл бұрын

    Great video, cheers!

  • @ScilexGuitar
    @ScilexGuitar6 жыл бұрын

    Hi, im wondering how to approximate the unknown b in a Rayleigh-distributed random variable using least squares having some values that the random variable take. Is it possible to give a short explanation of that?

  • @Jean-cu8if
    @Jean-cu8if7 жыл бұрын

    your voice is so cool

  • @jbstatistics

    @jbstatistics

    7 жыл бұрын

    Thanks!

  • @jbstatistics
    @jbstatistics11 жыл бұрын

    Merci!

  • @jbstatistics
    @jbstatistics11 жыл бұрын

    Thanks!

  • @StephenDoty84
    @StephenDoty844 жыл бұрын

    7:13 and if two points determine a line, once you know x, y, the mean value of x,y, then just use the slope to determine the next value of y for a given change in x above x-mean value, such that using the y intercept is not needed to make a second point? Sometimes x with a value of zero is not practical to assume either, as when you use x to be the price of an ounce of gold and y to be the price of ten oz.'s of copper, in a scatterplot.

  • @TheSoundGrid.
    @TheSoundGrid.8 жыл бұрын

    Thanks a lot sir. The most informative video i seen on entire you tube. Please provide video on "The Likelihood function" also.

  • @jbstatistics

    @jbstatistics

    8 жыл бұрын

    +Rupesh Wadibhasme Thanks for the compliment Rupesh! And thanks for the suggested topic. I do hope to get videos up on the likelihood function and maximum likelihood estimation, but time is a little short these days. All the best.

  • @jbstatistics
    @jbstatistics11 жыл бұрын

    You're welcome!

  • @jbstatistics
    @jbstatistics11 жыл бұрын

    Thanks! This project has just about killed me, but it seemed like a good idea at the time :)

  • @mushtaqahmad8329
    @mushtaqahmad83297 жыл бұрын

    Good videos,,,,-,,,,,I learn a lot and clear my concept easily

  • @jbstatistics

    @jbstatistics

    7 жыл бұрын

    I'm glad to be of help!

  • @ahmedabdelmaaboud3460
    @ahmedabdelmaaboud34604 жыл бұрын

    Excellent explanation, but What is the interpretation of model equation?

  • @ABo-jr8pg
    @ABo-jr8pg5 жыл бұрын

    _You don't yet know how to fit that line but I do_ Thanks for making statistics kinda fun :)

  • @vivianandlin
    @vivianandlin8 жыл бұрын

    Hi, thank you for this video, one question, in 6:44, the videos gives residue must sum to zero for least square regression, why is that? The residue is just minimized that could be non-zero, can you explain that?

  • @randycragun

    @randycragun

    5 жыл бұрын

    Suppose that the average of the residuals was 2 (the sum would be 2 times however many points there are). That means you could move the line up vertically by 2 and have a better fit to the data points. For a simple example, imagine two points: one with a residual of 4, and another with a residual of 0 (it is on the regression line). Then the sum of the residuals is 4, and the mean of the residuals is 2. But we can do better than this by moving the regression line up to go between these points (rather than directly through one of them). In that case, the residuals would become -2 and 2, respectively, and their sum would be 0. You can see this also by looking at the sum of the squares of the residuals. In this case, the sum of the squares of the residuals is 0^2+4^2 = 16. That is large compared to what we get if we move the line up by 2 so that it goes between the two points. Then the sum of the squares of the residuals is (-2)^2+2^2 = 8. This is really easier to illustrate by drawing points and lines, so I hope you try that yourself.

  • @pallavibhardwaj7465
    @pallavibhardwaj746511 жыл бұрын

    very nice...thanx.

  • @donatorolo2779
    @donatorolo27797 жыл бұрын

    Could you explain why « (Sx)^2 », « (Sy)^2 » and « Cor(x,y) » are divided by « n-1 », and not just « n » ? and by the way your videos are the best explanation on this subject ! Definitely a life saver. Keep on the good work =D

  • @jbstatistics

    @jbstatistics

    7 жыл бұрын

    Thanks for the compliment! I have a video that discusses the one sample case: The sample variance: why divide by n-1. It's available at kzread.info/dash/bejne/a4OCtK-ynbWYdco.html

  • @donatorolo2779

    @donatorolo2779

    7 жыл бұрын

    Thank you very much for your reply. So kind ! I'll watch it =D

  • @nyashanyakuchena786
    @nyashanyakuchena7865 жыл бұрын

    Hi i like your videos. I had a question. I know that the values you list for b1 and bo work when the errors follow N(0,var(x)). My question is what would the least squares estimators for b0 and b1 be if the errors follow N(0,2x)

  • @jbstatistics

    @jbstatistics

    5 жыл бұрын

    The least squares estimators are the least squares estimators -- they are the same formulas regardless of the distribution of errors.. The *properties* of the least square estimators depend on what the distribution of the errors is. Are you asking what would happen if the variance in the epsilons increases with X? If there is increasing variance, and we ignore that, then the resulting least squares estimators (the usual formulas) will still be unbiased, but the reported standard errors will be smaller than they should be.

  • @myworldAI
    @myworldAI3 жыл бұрын

    thanks

  • @cife94
    @cife9411 жыл бұрын

    Can you do that in Excel?

  • @mennaehab2409
    @mennaehab24094 жыл бұрын

    your videos are amazing may ALLAH bless you.

  • @liftyshifty
    @liftyshifty5 жыл бұрын

    hi, I dont understand why B1 is the SPxy/SSxx, can you please explain?

  • @aslgulyalcndag318

    @aslgulyalcndag318

    3 жыл бұрын

    kzread.info/dash/bejne/l6uixZOciK3Td6Q.html

  • @Delahunta
    @Delahunta6 жыл бұрын

    Hi i like your videos. I had a question. I know that the values you list for b1 and bo work when the errors follow N(0,var(x)). My question is what would the least squares estimators for b0 and b1 be if the errors follow N(0,2x).

  • @jbstatistics

    @jbstatistics

    6 жыл бұрын

    The least squares estimators are still the least squares estimators, regardless of whether the variance of y is constant or has some relationship with x. If we use our regular least squares estimators in a situation where the variance of y is non-constant, then the estimators are still unbiased but the standard errors will be off (and we thus may have misleading conclusions in our statistical inference procedures). If the assumptions of the model are all met, except for the fact that the variance of y is changing with x, then weighted regression will take care of that. In weighted regression, the notion is that points that have a high variance in the random variable y contain less information, and thus should receive less weight in the calculations. We typically weight by the inverse of the variance.

  • @Delahunta

    @Delahunta

    6 жыл бұрын

    Okay thanks, so under the weighted transformation the estimator for B1 would be (x'wx)^(-1) x'wy where the w matrix has (1/2xi)^2 for its diagonals?

  • @Malangsufi
    @Malangsufi11 жыл бұрын

    Thanks

  • @kamzzaa7265
    @kamzzaa7265 Жыл бұрын

    4:36 If we can solve for beta0 and beta1 using the equations beta0 = mean(y) - beta1(mean(x)) and beta1 = cov(x,y)/var(x). why should we use OLE instead?

  • @jbstatistics

    @jbstatistics

    Жыл бұрын

    We're not solving for beta_0 and beta_1, as they are parameters whose true values are unknown. We are solving for the least squares estimators of beta_0 and beta_1. At 4:36 I'm referring to the sample covariance of X and Y, and the sample variance of X, and just giving another way of expressing the formula we just derived.

  • @brodeurheaton
    @brodeurheaton11 жыл бұрын

    Doesn't everything seem that way at the time. Speaking of which, after all those hours of studying and review, guess who forgets to bring a calculator to the exam last night. This guy.

  • @Cleisthenes2
    @Cleisthenes2 Жыл бұрын

    solidarity for Canadians who call zero 'nought'

  • @TreBlass
    @TreBlass7 жыл бұрын

    What's the difference between *Random error component* and *Residuals*?

  • @jbstatistics

    @jbstatistics

    7 жыл бұрын

    Epsilon represents the theoretical random error component (a random variable). The residuals are the differences between the observed and predicted values of Y.

  • @TreBlass

    @TreBlass

    7 жыл бұрын

    So epsilon is basically a Random variable which takes the disturbance (values) from the mean (the regression line) and a Residual is an element of a Random error component? In other words, a Residual is a subset of Random Error Component? Also, a Residual is one of the many disturbances from the regression line for a given X? pls correct me if I am wrong

  • @jiaxiwang6457
    @jiaxiwang64579 жыл бұрын

    hi jb, just curious, why all your videos dose not have ad? I love it !!!!!

  • @jbstatistics

    @jbstatistics

    9 жыл бұрын

    Lixi W I don't enable ads for a number of reasons. The main one is that I'm simply trying to help people learn statistics, and forcing people to watch 5 seconds of an ad before getting some help just feels wrong. And the amount of revenue would be pretty small (forcing people to watch a video ad 3 million times just so I can get $2k or so taxable dollars just doesn't add up to me).

  • @luisc212

    @luisc212

    8 жыл бұрын

    S / O to @jbstatistics for not being a sell out!

  • @jbstatistics

    @jbstatistics

    8 жыл бұрын

    +Luis C Thanks Luis!

  • @d3thdrive
    @d3thdrive4 жыл бұрын

    Beauty

  • @captainw6307
    @captainw63074 жыл бұрын

    I love your videos, which consists of concise knowledge structure and sexy voice >.

  • @BraveLittIeToaster
    @BraveLittIeToaster7 жыл бұрын

    so does the computer just guess at random?

  • @jbstatistics

    @jbstatistics

    7 жыл бұрын

    I don't know what you're asking. If you clarify, I might be able to answer. Cheers.

  • @BraveLittIeToaster

    @BraveLittIeToaster

    7 жыл бұрын

    @5:00 what formula does the computer use to identify the slope/intercept of y?

  • @jbstatistics

    @jbstatistics

    7 жыл бұрын

    The software calculates the sample slope and intercept using the formulas I discuss earlier in the video (at 4:09).

  • @sebgrootus
    @sebgrootus3 жыл бұрын

    I love my jbstatistics, my superhero

  • @64FireStorm
    @64FireStorm5 жыл бұрын

    the sum of PRODUCTS

  • @nostro1940
    @nostro19402 жыл бұрын

    My teacher uses b0 & b1 as â and b^

  • @acinomknip
    @acinomknip5 жыл бұрын

    we don't know how to fit the line but I DO. LOL

  • @doopydave
    @doopydave10 жыл бұрын

    hahaha today I found out we go to the same university professor Balca xD

  • @jbstatistics

    @jbstatistics

    10 жыл бұрын

    Yes, that happens a lot :)

  • @09ak31
    @09ak316 жыл бұрын

    dude sounds like max kellerman lol

  • @vikasindoria4788
    @vikasindoria478810 ай бұрын

    tmkc

  • @minabotieso6944
    @minabotieso69443 жыл бұрын

    This was not your best

  • @jbstatistics
    @jbstatistics11 жыл бұрын

    Thanks!

  • @jbstatistics
    @jbstatistics11 жыл бұрын

    You're welcome!