Assumptions of Linear Regression

Assumptions of Linear Regression: In order for the results of the regression analysis to be interpreted meaningfully, certain conditions must be met:
1) Linearity: There must be a linear relationship between the dependent and independent variables.
2) Homoscedasticity: The residuals must have a constant variance.
3) Normality: The residuals must be normally distributed.
4) No Multicollinearity: No high correlation between the independent variables
Linearity:
In linear regression, a straight line is placed through the data. This straight line should represent all points as good as possible. If the relation is nonlinear the straight line cannot fulfill this requirement.
Normal distribution of the error:
One assumption of linear Regression is that the error epsilon must be normally distributed,
To check this there are two ways, one is the analytical way and the other is the graphical way.
Homoscedasticity:
A assumption for linear regression is that the residuals have a constant variance.
Since your regression model never exactly predicts your dependent variable in practice, you always have an error. Now you can plot your dependent variable on the x axis and the error on the y axis.
Multicollinearity:
In multicollinearity, two or more of the predictors correlate strongly with each other.
Test your assumptions for the linear Regression online:
datatab.net/statistics-calcul...
And here are mor informations about Regression:
datatab.net/tutorial/linear-r...

Пікірлер: 58

  • @datatab
    @datatab Жыл бұрын

    If you like, please find our e-Book here: datatab.net/statistics-book 😎

  • @shubhamthote4291
    @shubhamthote4291 Жыл бұрын

    Hello Ma'am , your teaching technique really Awesome. Please make a video lecture on """ What if these Linear Regression Assumption get violated ? """

  • @useful6131
    @useful61312 жыл бұрын

    Amazing! Explained so simply! It saved me a lot of time searching for bad explanations :)

  • @datatab

    @datatab

    2 жыл бұрын

    Many thanks 😊! Regards, Hannah

  • @sonalupadhyay555

    @sonalupadhyay555

    Жыл бұрын

    Same here. Absolutely agree.

  • @KitsGravity
    @KitsGravity2 жыл бұрын

    Well explained. Thanks for including the diagnostics, which is by far the most important part and something not often covered in most of the videos.

  • @datatab

    @datatab

    2 жыл бұрын

    Glad it was helpful! Regards Hannah

  • @marwatawfik3956
    @marwatawfik3956 Жыл бұрын

    Thanks so much. Do you have some features open (free) for students (i.e. regression)?

  • @ishaqhussain3892
    @ishaqhussain3892 Жыл бұрын

    Presentation of the concept is excellent 👍. Much appreciated 🎉

  • @datatab

    @datatab

    Жыл бұрын

    Thanks a lot 😊

  • @ananyaagarwal6504
    @ananyaagarwal6504 Жыл бұрын

    Well explained!

  • @retenim28
    @retenim282 жыл бұрын

    Hi, thanks for the video. Regarding the second assumption (residuals must be normally distributed). Does the histogram represent the normal distribution of the residual, right? I didn't understand if the points in the qqplot are the residuals or the sample data

  • @datatab

    @datatab

    2 жыл бұрын

    Hello thank you very much! Yes you are right! The captions are not correct, it is the residual in both cases! Regards, Hannah

  • @retenim28

    @retenim28

    2 жыл бұрын

    @@datatab Thanks for the reply. It's very common to see people checking the normality condition on the sample data and not on the residuals. I suppose it's a mistake. Instead, other people say: "Ok, it's not a "real assumption", but it is preferable that features are normally distributed, not only the residuals". Is there any truth behind this statement?

  • @datatab

    @datatab

    2 жыл бұрын

    @@retenim28 Hmm normally you main assumption is that the residuals are normally distributed! I can't answer that off the top of my head, but maybe the residuals are always normally distributed if all variables are normally distributed, but I don't know that for sure!

  • @bhavaniprasadraoejanthkar2498
    @bhavaniprasadraoejanthkar24982 жыл бұрын

    Hey! Your videos are awesome! It would be great if you make more videos on Machine Learning concepts.

  • @datatab

    @datatab

    2 жыл бұрын

    Many thanks! Yes we will try! Regards Hannah

  • @Gesuselsaviour
    @Gesuselsaviour2 жыл бұрын

    Thanks for the video, found it very helpful. Do we also have to ensure that there are no influential points in the data?

  • @datatab

    @datatab

    2 жыл бұрын

    What do you mean by influential points? Personally, I haven't heard of influential points as a requirement, but I haven't looked that up in more detail either! Regards, Hannah and Mathias

  • @Gesuselsaviour

    @Gesuselsaviour

    2 жыл бұрын

    @@datatab By influential point I mean an outlier that greatly affects the slope of the regression line. I was just wondering what the rule of thumb regarding them is when it comes to regression. But fair play if you're not sure if they are part of regression assumptions.

  • @datatab

    @datatab

    2 жыл бұрын

    @@Gesuselsaviour Well, if the outliers are too large, then the error epsilon will probably no longer be normally distributed and thus the requirements are not met, but as is so often the case, there is unfortunately no limit value that says from there it still goes and from there no longer!

  • @kamleshgya6694
    @kamleshgya66943 жыл бұрын

    Very helpful. Thank you so much.

  • @datatab

    @datatab

    3 жыл бұрын

    Thanks for your feedback!!! Cheers Hannah & Mathias

  • @perpalmgren4786
    @perpalmgren47862 жыл бұрын

    Hi! Thank you for a great statistic program and wonderful tutorials. One question and one statement: - Why are not two other important assumptions addressed, namely the problem with outliers and the requirement if independence of residuals? - Maybe it should be better illuminated that normality refers to that it is the residuals that should be normally distributed about the predicted dependents variables sore. It can be misunderstood that it is the raw data that should be normally distributed.

  • @datatab

    @datatab

    2 жыл бұрын

    Hello Per, thank you for your feedback! Yes that's right! Maybe we can make a video again to put that better there! Regards Hannah

  • @purplegeezer

    @purplegeezer

    Жыл бұрын

    @@datatab The assumption of independence of errors is actually very important. Your video is misleading people by not covering it.

  • @devanshujindal1731
    @devanshujindal17312 жыл бұрын

    Thank you ma'am for such a simple explanation it really helped me

  • @datatab

    @datatab

    2 жыл бұрын

    Glad to hear that! Many thanks! Regards Hannah

  • @WhyMunch69
    @WhyMunch696 ай бұрын

    Thank you ma'am!

  • @foucault9978
    @foucault99783 жыл бұрын

    Thank you very much!

  • @datatab

    @datatab

    3 жыл бұрын

    You are welcome!

  • @md.kutubulalamubayed6205
    @md.kutubulalamubayed620511 ай бұрын

    Marvelous

  • @k03dz0n3
    @k03dz0n3 Жыл бұрын

    soooo good! tysm

  • @alia4642
    @alia46422 жыл бұрын

    Thanks so much. What about the assumption: independence of the observations?

  • @datatab

    @datatab

    2 жыл бұрын

    Hmm, I have not read about it yet, but could make sense! Maybe the result is then no longer normally distributed. With it the assumption of independent observations would then be included in the assumption of normally distributed error. Regards Hannah

  • @mohsinalam8085
    @mohsinalam80853 жыл бұрын

    Fantastic. Thanks a lot

  • @datatab

    @datatab

    3 жыл бұрын

    Thanks for your Feedback!

  • @AbrarKnowledge
    @AbrarKnowledge2 жыл бұрын

    Nicely explained!!

  • @datatab

    @datatab

    2 жыл бұрын

    Many Thanks : )

  • @sanjeevgarg4355
    @sanjeevgarg43552 жыл бұрын

    Thank you ma'am, helped a lot

  • @datatab

    @datatab

    2 жыл бұрын

    Thanks!!!

  • @ipshitaghosh2656
    @ipshitaghosh26563 жыл бұрын

    Very nice explanation 😄🙌

  • @datatab

    @datatab

    3 жыл бұрын

    Thanks!

  • @hannahhordijk5121
    @hannahhordijk51218 ай бұрын

    Why is it the case that you should square the determinant in order to check for linearity? If there would be a logistic correlation, it would still be significant if you squared the determinant, right? Or not?

  • @pramodpatil1609
    @pramodpatil16093 жыл бұрын

    Thanks u so much

  • @datatab

    @datatab

    3 жыл бұрын

    Many thanks for your Feedback!

  • @datatab

    @datatab

    3 жыл бұрын

    Regards Hannah and mathias

  • @gopalakrishnamraju9321
    @gopalakrishnamraju93212 жыл бұрын

    Loved your accent

  • @datatab

    @datatab

    2 жыл бұрын

    🙂

  • @user-kk4og9yv1p
    @user-kk4og9yv1p Жыл бұрын

    Kandungan anda sangat menyentuh

  • @mosama22
    @mosama222 жыл бұрын

    Wow! Thank you :-)

  • @datatab

    @datatab

    2 жыл бұрын

    Thanks for your Feedback! Regards, Hannah

  • @zimalkhan1422
    @zimalkhan14222 жыл бұрын

    Some segments in the video are stamped not adjacent to each other

  • @datatab

    @datatab

    2 жыл бұрын

    Hello, what do you mean by that?

  • @amansinghal5908
    @amansinghal5908 Жыл бұрын

    I'm in love

  • @datatab

    @datatab

    Жыл бұрын

    : )

  • @car6120
    @car6120 Жыл бұрын

    so wtf do i do if my data isnt linear? just show a graph saying its not linear therefore i havent bothered to run and stats and all these data and research is a waste of time ?

  • @amardeepyaduvanshi5736

    @amardeepyaduvanshi5736

    5 ай бұрын

    Nope. Then you identify what distribution your dataset is following and try to predict. There are number of other distributions apart from linear. It might be quadratic or logarithmically related.