Krish Naik
5 жыл бұрын
151,354
1

Tutorial 29-R square and Adjusted R square Clearly Explained| Machine Learning

The adjusted R-squared is a modified version of R-squared that has been adjusted for the number of predictors in the model. The adjusted R-squared increases only if the new term improves the model more than would be expected by chance. It decreases when a predictor improves the model by less than expected by chance
#RsquareAdjustedRsquare
You can buy my book on Finance with ML and DL from the below link
www.amazon.in/Hands-Python-Fi...

Пікірлер: 142

@Emotekofficial4 жыл бұрын
Sum of squared residuals also called the sum of squared Errors is SSE and Sum of squared Regression is SSR just make sure about this since new students can get confused. Y = individual data points, Yreg = predicted Regression points Ymean = Average of Individual data points SSE = Y - Yreg SSR = Yreg - Ymean so, SST = SSE + SSR = Y - Ymean
@porselvans6172
3 жыл бұрын
Thank you, now understood well
@porselvans6172
3 жыл бұрын
@Ahmed Kellen didn't they ask money
@ShashwatAgarwal007
3 жыл бұрын
Hey can you help me the 'N' here, is it the total number of features or the total number of data points.
@GamerBoy-ii4jc
2 жыл бұрын
@@ShashwatAgarwal007 big N is total number of population and small n is total number of samples which we take from population
@blindprogrammer2 жыл бұрын
Initially N=1000 R^2=0.85 p=5 (initially) adjusted R_Squared = 1 - ((1-0.85)(1000-1)/(1000-5-1)) = 0.9849 1. suppose a new non-correlated variable is added: N=1000 R^2=0.86 (suppose new R^2) p=6 (new) adjusted R_Squared = 1 - ((1-0.86)(1000-1)/(1000-6-1)) = 0.8591 2. suppose a new correlated variable is added: N=1000 R^2=0.92 (suppose new R^2) p=6 (new) adjusted R_Squared = 1 - ((1-0.92)(1000-1)/(1000-6-1)) = 0.9195 As we can notice, on adding a non-correlated predictor, the overall adjusted R_squared has decreased while it has increased on adding a correlated predictor. Hope it helps!
@tanvipunjani70962 жыл бұрын
I am glad I came across this tutorial. Very well explained !
@praneethcj65444 жыл бұрын
Very intuitive explanation..!!! You have been such an inspirational instructor ..!!!!
@nilupulperera4 жыл бұрын
Very interesting Krish. As always you stimulate us to think and learn.
@sakshirikhe28693 жыл бұрын
It's very excellent and detailed explanation for a beginner!!!
@mawais25603 жыл бұрын
what are possible interpretations and justifications for low r square values in management science?
@SandeepKumar-ie1ni4 жыл бұрын
Sir, As you said that in order to avoid negate values in the residuals we squared the terms SSres and SStot , but sir if we apply mode on both values neglecting squared both terms , what will be the change in R values ?? On squaring the R value its getting larger which is reaching towards 1 more easily that depicts our model has fitted well . please answer sir .
@bobbypathak1232 жыл бұрын
Wow.. thanks so much Krish. This was the best explanation i found
@tannurohela61922 жыл бұрын
Hey, I didn't get the term Penalizing. In the video just before explaining Adjusted R square, it was said that "it is not Penalizing the new added features". Can someone please elaborate.
@ayushmaheshwari58053 жыл бұрын
please tell why SS res decrease as we increase the feature please explain ?
@ruchiyadav13343 жыл бұрын
Kuch samjh nhi aya
@adylmanulat24652 жыл бұрын
good day sir, I just wanted to ask if an independent variable is not significant or does not have an explanatory power to the model but when removing it lowers the adjusted r-square what does this imply? so far the reason that i know the reason is because the t-statistic is greater than one. With this information, what can we infer?
@akshaymote34305 ай бұрын
I didn't get one thing that even in Adjusted R2, whether there's correlation or not is not taken into consideration. So, by just considering number of variables, how correlation issue gets addressed?
@independent72123 жыл бұрын
Thank you so much sir for your great support by making such videos.
@anishchhabra6085 Жыл бұрын
Can you please explain how the SSres will decrease as we try to add a new independent variable?
@hemachand56174 жыл бұрын
Let's say I have 10 features and some R square value is calculated. Later it found that 4 of the features are uncorrolated with the target. Now 1-R2 value is not going to change and so does the adjusted R2 value. Can u correct me if I'm analyzing it wrong hoping it would follow the simple linear regression model not the lasso
@user-wx9sd7xp2f7 ай бұрын
very informative and useful content, lucid explaination
@kavururajesh17603 жыл бұрын
Explained in detailed manner keep doing
@reddy7645 жыл бұрын
Can you suggest good book for Machine Learning ?
@harisjoseph1173 жыл бұрын
Thank you Krish. Nice explanation.
@firta_banjara3 жыл бұрын
hi krish, if we add features with high error then the SSres increases , but if we add features with low error then SSres decreases
@aryanudainiya94862 жыл бұрын
best teacher of ML on the youtube
@mohammad.anas7777 Жыл бұрын
Nayek sir p is total independent features or those independent features which we have added later? Also, can we say that N is total number of columns in the data set? is so then, should we count those columns also which have irrelevant data like ticket serial number or passenger name in titanic dataset?
@mahalerahulm4 жыл бұрын
Wonderful Explanation !!
@amitanand84854 жыл бұрын
Thanks .. Explained beautifully
@ayantikabhowmik12614 жыл бұрын
Great explanation Sir!
@srinagabtechabs3 жыл бұрын
Excellent explanation.. thank u very much
@anuradhadevi14142 жыл бұрын
Bahut accha somjaya sir thank you sir
@alishaparveen52262 жыл бұрын
Could you please explain with any example from scratch with multi output in regression?? I want to predict 2 output (distance travelled and velocity) from the dataset.
@durgakorde3589 Жыл бұрын
Thanks a lot Krish 🙂its really helpful
@pratiknabriya55064 жыл бұрын
Thanks...very well explained.
@MrPrashanth555 жыл бұрын
SSR means Sum of the Squares of the Residuals SST - Sum of the Squares of the Total....
@pranjalgupta94274 жыл бұрын
Awesome video and explaination
@sagarpandya78653 жыл бұрын
Great explanation Thank you
@balaramg892 жыл бұрын
N - total sample size, indicates no of rows in the model?
@abhinavjain55613 жыл бұрын
In adjusted r2, their is r2 But whether the feature is correlated or not the r2 value will increase than how we are able to say something about adjusted r2
@seemaarya5983 жыл бұрын
How we can say adj r square is significant or not
@woblogs29413 жыл бұрын
Thank you sir u made the things veery easy
@voramb1233 жыл бұрын
Very interesting and excellent but requested to give examples to evaluate situations
@hakkamadan99413 жыл бұрын
beautiful explanation sirji
@shaz-z5065 жыл бұрын
Thank you Krish that's the good explanation.
@prateeksachdeva1611 Жыл бұрын
Very well explained
@mayurisagiraju79285 жыл бұрын
thank you so much...It helped
@harishgoud67725 жыл бұрын
Sir SSR means sum of squares of residuals.
@rajeshdhyani31142 жыл бұрын
Well Explained
@sahilbhatia26713 жыл бұрын
very well explained
@kumarvaibhav53253 жыл бұрын
Sir it would be great it you can compliment this with an example
@Kmrabhinav5692 жыл бұрын
Well done
@Priyadarshan1233 жыл бұрын
Hello sir, I am making a project on income and health expenses, my r-squared value comes out less than 1%. What should i interpret from this? Should i change my linear model or try other? What should i do?
@kitagrawal3211
2 жыл бұрын
you should add another feature which is correlated to the target variable. Low R-squared means that your independent feature and target variable are not correlated. You can confirm this by computing the correlation between them
@bhavanasree75734 жыл бұрын
What do we do next if we get to know that r-square is small ? Yeah it says the model isn't a good fit but is there any way we can improve the model after getting to know the r squared is less or we use some other method to solve this model
@manishsharma2211
3 жыл бұрын
Hyperparameter tuning
@SACHINKUMAR-px8kq2 жыл бұрын
Thankyou so much sir
@kalyanreddy62602 жыл бұрын
Rsqaure meanns ssr/sst only right whay -1 before that . Just to know in some excel videos it shows only ssr/sst
@gopakumar1384 жыл бұрын
very useful video
@adipurnomo56832 жыл бұрын
Fantastic course!. I hope you doing well sir .
@shubhamprasad69103 жыл бұрын
Which variable in the R^2 adjusted is equation has related to correlation. it is not R^2 and all other variable have nothing to do with correlation. Is it the ratio of (n-1)/(n-p-1)?
@akshaymote3430
5 ай бұрын
Even I have same question. There should be something more in the formula of R2 adjusted which will take correlation into account.
@anubhavgupta81464 жыл бұрын
Bhai kya karke manoge , itna simply koi kaise padha sakta hai👍
@sangitakhade17303 жыл бұрын
what is the meaning of penalize
@tonnysaha76763 жыл бұрын
Thank you sir🙏
@sardarsahib39934 жыл бұрын
superb
@ravitadiboina60653 жыл бұрын
Why r2 value is no decreasing when features are increasing is their any theory behind it
@kitagrawal3211
2 жыл бұрын
yes. you will always be adding either 0 or small values > 0 (because of the square) so it will either remain the same or increase.
@biswajitnayaak2 жыл бұрын
i am not 100% sure if this is correct when you say it needs to be squared (Actual - Predicted) because of negative value but i suspect its for the outliers
@hanman51954 жыл бұрын
All time never ever found these kind explanation. I will not follow any howle heros except Sadhguru and You.
@vjukulkarni60574 жыл бұрын
Hi krish can u please suggest how to explain the algorithm in interview
@bhavyaparikh6933
4 жыл бұрын
are they ask algorithm in interview
@manishsharma2211
3 жыл бұрын
@@bhavyaparikh6933 yes
@subhamsaha22352 жыл бұрын
Still not clear for me, can anyone help me out. In case of un-correlated or correlated variable, If p increases then N will also increase, R2 obviously increase, then how its penalizing?
@kitagrawal3211
2 жыл бұрын
N is constant here because it's number of samples vs p is number of preictors.
@emilyme94783 жыл бұрын
Awesome
@snigdharay88474 жыл бұрын
If these two are different then why do all say that r-sqaure and adjusted r-sqaure both are same and while seeing the ouput we always see the adjusted r-square.
@generationwolves
4 жыл бұрын
R-Squared and Adj R-Squared are NOT the same. For Simple Linear Regression, the R-Squared and Adj. R-Squared values will almost be similar. You can just check the R-Squared value to evaluate your model's goodness of fit. For multiple Linear Regression, you will find that no matter what, the R-Squared value will keep increasing as you add new features (even if the new feature is not correlated to the dependent variable). This leads you to believe that the new feature (independent variable) you've added is contributing to building a better model, which is not the case. The adjusted R-Squared function provides a penalty mechanism that reduces the overall value if the new feature is not contributing to the model. This metric is usually considered to evaluate the goodness of fit (in the case of Multiple Linear Regression), especially when you're using a Feature Selection method like Step-Wise Regression.
@manzarabbas63123 жыл бұрын
kamal !!!!!
@ParallelUniverse5503 жыл бұрын
Can R square be considered as training accuracy?
@kitagrawal3211
2 жыл бұрын
yes, it is a performance metric. in practice, adjusted r-score is used more often
@gauravjoshi97643 жыл бұрын
i just wanna know this total sample size is total number of columns or total number of rows
@kitagrawal3211
2 жыл бұрын
sample size is total number of rows. predictors are total number of columns
@rachanagovekar16833 жыл бұрын
What are these 33 dislikes for ? Is your language different :-D, Awesome explanation Krish, hats off
@adityasagar9293
3 жыл бұрын
maybe in search of hindi content
@keerthanpu8084 ай бұрын
HOW U TOOK AVERAGE LINE IN GRAPH (ON WHAT BASIS?)
@AkashRusiya
Ай бұрын
It's simply the arithmetic mean of target variable's "actual" values.
@utkarshsalaria39522 жыл бұрын
Sir at last of the video you said that r^2 will never be decreasing on increase of independent features even if the that feature is not correlated , then how can you say that adjusted R^2 will decrease when R^2 is less (at 14:16) which will never be true according to the fact that R^2 will always be increasing then how can it be less It have actually confused me Plz help if anyone knows
@rohandogra5421
2 жыл бұрын
Yup I also have the same problem
@tiverekarrahul
2 жыл бұрын
1) If added features are correlated with target, R2 grows much fater compared to denominator term containing number of features ( p). Hence Adj. R2 also increases. 2) If added features are not correlated or less correlated with target, then R2 grows slower compared to denominator term containing number of features ( p). Hence Adj. R2 will increase a little, but will not have any significant rise.( NOTE: Adj R2 Does not decrease) That is what is called as penalized. Not allowed to grow at same rate as that of correlated features case.
@ankursingh59692 жыл бұрын
Krish R-square will increase in both of the cases whether the variable is correlated with dependent variable or not. hence it result in decrease in Adj R-Squarein both of the case. However the magnitute will be difference.
@datascience67184 жыл бұрын
Sir, what is the meaning of penalize in terms of machine learning?
@ayushmishra-sw4po
4 жыл бұрын
Here Panalize means er are adding extra predictor which is no use..so it will decrease the value of Adjusted R sq
@datascience6718
4 жыл бұрын
@@ayushmishra-sw4po thank you so much
@kishanpandey47984 жыл бұрын
If I have 10 features and if I need to know which feature is affecting output y and which is not affecting y. Do I need to find correlation between y and each feature separately. If yes , then how? If not , then what to do? Krish please reply. Thanks
@deepakgehani
4 жыл бұрын
You can do Eda, do a pairplot check correlation and put on heatmap and later you can aply machine learning algo
@kishanpandey4798
4 жыл бұрын
@@deepakgehani thanks a lot. I will apply this and revert back to you in case I face any other issue. Thanks again
@praneethcj6544
4 жыл бұрын
You need to perform chi square test if both IP&Op variables are categorical and ANOVA for cat ,cont variables ,finally Pearson correlation for both continuous ...!!!
@praneethcj6544
4 жыл бұрын
You write in a loop all the variables and check correlation.
@mranaljadhav8259
4 жыл бұрын
you have many way to find , firstly you can find correlation between them using heatmap or corr method, secondly you an find the VIF value of the features , last way you can check your standard error by using OLS method.
@hemantdas95464 жыл бұрын
What does this mean that R square will always increase when feature is added. This means when features are increased predictions are better. Is it so?
@kulpreetsingh9064
4 жыл бұрын
No bro, That will depend whether the features getting added are correlated or not. If the features getting added are not correlated with the target variable then the adjusted R square will decrease, however if they are correlated then naturally adjusted R square will also increase.
@ayushmishra-sw4po
4 жыл бұрын
Adding multiple feature will automatically increase the r square, as increasing feature decreases the value of SSres.even if the feature is not related to the output variable. Adding multiple feature to our model can perform better in sample than when tested out of sample.So in such case adjusted r square works
@akshaykrishnan79854 жыл бұрын
Good morning sir. Please do upload a video with explanation of what exactly is p-value. Getting confused with it. I hope atleast your explanation would give more clarity.
@generationwolves
4 жыл бұрын
www.wikihow.com/Calculate-P-Value
@abhi90292 жыл бұрын
Hi Krish, At the end of your each sentence while explanation please make the same rhythm of the speech. What happen here is at the end of your sentence you make your voice very low so this creates confusion while listening.
@kewalagrawal65393 жыл бұрын
This is the problem with our education system...everything is just formula based...you started off with the formula without even giving any intuition about what actually R2 and adjusted R2 mean...what does a 50% R2 tell you...formula and maths always come last...you should first make your students visualize what these terms mean without using any maths at all...once they are good with it...then you bring the formula
@kinnaryraval3 жыл бұрын
Hi Krish, Nicely explained. But have a query. R-square will always increase whether calculated against significant or insignificant feature. So, there is no thing that R-sq will be less for non-corelated features and more for corelated ones, like it will increase blindly. So, how can you say that R-adj will decrease when added attributes are non-corelated as R-sq will still increase, making R-adj = 1 - smaller_number ? I hope my question is bit clear. Thanks n respect sir!! (v).
@harishkumaars9753
2 жыл бұрын
I too have this doubt
@saifsalim60844 жыл бұрын
In which condition, SSR will be greter than SST?
@ayushmishra-sw4po
4 жыл бұрын
As we increase the number of independent feature the value of SSR will increase
@nilaykushawaha2666
3 жыл бұрын
If the model prediction is worst than the average prediction we have assumed in SST
@richasharma59493 жыл бұрын
Good explanation, but it would be better to add an example. That way it will become more clear :)
@deepknowledge2505
3 жыл бұрын
Please see if this could help you kzread.info/dash/bejne/ZYejrZtsYKu9fJM.html
@anubhasinha25574 жыл бұрын
Nicely explained... Can you help me with difference between Sum of Residual and Cost function? Looks like both have same formula.
@ayushmishra-sw4po
4 жыл бұрын
Actually both are same..sum of residual is the sum of square of difference between predicted and actual data points and cost function is also same,
@anubhasinha2557
4 жыл бұрын
@@ayushmishra-sw4po Thanks Ayush!!!
@burhanuddinraja72093 жыл бұрын
Sir, but if p will increase the N will also increase because they both have independent variables. So the denominator will always be zero.
@kitagrawal3211
2 жыл бұрын
N is the number of samples, not number of predictors. For the shape of dataframe (m,n) the number of samples is m and number of preictors is n.
@nileshsuryavanshi87924 жыл бұрын
very well explained, thank you sir.
@ganesanr23074 жыл бұрын
Since R Square is the squared value of r, then how it will get a negative value. R square always 0 to 1. It will never ever be a negative number
@linuxrhel6107
4 жыл бұрын
There is no such value of R, only R Square is the terminology used for this formula. Check out the formula for R Square.
@ganesanr2307
4 жыл бұрын
R is the Correlation Coefficient
@meetmeraj2000
4 жыл бұрын
R squared can be a negative value if the model is worse than average best fit line.
@varunkukade79714 жыл бұрын
You said by using 1st formula that even if independent feature is not related, r^2 value increses .that was the drawback. But at 14.18 sec of video you are saying if the feature is not related then we would get smaller r^ value from 1 st formula. I got confused here. Please solve my confusion. I will be glad. Please🙌🙌🙏
@ayushmishra-sw4po
4 жыл бұрын
No.even if the feature is not correlated to output variable,the value of r square will increase, thats why we uses the adjusted r square..if the feature is not correlated, value will decrease.... May be he said that by mistake
@kitagrawal3211
2 жыл бұрын
he meant that for the same features, if they are correlated with the target variable, you will get a higher R2 value and a smaller value if they are uncorrelated.
@shubhamkundu22283 жыл бұрын
Little Confusing for the use of Adjusted Rsquare !.. So when we add more independent variables to model, the Rsquare will always make sure to increase, then Adjusted Rsquare checks if independent variables is not correlated to the target variable and minimize Rsquare value. Does that mean while feature selection, we should take those independent features that are correlated to target/output variable and drop other..? Aren't we supposed to take those independent variables in model that are not correlated with each other and they are independent, so why penalizing them which are not correlated !! For independent variables that are correlated, we could drop them !
@tejas4054 Жыл бұрын
Particular bolna kab band kroge
@Dyslexic_Neuron Жыл бұрын
not a satisfactory explanation as to how R adjusted takes care of non correlated value, just hacking the formula doesnt make it very clear. The intuition and the reason for adding sample size is not explained properly. Overall not a good explanation
@machinelearningchefs35254 жыл бұрын
Correct yourself R-squared = SumSquareRegression/SumSquareTotal and this entity cannot be negative. SST = SSR + SSE. So SST > SSE , there is no chance of R-squared to be negative. This what happens when you are teaching without have good understanding of concepts behind them. You have more than 150K subscribers and do not mislead them From mathematical stand point R-square is the ratio of variation explained due to the model to variation in the data
@jagannathgirisaballa
4 жыл бұрын
𝑅2 compares the fit of the chosen model with that of a horizontal straight line (the null hypothesis). If the chosen model fits worse than a horizontal line, then 𝑅2 is negative. Note that 𝑅2 is not always the square of anything, so it can have a negative value without violating any rules of math. 𝑅2 is negative only when the chosen model does not follow the trend of the data, so fits worse than a horizontal line. Example: fit data to a linear regression model constrained so that the 𝑌 intercept must equal 1500 i.stack.imgur.com/CHpzE.png The model makes no sense at all given these data. It is clearly the wrong model, perhaps chosen by accident. The fit of the model (a straight line constrained to go through the point (0,1500)) is worse than the fit of a horizontal line. Thus the sum-of-squares from the model (𝑆𝑆reg) is larger than the sum-of-squares from the horizontal line (𝑆𝑆tot). 𝑅2 is computed as 1−𝑆𝑆reg𝑆𝑆tot. When 𝑆𝑆reg is greater than 𝑆𝑆tot, that equation computes a negative value for 𝑅2 . With linear regression with no constraints, 𝑅2 must be positive (or zero) and equals the square of the correlation coefficient, 𝑟. A negative 𝑅2 is only possible with linear regression when either the intercept or the slope are constrained so that the "best-fit" line (given the constraint) fits worse than a horizontal line. With nonlinear regression, the 𝑅2 can be negative whenever the best-fit model (given the chosen equation, and its constraints, if any) fits the data worse than a horizontal line. Bottom line: a negative 𝑅2 is not a mathematical impossibility or the sign of a computer bug. It simply means that the chosen model (with its constraints) fits the data really poorly.
@jagannathgirisaballa
4 жыл бұрын
This person has put in a great degree of time and effort which is an indication of his passion. The reason he has 150K subscribers is that the followers are able to make sense of what he is saying. And dude, logically what will he gain by misleading them. Is he preaching some religion???? I checked your KZread channel...surprised that you are commenting without having uploaded a single video?? I recommend that first of all we learn to appreciate the person and even if there is a mistake in something he is saying(to err is human!), lets show some humility in pointing it out.
@machinelearningchefs3525
4 жыл бұрын
@@jagannathgirisaballa Hi I understand that you no idea about ML or stats. I dont need videos to be uploaded to comment on others videos. Anyway I have Phd in ML/Computer Vision. I dont want get into fight with you . Chill and follow his Videos.
@krishnaik06
4 жыл бұрын
Buddy chill...whatever I explain is based on the practical experience...so that means I have proof of everything I do. Any how u r highly qualified, I think u should share your knowledge with everyone...I would also love to see some implementations from your end..and Yes I do not mislead anyone..You can check my linkedin profile, and these videos have helped people to clear interviews. Anyhow it has not helped you, I am sorry about it. So in conclusion misleading is a very wrong term to use over here. Being a highly qualified guy like you, it doesn't suit you at all. Cheer stay safe and healthy. I would also suggest u to go through this link stats.stackexchange.com/questions/12900/when-is-r-squared-negative
@jagannathgirisaballa
4 жыл бұрын
@@machinelearningchefs3525 bro, I will be the first person to accept that I have no idea of ML or stats. And that's my excuse of being here and watching the video. So, bro with a PhD, whats your excuse of being here and watching the video? Checking out the opposition? :-) anyways, peace brother. I am here for learning and would love to learn from anyone..apologies if my comment hurt your feelings. not intentional.