Logistic Regression Using Excel Video 1 (Finding Heart Disease Probabilities)
In this video I give a quick introduction to logistic regression and build a logistic regression model using only Excel. I used the heart disease dataset provided in the following link (Cleveland data): archive.ics.uci.edu/ml/datase...
This has a follow up video where I finish the model and create a confusion matrix and find the accuracy, prediction, and recall measures. I finish the second video by creating a ROC (receiver operating characteristic) curve.
The link for the Excel file I used is provided below as a Google Sheets file. Do download it please open it, go to File- Download As - Excel
docs.google.com/spreadsheets/...
Пікірлер: 22
With this video, I implemented logistic regression at my job using variables to describe the percentage of approval for financial proposals. I applied the same principle but divided my results into three clusters instead of two, making the model significantly stronger. Thank you for this video!
@mcanbolat
6 ай бұрын
Although I am happy that the video helped, I recommend that you start using R or Phython for your future work. I posted this video to explain the basics of the model. You need to check the optimal number of clusters using an elbow diagram in which Excel is going to be very inefficient.
Thanks, I was looking for a detailed explanation so I can backtest my trading data. These videos are of great help.
Thanks, Mustafa! My professor is incapable of teaching this kind of information to his own class. He almost exclusively outsources other people's work to teach his students in a Master's Program. Very helpful and many thanks!
Thanks a lot for this video. really helped me a lot on right time.
Very nice and useful video, thank you very much for that!
Nice interpretation!
Hello Dr. Canbolat, I want to use a different data set from UCI repository. But there are no direct spreadsheets available. Could you please help me in that?
Excellent video thanks
good explanation. but how to get b0 (intercept)?.
Why did you use log() instead of ln()
i have a question; how did you choose the intercept to be -2 and why does intercept be upon you to choose?
@mcanbolat
Ай бұрын
I just used arbitrary numbers first. The optimal values were found later.
hi, i am trying to apply this in a school research. But when I launch the solver it gives me all the coefficient to Zero! Why it happens? It seems I cannot maximize log-likelyhood without choosing any other coefficient beyond zero...
@mcanbolat
Жыл бұрын
Sometimes Excel Solver is not able to find the optimal values if the model has a lot of variables and/or records. You can try with some arbitrary values in the coefficients to increase your chances. Otherwise, you will have to rely on other software such as R, Minitab, Stata, etc. to get the results very quickly. I used Excel here to give insights about how logistic regression classification works.
Did you do a regression to get the slopes and the intercepts using a training model first? Not sure how you got the B1, b2, b3 etc. thank you
@mcanbolat
11 ай бұрын
I maximized the sum of the log likelihoods. In regression you minimize the sum of the squared errors. In logistic regression you maximize the joint probability of success.
@shaunkiyabu2707
11 ай бұрын
Hello, thanks for the reply. I think im still confused. Around the 7 min 30 second mark (or a little bit after), you state “assume you know the numbers from the log of best fit”. This implies that you got b1,b2,b3 from a previous regression correct?
@mcanbolat
11 ай бұрын
What I mean is that we need to enter a formula to calculate the logit, odds, and probabilities referring to the coefficients cells which are empty (not known yet). Later, by optimizing the sum of the log likelihood, we will find the correct coefficients resulting in correct probabilities.
Wt are the values in b0,b1,2,3….???
@mcanbolat
2 жыл бұрын
Those are the coefficients for each variable in the dataset (and the intercept). The coefficients represent the expected change in log of odds for one unit change in that variable. If they are negative, they decrease the probability of success and if they are positive they increase the probability of success.
Y= log (odds)= b0+b1x1