Logistic Regression for Classification | Working with a real-world dataset from Kaggle
💻 For real-time updates on events, connections & resources, join our community on WhatsApp: jvn.io/wTBMmV0
In this lesson we will learn about using Logistic Regression for Classification. Logistic Regression is a commonly used technique for solving binary classification problems. You can experiment with the notebook used in the above video here 👉 jovian.ai/aakashns/python-skl...
🔗 Check out this playlist for the complete lecture series on Gradient Boosting Machines: • Machine Learning with ...
🎯 Topics Covered
• Downloading a real-world dataset from Kaggle
• Splitting a dataset into training, validation & test sets
• Imputing and scaling numeric features
• Encoding categorical columns as one-hot vectors
• Training a logistic regression model using Scikit-learn
• Evaluating a model using a validation set and test set
❓ Ask Questions here: jovian.ai/forum/t/lesson-2-lo...
classification/17915
⌚ Time Stamps:
00:00 Introduction
05:16 Problem Statement
25:43 Downloading a real-world dataset from Kaggle
35:35 Exploring data analysis and visualization
47:06 Splitting a dataset into training, validation & test sets
01:03:04 Filling/Imputing missing values in numeric columns
01:21:55 Scaling numeric features to a(0,1) range
01:28:10 Encoding categorical columns as one-hot vectors
01:39:02 Training a logistic regression model using Scikit-learn
01:53:41 Evaluating a model using a validation set and test set
02:19:38 Saving a model to disk and loading it back
02:36:28 Summary and Conclusion
⚡ Free Certification Course
"Machine Learning with Python: Zero to GBMs(Gradient Boosting Machine)" is a practical and beginner-friendly introduction to supervised machine learning, decision trees, and gradient boosting using Python. You will solve 3 coding assignments & build a course project where you'll train ML models using a large real-world dataset. Enroll now: zerotogbms.com
🔗 Visit the logistic regression lecture page here: jovian.ai/learn/machine-learn...
🎤 About the speaker
Aakash N S is the co-founder and CEO of Jovian - a community learning platform for data science & ML. Previously, Aakash has worked as a software engineer (APIs & Data Platforms) at Twitter in Ireland & San Francisco and graduated from the Indian Institute of Technology, Bombay. He’s also an avid blogger, open-source contributor, and online educator.
#GBM #MachineLearning #Python #Certification #Course #Jovian
-
Learn Data Science the right way at www.jovian.ai
Interact with a global community of like-minded learners jovian.ai/forum/
Get the latest news and updates on Machine Learning at / jovianml
Connect with us professionally on / jovianml
Follow us on Instagram at / jovian.ml
Subscribe for new videos on Artificial Intelligence / jovianml
Пікірлер: 73
That was intense!!! This is probably the first time I have watched a tutorial this long without any break You are Awesome sir
This video is still one of the best. A literal game changer!
Thanks a lot Aakash for the fabulous explanations and infectious passion to empower others. These tutorials are simply unmatched! Bravo!
@jovianhq
3 жыл бұрын
Thanks for the feedback, help us spread the word :)
@bongogappo38
Жыл бұрын
@@jovianhq sir what can we do if there is a column of string type values like disease name and symptoms
Nicely explained Akash and Jovian Team..this was probably the most thorough and clearly explained tutorial I came across
Great video! I learned a lot! Thank you!
great explanation with reasonable depth for this topic, such a great video...
@jovianhq
3 жыл бұрын
Thanks for the feedback, help us spread the word :)
Really, a lecture full of knowledge
@jovianhq
3 жыл бұрын
Thanks for the feedback, help us spread the word :)
Great content Aakash sir , that too free...really amazed and impressed by jovian !
@jovianhq
2 жыл бұрын
Glad you liked it!
Thank you, this was very beginner friendly and it helped me understand a lot of practical topics.
@jovianhq
Жыл бұрын
You're very welcome! Glad it was helpful.
Salute Boss. This is wholesome 💝💝
Thank you for such a detailed lecture. Very very helpful. Would love to know about more.
@jovianhq
2 жыл бұрын
Glad it was helpful! Go to zerotogbms.com for more lectures on Machine Learning
Thank you very much.🙏
excellent brother!
Very good tutorial.elaborate and detailed .thanks
@jovianhq
3 жыл бұрын
Thanks for the feedback, help us spread the word :)
Great content
Nice lecture
Thanks a Lot Bro its nice dataset and you covered very nice from start to end
I was working on a mini data science project in which test.csv and train.csv datasets given to me. I trained my model using training data. Now if i want to find accuracy score of my model on testing data what i will do? If i write model.predict(test_data) then how i will compare the predicted tesing values to the true values? Because there is no target values in the testing dataset
Nice Video....Really appreciated. Can we also include the topic of setting up data pre processing pipelines in future sessions.
@jovianhq
3 жыл бұрын
Thanks for the feedback and suggestion!
Hello. I have a question. Should we scale the features after the imputation or before because here you imputed the raw_df dataframe which is not imputed? Thanks
1:45:00 whilst you fitted the transformed cols in to your model, I am still getting a type error
hey, also isn't it a common practice to scale the test data that is transform the test data or validation data by fitting it only on training datasets?
At 1:35:35 ,encoder transform, i am getting an error that columns must be the same as length key.please tell me how to reolve it
finished watching
thanks u so good! thanks again
@jovianhq
2 жыл бұрын
You're welcome!
(1:53:40) when you plot the weights the negative weight would not be considered. And the negative weights also affect the model just in opposite direction. What are your thoughts should the negative weights be considered??
@jovianhq
2 жыл бұрын
Yes, the negative weights should be considered. In fact, you can try and ignore the columns which has very less weights i.e. whose weights are closer to 0. Both negative and positive weights effect the model in some way.
would you mind switching to dark mode? TIA
I have a doubt. When we do imputation, we take mean to replace the missing values. We take the mean from each columns of the entire data. The mean of data in each columns of the entire data should be different from means taken from train_df, val_df and test_df separately. It will create some discrepancy in the final result. What's your position on this ? Whether we should conduct imputation based on the entire dataframe or from its subsets
@jovianhq
2 жыл бұрын
A sample of the data should represent the entire dataset. Also, the validation, and training set should be independent of the training set. So imputation can be done differently in validation set and training set.
thnks sir...but how to deploy on the website?
Thank you🙂
@jovianhq
2 жыл бұрын
Welcome!
1:26:54 can't understand why is max value in some columns not 1, it should be 1....
3 hrs worth watching
@jovianhq
3 жыл бұрын
Thanks for watching!
FINISHED CODING FULL
So higher the weight more important column is (but only if numerical columns are scaled)? If data is not scaled we cannot derive this conclusion?
@jovianhq
2 жыл бұрын
True! Also, not just higher, the more negative the weight the more important it has i.e. The weight that are closest to 0 have minimum importance
amazing
@jovianhq
Жыл бұрын
THANKYOU!
Information Leakage timestamp: 1:25:10 , He fitted the scaler on the whole numerical dataset and transform it to train, validation and test sets. But isn't it the Information leakage because the scaler knew the test or validation while fitting?
@jovianhq
2 жыл бұрын
Well, if you have access to the validation dataset, you can do scaling on the training and validation both. Generally, you won't be able to touch the test dataset so we shouldn't fit scaler/encoder on the test dataset.
waoo
1:39:11
Sir do you continue this videos
@shreeyansjavangula8908
3 жыл бұрын
Yeah, this is a course on ml. The new videos structure is provided on his website. jovian.ai/learn/machine-learning-with-python-zero-to-gbms
0:58:00
1:06:09
1:00:55
Hi I noticed that in 1:53:44 you are making a prediction using the train inputs (X_train).... but shouldn't' t you be making a prediction using the validation inputs instead? I don't think you have passed the X_val into any of the logistic regression model prediction.... or am I just confuse ? HAAHHA.
@jovianhq
2 жыл бұрын
Please check kzread.info/dash/bejne/pZ593Mh8ZKS1eZM.html, at first we're predicting with the train set, later we are also predicting with the validation and test sets.
@rabbitazteca23
2 жыл бұрын
@@jovianhq I am sorry ahahah. you are right. I must have missed this part.
bookmark 1:03:15 .. for me imp part start here
What's a solver
@jovianhq
3 жыл бұрын
Hey, please go through the blog to know more about solvers. -> towardsdatascience.com/dont-sweat-the-solver-stuff-aea7cddc3451
1 ;56;49 nicee
Please add subtitles.
@jovianhq
3 жыл бұрын
Hey we are in the process of adding subtitles to videos, it will be added soon. Thanks!
@rubayetalam8759
3 жыл бұрын
@@jovianhq thanks! you are doing great!
Here is another simplified Logistic Regression tutorial if you are a beginner: kzread.info/dash/bejne/ppeetJqDibbIaag.html
1:18:01
1:08:30