Logistic Regression for Classification | Working with a real-world dataset from Kaggle

💻 For real-time updates on events, connections & resources, join our community on WhatsApp: jvn.io/wTBMmV0
In this lesson we will learn about using Logistic Regression for Classification. Logistic Regression is a commonly used technique for solving binary classification problems. You can experiment with the notebook used in the above video here 👉 jovian.ai/aakashns/python-skl...
🔗 Check out this playlist for the complete lecture series on Gradient Boosting Machines: • Machine Learning with ...
🎯 Topics Covered
• Downloading a real-world dataset from Kaggle
• Splitting a dataset into training, validation & test sets
• Imputing and scaling numeric features
• Encoding categorical columns as one-hot vectors
• Training a logistic regression model using Scikit-learn
• Evaluating a model using a validation set and test set
❓ Ask Questions here: jovian.ai/forum/t/lesson-2-lo...
classification/17915
⌚ Time Stamps:
00:00 Introduction
05:16 Problem Statement
25:43 Downloading a real-world dataset from Kaggle
35:35 Exploring data analysis and visualization
47:06 Splitting a dataset into training, validation & test sets
01:03:04 Filling/Imputing missing values in numeric columns
01:21:55 Scaling numeric features to a(0,1) range
01:28:10 Encoding categorical columns as one-hot vectors
01:39:02 Training a logistic regression model using Scikit-learn
01:53:41 Evaluating a model using a validation set and test set
02:19:38 Saving a model to disk and loading it back
02:36:28 Summary and Conclusion
⚡ Free Certification Course
"Machine Learning with Python: Zero to GBMs(Gradient Boosting Machine)" is a practical and beginner-friendly introduction to supervised machine learning, decision trees, and gradient boosting using Python. You will solve 3 coding assignments & build a course project where you'll train ML models using a large real-world dataset. Enroll now: zerotogbms.com
🔗 Visit the logistic regression lecture page here: jovian.ai/learn/machine-learn...
🎤 About the speaker
Aakash N S is the co-founder and CEO of Jovian - a community learning platform for data science & ML. Previously, Aakash has worked as a software engineer (APIs & Data Platforms) at Twitter in Ireland & San Francisco and graduated from the Indian Institute of Technology, Bombay. He’s also an avid blogger, open-source contributor, and online educator.
#GBM #MachineLearning #Python #Certification #Course #Jovian
-
Learn Data Science the right way at www.jovian.ai
Interact with a global community of like-minded learners jovian.ai/forum/
Get the latest news and updates on Machine Learning at / jovianml
Connect with us professionally on / jovianml
Follow us on Instagram at / jovian.ml
Subscribe for new videos on Artificial Intelligence / jovianml

Пікірлер: 73

@anuragthakur57873 жыл бұрын
That was intense!!! This is probably the first time I have watched a tutorial this long without any break You are Awesome sir
@SillyLittleMe Жыл бұрын
This video is still one of the best. A literal game changer!
@kizzavincent3 жыл бұрын
Thanks a lot Aakash for the fabulous explanations and infectious passion to empower others. These tutorials are simply unmatched! Bravo!
@jovianhq
3 жыл бұрын
Thanks for the feedback, help us spread the word :)
@bongogappo38
Жыл бұрын
@@jovianhq sir what can we do if there is a column of string type values like disease name and symptoms
@TheAnugupta3 жыл бұрын
Nicely explained Akash and Jovian Team..this was probably the most thorough and clearly explained tutorial I came across
@parastooaghr3 жыл бұрын
Great video! I learned a lot! Thank you!
@hemangdhanani94343 жыл бұрын
great explanation with reasonable depth for this topic, such a great video...
@jovianhq
3 жыл бұрын
Thanks for the feedback, help us spread the word :)
@rlm35743 жыл бұрын
Really, a lecture full of knowledge
@jovianhq
3 жыл бұрын
Thanks for the feedback, help us spread the word :)
@ektakumari44962 жыл бұрын
Great content Aakash sir , that too free...really amazed and impressed by jovian !
@jovianhq
2 жыл бұрын
Glad you liked it!
@sahilmalhotra7295 Жыл бұрын
Thank you, this was very beginner friendly and it helped me understand a lot of practical topics.
@jovianhq
Жыл бұрын
You're very welcome! Glad it was helpful.
@mdalamgirhossain61922 жыл бұрын
Salute Boss. This is wholesome 💝💝
@tapomayeebasu30472 жыл бұрын
Thank you for such a detailed lecture. Very very helpful. Would love to know about more.
@jovianhq
2 жыл бұрын
Glad it was helpful! Go to zerotogbms.com for more lectures on Machine Learning
@danielm57292 жыл бұрын
Thank you very much.🙏
@anuphp3432 Жыл бұрын
excellent brother!
@foodforthought84153 жыл бұрын
Very good tutorial.elaborate and detailed .thanks
@jovianhq
3 жыл бұрын
Thanks for the feedback, help us spread the word :)
@NehaSingh-fb8kj4 ай бұрын
Great content
@harshvardhansalve85374 ай бұрын
Nice lecture
@dataninjaa11 ай бұрын
Thanks a Lot Bro its nice dataset and you covered very nice from start to end
@UsmanKhan-tc4sk Жыл бұрын
I was working on a mini data science project in which test.csv and train.csv datasets given to me. I trained my model using training data. Now if i want to find accuracy score of my model on testing data what i will do? If i write model.predict(test_data) then how i will compare the predicted tesing values to the true values? Because there is no target values in the testing dataset
@gurjeet3333 жыл бұрын
Nice Video....Really appreciated. Can we also include the topic of setting up data pre processing pipelines in future sessions.
@jovianhq
3 жыл бұрын
Thanks for the feedback and suggestion!
@mayankraj47633 ай бұрын
Hello. I have a question. Should we scale the features after the imputation or before because here you imputed the raw_df dataframe which is not imputed? Thanks
@thakurprathiksinghrajput71355 ай бұрын
1:45:00 whilst you fitted the transformed cols in to your model, I am still getting a type error
@anuphp3432 Жыл бұрын
hey, also isn't it a common practice to scale the test data that is transform the test data or validation data by fitting it only on training datasets?
@sarimahsan63415 ай бұрын
At 1:35:35 ,encoder transform, i am getting an error that columns must be the same as length key.please tell me how to reolve it
@sandipansarkar92112 жыл бұрын
finished watching
@sharkk29792 жыл бұрын
thanks u so good! thanks again
@jovianhq
2 жыл бұрын
You're welcome!
@siddharthsahu50482 жыл бұрын
(1:53:40) when you plot the weights the negative weight would not be considered. And the negative weights also affect the model just in opposite direction. What are your thoughts should the negative weights be considered??
@jovianhq
2 жыл бұрын
Yes, the negative weights should be considered. In fact, you can try and ignore the columns which has very less weights i.e. whose weights are closer to 0. Both negative and positive weights effect the model in some way.
@asifsaad58273 жыл бұрын
would you mind switching to dark mode? TIA
@georgevavolil70052 жыл бұрын
I have a doubt. When we do imputation, we take mean to replace the missing values. We take the mean from each columns of the entire data. The mean of data in each columns of the entire data should be different from means taken from train_df, val_df and test_df separately. It will create some discrepancy in the final result. What's your position on this ? Whether we should conduct imputation based on the entire dataframe or from its subsets
@jovianhq
2 жыл бұрын
A sample of the data should represent the entire dataset. Also, the validation, and training set should be independent of the training set. So imputation can be done differently in validation set and training set.
@datahistory2411 Жыл бұрын
thnks sir...but how to deploy on the website?
@shantanusingh21982 жыл бұрын
Thank you🙂
@jovianhq
2 жыл бұрын
Welcome!
@krupamehta8705 Жыл бұрын
1:26:54 can't understand why is max value in some columns not 1, it should be 1....
@rlm35743 жыл бұрын
3 hrs worth watching
@jovianhq
3 жыл бұрын
Thanks for watching!
@sandipansarkar92112 жыл бұрын
FINISHED CODING FULL
@lion875632 жыл бұрын
So higher the weight more important column is (but only if numerical columns are scaled)? If data is not scaled we cannot derive this conclusion?
@jovianhq
2 жыл бұрын
True! Also, not just higher, the more negative the weight the more important it has i.e. The weight that are closest to 0 have minimum importance
@user-ds2vu7uy2q Жыл бұрын
amazing
@jovianhq
Жыл бұрын
THANKYOU!
@truptpatel25972 жыл бұрын
Information Leakage timestamp: 1:25:10 , He fitted the scaler on the whole numerical dataset and transform it to train, validation and test sets. But isn't it the Information leakage because the scaler knew the test or validation while fitting?
@jovianhq
2 жыл бұрын
Well, if you have access to the validation dataset, you can do scaling on the training and validation both. Generally, you won't be able to touch the test dataset so we shouldn't fit scaler/encoder on the test dataset.
@neurax6688 Жыл бұрын
waoo
@arjunbhandari55544 ай бұрын
1:39:11
@pallapothubhargavramfromib22443 жыл бұрын
Sir do you continue this videos
@shreeyansjavangula8908
3 жыл бұрын
Yeah, this is a course on ml. The new videos structure is provided on his website. jovian.ai/learn/machine-learning-with-python-zero-to-gbms
@fet_hsc23003 жыл бұрын
0:58:00
@fet_hsc23003 жыл бұрын
1:06:09
@fet_hsc23003 жыл бұрын
1:00:55
@rabbitazteca232 жыл бұрын
Hi I noticed that in 1:53:44 you are making a prediction using the train inputs (X_train).... but shouldn't' t you be making a prediction using the validation inputs instead? I don't think you have passed the X_val into any of the logistic regression model prediction.... or am I just confuse ? HAAHHA.
@jovianhq
2 жыл бұрын
Please check kzread.info/dash/bejne/pZ593Mh8ZKS1eZM.html, at first we're predicting with the train set, later we are also predicting with the validation and test sets.
@rabbitazteca23
2 жыл бұрын
@@jovianhq I am sorry ahahah. you are right. I must have missed this part.
@kmishy Жыл бұрын
bookmark 1:03:15 .. for me imp part start here
@debojitmandal86703 жыл бұрын
What's a solver
@jovianhq
3 жыл бұрын
Hey, please go through the blog to know more about solvers. -> towardsdatascience.com/dont-sweat-the-solver-stuff-aea7cddc3451
@adityabenere60042 жыл бұрын
1 ;56;49 nicee
@rubayetalam87593 жыл бұрын
Please add subtitles.
@jovianhq
3 жыл бұрын
Hey we are in the process of adding subtitles to videos, it will be added soon. Thanks!
@rubayetalam8759
3 жыл бұрын
@@jovianhq thanks! you are doing great!
@imaksinsights72022 жыл бұрын
Here is another simplified Logistic Regression tutorial if you are a beginner: kzread.info/dash/bejne/ppeetJqDibbIaag.html
@yskasells40142 жыл бұрын
1:18:01
@fet_hsc23003 жыл бұрын
1:08:30

Logistic Regression for Classification | Working with a real-world dataset from Kaggle

Пікірлер: 73

@jovianhq

3 жыл бұрын

@bongogappo38

Жыл бұрын

@jovianhq

3 жыл бұрын

@jovianhq

3 жыл бұрын

@jovianhq

2 жыл бұрын

@jovianhq

Жыл бұрын

@jovianhq

2 жыл бұрын

@jovianhq

3 жыл бұрын

@jovianhq

3 жыл бұрын

@jovianhq

2 жыл бұрын

@jovianhq

2 жыл бұрын

@jovianhq

2 жыл бұрын

@jovianhq

2 жыл бұрын

@jovianhq

3 жыл бұрын

@jovianhq

2 жыл бұрын

@jovianhq

Жыл бұрын

@jovianhq

2 жыл бұрын

@shreeyansjavangula8908

3 жыл бұрын

@jovianhq

2 жыл бұрын

@rabbitazteca23

2 жыл бұрын

@jovianhq

3 жыл бұрын

@jovianhq

3 жыл бұрын

@rubayetalam8759

3 жыл бұрын

Келесі