Credit Scoring Project using Machine Learning | Risk Modelling | Logistic Regression | ML Project#1

🔥 In this video, we shall build a Credit Scoring Model for a Bank, enabling them to make data-driven lending decisions. We shall use Logistic Regression classifier to build our model and decile methodology to formulate a lending strategy.
We are Skillcate!! And we are on a mission to bring you application based machine learning education. We launch new machine learning projects every month. So, make sure to subscribe to our channel to get access to all of our ML projects.
At Skillcate, our endeavour is to take you through the end-to-end journey of solving real business problems, of which coding is just one part. So, even if you have little to no experience of coding, don’t worry. As part of this free course, we are giving you a free Do-It-Yourself toolkit, having ready-to-use python code for your hands-on learning and reuse.
Enjoy data science!
🔥 Course Sections
00:00 Introduction
03:19 Client's Business Case
05:22 Solutioning Intuition
06:11 ML Model Building
13:34 Decile Methodology
22:21 Solution Delivery to Client
🔥 Resources
Google Drive: drive.google.com/drive/folder...
Data Normalization: developers.google.com/machine...
Credit Scoring: www.investopedia.com/terms/c/...
ROC Curve: towardsdatascience.com/unders...
Do like, share & subscribe to our channel.
🔥 Keep in touch
Website: www.skillcate.com
LinkedIn: / 67209084
Email: skillcate@gmail.com

Пікірлер: 46

  • @skillcate
    @skillcate Жыл бұрын

    Hey guys!! Glad to see such amazing feedback on this ML Project🤗 Need your support in reaching out to more learners by subscribing to my channel 🙂 Also, join me on my Skillcate Discord Server: discord.gg/GyMBfD4ER5 🙂 Let's talk Machine Learning ❤❤

  • @suattuncer8727
    @suattuncer872711 ай бұрын

    Just for those who don't know about scaling: In the video, the standard scaling method is used, which scales values based on the data's variation itself. On the other hand, scaling between 0 and 1 is Min/Max scaling, where the maximum value is set to 1 and the minimum value to 0. It's important to note that these boundaries are determined by all your data, not on a column-by-column basis.

  • @luizfelipevercosa
    @luizfelipevercosa Жыл бұрын

    Thanks a lot! I loved the threshold analysis!

  • @skillcate

    @skillcate

    Жыл бұрын

    Glad you liked it Luiz 😊

  • @MayankDayal1234
    @MayankDayal1234 Жыл бұрын

    Good stuff.. thanks for sharing.

  • @user-qq8bc8km7e
    @user-qq8bc8km7e11 ай бұрын

    Thanks for the awsome explanation! I still couldn't understand, how you come up with the decision of taking 79.73% for keeping profitability & expansion in mind? why not 72.45% for exemple?

  • @xolanitarence8853
    @xolanitarence8853 Жыл бұрын

    Wow it gives a good exposure

  • @skillcate

    @skillcate

    Жыл бұрын

    Glad you liked it!!

  • @shahwaheedullah4799
    @shahwaheedullah47992 жыл бұрын

    Hi Sir, thanks for sharing this video, a-lots of knowledge and information. But Sir how we can use financial ratios dateset of industries in the logistic regression for predicting credit and investment risk for financial institutions??? Please share some videos/links to learn! Thanks & Regards

  • @skillcate

    @skillcate

    2 жыл бұрын

    Dear Waheed, thank you for your comment & apologies for the delay in response. It's really an excellent question. Financial ratios are a solid proxy for gauging credit worthiness of FIs. The key here would be to get/build a good labelled dataset on historical numbers for achieving this. From a ML standpoint, process essentially remains the same, with a bunch of independent variables (Financial Ratios in this case + other features) and the dependent variable Good / Bad. Hope this answers your query :-)

  • @rolfjohansen5376
    @rolfjohansen5376 Жыл бұрын

    I sort was expected something more after the logistic regression , something like an advanced ML technique like gradient boosting, Neural nets, etc ...

  • @skillcate

    @skillcate

    Жыл бұрын

    Hey Rolf, thanks for the feedback. On Neural Network specifically, I have done couple of projects: one on Movie Review Sentiment Analysis using LSTM: kzread.info/dash/bejne/oYujm7WHk9zenKw.html, and second, Age-Gender-Emotion Detection using CNN: kzread.info/dash/bejne/p6Oq0ZOsYcXHorg.html If your request is specific to Credit Scoring, I am making a note to do an advance approach on this project in coming days. Happy learning to you :)

  • @mathslearningmadefun7226
    @mathslearningmadefun72262 жыл бұрын

    It was again a great video especially the last part i.e., Decile concept. I have a doubt regarding how to find the 'Target (predicted)' value for a new observation, other than what is already present in train and test data. Could you please help me with this?

  • @skillcate

    @skillcate

    2 жыл бұрын

    Thanks for your feedback! We have added Prediction code and supporting files in the Credit Scoring Toolkit folder: drive.google.com/drive/folders/1Rz7w2FuC9CegZ-o6SeJoBivj08iaLGer. Here's a crisp Read Me document on using the new files: docs.google.com/document/d/14FJ137z-zdFSmatOZa1etE34_n0-i1oqOmMXGTqdJws/edit?usp=sharing. With this new prediction file, you may learn how to predict Probability_of_Good for new loan applications. And once you have the probability values, you may check for decile specific Cut-off Probabilities to figure out whether to Approve or Reject a loan. Happy Learning! :)

  • @mathslearningmadefun7226

    @mathslearningmadefun7226

    2 жыл бұрын

    @@skillcate Thank You for making the extra effort in uploading new files to drive. It helped me a lot.

  • @skillcate

    @skillcate

    2 жыл бұрын

    @@mathslearningmadefun7226 Happy to help :) Do like, share and subscribe if you like our content. This helps us reach to more folks like yourself! Keep learning!!

  • @karthikharisamy5367
    @karthikharisamy53672 жыл бұрын

    Can you share how to build a credit risk model end to end under IFRS 9 Regulations. It will be of great help if you can share any links as well. Thanks

  • @skillcate

    @skillcate

    2 жыл бұрын

    Hi Karthik! Really appreciate your feedback. Please reach out to us at skillcate@gmail.com with details. Thank you!!

  • @rahulshukla7380
    @rahulshukla7380 Жыл бұрын

    Could you please help me with how you created the pivot table and added all the column fields to it

  • @skillcate

    @skillcate

    Жыл бұрын

    Dear Rahul, hope you are well! Were you able to access our Skillcate Toolkit, link to which is shared in the Video Description? Here we have kept this Datasheet for your easy reference on the Excel Formulas used. Anyways, you may drop me an email on skillcate@gmail.com and I may take a quick 1:1 session with you (no charges, of course), where we may prepare this sheet together :) Toolkit link: Link: drive.google.com/drive/folders/1Rz7w2FuC9CegZ-o6SeJoBivj08iaLGer?usp=sharing

  • @igorgomez1002
    @igorgomez1002 Жыл бұрын

    Hi, did you change the Dataset? Because de IDs you are using in the example don't appear in the Drive's dataset (for example, the ID 66, first row). And the results I get are totally different

  • @skillcate

    @skillcate

    Жыл бұрын

    Dear Igor, just checked the code myself. The dataset is unchanged. For Train-Test-Split function, I have added a new parameters "stratify=y" to balance out good-bad loans across train and test set. With this, I'm getting even better accuracy of 83.3%. Do try it out once yourself. I'm sure your results will improve.

  • @vishwav5753
    @vishwav57532 жыл бұрын

    can u please provide feature selection algorithm for this credit scoring project please bro..

  • @skillcate

    @skillcate

    2 жыл бұрын

    Dear Vishwa, that's a great question. In logistic regression, you may check for multi-collinearity among input features. VIF or Variance Inflation Factor, is a proxy to gauge multi-collinearity. In simple words, Multicollinearity describes the state where the independent variables used in a study exhibit a strong relationship with each other. Sharing this comprehensive article that has the details and code snippets for you to implement this functionality into your Credit Scoring project: towardsdatascience.com/targeting-multicollinearity-with-python-3bd3b4088d0b Keep learning!! :)

  • @NEWLIFESTYLE3
    @NEWLIFESTYLE39 ай бұрын

    Hi, please, I have a small problem with my dataset. When I concatenate the 'prob_0' and 'prob_1' columns with the 'Actual_target' (which is my 'y_test' values), the 'Actual_target' column contains NaN values. Why is that?

  • @kevinmugo2662

    @kevinmugo2662

    5 ай бұрын

    When declaring x and y , you left (.values) at the end.

  • @retiredman5722
    @retiredman57222 жыл бұрын

    You need to show that 20% validation sample result based on your 80% sample scorecard. right?

  • @skillcate

    @skillcate

    2 жыл бұрын

    Hey buddy!! Thank you for your comment & apologies for the delay in response. That's correct!! 20% is basically unseen data, that was kept separate from the remaining 80% training data used for Model Building. So, this 20% is the one we use for generating labels from our model + formulating lending strategy..

  • @saurabhmeshram7315
    @saurabhmeshram7315 Жыл бұрын

    how you calculated profit to business values?

  • @skillcate

    @skillcate

    Жыл бұрын

    Hey Saurabh, we used this formula for each decile: (Cumm Good * Profit from a good loan) - (Cumm Bad * Loss from a bad loan)

  • @mansimishra7089
    @mansimishra7089 Жыл бұрын

    Where are you collecting this dataset please let me know? Thankyou!!

  • @skillcate

    @skillcate

    Жыл бұрын

    Hi Mansi. As an example this specific dataset we have got from UCI repository: archive.ics.uci.edu/ml/index.php. However, there are plenty other credit scoring dataset available where this approach is applicable as well: github.com/JLZml/Credit-Scoring-Data-Sets.

  • @omdivyatej3818
    @omdivyatej3818 Жыл бұрын

    What is "No.of trade lines Unes 60 days worse or ever" in the feature part? What is Unes here?

  • @omdivyatej3818

    @omdivyatej3818

    Жыл бұрын

    And what is pct

  • @skillcate

    @skillcate

    Жыл бұрын

    Dear Om, hope you are all well. Sharing clarifications here: 1. Trade lines are essentially the credit lines you have availed, ex: you may have a credit card + an education loan, then you have 2 trade lines open 2. For Credit Scoring problem, timeline of Trade Lines is important as well, ex: you have recently been opening a lot many trade lines all of a sudden is a Red-flag 3. Read 'Unes' as Lines. There's a typo there :-) 4. 'Pct' is percentage. No. of Trade Lines 50pct utilised means, means the count of Credit Lines you have consumed over 50% 5. 'No. of trade lines 60 days or worse ever' is the count of all active/inactive credit lines in last 60 days + the credit lines that were active prior to 60 days where you defaulted

  • @JoanaOdtojan
    @JoanaOdtojan Жыл бұрын

    Hi, wondering how do I apply this for financial crime scoring?

  • @skillcate

    @skillcate

    Жыл бұрын

    Dear Joana, I'm actually planning to do a project on Credit Card Fraud Detection by next week. Stay tuned for that :)

  • @kevinmugo2662
    @kevinmugo26625 ай бұрын

    Kindly , someome help how the decile formula was applied for all cells

  • @nelya.kulch11

    @nelya.kulch11

    3 ай бұрын

    it's shown in the video: in cell 61 we input formula: first row cell +1. And the stretch this formula down/ or copy this formula to all cells below / or double click the cell

  • @user-mn8qu8lv6v
    @user-mn8qu8lv6v11 ай бұрын

    could not convert string to float: '$2,327' Getting this Error Please Help Out

  • @nelya.kulch11

    @nelya.kulch11

    3 ай бұрын

    Maybe because if $ sign, try to delete it before

  • @vinayaktalukder2381
    @vinayaktalukder2381 Жыл бұрын

    Hello Sir, I am getting this message " FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/My Drive/1_LiveProjects/Project1_Credit_Scoring/a_Dataset_CreditScoring.xlsx' " Could you please help me out

  • @skillcate

    @skillcate

    Жыл бұрын

    Hi Vinayak, Can you check if 1. if your google drive is mounted (check Cell 2 in the notebook)? 2. If yes to above, check if "a_Dataset_CreditScoring.xlsx" file is present at by browsing to the location /content/drive/My Drive/1_LiveProjects/Project1_Credit_Scoring/? If not make sure to copy file at the location.

  • @pratipadakhatode1146
    @pratipadakhatode1146 Жыл бұрын

    hello sir I have query with source code that you provided how can I contact you

  • @skillcate

    @skillcate

    Жыл бұрын

    Hi Pratipada, Sure. Reach out to us at skillcate[AT]gmail.com for any query.

Келесі