Feature selection in machine learning | Full course

Ғылым және технология

Full source code on GitHub: github.com/marcopeix/youtube_...
Introduction - 0:00
Initial code setup - 2:19
Variance threshold - 11:04
Variance threshold (code) - 13:02
Filter method - 19:39
Filter method (code) - 21:27
RFE - 29:08
RFE (code) - 30:42
Boruta - 37:12
Boruta (code) - 41:21
Thank you - 46:35
A full course on feature selection in machine learning projects.
We first cover a naive method based on variance. Then we move on to filter method and wrapper method like recursive feature elimination or RFE. Finally, implement the Boruta algorithm.

Пікірлер: 37

  • @samuelliaw951
    @samuelliaw9515 ай бұрын

    Really great content! Learnt a lot. Thanks for your hard work!

  • @shwetabhat9981
    @shwetabhat9981 Жыл бұрын

    Woah , much awaited 🎉 . Thanks for all the efforts put in sir . Looking forward to more such amazing content 🙂

  • @claumynbega1670
    @claumynbega16705 ай бұрын

    Thanks for this valuable work. Helps me learning the subject.

  • @michaelmecham6145
    @michaelmecham61452 ай бұрын

    Sensational video, thank you so much!

  • @babakheydari9689
    @babakheydari968912 күн бұрын

    It was great! Thanks for sharing your knowledge. Hope to see more of you.

  • @tanyaalexander1460
    @tanyaalexander146026 күн бұрын

    I am a noob to data science and feature selection. Yours is the most succinct and clear lesson I have found... Thank you!

  • @abhinavreddy6451
    @abhinavreddy645125 күн бұрын

    Please do more Data science-related content, It was very helpful I searched everywhere for feature selection videos and finally landed on this video and this was all I needed, the content is awesome and the explanation is as well!

  • @Loicmartins
    @Loicmartins5 ай бұрын

    Thank you very much for your work!

  • @paramvirsaini2806
    @paramvirsaini28067 ай бұрын

    Great explanation. Easy hands-on as well!!

  • @datasciencewithmarco

    @datasciencewithmarco

    7 ай бұрын

    Thank you!

  • @maythamsaeed533
    @maythamsaeed5335 ай бұрын

    very helpful video and easy way to explain the content. thanks alot

  • @oluwasegunodunlami7360
    @oluwasegunodunlami73605 ай бұрын

    Wow, this video is really helpful, a lot of interesting methods were shown. Thanks a lot. I like to ask you to make a future video covering how you perform feature engineering and model fine tuning 1:49

  • @tongji1984
    @tongji19842 ай бұрын

    Dear Marco Thank you.😀

  • Жыл бұрын

    I am currently reading your book and it's amazing

  • @jmagames2766

    @jmagames2766

    6 ай бұрын

    what is the name of the book plz

  • @imadsaddik
    @imadsaddik2 ай бұрын

    Thank you for sharing

  • @roba_trend
    @roba_trend Жыл бұрын

    interesting content much love it🥰

  • @dorukucar7105
    @dorukucar71055 ай бұрын

    pretty helpful!

  • @scott6571
    @scott65713 ай бұрын

    Thank you! It's helpful!

  • @datasciencewithmarco

    @datasciencewithmarco

    3 ай бұрын

    Glad it helped!

  • @chiragsaraogi363
    @chiragsaraogi3636 ай бұрын

    This is an incredibly helpful video. One thing I noticed is that all features are numerical. How do we approach feature selection with a mix of numerical and categorical features? Also, when we have categorical features, do we first convert them to numerical features or first do feature selection. A video on this would be really helpful. Thank you

  • @haleematajoke4794

    @haleematajoke4794

    5 ай бұрын

    You will need to convert the categorical features into numerical format by using label encoding which automatically converts it to numerical values or custom mapping where u can manually assign ur preferred values to the features. I hope it helps

  • @haleematajoke4794

    @haleematajoke4794

    5 ай бұрын

    You will have to do the conversion before feature selection because machine learning models only learn from numerical data

  • @alfathterry7215
    @alfathterry72154 ай бұрын

    in Variance threshold technique, if we use Standard scaler instead of Minmax scaler, the variance would be the same for all variables.... does it means we can eliminate this step and just use standars scaler?

  • @eladiomendez8226
    @eladiomendez82263 ай бұрын

    Awesome video

  • @datasciencewithmarco

    @datasciencewithmarco

    3 ай бұрын

    Thanks!

  • @pooraniayswariya997
    @pooraniayswariya9979 ай бұрын

    Can you teach how to do MRMR feature selection in ML?

  • @cagataydemirbas7259
    @cagataydemirbas7259 Жыл бұрын

    Hi, when I use randomforest , DecisionTree and xgboost on RFE , even if all of them tree based models, they returned completely different orders. On my dataset has 13 columns, on xgboost one of feature importance rank is 1, same feature rank on Decisiontree is 10, an same feautre on Randomforest is 7. How can I trust wich feature is better than others in general purpose ? İf a feature is better predictive than others, shouldnt it be de same rank all tree based models ? I am so confused about this. Also its same on SquentialFeatureSelection

  • @datasciencewithmarco

    @datasciencewithmarco

    Жыл бұрын

    That's normal! Even though they are tree-based, they are not the same algorithm, so ranking will change. To decide on which is the best feature set, you simply have to predict on a test set and measure the performance to make a decision.

  • @therevolution8611
    @therevolution8611 Жыл бұрын

    can you explain how we are performing feature selection for the multilabel problem?

  • @user-cl3ej5mi7k

    @user-cl3ej5mi7k

    3 ай бұрын

    You can convert the label to numerical features by replacing them with numbers. If you have 3 labels in a feature, you could represent them with 0,1,2 there are different methods to use. Simpler one is .replace({})

  • @mrthwibble
    @mrthwibble3 ай бұрын

    Excellent video, however I'm preoccupied trying to figure out if having wine as a gas would make dinner parties better or worse. 🤔

  • @roba_trend
    @roba_trend Жыл бұрын

    i tried to search under your github aint get the data where is the data you work on?

  • @datasciencewithmarco

    @datasciencewithmarco

    Жыл бұрын

    The dataset comes from the scikit-learn library! We are not reading a CSV file. As long as you have scikit-learn installed, you can get the same dataset! That's what we do in cell 3 of the notebook and it's also on GitHub!

  • @nikhildoye9671
    @nikhildoye9671Ай бұрын

    I thought feature selection is done before model training. Am I wrong?

  • @keerthana7353

    @keerthana7353

    Ай бұрын

    Yes correct

Келесі