Hyperparameter Tuning of Machine Learning Model in Python

Ғылым және технология

In this video, I will be showing you how to tune the hyperparameters of machine learning model in Python using the scikit-learn package.
🌟 Buy me a coffee: www.buymeacoffee.com/dataprof...
📎CODE: github.com/dataprofessor/code...
⭕ Playlist:
Check out our other videos in the following playlists.
✅ Data Science 101: bit.ly/dataprofessor-ds101
✅ Data Science KZreadr Podcast: bit.ly/datascience-youtuber-p...
✅ Data Science Virtual Internship: bit.ly/dataprofessor-internship
✅ Bioinformatics: bit.ly/dataprofessor-bioinform...
✅ Data Science Toolbox: bit.ly/dataprofessor-datascie...
✅ Streamlit (Web App in Python): bit.ly/dataprofessor-streamlit
✅ Shiny (Web App in R): bit.ly/dataprofessor-shiny
✅ Google Colab Tips and Tricks: bit.ly/dataprofessor-google-c...
✅ Pandas Tips and Tricks: bit.ly/dataprofessor-pandas
✅ Python Data Science Project: bit.ly/dataprofessor-python-ds
✅ R Data Science Project: bit.ly/dataprofessor-r-ds
⭕ Subscribe:
If you're new here, it would mean the world to me if you would consider subscribing to this channel.
✅ Subscribe: kzread.info...
⭕ Recommended Tools:
Kite is a FREE AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. I've been using Kite and I love it!
✅ Check out Kite: www.kite.com/get-kite/?...
⭕ Recommended Books:
✅ Hands-On Machine Learning with Scikit-Learn : amzn.to/3hTKuTt
✅ Data Science from Scratch : amzn.to/3fO0JiZ
✅ Python Data Science Handbook : amzn.to/37Tvf8n
✅ R for Data Science : amzn.to/2YCPcgW
✅ Artificial Intelligence: The Insights You Need from Harvard Business Review: amzn.to/33jTdcv
✅ AI Superpowers: China, Silicon Valley, and the New World Order: amzn.to/3nghGrd
⭕ Stock photos, graphics and videos used on this channel:
✅ 1.envato.market/c/2346717/628...
⭕ Follow us:
✅ Medium: bit.ly/chanin-medium
✅ FaceBook: / dataprofessor
✅ Website: dataprofessor.org/ (Under construction)
✅ Twitter: / thedataprof
✅ Instagram: / data.professor
✅ LinkedIn: / chanin-nantasenamat
✅ GitHub 1: github.com/dataprofessor/
✅ GitHub 2: github.com/chaninlab/
⭕ Disclaimer:
Recommended books and tools are affiliate links that gives me a portion of sales at no cost to you, which will contribute to the improvement of this channel's contents.
#dataprofessor #hyperparameter #machinelearning #datascienceproject #tuning #optimization #optimisation #randomforest #decisiontree #svm #neuralnet #neuralnetwork #supportvectormachine #python #learnpython #pythonprogramming #datascience #datamining #bigdata #datascienceworkshop #dataminingworkshop #dataminingtutorial #datasciencetutorial #ai #artificialintelligence #tutorial #dataanalytics #dataanalysis #machinelearningmodel

Пікірлер: 78

  • @Ghasforing2
    @Ghasforing23 жыл бұрын

    This was a lucid and complete discussion on Hyperparameters tuning. Thanks for the sharing, Professor.

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Thank you for watching and glad it was helpful 😊

  • @AI_Boy99
    @AI_Boy997 ай бұрын

    Wow, this was amazing. I'm working on machine learning models to diagnose early leackage of valves in piston diaphragm pumps. Thanks Chanin. Really love your videos.

  • @WaliSayed
    @WaliSayed2 күн бұрын

    Very clear and details are explained in simple way. Thank you!

  • @aimenbaig6201
    @aimenbaig62013 жыл бұрын

    i love your calm teaching style! it's relaxing

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Thank you! 😊

  • @neeshi1176
    @neeshi11762 жыл бұрын

    while its too late for watching but worth it sir, thank you so much for the Gem...keep teaching ; very elaborative explaination

  • @ajifyusuf7624
    @ajifyusuf76243 жыл бұрын

    This video, I think, is one of the best for explanation tuning hyperparameter

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Thanks for the kind words 😊

  • @aiuslocutius9758
    @aiuslocutius97582 жыл бұрын

    Thank you for explaining this concept in an easy-to-understand manner.

  • @DataProfessor

    @DataProfessor

    2 жыл бұрын

    You're very welcome!

  • @jgubash100
    @jgubash1003 жыл бұрын

    Liked the contour plots, I'll have to try those too.

  • @MarsLanding91
    @MarsLanding913 жыл бұрын

    Superb video. Very Insightful. Question - How are you picking the numbers for the parameters? max_features_range = np.arange(1,6,1) - why did you decide to start at 1 and end at 6? Why are you incrementing by 1 and not by 2, for example? Would love to hear your thoughts on this.

  • @CatBlack01
    @CatBlack013 жыл бұрын

    Clear explanation and presentation. Love the analogies and error fixing.

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Much appreciated! Glad to hear!

  • @amiralx88
    @amiralx883 жыл бұрын

    Really nice and clean code I've learned a lot from your video how to optimize mine. Thanks

  • @dearcharlyn
    @dearcharlyn2 жыл бұрын

    Another amazing tutorial, well explained and comprehensible! Thank you data professor! I am currently working on COVID-19 predictor models. :)

  • @DataProfessor

    @DataProfessor

    2 жыл бұрын

    Thanks! Appreciate the kind words!

  • @madhawagunathilake8304
    @madhawagunathilake8304 Жыл бұрын

    Thank you Prof. for your very insightful and helpful lecture!!

  • @GeraldTalton
    @GeraldTalton Жыл бұрын

    Great video, always helps to see the visualization

  • @jorge1869
    @jorge18694 жыл бұрын

    Hello Dr!!!, I have read many of your works because alternatively I have a line of research related to the development of tools based on machine learning, mainly prediction of peptides with different activities. Currently, I use Python to develop and of course publish my papers, currently I am also learning R because I have noticed this language has good libraries to calculate molecular descriptors, for instance "Protr". I would appreciate a video tutorial explaining key steps such as data separation, training and cross-validation and testing with R using the "CARET" library, of course if possible. Greeting and success for this awesome youtube channel!

  • @DataProfessor

    @DataProfessor

    4 жыл бұрын

    Thanks JF for the comment and for reading my research work. How did you discover this KZread channel? (so that I can use this information to better promote the channel) Yes, we also use protr package in R for some of our peptide/protein QSAR work. In that case, I might make a video about calculating the descriptors of peptide/protein or even compounds in future videos. In the meantime, please check out the following video "Machine Learning in R: Building a Classification Model" as well as 13 other R video tutorials explaining the machine learning model building process in a step-by-step manner. kzread.info/dash/bejne/loal1q6xirm4pdo.html

  • @jorge1869

    @jorge1869

    4 жыл бұрын

    Dr. thank you so much for your reply. I discovered your channel here on youtube looking for machine learning tutorials in R, when you mentioned your name in one of your videos where you do an excellent lecture on drug discovery, quickly I look up your profile on researchgate and that's how I realized it was you.

  • @DataProfessor

    @DataProfessor

    4 жыл бұрын

    JF Thanks for the insights, it is very helpful.

  • @donrachelteo9451
    @donrachelteo94513 жыл бұрын

    Yes indeed this is one of the best explanation on hyperparameter tuning. Just needed clarification: how do we decide the range of values to run in grid search? Hope you can also help do one video on Manual Tuning vs Auto Grid Search Tuning. Thanks 👍

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Thanks for the suggestion! I'll put it on my to do list.

  • @donrachelteo9451

    @donrachelteo9451

    3 жыл бұрын

    @@DataProfessor thanks professor

  • @gabrielcornejo2206
    @gabrielcornejo22062 жыл бұрын

    Great tutorial, thank you very much. I have a question. How I could know which are best 3 features to used to built de best model with 140 n_estimators???

  • @sudhakarsingha283
    @sudhakarsingha2833 жыл бұрын

    This is a video with detail discussion on hyperparameter tuning.

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Thank you for watching!

  • @infinitygeospatial1972
    @infinitygeospatial19722 жыл бұрын

    Great video. Very Explanatory. Thank you

  • @limzijian98
    @limzijian982 жыл бұрын

    Hi, just wanted to ask , how do you determine the number of n_estimates for a record size of 2mill ?

  • @muskanmishra6625
    @muskanmishra6625 Жыл бұрын

    very well explained thank you so much🙂

  • @josiel.delgadillo
    @josiel.delgadillo2 жыл бұрын

    How do you use gridsearchcv with a custom estimator? I can’t seem to make it work.

  • @eyimofep
    @eyimofep2 жыл бұрын

    Nice tutorial, so now that I've done all this, hoe can i apply the model, like now use what we've done to predict the X_test data or predict the data if we create an API

  • @joseluisbeltramone599
    @joseluisbeltramone5992 жыл бұрын

    Fantastic explanation, Sir (as always). Thank you very much!

  • @DataProfessor

    @DataProfessor

    2 жыл бұрын

    You are very welcome

  • @geoffreyanderson4719
    @geoffreyanderson47192 жыл бұрын

    Thank you for making good content and that is what attracted me to the channel, Data Professor. I say the following only with constructive purpose. There is no signal to find in a random dataset like that sampled by make_classification. Is this correct? Thus the RF is fitting itself to noise only. It's using completely spurious assocations. You would prefer to avoid fitting to noise components in real life as much as possible. Fitting to noise is pure variance error.

  • @cahayasatu9201
    @cahayasatu92013 жыл бұрын

    Thank for a great tutorial. May I know how to see/identify what are the 2 features that produces the best accuracy?

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Hi, if using random forest, the feature importance plot will allow us to see which features contributed the most to the prediction. The shap library also adds this capability to any ML algorithm.

  • @geoffreyanderson4719
    @geoffreyanderson47192 жыл бұрын

    A thought experiment: If the generating process continued a lot longer and made far more than 200 examples, what would this do to the tuned final model's predictions? I am talking about the model that was developed on the 200 examples. That is, what happens when it is tried on that new data? Keep in mind that sklearn's make_classification() by design produces noise only, no signal.

  • @sofluzik
    @sofluzik4 жыл бұрын

    lovely . how relevant is confusion and classification report , and AUC score , ROC with score mentioned above.

  • @DataProfessor

    @DataProfessor

    4 жыл бұрын

    Hi Rajaram, this article does a good job in providing a detailed distinction of the various metrics for classification neptune.ai/blog/f1-score-accuracy-roc-auc-pr-auc

  • @sofluzik

    @sofluzik

    4 жыл бұрын

    @@DataProfessor thank you sir

  • @nibrad9712
    @nibrad971216 күн бұрын

    Why did you choose the max feature as 5 while the n estimator to be 200? More specifically, how do I choose these params?

  • @budoorsalem1168
    @budoorsalem11683 жыл бұрын

    Thank you for your great video , have you done in hyperparameters tuning for different algorithm like decision tree, Ann, GBR?

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    The first step is to figure out which hyperparameters you want to optimize. You can do that by going to the API Documentations and look for the algorithm function that you want to use and see which hyperparameters there are and adapt accordingly as shown in this video. For example, in Random Forest, the 2 hyperparameters that we choose for optimization is max_features and n_estimators. For example, for ANN, you may choose to optimize the learning rate, momentum and number of nodes in the hidden layer, etc.

  • @budoorsalem1168

    @budoorsalem1168

    3 жыл бұрын

    @@DataProfessor thank you so much, this is really helped me

  • @AbhishekSingh-vl1dp
    @AbhishekSingh-vl1dp Жыл бұрын

    How we will decide how much to split data into train set and into the test set ??

  • @DM-py7pj
    @DM-py7pj Жыл бұрын

    Is it not important to know which features when GridSearch tells you the optimal number of features? And what then when, over different runs, you get different n_features?

  • @SyedZion
    @SyedZion3 жыл бұрын

    Can you please explain the same concept with RandomizedSearch?

  • @kailee3491
    @kailee3491 Жыл бұрын

    where can i find the environmental requirements?

  • @hejarshahabi114
    @hejarshahabi1143 жыл бұрын

    thanks for your video. I also have a question regarding max features that you mentioned "11:48". by max features what do you mean? do you mean the maximum independent elements like x1,x2,...xn.

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Thanks for watching! Yes, if max_features == all features . The max_features is a parameter that scikit learn uses to determine how many features to use in performing the node split. More details provided here scikit-learn.org/stable/modules/ensemble.html#random-forest-parameters

  • @hejarshahabi114

    @hejarshahabi114

    3 жыл бұрын

    @@DataProfessor thank you very much for your quick response, please keep making videos on such topics, you are doing great and I've learnt many things from your channel. BIG LIKE

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    @@hejarshahabi114 Thanks, and greatly appreciate the support 😊

  • @dennislam1501
    @dennislam150110 ай бұрын

    what is minimum sample size for decent tuning? 10000? 1000? 100000? data rows i mean

  • @dreamphoenix
    @dreamphoenix2 жыл бұрын

    Thank you

  • @budoorsalem8378
    @budoorsalem83783 жыл бұрын

    thank you so much Professor for this good information, it helped a lot, I wondering if we can do the hyper tuning parameter in random forest regression for continuous data

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Hi, by continuous data are you referring to the Y variable? If so, then the answer would be yes.

  • @budoorsalem1168

    @budoorsalem1168

    3 жыл бұрын

    @@DataProfessor yes the target dependent variable is not categorical.. it is different numbers

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    @@budoorsalem1168 Hyperparameter tuning can be performed for both categorical and numerical Y variables (classification and regression, respectively).

  • @budoorsalem1168

    @budoorsalem1168

    3 жыл бұрын

    @@DataProfessor ok thank you so much

  • @isaacvergara6792
    @isaacvergara67923 жыл бұрын

    Awesome video!

  • @bryanchambers1964
    @bryanchambers19642 жыл бұрын

    I have a very large dataset. 356 columns, I reduced it to 75 using PCA and retained 99.8% variance. I did a clustering model and it works outstanding, I identified 3 clusters out of 8 in which potential customers belong. But my machine learning model is garbage. ROC-AUC score of barely greater than 0.5. I am surprised because if the cluster model works very well than shouldn't the machine model work well? I was wondering if you had any suggestions?

  • @DanielRong795

    @DanielRong795

    2 жыл бұрын

    may I ask what's ROC-AUC?

  • @user-ku4vf2mk8t
    @user-ku4vf2mk8t3 жыл бұрын

    Awesome video thanks

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Thank you

  • @guoqiang7215
    @guoqiang72153 жыл бұрын

    I am working on spam mail data set and now try to make hyperparameter tuning to the model

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Thanks for sharing, sounds like an interesting project.

  • @franklintello9702
    @franklintello97022 жыл бұрын

    I am still trying to find one with real data, because all this automatic generated are hard to apply sometimes.

  • @MinhHua-zu2pl
    @MinhHua-zu2pl2 ай бұрын

    please make screen font bigger thank you

  • @levithanprimal2410
    @levithanprimal24103 жыл бұрын

    How am I watching this for free? Thanks Professor!

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Glad it was helpful and yes we have free data science contents here, would appreciate if you share with a friend or two 😆

  • @shivamkrathghara3340
    @shivamkrathghara33403 жыл бұрын

    why 81k ? it should be more than 810k Thankyu professor

  • @DataProfessor

    @DataProfessor

    3 жыл бұрын

    Haha, thanks for the support!

  • @yingzisilver9085
    @yingzisilver9085 Жыл бұрын

    Thank you

Келесі