StatQuest: Random Forests Part 1 - Building, Using and Evaluating
Random Forests make a simple, yet effective, machine learning method. They are made out of decision trees, but don't have the same problems with accuracy. In this video, I walk you through the steps to build, use and evaluate a random forest.
NOTE: Random Forests are made from Decision Trees, so if you don't know about those, here's the Quest: • Decision and Classific...
ALSO NOTE: This StatQuest is based on Leo Breiman's (one of the creators of Random Forests) website: www.stat.berkeley.edu/~breima...
For a complete index of all the StatQuest videos, check out:
statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Buy The StatQuest Illustrated Guide to Machine Learning!!!
PDF - statquest.gumroad.com/l/wvtmc
Paperback - www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - www.amazon.com/dp/B09ZG79HXC
Patreon: / statquest
...or...
KZread Membership: / @statquest
...a cool StatQuest t-shirt or sweatshirt:
shop.spreadshirt.com/statques...
...buying one or two of my songs (or go large and get a whole album!)
joshuastarmer.bandcamp.com/
...or just donating to StatQuest!
www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
/ joshuastarmer
0:00 Awesome song and introduction
0:31 Motivation for using Random Forests
1:17 Step 1, create a bootstrapped dataset
2:23 Step 2, create a decision tree a random subset of variables at each step
4:00 Step 3, repeat steps 1 and 2 a bunch of times
4:40 Classifying a new sample with a Random Forest
5:41 Definition of Bagging
6:03 Evaluating a Random Forest
8:34 Optimizing the Random Forest
Corrections:
3:18 I should have said the same feature (or variable) can be selected multiple times in a tree. Every time we select a subset of features to choose from, we choose from the full list of features, even if we have already used some of those features. Thus, a single feature can appear multiple times in a tree.
9:28 I say "square" when I meant to say "square root".
#statquest #randomforest #ML
Пікірлер: 1 300
Corrections: 3:18 I should have said the same feature (or variable) can be selected multiple times in a tree. Every time we select a subset of features to choose from, we choose from the full list of features, even if we have already used some of those features. Thus, a single feature can appear multiple times in a tree. 9:28 I say "square" when I meant to say "square root". Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
@monishaathikesavanpremalat7587
4 жыл бұрын
StatQuest with Josh Starmer - Can you explain more on this? I mean how does it work when same variable is selected more than once... Let’s say for example., we have our root node as Good blood circulation - if it is true then it goes to left node and if is false it goes to right node Then let’s say our internal node on right side is chest pain and then next internal node below it is weight Then as you said we have again selected good blood circulation randomly ... how does it work because right side nodes have bad blood circulation already Selecting variables multiple times makes sense when it is continuous variable because it can used again with different threshold values but how come it works with categorical variable
@lprashanthi7298
4 жыл бұрын
For the same bootstrap data set we can make different trees with randomly picked variables?
@statquest
4 жыл бұрын
No, you only make one tree per boostrapped dataset. What I was trying to say was that all columns/variables/features in that bootstrapped dataset are considered at each node.
@statquest
4 жыл бұрын
It’s true. There’s no use in visiting a boolean variable more than once. However, as a general rule, all variables, at least in theory, can be visited more than once in a single tree. The actual implementation may optimize this by omitting boolean variables if they have already been visited.
@lprashanthi7298
4 жыл бұрын
@@statquest so here randomly 2 is picked at each node without taking into account gini ? Is that right
Please continue what you're doing. KZread users are blessed to have you here. Your contents are not long, right to the point, clear and your way of teaching is amazing. If there will be a heaven you will be right in the middle of it.
@statquest
3 жыл бұрын
Thank you very much! :)
@2DReanimation
3 жыл бұрын
Beats the Wikipedia articles on these subjects by miles! Perhaps if I was more versed in math, I could just read it like the daily newspaper, but as it is, having this clear visualization and examples is a waay superior learning method!
This video is so well made and so well explained that I comprehend the subject matter even while sitting here drunk AF at my keyboard... *hic*
@statquest
4 жыл бұрын
Dang! :)
@hichamsalimlyoussi8539
4 жыл бұрын
Well said ! Understanding a subject is more relying on the teacher's skills to explain clearly than to the student's efforts to trying to understand.
Your "BAM!!" always bring me back to the reality when, eventually, my brain takes a nap during the class haha. It makes me go back and rewatch the issue. I really appreciate your art, sir.
@statquest
3 жыл бұрын
Awesome! :)
@DEEPAKSV99
2 жыл бұрын
Very true. He is so good at teaching!
I'm studying Data Science at MIT, you really can't imagine Josh how much StatQuest is helping me, and a couple more channels, before I start any topic I like to tackle it first or just take a general idea, and you can't imagine how much your videos helped! Short, concise, and to the point! I even started to like mice man for god sack! Thank you Josh 🙂
@statquest
2 жыл бұрын
Thank you! I'm glad my videos are helpful.
@jarosawszyc8287
Жыл бұрын
Hello, can you share what other channels you watch? I'm a student too, and I wish to gain as much knowledge before looking for a job. Have a wonderful day!
@mosama22
Жыл бұрын
@@jarosawszyc8287 No one is perfect in everything! Look up by the topic, not by the channel, you'll ALWAYS find someone who explains what you looking for in a very simple and very straight forward way.
@mosama22
Жыл бұрын
@@atharvambokar573 Massachusetts Institute of Technology 🙂
@mosama22
Жыл бұрын
@@atharvambokar573 I wanted to change my career to a Data Analyst, and got accepted for post graduate at MIT. I literally started from scratch. Plus what is wrong in creditting people for their efforts.
I have no background in Machine Learning and was attempting to understand random forests. Watched a bunch of videos but this was the one that actually made it clear to me. Thanks a lot!
@statquest
3 жыл бұрын
Glad it was helpful!
"Sir, after your examination we found that you've got heart disease. BAAM"
@phan9995
4 жыл бұрын
"All 100 trees of the random forest predicted that you have a heart disease. HOORAY! "
@user-tw4br4ti1h
4 ай бұрын
It is actually "BAAM?" :D
@jayeshparmar2603
Ай бұрын
@@phan9995Double baam!!
Your videos are unbelievably simple to follow and intuitive. You know that someone is a master of their craft when they can explain it to someone else in a concise and easy-to-understand way. Well done, and thank you for all of your videos!
@statquest
Жыл бұрын
Wow, thank you!
This is absolutely awesome and so clearly explained. I wish all textbooks were like this. Keep it up !!
@2DReanimation
3 жыл бұрын
Indeed! Wikipedia is usually a way to get the information more simply than textbooks, but they bombard you with math formulae right from the beginning on these subjects! ^^ So this is so much more intuitive!
Words cannot express how much I love StatQuest!!!!!! Thanks Josh for the amazing videos! I'm definitely gonna recommend StatQuest to my classmates!
@statquest
2 жыл бұрын
Thank you very much! :)
I was required to give a ppt on random forest in my class and after going with such a lucid video I am a felling great. Thanks STATSQUEST
@statquest
4 жыл бұрын
Hooray! :)
@knowledgeexchange428
4 жыл бұрын
dont forget the BAAMMM!!!
Your service to the data science community is very very much valuable...!!! Your videos on basics leaves a long lasting memory..!! Thank you..!!
@statquest
2 жыл бұрын
Thank you!
I love your content! It's super helpful to have all the information presented verbally and written at the same time. I usually use captions, but those often cover up the content which is annoying and the way you do it is so much better. Plus your intros make me giggle. Thank you so much!
@statquest
Жыл бұрын
Hooray! I'm glad you like my style! :)
One of the best tutorial channels. Keep up the good work and helping us all who are in need.
@statquest
5 жыл бұрын
Thank you so much!!! :)
Isn't it 'square root' of the number of variables at the end, instead of 'square'?
@statquest
6 жыл бұрын
You are correct. I make a slight mistake.
@AshisMohanty87
5 жыл бұрын
Absolutely, thats what I was breaking my head to understand and thankfully got it clarified from this comment !!!
@rajeshs2840
4 жыл бұрын
Yes You are absolutely Correct..
@ugomuohtochukwu1105
4 жыл бұрын
I was just about to write up this query. Absolutely correct
@MikhailGavryuchkov
4 жыл бұрын
@@statquest I am pretty sure StatQuest will be canonized as a "Video Bible of Machine Learning" some time soon. Sooooo, any slight mistake will have a probability (or likelihood?) of huge changes in the course of humankind history. :))
sir, you are always so helpful, when I have problems, the first thing came to my mind is to search the relative topic in your videos, looking forward more videos from you, thank you very much!
You're the best! Out of many tutorials on Internet...this best explains Random forest and its working! Thank you so much :)
the presentation is so simple and adorable, and what's more the topic is truly clearly explained! Thanks!! I love this animation and color choice
@statquest
3 жыл бұрын
Thank you very much! :)
Can I say BAAM in an interview after explaining what is Random Forests?
@imrankhanissm
4 жыл бұрын
Yes but then interviewer will also say DOUBLE BAAM after kicking you out
@prat-man
4 жыл бұрын
@@imrankhanissm XD
@shihyusung
4 жыл бұрын
@@imrankhanissm lmao
@roh_95
4 жыл бұрын
@@imrankhanissm 😂🤣😁
@amrutajahagirdar7438
4 жыл бұрын
haha lol
Wow! Thank you Josh, only after watching your videos about decision tree and random forest, I really get the concepts of them!
@statquest
5 жыл бұрын
Hooray!!! I glad you like the videos and that they were helpful! :)
i searched for so many tutorials but no one explained it like you. Thank you!!
@statquest
3 жыл бұрын
Thanks! :)
Your videos deserve a million kudos. They should be a legit applied statistics class for biologists like myself. Please continue your amazing work! I invite everybody to support Josh with as little as $1 a month.
@statquest
4 жыл бұрын
Thank you very much for supporting StatQuest! It means a lot to me.
I must say,this is the best video! You made it so easy to understand. And the way you explain it is perfect!
@statquest
5 жыл бұрын
Thank you!!! :)
You are the only one who clearly explains how the *size of bootstrap* can be the same as the *size of the original train data* . Thanks!
@statquest
6 жыл бұрын
You're welcome!!! Hooray! I'm glad it was clearly explained. :)
Thank you, Josh, for making things much easier. This 10 minutes tutorial is almost tantamount to reading one whole book about random forest decision tree. I heard the piece at 9:39 as '.... square root of the number of variables...."
@statquest
4 жыл бұрын
Thanks! Yes, I made a typo at the end of the video and it is mentioned in a pinned comment.
I've seen multiple videos and articles explaining Random Forests and I must say this is the best so far. Awesome work!
@statquest
5 жыл бұрын
Thank you very much!!! :)
Heavily relying on these videos to understand the techniques relevant for my Master Thesis in experimental particle physics. Thank you so much, you are the best!
@statquest
3 жыл бұрын
Thank you! I'm glad the videos are helpful. :)
@KamiK4ze
Жыл бұрын
I'm also using this for my Master Thesis :D
This is the best video I've ever seen to explain Random Forrest! Thanks so much for making this!!Please keep making videos! Love the humor also :)
@statquest
5 жыл бұрын
You're welcome!!! I'm glad you like the video (and my silly jokes)! :)
These graphical examples are awesome. I love your videos!!
I really like the pace of these. It helps me get around the terms and have chance for my brain to keep up
@statquest
4 жыл бұрын
Awesome! :)
very good approach, genious simplicity and to the point, helps anyone learn(even me). I even stopped skipping the "music" to pay my respects to you Josh!
@statquest
5 жыл бұрын
Thanks so much! I love that you put "music" in quotes. That cracked me up. As long as I get to have my fun, I'm happy to make these videos. ;)
You're so cool Josh, by only first listening to your song, I knew you'd nail this topic just as usual. You're much appreciated Sir Starmer
@statquest
2 жыл бұрын
Thanks!
This is one lovely presentation on Random Forests! Thanks a ton for making it easy for us to understand.
@statquest
3 жыл бұрын
Thank you very much! :)
Great tutorial, as always!
easily the best channel for data science/machine learning. respect
@statquest
4 жыл бұрын
Thank you very much! :)
@simonweppe7131
Жыл бұрын
@@statquest agreed 100%
Thank you very much for this video! It was a great use of visuals to explain the progression of various aspects of this topic (e.g. eventually using the out-of-bag samples to calculate accuracy)!
@statquest
3 жыл бұрын
Glad you enjoyed it!
I just love your videos. Im attending a bootcamp about Datasceince and all the concepts that the instructor is not explaining as i would understand them, i come to your channel to learn them better. Thanks man ❤❤
@statquest
9 ай бұрын
Glad to help!
Man you're a legend. LOVE Your content especially the music.
@statquest
3 жыл бұрын
Glad you enjoy it!
The "Oh no! Terminology/Jargon alert !" always gets me
@statquest
3 жыл бұрын
:)
These videos are so amazing, I absolutely love them! Thank you so much!
@statquest
5 жыл бұрын
Hooray!!! I'm so happy to hear that you like the videos! :)
Really good tutorial. It's always the best to explain using real examples. Good job!!!
Great video. Very simple and easy to understand. Nice job!
@statquest
5 жыл бұрын
Thank you! I'm glad to hear you like the video. :)
Everytime he starts "Bo pi do pi do pi doo" i inadvertently start laughing
I found this on top of the list while searching and now I don't need any other video to understand it. Thanks
@statquest
2 жыл бұрын
Glad it helped!
I love StatQuest. You really make statistics enjoyable. BAM!!!!!!!! Thank you for your time and effort in making these videos. You have helped out a lot of students. Please keep up the good work. Keep them coming.
@statquest
4 жыл бұрын
Thank you very much! :)
@L.-..
4 жыл бұрын
@@statquest hello there.. could you please clarify the doubt that was asked in your pinned comment.. about choosing the same features again at different level in the decision tree..
In grad school and am planning on applying the RF method to my air quality data. Your videos have been a life-saver!
@statquest
4 жыл бұрын
Awesome and good luck! :)
@MrSpiritmonger
4 жыл бұрын
@@statquest Hi, I am doing my PhD, I am inspired to do a project just because I learned so much from your video. I learn more from one single video than an entire year of my coursework.
I'm in grad school and I'm supposed to "quote" litterature, yet I just watch StatQuest's videos because they are easy and fast.
@statquest
4 жыл бұрын
BAM!
Thank you for another great video, i love how you are to the point and explain everything in such an easy way
@statquest
2 жыл бұрын
Glad you like them!
This is the best video that very well explains random forests.Very Helpful!
@statquest
3 жыл бұрын
Thank you! :)
After learning so much from your videos and then listening to your beautiful sentimental songs makes me cry! ;)
@statquest
4 жыл бұрын
Thank you very much!!!! :)
You are a boon to humanity...Hope you make a lot of money:)
@statquest
4 жыл бұрын
That would be awesome! Maybe one day that will happen. :)
Just discovered your channel. I really like your work. Love the funny moments you add in it. Thank you for clarifying those concepts!
@statquest
2 жыл бұрын
Awesome, thank you!
This makes it so easy. I understand it well enough to explain it to others. Thanks.
@statquest
3 жыл бұрын
Thank you!
This video is totally "Out-of-Boot" among all other Random Forest videos on KZread 😀
@statquest
3 жыл бұрын
BAM! :)
Thank you so much Statquest. in my "quest" to build random forest from scratch in C++, I found this resources soooo usefull. Including the video on the decision tree classifier
@statquest
4 ай бұрын
Glad I could help!
U have got great skills in explaining concepts. Thank you!
In evaluating the random forest, you mention: use trees to classify samples that are not used to build the trees. As a result, the number of trees to test an out-of-bag sample is different from one sample to another, aren't? E.g. a sample can be tested by 4 trees (there are 4 trees that do not use the sample to build), while another sample can be tested by 10 trees.
my project is about using random forests on accelerometer data from goats, but Josh is the real goat
@statquest
4 ай бұрын
bam! That sounds like a pretty cool project!
@aristide_F
4 ай бұрын
Oh Really, This is somewhat what i am working on. I am studying different speedup processes and strategies and I am specifically working on stroke prediction..
@big_snake431
4 ай бұрын
nice. I am working on an activity tag to classify behavior automatically without observing the animal directly@@aristide_F
Great job StatQuest! I am in love with this channel.
@statquest
5 жыл бұрын
Hooray!!! I’m so happy to hear that you like StatQuest!
probably these best video fora beginner on the internet. Thanks!
9:36 Typically we start by using the square [root?]
@nataliag5045
3 жыл бұрын
Was wondering the same thing. If we have 5 features we can't bootstrap with 25. It makes more sense if we have 25 variables, to start with 5 for bootstrapping.
Finally found an ML tutorial without accent.
@digit432
5 жыл бұрын
He has an American accent
@fiddinyusfida5356
5 жыл бұрын
@@digit432 lol
Oh good..! I understood random forest concept after watching this video. Thanks so much
@statquest
4 жыл бұрын
Hooray!! I'm glad the video was helpful. :)
I like this, quite clear explanation and hit the points we need to know. Thanks.
@statquest
3 жыл бұрын
Thank you! :)
3:37 looks like a warning on a cigarette pack
Double out-of-bam
@statquest
5 жыл бұрын
I like it! :)
Hey Josh, your videos are very interesting and easy to understand. Keep up the good work, man. The gini impurity could be the 'Gain Ratio Impurity'. The Information Gain Ratio is an important parameter in decision tree learning, so seems plausible to me.
best explanation so far on KZread. thank you!!
@statquest
4 жыл бұрын
Thank you! :)
There are 7 idiot heaters who down-voted this amazing video... They probably live in pain!!
@statquest
5 жыл бұрын
:)
This patient has heart disease. BAM!! Congrats
@statquest
4 жыл бұрын
:)
OMG you saved my day ! I am currently preparing a phd exam, and I was lost about random forest, now I can easily breath thanks to your amazing video ! Thanks a lot
@statquest
2 ай бұрын
Good luck!
This was such an awesome and a cool video !! Found this helpful :) Kudos for making this video, and hope you continue with your Quest !!
@statquest
3 жыл бұрын
Thank you! Will do!
Respect ,The way you have present Random forest
@statquest
5 жыл бұрын
Thank you! :)
I enjoy the tutorial and your album a lot. Math + music, what a combination. thanks for the content this helps me more than my couple hundred dollar textbooks.
@statquest
2 жыл бұрын
Thank you very much!!!! :)
I'll keep saying stat-quest is the best stat tutorials channel ever
@statquest
6 ай бұрын
bam!
I love the way you explain. so clear and helpful ! Thanks
@statquest
5 жыл бұрын
You're welcome!!! I'm glad to hear you like this video. :)
This guy is a living legend, giving us quality content for free
@statquest
Жыл бұрын
Thank you!
You are the cutest stat teacher i have came across in my life . Lots of LOVE and Respect.
@statquest
Жыл бұрын
Wow, thank you!
Thank you for sharing this easy way to understand RF!
@statquest
4 жыл бұрын
Thanks! :)
Excellent! You definitely have to continue making educational videos. Congratulations for your work.
I trust the information you provide so much, to an extent that I press the like button even before I watch the video
@statquest
4 жыл бұрын
Awesome!!!
Your songs are also amazing. I loved the last album you shared on Bandcamp. It motivates me to continue my passion for music as well as my studies in Computer Science.
@statquest
3 жыл бұрын
Happy to hear that!
It just hit me that its called random forest because its a forest of random trees. This channel is pure gold for bioinformatics students
@statquest
2 жыл бұрын
Thanks!
thank you very much for this well-crafted and excelllent video!! finally understand bagging
@statquest
2 жыл бұрын
bam!
I can't wait to work through all your videos!
@statquest
2 жыл бұрын
Hope you enjoy!
Wow man,, best explanation on earth,, we love you,, I watched and read alooooot about it but was ending up more confused,,, I wish your way of explanation becomes the standard
@statquest
4 жыл бұрын
Thank you very much! :)
Bam Bam Double Bam!!! Awesome Tutorial Videos and very helpful!!
I can't be any happier lol! Thanks a ton for the vid, the explanation is awesome!
@statquest
4 жыл бұрын
Hooray! :)
Excellent tutorial! His tutorials are underrated.
@statquest
6 жыл бұрын
Thank you!! I'm glad you like this one! :)
I love your opening song, it makes the content less stressful!
@statquest
2 жыл бұрын
bam!
You guys are amazing!!!! So clear and easy to understand especially for me who hate math!!! Thank u!!!
@statquest
2 жыл бұрын
Thanks!
The best explanation for Random forests!
@statquest
3 жыл бұрын
Thank you! :)
Oh my god you made it easy to understand. Please keep this process.
@statquest
2 жыл бұрын
Thanks!
I am loving all these videos. The clarity and simplicity in explaining the concepts. Just want to highlight a small error. I guess it is the square root of number of variables (and not the square) that are typically started with.
@statquest
Жыл бұрын
Thanks! That error is mentioned in a pinned comment.
I must say I went through a few mathematical books in order to understand it, but this explanation is just BAMMMM!! :)
@statquest
5 жыл бұрын
Hooray! :)
I read lots of blog on Medium app..none of them clarify its so well the you do Mr. stramer. Thanks for your effort to make is so easy
@statquest
4 жыл бұрын
Glad it was helpful!
@nmana9759
4 жыл бұрын
I wish more medium writers explain more in detail just like statquest's videos
Your channel has saved my research career
@statquest
5 жыл бұрын
Hooray! :)
I can’t thank you enough!!! Your videos saved me 🫶🏽🫶🏽🫶🏽
@statquest
Жыл бұрын
Thank you!
Great explanation, made concepts simple and easy to understand!
@statquest
3 жыл бұрын
Thank you!
Thank you, you r great. I really enjoy your music and tutorial
@statquest
6 жыл бұрын
Thank you so much!!! :)