Lesson 3: Practical Deep Learning for Coders 2022
00:00 Introduction and survey
01:36 "Lesson 0" How to fast.ai
02:25 How to do a fastai lesson
04:28 How to not self-study
05:28 Highest voted student work
07:56 Pets breeds detector
08:52 Paperspace
10:16 JupyterLab
12:11 Make a better pet detector
13:47 Comparison of all (image) models
15:49 Try out new models
19:22 Get the categories of a model
20:40 What’s in the model
21:23 What does model architecture look like
22:15 Parameters of a model
23:36 Create a general quadratic function
27:20 Fit a function by good hands and eyes
30:58 Loss functions
33:39 Automate the search of parameters for better loss
42:45 The mathematical functions
43:18 ReLu: Rectified linear function
45:17 Infinitely complex function
49:21 A chart of all image models compared
52:11 Do I have enough data?
54:56 Interpret gradients in unit?
56:23 Learning rate
1:00:14 Matrix multiplication
1:04:22 Build a regression model in spreadsheet
1:16:18 Build a neuralnet by adding two regression models
1:18:31 Matrix multiplication makes training faster
1:21:01 Watch out! it’s chapter 4
1:22:31 Create dummy variables of 3 classes
1:23:34 Taste NLP
1:27:29 fastai NLP library vs Hugging Face library
1:28:54 Homework to prepare you for the next lesson
Many thanks to bencoman, wyquek, Raymond Wu, and fmussari on forums.fast.ai for writing the transcript.
Timestamps thanks to "Daniel 深度碎片" on forums.fast.ai.
Пікірлер: 82
Wow, this guy is a deep learning/ML genius! I've been studying deep learning for 2 months now, and I consider myself quite good at math and coding. I've been looking for an explanation of what is happening under the hood when the model is training - an "explain like I'm 5" type of explanation. But the only things I could find were academic explanations of how a deep neural network trains with matrix multiplication of weight, bias, backpropagation, etc. I've probably watched 30 videos of those that are all copycats of each other, and I think those people don't know what they are talking about, just spitting out what they saw or read in academic papers/courses. This video was an eye-opener; the guy really knows what is happening behind the scenes, and his 30 years of expertise in the field really shows in those simple yet very easy-to-understand explanations. Thank you! 🙏
I’ve watched so many videos…. Read so many blogs…. Books…. Trying to understand this thing to understand what a neural network is and how it learns- you explained it perfectly making all the words just fit. The meanings become obvious when presented like this, you did this in…. 15 minutes 🔥
I "knew" that deep learning models used the sum of wi +xi + b function, I "knew" that it supposedly was used because it was an "all purpose" function, but now thanks to you Jeremy I know WHY its an "all purpose" function 10/10 explanation. Math should always be explained like this, its actually beautiful to see it all unfold.
The quadratic section is a beautifully crafted example. Thanks
@d14drums
Жыл бұрын
yeah that made it fully click for me
I greatly appreciate this effort to uplift the community worldwide
The quadratic example was a really good illustration of how gradient descent works - it is really good for building intuition. Then, the Excel example cements the understanding really well with a solid dataset. This is my favourite of the 3 lectures so far.
Great lesson!! Jeremy deciding to approach chapter 4 differently after seeing many student quit at this point really shows that he cares about students' learning. Greatly appreciated for the effort!🙏
Great foundational lecture. Jeremy has a relaxed, non-intimidating approach that works for me. Brilliant step by step walk into the deep end of the pool without getting us lost or scared :) Thank you for taking the time to put this together.
@howardjeremyp
Жыл бұрын
Glad you enjoyed it!
I had watched hundreds of deep learning tutorials and read too many DL books yet I couldn't form a clear intuition of what was exactly happening under the hood. Then I watched this video and 29:00 was my aha moment, Suddenly everything fell into place. Thanks Jeremy
I've gone through many great courses in all sorts of subjects, but I think this course might be the best. Kudos for putting out this fantastic content out there for free for everyone to learn.
@howardjeremyp
Жыл бұрын
Great to hear!
I couldn't understand why ReLu was needed and now I understand. I'm a programmer and I think this is the DL course for me. The explanation is very easy to understand. Thank you!
For those following along, there was a mistake in the spreadsheet range when calculating total loss, both at 1:14:27 and 1:17:40, it selects from row 662 instead row 4. Correct solved loses are 0.144 and 0.143.
Amazing talk! Thanks thanks thanks! You're doing the machine learning field so much easier to understand, and that's something invaluable.
many terms i had heard already, like loss function, fitting a model, activation function, relu JH is Amazing amazing teacher that these things are now clear crystal in my mind Thank you so much JH
Unbelievable content! Thanks to all who have made it possible!
This is god-tier educational content, sir. Thanks for sharing it!
Probably the most easy to digest material I've seen on the subject, thank you.
This is mind blowing! Great job explaining all these concepts.
Skip 10 minutes to start the lesson
Thank you so much jeremy for making this course, I am going slow but learning a lot everyday, you are a very patient teacher. Thank you.
New didactic and methodological ideas - like them very much - still a bit rough in execution - but discovers amazing new territory to approach neural networks - deep learning ... well done!
As always, an excellent video Jeremy.
Simply amazing! Excellent lecture.
The excel example blew my mind. Loved this lesson. Thank you.
I was lucky to have good math teachers in high school. Jeremy explaining the concepts reminded me of them. Thanks.
what a great lesson. mind blown! Thank you so much! You are a great teacher!
the explanation of deep learning foundations as is here, is too good! As said by Jeremy, one has to remind oneself, that is it, there is no more.
I am a newbie in machine learning. But the approach, you took in this lesson to explain difficult concepts, is making it so easy to understand. Great work.
@howardjeremyp
Жыл бұрын
Great to hear!
Wow, great explanation! Thanks!
Thanks Jeremy, great tutorial.
Quadratic example was just superb. 🎉
1:05:02 - "There's a competition I've actually helped create many years ago called Titanic" Biggest flex ever.
i am in love with this course
loved the excelTorch!!
I think one way to improve the slow/fast issue is that it is actually sometimes, both. The part that needs to go faster, would/should be going faster, or trimmed out unnecessary part. The parts that is complicated, maybe slow down a bit. Then add very short/fast "teaching" for each topic, and then goes into details after each short teaching, short teaching is not summary. So people who gets it can move ahead to the next topic.
great course! so weird that the videos have less than 100k views.
Excellent!
17:27 minor correction: it's error rate going down instead of accuracy
Thanks! Jeremy, great Lecture, never got into NPL, but now I am understanding it.
@howardjeremyp
2 жыл бұрын
Excellent!
@jordankuzmanovik5297
2 жыл бұрын
@@howardjeremyp Hi Jeremy, You mentioned that there will be part 2 of this course. When can we expect those videos? Thanks
@mohdsadik1784
Жыл бұрын
@@jordankuzmanovik5297 you can see videos now
great content.
Excellent tutorial! I have one question, in the excel, why are Parch and SibSp not normalized? Because they are not "big enough" to negatively interfere?
basically we have data, now let's create a general function (from those data) that can kind of produce those data and also predict what the next data would be.
Thank you for providing this insightful course, which has been instrumental in enhancing in cementing intuition. I have a question regarding the updating loop at the 41:30 mark. It appears that there may be a minor oversight. Shouldn't we consider resetting all gradients to zero prior to each subsequent call of the backward() function? Because PyTorch, by default, accumulates the accumulation of gradients from previous iterations, eventually leading to inaccuracies in gradient computation.
At 28:40 I believe you run the cell again and it changes the tensors slightly - drove me a bit mad trying to figure out why my results were different.
I just made a NN in Excel. Wow. If you want to predict two different things, do you just have a separate set of weights and Lins for the second item?
👏👏👏 applause from online
So paperspace appears to not be free. When I try starting a notebook he forces me to upgrade to 8/month. Is this still the recommended platform? IS it worth it?
@toromanow
Жыл бұрын
Looks like it's not worth it at all. I purchased the subscription only to get an error message that 'The VM I selected is currently not available please select another'. They indeed showed me a list of available VM. The available ones were at an additional cost of 0.7-3.50 USD per hour. Yes, that's on top of the 8USD/month subcription.
(around 17:40) Is taking the ratio of the two `error_rate`s standard practice? I find the "30% improvement" statistic a little misleading? The original error rate is 7.2% and the new error rate is 5.6% (rounding of 5.548 but this is a detail). In other words the accuracy goes from 92.8% to 94.4%. This can be seen as significant or not depending on which scale you adopt: a linear or a logarithmic one.
1:00:50 how did we go from trying to fit a function to computer vision's pixels ? The jump from relu functions applied on linear functions to speaking about pixels in an image is not clear. Can you please elaborate ? Why did u say each pixel will have a variable of its own ? what is the mapping from computer vision to function fitting in this context ? Why is every single pixel in an image is a single variable ? what is the rationale ?
building a neural net in spreadsheet. Heck yea!
I'm slightly confused about the intuition behind how multiple ReLUs can lead to a squiggly line. Wouldn't it more specifically lead to a line that is always either stagnant or gradually increasing because of how the output must be >=0 ?
just a quick question: by reproduce the code, is it mean that one should be able to write out the code by memory/understanding as in know all of the parameters within the arguments as well as the defined functions? Of course that would be best case scenario but I feel it would get in the way of moving through the course as one does not need to perfectly be able to reproduce the code, just understand what the parameters are doing, right?
48:55 the computer draws the owl :)
At 1:14:13 Jeremy describes calculating a loss. Can anyone explain this more, i.e. why subtracting whether the passenger survived (0 or 1) squared from the output of the linear equation for each row equates to a loss or error? It seems arbitrary and I'm not understanding why this is how we judge an error rate.
@curiousboy7015
6 ай бұрын
We want to make prediction equal to actual value. so we dont want a large gap between actual and predicted value thus we define loss as the square of the distance between actual and predicted value (the square will increase loss at higher rate if there is a large distance) now we just have to minimize loss - it will occur by changing weights and biases
In 43:00, isn’t there supposed to be abc.zero_grad() to zero out the gradients?
@solaxun
Жыл бұрын
I was wondering the same... otherwise wouldn't each backward call be accumulating progressively larger gradients, from keeping around the prior gradient before the updates occurred?
@greatfate
Жыл бұрын
@@solaxun Yup, exactly. It's one of the worst bugs (it's bitten me in the neck several times)
how much the difference betewen train_loss and validation_loss should be accepted ?
❤
Where can I find the walk through of Gradio?
I don't quite see how the Excel example qualifies as a "deep" neural network, since the layers were not stacked on top of each other but added together. The example is still great, though, and I could see how to stack the layers.
@elnur0047
Жыл бұрын
Hi, can you elaborate bit more regarding this? how does stacking differ from the approach in the video?
@yaptor0
Жыл бұрын
@@elnur0047 Rather than both multiplying the same inputs the 2nd one would multiply the products from the previous output. I was also a little confused when he just added them up at the end instead of feeding one into the other.
@tungo96
Жыл бұрын
yeah I have exactly the same doubt when I saw that, these are still 2 independent layers.
@lifthrasir1609
Жыл бұрын
Jeremy actually confirms that at 1:16:15
@JayPinho
Жыл бұрын
@@yaptor0 How would that calculation work? Doesn't he have to first sum up all the products from a given layer and RELU them (i.e. take the max of the sumproduct and 0)? If the 2nd layer simply accepted the individual products as inputs, wouldn't this 2-layer network just be a linear function?
=IF([@Embarked]="S" , 1, 0) and other IF statements like this seem not to work for me. Anyone experienced the same thing.
1:11:41 was a nice contradiction :D
lesson 1 needing math is a myth, awesome lets continue lesson 3 - here are all these math terms/equations you have no idea are or what you are looking at. Now I'm overwhelmed and feel defeated.
@curiousboy7015
6 ай бұрын
try doing atleast highschool math
I tried to make a Paperspace account and accidentally mistyped the phone verification, so they decided that I'm no longer allowed to verify with my phone number. Disappointing.
@romainrouiller4889
Жыл бұрын
Vpn.
I don't even know how to use Excel.