PyTorch Tutorial 14 - Convolutional Neural Network (CNN)

New Tutorial series about Deep Learning with PyTorch!
⭐ Check out Tabnine, the FREE AI-powered code completion tool I use to help me code faster: www.tabnine.com/?... *
In this part we will implement our first convolutional neural network (CNN) that can do image classification based on the famous CIFAR-10 dataset.
We will learn:
- Architecture of CNNs
- Convolutional Filter
- Max Pooling
- Determine the correct layer size
- Implement the CNN architecture in PyTorch
📚 Get my FREE NumPy Handbook:
www.python-engineer.com/numpy...
📓 Notebooks available on Patreon:
/ patrickloeber
⭐ Join Our Discord : / discord
Part 14: Convolutional Neural Network (CNN)
If you enjoyed this video, please subscribe to the channel!
Official website:
pytorch.org/
Part 01:
• PyTorch Tutorial 01 - ...
More about CNNs:
deeplizard channel: • Convolutional Neural N...
Stanford Lecture: • Lecture 5 | Convolutio...
cs231n.github.io/convolutional...
machinelearningmastery.com/co...
Code for this tutorial series:
github.com/patrickloeber/pyto...
You can find me here:
Website: www.python-engineer.com
Twitter: / patloeber
GitHub: github.com/patrickloeber
#Python #DeepLearning #Pytorch
----------------------------------------------------------------------------------------------------------
* This is a sponsored link. By clicking on it you will not have any additional costs, instead you will support me and my project. Thank you so much for the support! 🙏

Пікірлер: 185

@normalperson11304 жыл бұрын
Dude. Please continue to upload. Ik you don't get that many views. But there is shortage of Pytorch videos and your video are helpful for me. I hope the algorithm kicks in and your video is suggested to more people..
@patloeber
4 жыл бұрын
Thank you! Yes I will continue :)
@ChaojianZhang
3 жыл бұрын
Honestly, his is the best I have seen on CNN so far. Short and concise. Clear and straightforward.
@ChaojianZhang
3 жыл бұрын
Serves as a good reference video for the programming aspect. Some of the convolution math stuff is clearly skipped in this video.
@H3K36ME3
3 жыл бұрын
This is an amazingly useful channel, thanks for your awesome work!
@ngunyi101
Жыл бұрын
you said he's not getting that many views? :D how times change. consistency is key
@juvanthomas70224 жыл бұрын
This series of tutorial is my foundation of pytorch , These tutorials stands above all i watched . Thank You very much author. Waiting for more uploads. :)
@patloeber
4 жыл бұрын
Thank you so much for the feedback! I'm really glad that you like it and it is helpful!
@iEdp526_01 Жыл бұрын
Hey, just wanted to let you know how much these videos helped me. I started working to learn ML three years ago and now, as I'm about to graduate, have come to the point of independently building and training nets for my Undergrad Senior Project. I don't think I ever would have gotten off the ground if not for these and even now reference them when I'm starting with new types of nets or data prep. Thanks for all the time and effort you put into these.
@Hazarth4 жыл бұрын
Your videos are hands down the best step by step explanation of pyTorch, machine learning and the math behind it! I'm very thankful that you make this series, you're amazing and I wish you a great day!
@patloeber
4 жыл бұрын
Thank you so much :)
@scoburto13 жыл бұрын
Especially liked the explanation of how the size of the torch tensor changes through the layers of the ConvNet. Thanks for sharing!
@patloeber
3 жыл бұрын
Thanks, happy to hear that!
@ferencfeher70943 жыл бұрын
This equation saved me. I am literally in a masters program and I was struggling with getting the right number of dimensions. Not anymore thanks to you!!
@patloeber
3 жыл бұрын
Glad to hear that :)
@conlanrios6 ай бұрын
Thank you! This finally helped me understand what was going on between convolutional layers.
@sanderg91063 жыл бұрын
I am starting with pytorch and this video saved me from anxiety and despair :)
@aidankennedy69733 жыл бұрын
This, as with all videos on this channel, needs more views. Every time I need to learn something on ML, this channel has the best and most enjoyable videos.
@patloeber
3 жыл бұрын
thanks so much :)
@MarcinAKGaming3 жыл бұрын
Awesome tutorial. Helped me understand so many concepts I need for a college level ML course in 20 minutes. Thanks!
@patloeber
3 жыл бұрын
Glad you enjoyed it!
@porkfisher1030Ай бұрын
You have saved my Nature Inspired Computing assignment!!! Thank you soooo much! Fantastic demonstration and clarity! Amazing 😆
@saruaralam27234 жыл бұрын
your teaching style/flow is great,(theory and coding at the same time), kindly upload more regarding other DL frameworks/platforms like tensorflow, keras, etc.
@patloeber
4 жыл бұрын
Thank you! I'm glad that you like it!
@jh66434 жыл бұрын
Great tutorials. Making sure that I don't leave without liking these videos.
@patloeber
4 жыл бұрын
thank you!
@summerxia74742 жыл бұрын
The best CNN python video!!! Thank you so much!!!
@erfanshayegani3693 Жыл бұрын
You are the best considering the strength of explanation!
@user-vp2jc7fi5q11 ай бұрын
You uploaded 3 years ago and im so glad you did, university didnt teach this much istg THANKS ALLOT !!!! KEEP UPLOADING MORE. and tell a toolkit other than cuda for intel UHD graphics
@polouabcoite3 жыл бұрын
Thank you so much! Your videos are helping me a lot. Congratulations!!
@patloeber
3 жыл бұрын
thanks a lot :)
@longnguyenhoang7642 жыл бұрын
your course is saving my life, EVERY SINGLE VIDEO is a gold material
@amiprogramming4897
Жыл бұрын
Hey, I just came across your comment on the PyTorch Geometric tutorial lol
@aytida754
Ай бұрын
@@amiprogramming4897 Hey, I just came across your reply to a comment on the PyTorch Geometric tutorial lol
@twahirabasi97653 жыл бұрын
The best tutorial!, thank you so much!
@patloeber
3 жыл бұрын
Thanks 🙏🏻
@paulntalo14252 жыл бұрын
Awasome precise and insighful tutorials, indeed the best about PyTorch and CNN. Thank you
@patloeber
2 жыл бұрын
Glad you like them!
@suryavaraprasadalla85112 жыл бұрын
keep going. Please continue to upload. Great Content and support.
@BalajiOm3 жыл бұрын
Very helpful! Thanks for the video
@nougatschnitte84032 жыл бұрын
Writing my Bachelor thesis about this, you are a life saver :-)
@MontanaPreston3 жыл бұрын
Very helpful, thank you!
@yannickleroy74192 жыл бұрын
Awesome video thank you very much!
@user-pt9lb4rz7u5 ай бұрын
Thank you for this video!
@tianzongwang68324 жыл бұрын
Very clear implementation!
@patloeber
4 жыл бұрын
Thank you!
@georgianaorbeanu91792 жыл бұрын
Great video! Keep it up!
@michaelmuolokwu50392 жыл бұрын
I really love your videos
@parthkandwal83432 жыл бұрын
Great work Thank you very much
@jieluo37363 жыл бұрын
your videos are really good, thank you
@patloeber
3 жыл бұрын
thanks for watching :)
@summerpiao22993 жыл бұрын
Hi, I just want to thank you for your work. I think those videos are really helpful to me and we are very appreciative of those. :-D They are really useful and you have a clear explaining structure. Thank you a lot!
@patloeber
3 жыл бұрын
Glad you like them!
@fahadaslam8204 жыл бұрын
Do you have an example of CNN implementation on 1D data? for example CNN model for 'Wine Dataset you have used in your tutorial'? Thanks!
@user-fk1wo2ys3b3 жыл бұрын
Superb job!
@patloeber
3 жыл бұрын
Thank you! Cheers!
@Deathend2 жыл бұрын
Thank you for the tutorial as well as the github. I need to mess around with things to get a solid grasp of them so I greatly appreciate this. :D
@patloeber
2 жыл бұрын
Glad I could help!
@TusharFaroque3 жыл бұрын
*Wow, Thanks a lot brother*
@asrafpatoary41272 жыл бұрын
I am studying at FAU and watching your videos to crack the coding part of DL exam ✌
@sinemozdemir38842 жыл бұрын
thank you, very good explanation.
@patloeber
2 жыл бұрын
thanks!
@anuragshrivastava78552 жыл бұрын
please upload more advance pytorch videos and projects and keep doing great work
@mohaiyedin3 жыл бұрын
Great video... Thanks for sharing...
@patloeber
3 жыл бұрын
glad you liked it :)
@diegocassinera Жыл бұрын
Great Video . One simple question, you explain very well how the hardcoded values came to be. Could the values for the inner layers (pool, conv2, fc1, fc2,...) be obtained programmatically from the previous layer ?
@saltanatkhalyk33972 жыл бұрын
Thank you good man
@CPjonesn2 жыл бұрын
Loved the video as always, thank you! Short question: I was wondering how you came (or have been comming) up with the simple CNN architecture(s), is this for example a common vanilla network or do you maybe have a paper at hand that you use. Would be interesting to know. Thanks ahead - big fan!
@johnparker24864 жыл бұрын
You are amazing!
@patloeber
4 жыл бұрын
thank you!
@user-er3vj8cl9l Жыл бұрын
wow its very awesome thx :)
@rutvikjaiswal49863 жыл бұрын
This video really want to goes in trading page sir your teaching style is awesome ! you are too cool thank your for this video . I fall in love with your teaching
@patloeber
3 жыл бұрын
thanks a lot! happy to hear this
@amankushwaha89272 жыл бұрын
Thanks
@davidwu32472 жыл бұрын
you are a LIFESAVER
@patloeber
2 жыл бұрын
happy to hear that :)
@hjr00213 жыл бұрын
Please do a video on the implementation of Conv1D for multi class classification.
@anthonynguyen62934 жыл бұрын
can you explain a little bit more on how you decide the output channel and the kernel size? And also the input/output sizes of the fully connected layers please.
@patloeber
4 жыл бұрын
Very good question. The architecture in my video is taken from the popular LeNet-5 network. You can read more here: medium.com/datadriveninvestor/five-powerful-cnn-architectures-b939c9ddd57b
@sergiomurilo7583 жыл бұрын
Dude...can't thank you enough....You saved my life hehe
@patloeber
3 жыл бұрын
haha glad to hear that :)
@user-xp4uw2kc3n5 ай бұрын
Thanks it was very helpful! if we want to add one more convolution layer what its argument number will be?
@tumultuousgamer2 жыл бұрын
Could you please clarify why you flatten to columns instead of rows i.e. x.view(-1, 16 * 5 * 5) instead of x.view(16*5*5, -1). In my program, I noticed that there are errors like NaN happening when I flatten to rows (with a higher learning rate of 0.5), rather than columns. Seems like you have done this for some reason, could you please explain it?
@canernm3 жыл бұрын
Hello! Thanks for the videos. Quick question: i've seen people use the methods "model.train()" and "model.eval()". Can you tell me why they are not necessary here? Thank you in advance!
@chandanagrawal23994 жыл бұрын
Very clear explaination.. Plz also consider making a tutorial on using GPUs with pytorch.. It would be very helpful
@patloeber
4 жыл бұрын
Thank you! All the code in my tutorials should work on GPUs, too, since we are sending model and tensors to the GPU device if it is available.
@ranjanrajdahal35575 ай бұрын
This is outstanding . can anyone know how the earthquake time series data can be trained to CNN ? any video please help
@prajganesh4 жыл бұрын
have a basic question. When Forward and background propagation happens, does it enumerate any number of time back and forth to go to minimize loss or do we need to iterate in a loop? So the training loop is for each image, but then the Forward and Backward goes any number of times to optimize, correct?
@patloeber
4 жыл бұрын
training loop is for the number of epochs we specify. and then for each epoch we iterate over our data and take batch samples. For each batch we do a forward and backward pass then.
@ottorocket032 жыл бұрын
On 14.03, what if we have multiple filters, i.e 4 filters with size of 3 x 3 ? Does the equation change ?
@aomo5293 Жыл бұрын
Hi, Thank you for great video; Please have y made before an example on which you show how to load images from local directory + labels from extrac csv or pkl file ? Thank you
@userwheretogo2 жыл бұрын
What is the difference between view and reshape? reshape was used in FFN video and view is used here. Thanks!
@TheOraware3 жыл бұрын
thanks for such a detailed video , why did you chose output channel size is 6 at 8:32? is it just an arbitrary?
@alperensonmez6875
2 жыл бұрын
I couldn't get it either
@anonymousanon4822
2 жыл бұрын
Yes, it's arbitrary. Basically the amount of output channely determines how many different convolutional filters are used. So more filters allow the neural network to maybe implement a vertical edge finding filter, one for horizontal edges, 2 for diagonals, and more. The downside is that more channels mean more weights, which makes the network harder/slower to train.
@tobi96683 жыл бұрын
Why do you choose 6 and 16 for ouput size in conv layer? Is this just trying out what works the best? I read when the image has more features the outputsize should be greater. Is this correct? Would be size if you do some more content about cnn or gan
@divymohanrai3 жыл бұрын
Really great tutorial. I had a question regarding the values for mean and std(). How did you choose the value of mean to be 0.5 for all channels and the same for standard deviation? Did you precompute it?
@patloeber
3 жыл бұрын
This is approximately the mean over each channel of the training data set (yes precomputed).
@R3nxt4 жыл бұрын
Great
@skyacaniadev22293 ай бұрын
@patloeber Is it a typo in the learning rate? I used 0.01 (instead of your 0.001), and the accuracy is much better (65%).
@tristanc.65988 ай бұрын
Why was the output channel size on the second conv layer 16?
@amareshdhal5163 жыл бұрын
Why the output channel is 6 at 8.35.
@juanete692 жыл бұрын
Hello. Is it the same a "train_loader" than a minibatch?
@elise8619 Жыл бұрын
May I ask why you don't just use nn.Sequential to define the model? It is much more straightforward and easier to read I think. Or perhaps this is a newer feature? Anyway, for anyone interested, I just replaced the class definition with: model = nn.Sequential( nn.Conv2d(3,6,5,stride=1), nn.ReLU(), nn.MaxPool2d(2,2), nn.Conv2d(6,16,5,stride=1), nn.ReLU(), nn.MaxPool2d(2,2), nn.Flatten(), nn.Linear(16*5*5,120), nn.ReLU(), nn.Linear(120,84), nn.ReLU(), nn.Linear(84,10) ).to(device)
@epistemicompute Жыл бұрын
I am confused, why doesn't max pooling change the input dimension of the next convolution layer?
@Chiro132 жыл бұрын
hi, the Conv2d has the Relu activation?
@valarmorghulisx3 жыл бұрын
hi! Thank you so much for this awesome tutorials. we calculate the n_total_steps = len(train_loader). why is the train loader length is 12500? where did we define it?
@builder_Max
Жыл бұрын
It's defined at train_loader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True). As you set batch_size as 4 here, it divides the total number of data(50000) by 4 and becomes 12500.
@sailfromsurigao3 жыл бұрын
Why not use flatten layer?
@prajganesh4 жыл бұрын
Is it possible to show how to train our own images and identify? For fun, I want to load all my local pictures and separate into folder based on the images it sees. Do we have any examples?
@patloeber
4 жыл бұрын
have a look at tutorial 15. there i load saved images from folders
@igor-policee2 жыл бұрын
Hello! I always look at your work carefully and I want to thank you for what you do! I have one question about the code. Please explain why you use exactly such parameters in: transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)). Thank you!
@patloeber
2 жыл бұрын
these are the mean and std dev that were calculated previously from the training data
@tarekradwan86613 жыл бұрын
when you use transforms.Normalize(.....) shouldn't each channel in the image be normalized to [0,1] before you can set a mean and std of 0.5??
@patloeber
3 жыл бұрын
Good point! All torchvision datasets are PILImage images of range [0, 1], so it's already scaled :)
@nateshtyagi3 жыл бұрын
Excellent work but I want to know will this model work on a dataset that has classes that aren't mutually exclusive? For ex: Street View House Number Dataset (SVHN).
@patloeber
3 жыл бұрын
nope for SVHN you have to adapt the model and probably use object detection first, then classify each digit separately
@teetanrobotics53634 жыл бұрын
you missed RNN and LSTM. But still an amazing playlist
@patloeber
4 жыл бұрын
I know. I plan to do them in the future
@iposipos93424 жыл бұрын
Thanks for your video. I find this a little confusing. Please what is the difference between conv1D, conv2D and conv3D and in what context should we use each of them? thank you
@patloeber
4 жыл бұрын
Good question! Most of the time we are dealing with conv2D since our images are most likely 2-dimensional. Maybe this link is helpful : stackoverflow.com/questions/42883547/intuitive-understanding-of-1d-2d-and-3d-convolutions-in-convolutional-neural-n
@iposipos9342
4 жыл бұрын
@@patloeber thanks
@moshoodolawale35914 жыл бұрын
What theme are you using on visual code studio and likely tips and tricks for running the code within your environment in general?
@patloeber
4 жыл бұрын
It's the night owl theme. I'm planning to do a tutorial about my vs code setup
@moshoodolawale3591
4 жыл бұрын
@@patloeber That would be great
@nicolasgabrielsantanaramos2914 жыл бұрын
Is it possible to use time series as input data ? Do you indicate any link to read more about ? And, thanks a lot for the class, it help me a lot.
@patloeber
4 жыл бұрын
Sure you can. machinelearningmastery.com/how-to-develop-convolutional-neural-network-models-for-time-series-forecasting/ towardsdatascience.com/how-to-use-convolutional-neural-networks-for-time-series-classification-56b1b0a07a57
@nicolasgabrielsantanaramos291
4 жыл бұрын
@@patloeber thanks!!!!
@MRexlit33 жыл бұрын
Hi, I am currently doing work which involves creating a CNN. We have to give it an input channel of 3*128*128. Does this just mean I set the Channel parameter in the Conv2d to 3, and the images are 128*128? Or do I need to set parameters as 128 somewhere
@nothinghere3702
3 жыл бұрын
Images are 128*128 pixel And 3 indicates it’s a colored images (R,G,B).
@raminessalat98033 жыл бұрын
Hey Great video! have a question: what is the reason for the normalization 0.5 in the transform option for?
@patloeber
3 жыл бұрын
That’s roughly the mean value of the training dataset (which I pre-calculated). Using this will normalize the whole dataset to have the same mean
@raminessalat9803
3 жыл бұрын
@@patloeber Great! thanks!
@popamaji4 жыл бұрын
13:20 why did u increased the colour channel numbers and what does it mean even?
@patloeber
4 жыл бұрын
The architecture is taken from the popular LeNet-5. It means we get 6 feature maps as output. medium.com/datadriveninvestor/five-powerful-cnn-architectures-b939c9ddd57b
@chootzesien93153 жыл бұрын
Hi! May I know why the optimizer.zero_grad() is place before the optimizer.step()? Previous episode it was place after the optimizer.step()
@patloeber
3 жыл бұрын
does not really matter as long as it's called before the next iteration
@yashvander-bamel2 жыл бұрын
There is one question though, do we need to keep track of the shapes after each convolution and/or pooling layer? So that we can enter the correct amount of input neurons in the first Linear layer. Isn't there a convenient method for this? BTW Thanks for the awesome tutorial !!
@pratyushsingh7062
2 жыл бұрын
Yes, you need to calculate that manually
@yashvander-bamel
2 жыл бұрын
@@pratyushsingh7062 Have a look at lazy layers in pytorch. You might want to change your opinion then.
@VarunKumar-pz5si3 жыл бұрын
Why you normalized the data from [0,1] to [-1,1]
@aliikram49933 жыл бұрын
what if i want to do this but with a data which is not one of the torchvision datasets how would I load it then
@patloeber
3 жыл бұрын
probably implement your own Dataloader like I explained in lesson 9
@chakra-ai3 жыл бұрын
Hi, I request, Can you please add a NLP use case to this series of pytorch implementation.
@patloeber
3 жыл бұрын
Definitely want to do this. For now I already have a chat bot tutorial (4 videos) with PyTorch that teaches some beginner NLP techniques
@theonethatcant4 жыл бұрын
Why do you perform optimizer.zero_grad() before the loss.backward() and optimiser.step()? It conflicts with your previous videos and seems counter-intuitive as I assumed the backward step uses the gradients resulting in the forward step.
@patloeber
4 жыл бұрын
It does not matter if you call zero_grad at the end or at the beginning of the for loop. Just make sure that the gradients are empty before the next backward() call. I should have been more consistent in my code...
@back811924 жыл бұрын
I was wondering when did you call the forward function that you defined in the class? It seems that you didn't call it...
@patloeber
4 жыл бұрын
The forward pass will be executed for you when you call outputs = model(images). For this you have to define it in your model class
@back81192
4 жыл бұрын
@@patloeber thanks
@HanWang_ Жыл бұрын
Thank you so much! Everything is so clear. And even though English is not my mother tongue, I can catch up without caption. (*^_^*)
@kerrsv16 күн бұрын
Did you forget a 2nd pooling layer?
@prashantsharmastunning4 жыл бұрын
so we can randomly choose output_channel for each CNN layer?!! does it affect the accuracy?
@patloeber
4 жыл бұрын
Hi! Different architectures of course affects the accuracy. the architecture in this video is taken from the popular LeNet-5. I did not go too much into detail when talking about the architecture. If you are interested you can read more here: medium.com/datadriveninvestor/five-powerful-cnn-architectures-b939c9ddd57b
@prashantsharmastunning
4 жыл бұрын
@@patloeber thanks this was really helpful..
@lakeguy656163 жыл бұрын
I followed your code exactly. I trained for 20 epochs and achieved overall accuracy of 63%. So I trained for 100 epochs and the accuracy went down to 60.75%. What accuracy can be achieved? what is the highest accuracy you have reached? thank you for responding.
@patloeber
3 жыл бұрын
I used just a basic model in this tutorial. I recommend to follow tutorial #15 and use transfer learning on CIFAR10 and then see how well it performs
@lakeguy65616
3 жыл бұрын
I have tested the simple ff model from #13 with different hidden layers 1 through 4 and different numbers of neurons per layer (25 - 3000).
@manalihiremath28053 жыл бұрын
i am getting tis error:Given groups=1, weight of size [20, 15, 3, 3], expected input[32, 3, 256, 256] to have 15 channels, but got 3 channels instead
@patloeber
3 жыл бұрын
compare with my code on github. somewhere you have an error with the wrong size
@saurrav38013 жыл бұрын
Bro how to find image standard deviation and mean of image channels..
@patloeber
3 жыл бұрын
This is just the pre-calculated mean and stddev of the training dataset.
@aleenasuhail43093 жыл бұрын
I have a cov network: net = nn.Sequential( nn.Conv2d(3,10, kernel_size=5, padding=0), nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2), nn.Conv2d(10,16, kernel_size=5, padding=0), nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2), nn.Flatten(), nn.Linear(16*5*5,120), nn.ReLU(), nn.Linear(120,10) ) for param in net.parameters(): print(param.shape) but I am getting an error when trying to train it the error is: mat1 and mat2 shapes cannot be multiplied (64x13456 and 400x120) could you please help
@robinswamidasan
3 жыл бұрын
Clearly the size of the output from the 2nd MaxPool2D is not 16*5*5. What is the size of your input image? It's clear that it has 3 channels, but what is size of the data per channel (e.g. # of pixels: m x n). The input to Linear will depend on this size.
@akhileshsingh5693 жыл бұрын
How do you calculate there are 6 output channel
@patloeber
3 жыл бұрын
the architecture in this video is taken from the popular LeNet-5. I did not go too much into detail when talking about the architecture. If you are interested you can read more here: medium.com/datadriveninvestor/five-powerful-cnn-architectures-b939c9ddd57b
@barath_4 жыл бұрын
autoencoders bro!