Reinforcement Learning Jump Start | Complete Deep Learning Course

In this complete reinforcement learning course you will learn everything from implementing double Q learning and SARSA with just numpy all the way up to implementing Deep Q Learning and Policy Gradient methods in Tensorflow.
No prior knowledge is needed, other than basic proficiency with Python.
This was cross posted with the Free Code Camp channel, which is an awesome community for developers of all stripes. They are THE premier online bootcamp. You can check them out at
/ @freecodecamp
#ReinforcementLearning #DeepQLearning #PolicyGradients
Learn how to turn deep reinforcement learning papers into code:
Get instant access to all my courses, including the new Prioritized Experience Replay course, with my subscription service. $29 a month gives you instant access to 42 hours of instructional content plus access to future updates, added monthly.
Discounts available for Udemy students (enrolled longer than 30 days). Just send an email to sales@neuralnet.ai
www.neuralnet.ai/courses
Or, pickup my Udemy courses here:
Deep Q Learning:
www.udemy.com/course/deep-q-l...
Actor Critic Methods:
www.udemy.com/course/actor-cr...
Curiosity Driven Deep Reinforcement Learning
www.udemy.com/course/curiosit...
Natural Language Processing from First Principles:
www.udemy.com/course/natural-...
Reinforcement Learning Fundamentals
www.manning.com/livevideo/rei...
Here are some books / courses I recommend (affiliate links):
Grokking Deep Learning in Motion: bit.ly/3fXHy8W
Grokking Deep Learning: bit.ly/3yJ14gT
Grokking Deep Reinforcement Learning: bit.ly/2VNAXql
Come hang out on Discord here:
/ discord
Need personalized tutoring? Help on a programming project? Shoot me an email! phil@neuralnet.ai
Website: www.neuralnet.ai
Github: github.com/philtabor
Twitter: / mlwithphil

Пікірлер: 33

@MachineLearningwithPhil5 жыл бұрын
This content is sponsored by my Udemy courses. Level up your skills by learning to turn papers into code. See the links in the description. Time stamps for all the modules: Intro 00:00:00 Intro to Deep Q Learning 00:01:30 How to Code Deep Q Learning in Tensorflow 00:08:56 Deep Q Learning with Pytorch Part 1: The Q Network 00:52:03 Deep Q Learning with Pytorch part 2: Coding the Agent 01:06:21 Deep Q Learning with Pytorch part 3: Coding the main loop 01:28:54 Intro to Policy Gradients 01:46:39 How to Beat Lunar Lander with Policy Gradients 01:55:01 How to Beat Space Invaders with Policy Gradients 02:21:32 How to Create Your Own Reinforcement Learning Environment Part 1 02:34:41 How to Create Your Own Reinforcement Learning Environment Part 2 02:55:39 Fundamentals of Reinforcement Learning 03:08:20 Markov Decision Processes 03:17:09 The Explore Exploit Dilemma 03:23:02 Reinforcement Learning in the Open AI Gym: SARSA 03:29:19 Reinforcement Learning in the Open AI Gym: Double Q Learning 03:39:56 Conclusion 03:54:07
@alvarorodriguez15924 жыл бұрын
Hi newcomer! Don't be afraid by the 4 hour long video!! Its really just several lessons concatenated, the first one containing a whole program in 52min! You also have Phil's GitHub in the video description if you prefer to study the code and go to the video only when having a hard time figuring sth out. Thank you Phil for such substantial content!
@anantasin4 жыл бұрын
This is the best lecture on RL ever! Thank you do much!
@fadop31564 жыл бұрын
Thank you so much!
@anubhav21985 жыл бұрын
Thank you sooo much 👍
@seth81415 жыл бұрын
Wow awesome Phil. I'll take a look someday XD
@dheerendratomar27254 жыл бұрын
Hey Phil, I want to thank you for sharing such good content for free. I have one question for you. Are you planning to do a series on imitation learning techniques for continuous action and state space? An overview of how to achieve this task will also be great.
@MachineLearningwithPhil
4 жыл бұрын
I hadn't planned on it but I can add it to the list
@dheerendratomar2725
4 жыл бұрын
@@MachineLearningwithPhil Thanks! that would be very great of you
@tamirtsogbayar3912Ай бұрын
Hello Phil I appreciate with your create video. I'm planning to develop AI game bot for Dota 2 game based on the method Deepmind used in Starcraft bot. but i still have no idea how to start what is the components. Could you help me for that please
@RedShipsofSpainAgain4 жыл бұрын
Hey Phil, do you have a video on how to set up your virtual environment for these tutorials? Conda/pip/gym/PyTorch/Tensorflow, etc packages, linter and IntelliSense on Visual Studio Code? Thanks
@MachineLearningwithPhil
4 жыл бұрын
I don't, sorry. I run Linux which puts me in the minority I think
@RedShipsofSpainAgain4 жыл бұрын
This is a great vid Phil, thank you! BTW, at 2:44 I know it's just an example, but are those ballpark salaries accurate? Amazon for $350,000?!?
@MachineLearningwithPhil
4 жыл бұрын
hah! Nope, I just pulled them out of thin air. Glassdoor indicates starting compensation of around $170,000 with stocks included.
@anirban1233215 жыл бұрын
Thanks for the tutorials. They really helped. I saw this tutorial on youtube and went on to get your intro to RL course at o'reilly. I am really enjoying the course , esp how you create simple quizzes on the study material, which make it much easier to understand the subject. I have a question with regards the "maze running robot" topic . The actionSpace is = {'U': (-1,0), 'D': (1,0), 'L': (0,-1), 'R': (0,1)} and the maze is of the size (6,6) (x,y) co-ordinate. The "state" variable has (x,y) co ordinates in it. However when we def the function isAllowedMove(self, state, action): there we take y , x =state (essentially reversing x and y). I am not able to understand why would I need to invert the maze[x,y] to maze[y,x] ? Rgds, Anirban
@MachineLearningwithPhil
5 жыл бұрын
Great question. Sorry for the delayed reply, I was out of town and this comment escaped me. It's because the x coordinate represents the columns and the y coordinate represents the row. In a right handed coordinate system, x is on the horizontal axis, and y is on the vertical axis. Hence, x is the column and y is the row. Since the indexing of numpy is row, column, we have to switch the two indices. I really should have used i and j instead of x and y to avoid confusion.
@Gottii924 жыл бұрын
i rewatched this video and didnt really understand why we need 2 NNs at 5:55 and what "elminating bias and the estimates of the actions" means 🤔
@MachineLearningwithPhil
4 жыл бұрын
Good questions! We need 2 neural networks because if we use 1 we are effectively chasing a moving target. We use the same network to learn the value of states as well as to choose the actions. The bias comes in because we are taking a max over actions, which implicitly biases the estimates.
@portiseremacunix3 жыл бұрын
Great course though the tf is now tf2...
@MachineLearningwithPhil
3 жыл бұрын
Actor critic in tf2 dropping today.
@liangyumin94055 жыл бұрын
Could you tell me your develop environment, I use win10 & python3.7(anaconda) but I can not install all gym environments....[cry]
@MachineLearningwithPhil
5 жыл бұрын
I'm running Ubuntu 18.04 and Python 3.6.7. Which environments are giving you issues?
@liangyumin9405
5 жыл бұрын
@@MachineLearningwithPhil gym does not support python 3.7 very well....
@MachineLearningwithPhil
5 жыл бұрын
@@liangyumin9405 You can do conda create -n NewEnvironment python=3.6 Then activate the environment and try installing gym and seeing if it works.
@liangyumin9405
5 жыл бұрын
@@MachineLearningwithPhil use py3.6 virtual env. may be a good idea~, thk u
@Gottii925 жыл бұрын
hello, did you look into my problem?
@MachineLearningwithPhil
5 жыл бұрын
I haven't forgotten you :) I'm working on it now. I finished up another project with a DQN and made some improvements that may benefit you. I'll do a new video on that this weekend, and will initiate a pull request on your repo if I get it working.
@MachineLearningwithPhil
5 жыл бұрын
OK, I've gotten it to run on my local machine with some improvements. I've forked the repo and sent in a pull request with some suggestions on how to push the project forward. Let me know what you think!
@Gottii92
5 жыл бұрын
@@MachineLearningwithPhil hello man, it's really exciting that you looked into my project, at the moment as i talk i'm executing/training it myself, your answer was very long, so i probably have to read it several times, while trying different things, is there some way to direct chat with you? for example on discord that would make things easier if you are not busy i'll try having a smaller action space, i've already tried it with 32x32, cause the environment itself is somewhat generic also i might try totally rebuilding the environment and try other ways of approaching my bigger problem if you heared about tf.agents they offer a policy gradient agent, i've tried that on my environment, but also didnt get a very good result :D i don't fully understand policy gradient yet also to be honest sorry for my messy structure, i am not very experienced with programming in a team, or git/github i propose to you to make a discord server, there you could probably reach more people combined with twitter, your website and youtube, also it takes like 2 min to setup it's somewhat unpleasant to write in the youtube comment section :P if you add me on discord under the tag "Gotti#0140" i could probably communicate better with you thank you for your look and pull into my project
@MachineLearningwithPhil
5 жыл бұрын
I can set up a discord server, no problem. I'll get to that later this weekend. We can collab on the project and maybe something cool will come of it. Thanks!
@Gottii92
5 жыл бұрын
@@MachineLearningwithPhil nice 😅

Reinforcement Learning Jump Start | Complete Deep Learning Course

Пікірлер: 33

@MachineLearningwithPhil

4 жыл бұрын

@dheerendratomar2725

4 жыл бұрын

@MachineLearningwithPhil

4 жыл бұрын

@MachineLearningwithPhil

4 жыл бұрын

@MachineLearningwithPhil

5 жыл бұрын

@MachineLearningwithPhil

4 жыл бұрын

@MachineLearningwithPhil

3 жыл бұрын

@MachineLearningwithPhil

5 жыл бұрын

@liangyumin9405

5 жыл бұрын

@MachineLearningwithPhil

5 жыл бұрын

@liangyumin9405

5 жыл бұрын

@MachineLearningwithPhil

5 жыл бұрын

@MachineLearningwithPhil

5 жыл бұрын

@Gottii92

5 жыл бұрын

@MachineLearningwithPhil

5 жыл бұрын

@Gottii92

5 жыл бұрын

Келесі