Johnny Code

14 күн бұрын

Explain Loss, Backpropagation, Gradient Descent | DQN PyTorch Beginners Tutorial #6

Пікірлер

@rishabh-mehindrafirstchoic98689 сағат бұрын

when you put query commands from one table to another table, it will show same data like you have i first table if you have different columns (like months) with different data but in same order as in first table solution: is if you have multipe sheet then change data source number as 0 or 1 or 2, 3 , 4 and so on as per numbers of sheets or workbook available

@girijajoe13 сағат бұрын

Oh my god, This helped me a lot

@blenderstudio15916 сағат бұрын

i am doing it in vs code .. I have installed streamlit but while running it is giving error that module not found

@johnnycode2 сағат бұрын

Which module?

@woohoocares1515Күн бұрын

Do you have a discord or something where we can ask questions? I am getting some very strange outputs for my average reward graph. Additionally, my new best reward is struggling to pass 200. Also would you be able to post your source code somewhere so we can compare? I just want to see where our code deviates (if at all). Thanks! Great series so far :D Edit: I tweaked my hyperparameters and now I am getting a much better result! I am new to machine learning so I am getting used to having to play with things like hyperparameters in order to get better results.

@johnnycode22 сағат бұрын

I don't have a discord. You are welcome to drop your questions here. Glad you are able to make some progress. There are a lot of randomness involved in Reinforcement Learning which makes it extra difficult. Here is the final code: github.com/johnnycode8/dqn_pytorch

@woohoocares1515Күн бұрын

Johnny, you are amazing, seriously. I am so happy I found your channel, it is a gold mine.

@vanm8523Күн бұрын

thank you, man, really

@ElisaFerrari-q5iКүн бұрын

Based on what do we assign these values to hyperparameters?

@johnnycodeКүн бұрын

Based on trial and error, or a process called hyperparameter tuning.

@tu.sh4rКүн бұрын

I am running on WSL conda env and I am getting this error when i run the same code as you ```X Error of failed request: BadValue (integer parameter out of range for operation) Major opcode of failed request: 148 (GLX) Minor opcode of failed request: 3 (X_GLXCreateContext) Value in failed request: 0x0 Serial number of failed request: 146 Current serial number in output stream: 147```

@johnnycodeКүн бұрын

Try my other video that installs Gymnasium on wsl: kzread.info/dash/bejne/q6yHl7l-os2_qMY.html

@user-tb7wo4dc7m3 күн бұрын

Hey do you still use your lenovo for doing these tasks?

@johnnycode3 күн бұрын

Yes, my Lenovo (Slim 7 Pro X) is still solid.

@nikildn4 күн бұрын

You explained better than my university lecturer!!!

@DEVRAJ-np2og4 күн бұрын

how to start learning reinforcement learning? i knew panda numpy matplotlib and basic ml algo?

@user-zw7pd5io3e5 күн бұрын

The best teachers are those who teach a difficult lesson simply. thank you

@rayog27075 күн бұрын

what to do to see the Q-table?

@johnnycode5 күн бұрын

The Q-table is a regular Python array, so you can just use a loop to print the value. In my other video, you can visually see the values on the map: kzread.info/dash/bejne/Y4uTrrF7XZOvdbw.html

@NoahRothgaber6 күн бұрын

Hello! I am struggling to implement a continuous action space agent for a different algorithm than DQN (TD3, using the continuous mountain car environment from gymnasium). I appreciated how you went through this series in simple and easy to understand lessons, so I wanted to ask if you have any suggestions for further resources on a similar implementation to what you have here, or even some more of the general concepts behind TD3. Thank you for what you do and any time you spent reading this!.

@johnnycode5 күн бұрын

Thanks for the feedback and the super thanks! If the dqn tutorial series gets popular enough, I may continue this format on other algorithms.

@johnbrann66345 күн бұрын

@@johnnycodeI honestly would be super interested in that too. I thought your DQN videos were super clear! Keep it up!

@johnnycode5 күн бұрын

Appreciate your feedback!

@ElisaFerrari-q5i6 күн бұрын

how can we solve the problem using a single DQN instead of 2?

@johnnycode6 күн бұрын

You can use the same policy network in place of the target network. Training results may be worst than using 2 networks.

@user-nl7bv6pw9p6 күн бұрын

not working

@ElisaFerrari-q5i7 күн бұрын

is there a possibility that, due to the slippery flag, the agent chooses the best action (knowing the best path) but it falls in the hole?

@johnnycode7 күн бұрын

Yes, the agent can fall into the hole when slippery is on even if it knows the best path. Think of "best path" as the path with the highest chance of success.

@ANKUSHKUMAR-jr1pf8 күн бұрын

What a great series. I always watch your video by disabling ad blocker :)

@johnnycode8 күн бұрын

Thanks for the feedback! And enduring some KZread ads😁

@user-ex2ri2yy4z11 күн бұрын

Thank you so so much for this. I'm not sure why these instructions aren't on the main page or on github

@SpaceNobody-uh2in11 күн бұрын

great video , thanks

@Bryan-mw1wj11 күн бұрын

Perfect, all i was interested in was persisting the database. Thanks for adding that at the end

@SpaceNobody-uh2in12 күн бұрын

thanks bro , good content

@rickyolal14 күн бұрын

Hey Johnny, I was wondering if you knew how to make the algorithm learn some already known states? I have a challenge related to make a DQN learn and start with already known states stored in a csv file, and I am struggling because I have no idea how to do that. Is it possible?

@johnnycode14 күн бұрын

I'm guessing if you know those states, then you would know what action to take or not take in relation to those states. For example, a pawn on a chess board can't go backwards, since you know that state is impossible. If my interpretation of your question is correct, then you might want to look into "action masking", which prevents the agent from taking illegal actions. You can start with this SB3 reference, but the concept is not limited to PPO: sb3-contrib.readthedocs.io/en/master/modules/ppo_mask.html

@mohamedbassiony932214 күн бұрын

Awesome!! Is it better to divide data into chunks before embeddings?

@johnnycode14 күн бұрын

It depends on the problem that you're trying to solve. The semantic meaning of a line of text vs a paragraph of text vs a page of text vs the whole document are different. Which level to divide into chunks depends on what your needs are when querying the database.

@WilliamChen-pp3qs14 күн бұрын

this tutorial series is awesome! looking forward to actor critic series!

@arnavmodanwal629515 күн бұрын

Hi, your videos are great and helped me a lot since you were using the latest version of stable baseline3...But I am facing an issue that the verbose values are not getting printed in output I have put verbose = 1 and even tried to use verbose = 2 but not getting the desired outputs (like rewards, loss, iterations, ep_len_mean etc.) as it was getting printed in your videos. Can you please help me? Is this due to the custom environment I am using or something else? Also, tensorboard logs are also not working...

@johnnycode15 күн бұрын

You should try creating a new conda environment and then install SB3 again. In my SB3 introduction video, I just ran pip install stable-baselines3[extra] and didn't do anything else special: kzread.info/dash/bejne/gaWquqqij7TahJM.html

@arnavmodanwal629514 күн бұрын

@@johnnycode Hi, I will try this one again...Thanks a lot for the reply and your time! Might need your help again...

@arnavmodanwal629514 күн бұрын

Hi, @johnnycode, I tried reinstalling the stable-baselines3[extras] but I am not getting the monitor data also the tensorboard logs are also not getting displayed...Is there some issue with the new version of stable-baselines3[extra] can you please give me the version you installed when making the video?

@johnnycode14 күн бұрын

stable-baselines3 2.0.0 tensorboard 2.13.0

@mystic221215 күн бұрын

Really good video. I have a big issue with the understanding how to delete and add files to the chromaDB I have done it also with a Docker Container and I have issue to get the file Names for example of the documents who are in the DB stored. I am also using the Metadata fields for reference files for specific chunks etc. Is there any explanation of those concepts, that would be awesome!

@johnnycode15 күн бұрын

I use the metadata fields in these videos and also use the Where Filter to query, let me know if these solves your problem: kzread.info/dash/bejne/g4eLlK5xlM7His4.html kzread.info/dash/bejne/p5OCk9Zpc6XboaQ.html

@derf241316 күн бұрын

Fantastic

@5uryaprakashp116 күн бұрын

Now I can put chroma db in my resume. Thanks you creating such a crisp and straightforward tutorial.

@jayaram523616 күн бұрын

nice

@ANKUSHKUMAR-jr1pf17 күн бұрын

Waiting for the next video

@johnnycode17 күн бұрын

Working on it :D

@hakankosebas208517 күн бұрын

could you do balancing double inverse pendulum example?

@ANKUSHKUMAR-jr1pf17 күн бұрын

Great Video Johnny :) Thankyou so much for the series

@DQNLabsAI17 күн бұрын

I do not understand dude, how do you always make the exact video I need? Are you reading my mind!?!?

@johnnycode17 күн бұрын

HAHA I wish!

@peterhpchen19 күн бұрын

Excellent!

@baruite19 күн бұрын

Thanks !!!

@koka-lf8ui20 күн бұрын

thank you so much.can u please implement any env with dqn showing the forgetting problem?

@EzraOdole20 күн бұрын

I can't believe this is free

@tryfonasmp682221 күн бұрын

Great video, i like how you show us graphically the network , nodes , policy etc. Good job!

@johnnycode21 күн бұрын

Thanks for the feedback!

@Connor5144021 күн бұрын

Very helpful! thank you

@tigerlils391021 күн бұрын

Thank you!!

@premmanu655721 күн бұрын

Awesome vid. Can you please do a video/tutorial to read driver lisences of different states.

@umeshkarthikeya422125 күн бұрын

Great video ❤

@hrsharma0225 күн бұрын

Great tutorial can you upload more videos on custom environment for multi robots and navigation

@KevKevKev74yes26 күн бұрын

Hi, good job. Do you know if it's possible with Ollama Embeddings ? Because it seems to use only one process in my code.

@johnnycode25 күн бұрын

Sorry, I have not worked with that so I don’t know.

@KevKevKev74yes24 күн бұрын

@@johnnycode It works, thanks you with Ollama. Just I can't exceed more than 166 batch size

@johnnycode24 күн бұрын

That is great! Thank you for the coffee!

@rickyolal28 күн бұрын

Hey Johnny! Thanks so much for these videos! I have a question, is it possible to apply this algorithm to a continuous action space? For example, select a number in a range between [0, 120] as an action, or should I investigate other algorithms?

@johnnycode28 күн бұрын

Hi, DQN only works on discrete actions. Try a policy gradient type algorithm. My other video talks about choosing an algorithm: kzread.info/dash/bejne/ZHV6zo-ih6q3qsY.html

@IdanBezalel29 күн бұрын

Thanks !

@ddoq1345j29 күн бұрын

I want to know the method to use GPU when creating collection and querying collection. Is it possible?

@johnnycode29 күн бұрын

If you enable device=cuda for the sentence transformer like how what I did in the video, then run query, it might use the GPU. You'll probably need a very large database before you can see much activity on the GPU though.

@roudxh3268Ай бұрын

Do you have any problem related to the laptop's hinge? I heard that the yoga line-up has bad hinge

@johnnycodeАй бұрын

I have not had any hinge problems with my 10 year old yoga.

@DQNLabsAIАй бұрын

Great video! Just a suggestion: Can you maybe do a small Quickstart tutorial on Unity ML-Agents? I feel like that's the future after Gym for most of us RL developers LOL. Just a suggestion, don't work on it if you don't want to, no hard feelings.

Johnny Code

Test DQN Algorithm on CartPole-v1 | DQN PyTorch Beginners Tutorial #8

Optimize Target Network PyTorch Code | DQN PyTorch Beginners Tutorial #7

Explain Loss, Backpropagation, Gradient Descent | DQN PyTorch Beginners Tutorial #6

Implement Target Network | DQN PyTorch Beginners Tutorial #5

Set Up ChromaDB with Docker & Enable Role-Based Token Authentication

Implement Epsilon-Greedy & Debug the Training Loop | DQN PyTorch Beginners Tutorial #4

Implement Experience Replay & Load Hyperparameters from YAML | DQN PyTorch Beginners Tutorial #3

Implement the Deep Q-Network Module | DQN PyTorch Beginners Tutorial #2

Install FlappyBird Gymnasium & Setup Development Environment | DQN PyTorch Beginners Tutorial #1

Use Q-Learning to Train Gymnasium Taxi-v3 | Reinforcement Learning Tutorial

Install Gymnasium Box2D on Windows Subsystem for Linux (WSL)

Install PettingZoo Multi-Agent Reinforcement Learning (MARL) Library

Stable Baselines3 Tutorial: Auto-Stop Training When Best Model Is Found | Demo on BipedalWalker-v3

Stable Baselines3 Tutorial: Dynamically Load RL Algorithm for Training | Demo on Pendulum-v1

Stable Baselines3 Tutorial: Beginner's Guide to Choosing Reinforcement Learning Algorithms

MIP Broke My C# File Saving Excel Code, How to Fix it with 1 Line (Microsoft Information Protection)

REINFORCE (Vanilla Policy Gradient VPG) Algorithm Explained | Deep Reinforcement Learning

Deep Q-Network (DQN) Applied to Gymnasium Mountain Car | Python+Pytorch Deep Reinforcement Learning

Visualize Custom Reinforcement Learning Environment with Pygame

How gymnasium.spaces.Box Works for Custom Gym Environments' Action and Observation Spaces

Build a Custom Gymnasium Reinforcement Learning Environment & Train w Q-Learning & Stable Baselines3

Maximize ChromaDB Embedding Speed w CUDA & Multiprocessing

How to Create Your Google API Key using Generate AI Gemini API Key as an Example

Improve ChromaDB Vector Embeddings w Gemini Pro (FREE)

Get Started with Convolutional Neural Network (CNN) | Deep Q-Learning Example

3. How to Save YouTube Video Summaries to ChromaDB & Perform Semantic Search

2. How to Extract YouTube Video Summaries with Gemini Pro Python API

1. How to Search YouTube with Python API

Get Started with ChromaDB Image Search (Multimodal Embeddings)

Пікірлер