Long Short-Term Memory with PyTorch + Lightning

In this StatQuest we'll learn how to code an LSTM unit from scratch and then train it. Then we'll do the same thing with the PyTorch function nn.LSMT(). Along the way we'll learn two cool tricks that Lightning gives us that make our lives easier: 1) How to add more training epochs without starting over and 2) How to easily visualize the training results to determine if you need to do more training or are done.
English
This video has been dubbed using an artificial voice via aloud.area120.google.com to increase accessibility. You can change the audio track language in the Settings menu.
Spanish
Este video ha sido doblado al español con voz artificial con aloud.area120.google.com para aumentar la accesibilidad. Puede cambiar el idioma de la pista de audio en el menú Configuración.
Portuguese
Este vídeo foi dublado para o português usando uma voz artificial via aloud.area120.google.com para melhorar sua acessibilidade. Você pode alterar o idioma do áudio no menu Configurações.
If you'd like to support StatQuest, please consider...
Patreon: / statquest
...or...
KZread Membership: / @statquest
...buying my book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
statquest.org/statquest-store/
...or just donating to StatQuest!
www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
/ joshuastarmer
0:00 Awesome song and introduction
4:25 Importing the modules
5:39 An outline of an LSTM class
6:56 init(): Creating and initializing the tensors
9:09 lstm_unit(): Doing the LSTM math
12:25 forward(): Make a forward pass through an unrolled LSTM
13:42 configure_optimizers(): Configure the...optimizers.
14:00 training_step(): Calculate the loss and log progress
16:40 Using and training our homemade LSTM
20:43 Evaluating training with TensorBoard
23:22 Adding more epochs to training
26:18 Using and training PyTorch's nn.lstm()
#StatQuest

Пікірлер: 177

  • @statquest
    @statquest Жыл бұрын

    Get the code/Jupyter Notebook here: lightning.ai/lightning-ai/studios/statquest-long-short-term-memory-lstm-with-pytorch-lightning?view=public§ion=all To learn more about Lightning: lightning.ai/ Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

  • @duttaoindril

    @duttaoindril

    Жыл бұрын

    Still waiting on the last one in the series - attention.

  • @statquest

    @statquest

    Жыл бұрын

    @@duttaoindril I'm still working on it.

  • @naruto-yy4xk

    @naruto-yy4xk

    Жыл бұрын

    @@statquest LLM ? BAM??? Please.....😅😅

  • @statquest

    @statquest

    Жыл бұрын

    @@naruto-yy4xk I'm working on it.

  • @naruto-yy4xk

    @naruto-yy4xk

    Жыл бұрын

    @@statquest Bam....

  • @DeanRGAnderson
    @DeanRGAnderson Жыл бұрын

    I am 71 yr old engr. grad from UCLA in 1975. Binge watched first 21 videos from Josh Starmer's Neural Networks/Deep Learning playlist in 2 days. Wonderful experience. Josh is an excellent teacher. I have no previous experience with neural networks, but after watching these videos, I feel ready to experiment with NN.

  • @statquest

    @statquest

    Жыл бұрын

    BAM!!! Enjoy! :)

  • @DeanRGAnderson

    @DeanRGAnderson

    Жыл бұрын

    @@statquest Just told my grandson (EE BYU and now grad student at BYU for MSEE) to watch this same playlist. Are you a univ. professor? My grandson is doing his 3rd summer intern with me. We will be doing on-device ASR with a new type of microphone enhancing SNR for lips to microphone distances up to 8 meters. (Scotty: "Hello computer" - Star Trek 4)

  • @statquest

    @statquest

    Жыл бұрын

    @@DeanRGAnderson That's really cool!!! My next video (coming out on Monday) is about how neural networks can be used to translate one language (like english) to another language (like spanish). I'm pretty excited about it. I'm not a professor - I used to be one - in genetics at UNC-Chapel Hill - but now I try to spend as much time making videos as possible. I visited Utah (Salt Lake City) for the first time last summer - it was one of the most beautiful places I've ever been. I loved hiking in the hills that surrounded the city. Good luck with your project! It sounds great.

  • @markfchapmani
    @markfchapmani3 ай бұрын

    This is just great Josh. You have a real ability to explain these complex concepts in an understandable way.

  • @statquest

    @statquest

    3 ай бұрын

    Thank you!

  • @jessicas2978
    @jessicas2978 Жыл бұрын

    This tutorial is amazing, and I finally know how to code LSTM. Super helpful to my projects. Thank you so much!

  • @statquest

    @statquest

    Жыл бұрын

    Glad it helped!

  • @hbb21st
    @hbb21st Жыл бұрын

    StatQuest as an English teaching video has been running successfully for long, clear and logic, my son and like it. :)

  • @statquest

    @statquest

    Жыл бұрын

    BAM! :)

  • @DanteNoguez
    @DanteNoguez Жыл бұрын

    An incredible video, and the Spanish audio version sounds really good as well. DOUBLE BAM!!

  • @statquest

    @statquest

    Жыл бұрын

    HOORAY!!! I'm so glad that is working out. I hope to add that to more of my videos.

  • @ppradhan
    @ppradhan Жыл бұрын

    Thank you Josh. I owe you a lot. My binge watching ML series ended today. Holy smoke, I almost spent the whole month of March! Soon I will buy your book "The StatQuest Illustrated Guide To Machine Learning". I am inspired by your work. e^BAM!!!

  • @statquest

    @statquest

    Жыл бұрын

    Thank you very much! I hope you enjoy the book. I'm super proud of it because it incorporates a lot of lessons I learned from making the videos and also provides more of a coherent flow from topic to topic.

  • @wesleyfman
    @wesleyfman Жыл бұрын

    Que surpresa boa! Seus vídeos são incríveis, continue assim! Um grande abraço do Brasil! DOUBLE BAM!

  • @statquest

    @statquest

    Жыл бұрын

    Muito obrigado!!! :)

  • @caiyu538
    @caiyu5388 ай бұрын

    Great, great ,great. Thank you so much for your great lectures.

  • @statquest

    @statquest

    8 ай бұрын

    Glad you like them!

  • @danielbadawi5623
    @danielbadawi5623 Жыл бұрын

    What a video WOW. Very useful. Please never ever stop making videos.

  • @statquest

    @statquest

    Жыл бұрын

    More to come!

  • @danielbadawi5623

    @danielbadawi5623

    Жыл бұрын

    @@statquest a special request if you please could do a tutorial on ConvLSTM.

  • @statquest

    @statquest

    Жыл бұрын

    @@danielbadawi5623 I'll keep that in mind! However, for now the next steps are 1) Word Embedding, Attention and then Transformers.

  • @SaschaRobitzki
    @SaschaRobitzki4 ай бұрын

    Great video! Especially the LSTMbyHand; I need more of that.

  • @statquest

    @statquest

    4 ай бұрын

    Thanks!

  • @wjchicago
    @wjchicago6 ай бұрын

    cannot be cleaner and clearer than this!

  • @statquest

    @statquest

    6 ай бұрын

    Thanks!

  • @aliaamir9778
    @aliaamir97788 ай бұрын

    Just loving your every video , Clearly explained every Concept of ML and DL, please make one video on GRU algorithm as well.

  • @statquest

    @statquest

    8 ай бұрын

    Thanks! I'll keep that in mind.

  • @laythherzallah3493
    @laythherzallah3493 Жыл бұрын

    You make the life easy , thank you thank you thank you

  • @statquest

    @statquest

    Жыл бұрын

    Thanks!

  • @bleakmess
    @bleakmess10 ай бұрын

    Hi Josh! Thank you for such a wonderful video series, and as I am familiar with all these concepts what's my next step? I have gone through your book as well. I wish there were some problem sheets or coding exercises to get a feel for the methods you discussed. Thanks a lot man and guide me if possible, please.

  • @statquest

    @statquest

    10 ай бұрын

    I'm working on more PyTorch videos.

  • @bleakmess

    @bleakmess

    10 ай бұрын

    @@statquest Awesome.

  • @supermandrew88
    @supermandrew88 Жыл бұрын

    Hi Josh! In your introduction for the sample problem you mention that we're looking at stock prices for two different companies: A and B. If A and B didn't share the same values for days 2, 3, and 4, would you still use one LSTM model for both stocks? In other words, when would you instead choose to create a separate model for each stock? I'm sort of looking at a similar example problem: sales reps performances over time. Would you include all the reps performances in your dataset, or would you create a different model for each rep? Thank you for your time! Your book just arrived at my house a few days ago.😄

  • @statquest

    @statquest

    Жыл бұрын

    It really depends on what you want. One thing that is nice about about fitting the LSTM to multiple stocks is that it helps prevent overfitting and this can help with making predictions in the long run.

  • @reeljojo9229
    @reeljojo9229 Жыл бұрын

    Thanks Josh!!! You are incredible! I learned a lot from your tutorial! Now I understand how to use PyTorch and Lightning to optimize LSTM by hand, but I couldn't use these methods well for the dataset. Maybe you have any tutorials to recommend? Thanks again!

  • @statquest

    @statquest

    Жыл бұрын

    What dataset are you referring to?

  • @reeljojo9229

    @reeljojo9229

    Жыл бұрын

    @@statquest Like stock dataset, with open, low, high, close columns etc.

  • @statquest

    @statquest

    Жыл бұрын

    @@reeljojo9229 Ok. I'll keep that in mind.

  • @reeljojo9229

    @reeljojo9229

    Жыл бұрын

    @@statquest You are the best!!!!

  • @user-fi3ru9qn7f
    @user-fi3ru9qn7f Жыл бұрын

    Another incredible video by this incredible educator. Josh...any videos on transformer models in the pipline?

  • @statquest

    @statquest

    Жыл бұрын

    they are definitely in the pipeline.

  • @user-fi3ru9qn7f

    @user-fi3ru9qn7f

    Жыл бұрын

    @@statquest Looking forward to it!!

  • @duttaoindril

    @duttaoindril

    Жыл бұрын

    Literally waiting with baited breath since my assignment is due soon 😂

  • @statquest

    @statquest

    Жыл бұрын

    @@duttaoindril I'm working as quickly as I can, but it's still at least a month away.

  • @magicfox94
    @magicfox94 Жыл бұрын

    Infinite BAM for you!

  • @statquest

    @statquest

    Жыл бұрын

    Thank you! :)

  • @Eduardo_Trader_Investidor
    @Eduardo_Trader_Investidor3 ай бұрын

    Muito bom!

  • @statquest

    @statquest

    3 ай бұрын

    Muito obrigado!

  • @kareemullaashrafali7476
    @kareemullaashrafali74764 ай бұрын

    simply wow

  • @statquest

    @statquest

    4 ай бұрын

    Thanks!

  • @Bbdu75yg
    @Bbdu75yg7 ай бұрын

    AWESOMEEEEEEE

  • @statquest

    @statquest

    7 ай бұрын

    You're welcome 😊

  • @Tomas-kv7mw
    @Tomas-kv7mw Жыл бұрын

    Great videos! Will there be a transformer/attention video?

  • @statquest

    @statquest

    Жыл бұрын

    Yes, soon!

  • @terryliu3635
    @terryliu3635Ай бұрын

    Thanks, Josh, for another great video! Another quick question, why is that the case using the 2nd approach, only with 300 epochs we are able to get good prediction accuracy, but the 1st approach has to take more epochs?

  • @statquest

    @statquest

    Ай бұрын

    This question is answered at 30:51

  • @supermandrew88
    @supermandrew88 Жыл бұрын

    Hi Josh! Sorry for another question. I've got a dataset that is slightly more complex than the stock example data you used here. I'm trying to use LSTM to predict the result of a sales lead before it runs. For each data point, I've got: the sales rep, the date of the lead, the product pitched, the zip code, and the result (sold or not sold). I've already got my program set up to evaluate different encoding methods for each of those variables. After watching this video, I'm trying to visualize how I would use my data in an LSTM. You created two arrays of data: one for company A and one for company B, where each index in the array corresponded to a date in your dataset. So for my dataset, I can create n arrays of data, where n is my number of sales reps. My issue arises in the fact that I have an uneven amount of data points per date. For instance, sales rep Bob was able to run 3 appointments on May 1 and 1 appointment on May 2. Likewise, sales rep Alice ran 0 appointments on May 1 (I guess she's off on Mondays), and 4 appointments on May 2. How would I preprocess this data for an LSTM? I'm assuming each data point would be an array of encodings: [sales rep, product, ...], but if one rep ran 4 leads on a day and another ran 1 (or even 0), do I just "pad" my data with a value like -1? So on day 2, Bob might be: [[encoded data point 1],-1,-1,-1], and Alice is: [[data point 1],[data point 2],...] ? Finally, I noticed sequence length wasn't really included in this video. Is that because our sequence length is 1 in this case? Do we still capture temporal relationships with a sequence length equal to 1? Sorry for the lengthy question. I appreciate any insights! P.S. I hope you create a 2nd StatQuest book that includes topics like LSTMs, Transformers, etc. I'd love to buy it! 😊

  • @statquest

    @statquest

    Жыл бұрын

    The whole idea of using LSTMs (or any recurrent neural network) is to allow for different sequence lengths for the input values. So "Bob" can have a sequence of 10 values and "Alice" can have a sequence of 5 values. That's fine. What I would encourage you to do is to look at my word embedding video: kzread.info/dash/bejne/qJ2O1LGnesbSiZM.html and consider adding an embedding layer to the inputs to the LSTMs. You could have one embedding layer encode the employee etc. NOTE: You don't have to pre-train the embedding layer as suggested in the video. You can just add the embedding layer (with nn.Embedding()) to your model and train everything at once.

  • @supermandrew88

    @supermandrew88

    Жыл бұрын

    @@statquest Thanks, I will take a look now! I actually used your encoding video in order to implement the weighted mean target encoding for the names of the reps. You mentioned that it's okay for Bob to have a sequence of 10 values and Alice to have a sequence of 5. Just to confirm, this would be like Stock A having multiple and a different amount of values on day 1 compared to Stock B. This should be okay?

  • @panayiotisgeorgiou1609
    @panayiotisgeorgiou16095 күн бұрын

    Hey, I have just watched a bunch of your videos. Very, Very nice work. Really to the point and super informative. I am currently making a statistical model analysis for a machine learning algorithm, but I am super confused with what should I consider as True negative and False Negative. Because the algorithm makes only positive claims. Patient comes and has a disease A, but the System shows a Disease B. What's that? A FN or a TN ? I am absolutely confused, and I'm taking a long shot here.

  • @statquest

    @statquest

    5 күн бұрын

    To be honest, an algorithm that only makes positive claims doesn't sound very useful. Instead, you should consider 3 possible outputs - disease A, disease B and neither disease A or B. To learn how to interpret 3 possible outputs, see: kzread.info/dash/bejne/fZin0pisn9SnZ9I.htmlsi=Yei-8SWZiY42tcwt&t=288

  • @BruceHartpence
    @BruceHartpence Жыл бұрын

    Nice video as always Josh. I have a quick question: the example seems to break the dimensions of nn.LSTM (input must have 3 dimensions, got 2) and the expected are ([batch_size, seq_len, nb_features]. Any thoughts?

  • @statquest

    @statquest

    Жыл бұрын

    Are you using my jupyter notebook or your own code?

  • @BruceHartpence

    @BruceHartpence

    Жыл бұрын

    @@statquest Good morning! I am using your code but in a standard Python39 Windows install with the latest torch and lightning. As one might expect, LSTMbyHand works fine without the call to nn.LSTM. I was just puzzling through creating a single example as in your model(torch.tensor([0.,.5,.25,1.])).detach(). Training has a similar problem.

  • @statquest

    @statquest

    Жыл бұрын

    ​@@BruceHartpence The code you copied and pasted looks different from mine, which is model(torch.tensor([0., 0.5, 0.25, 1.])).detach() (where I explicitly add the 0's before each decimal point.) I know that's not the problem with your code, but it suggests that you are writing your own rather than using my jupyter notebook. Is that correct?

  • @BruceHartpence

    @BruceHartpence

    Жыл бұрын

    @@statquest Well, it's your code from the video. If I typed something wrong i am going to feel silly all day. I'll check later today.

  • @statquest

    @statquest

    Жыл бұрын

    @@BruceHartpence Let me know - I use a mac and it's possible there are differences that need to be worked out between OSs (I hope not).

  • @user-kz6jr6gw7t
    @user-kz6jr6gw7t8 ай бұрын

    Hey Josh, Thanks for the video. I got one question: in the training step, for the self.forward(), why do you take input_i[0] as input instead of just input_i?

  • @statquest

    @statquest

    8 ай бұрын

    That makes sure that the "dimensions" of the tensor are correct. I hope to cover this topic soon.

  • @user-kz6jr6gw7t

    @user-kz6jr6gw7t

    8 ай бұрын

    Does it mean you take only one feature (e.g. price) of the input? @@statquest

  • @statquest

    @statquest

    8 ай бұрын

    @@user-kz6jr6gw7t In this case, the whole LSTM is set up to only accept a single input, however, an LSTM can accept multiple inputs if we configure it that way. No, the reason we have input_i[0] is simply to remove some extra brackets from the data so that the tensor has the correct dimension (and I'll explain this better in a new video soon).

  • @user-kz6jr6gw7t

    @user-kz6jr6gw7t

    8 ай бұрын

    well understood. Thank you Josh!@@statquest

  • @nancyboukamel442
    @nancyboukamel4422 ай бұрын

    Thank you Josh Stammer!! Can you please do a video for transformer encoder using lstm

  • @statquest

    @statquest

    2 ай бұрын

    Hmm...Transformers don't use LSTMs...so are you thinking of something else? Here's a video about transformers: kzread.info/dash/bejne/rKyF27aEaNTbqbw.html

  • @ethansmith7608
    @ethansmith7608 Жыл бұрын

    can you make a video breaking down what constitutes a BAM, and what characteristics can qualify said BAM as DOUBLE or even TRIPLE BAM?

  • @statquest

    @statquest

    Жыл бұрын

    Sure! Here it is: kzread.info/dash/bejne/m2idt9ijo6qpfcY.html

  • @ethansmith7608

    @ethansmith7608

    Жыл бұрын

    @@statquest legend!

  • @brahimmatougui1195
    @brahimmatougui11956 ай бұрын

    hope to see a new video soon. and hope you are doing well too

  • @statquest

    @statquest

    6 ай бұрын

    The next video in this series comes out... right now! :)

  • @brahimmatougui1195

    @brahimmatougui1195

    6 ай бұрын

    @@statquest I have that feeling that the new video is coming out hhhhh

  • @statquest

    @statquest

    6 ай бұрын

    @@brahimmatougui1195 Here's the link: kzread.info/dash/bejne/g5pkmLp9ibupiKw.html

  • @huseyngorbani6544
    @huseyngorbani6544 Жыл бұрын

    Hi, thanks for the video. Been waiting video about transformers and its implementation. Please kindly share.

  • @statquest

    @statquest

    Жыл бұрын

    I'm working on it.

  • @ChristosKaskouras
    @ChristosKaskouras Жыл бұрын

    What are you doing out there cannot be described! Thanks a lot for all the videos! I have a couple of questions though. The first, regarding the model training. Let's suppose that I want to create a list of the of the losses for every epoch. Is that possible? Can I somehow have the trainer in a for loop? The second, when I try to load the TensorBoard an error page appears saying "No dashboards are active for the current data set."

  • @statquest

    @statquest

    Жыл бұрын

    You should be able to make a list of the losses...and I'm bummed you are getting an error. Are you using my specific jupyter notebook or have you written your own code?

  • @ChristosKaskouras

    @ChristosKaskouras

    Жыл бұрын

    @@statquest Even when I am using your code I am getting the error. I guess it might be something with the installation of lightning since I cannot execute the line "from pytorch_lightning.utilities.seed import seed_everything", I receive ImportError. I tried to reinstall the packages and to run the code from IDE (Spyder) but still it did not work

  • @ChristosKaskouras

    @ChristosKaskouras

    Жыл бұрын

    @@statquest Finally I managed to open the TransferBoard when I run Anaconda as administrator.

  • @ChristosKaskouras

    @ChristosKaskouras

    Жыл бұрын

    @@statquest I am trying to modify your code to be be adjusted in my problem. I need to design a NN which will take as input a timeseries and will predict another timeseries, which means that for each value of input there should be a value for output (I have data for both to do the training). But when I change the inputs and labels, it seems that it works but all the predictions are 'tensor([0.])'. What I am doing is that I set as inputs a nested list of my input data and as labels a list of outputs. Is that something you can help me with?

  • @statquest

    @statquest

    Жыл бұрын

    @@ChristosKaskouras The way I debug stuff like this is that I add a bunch of print statements to the "forward" method and call it directly with some data and make sure everything is working as anticipated.

  • @x11y22z33me
    @x11y22z33me11 ай бұрын

    Hi Josh. I have a question about why these predictions work so well. I have seen your LSTM video as well, but don't have an intuitive feel for this. For example, company A goes up from 0 to 1 in 4 steps. Why would a model expect it to come down to zero the next day? If I were to guess, I would expect the value to remain high for day 5, and even if it reduces maybe go to 0.5 from 1 since the previous jump down was from 0.5 to 0.25.

  • @statquest

    @statquest

    11 ай бұрын

    The model was trained specifically with this data, so it's just replaying what it was trained on.

  • @x11y22z33me

    @x11y22z33me

    10 ай бұрын

    @@statquest Oh okay, makes sense. Thanks for your reply, and thanks for all you do.

  • @z4br4k98
    @z4br4k98 Жыл бұрын

    Would it not be better to initialize the parameters using Xavier? Otherwise we might introduce vanishing gradients

  • @statquest

    @statquest

    Жыл бұрын

    Perhaps. However, things work as-is in this example.

  • @macknightxu2199
    @macknightxu2199 Жыл бұрын

    Hi, will there be new videos in this series of NN? BR

  • @statquest

    @statquest

    Жыл бұрын

    Yes, a lot more.

  • @tribuiduonguc7788
    @tribuiduonguc7788 Жыл бұрын

    Is it true that for the number of output time step(e.g. I want to predict value for the next 5 days), we need to specify the corresponding hidden size as the video mentioned (hidden size = 5)?

  • @statquest

    @statquest

    Жыл бұрын

    If you want to predict a value for the next 5 days, you just need to unroll the LSTM 5 times. This is different from wanting 5 outputs per day, which is what the hidden size parameter determines.

  • @tribuiduonguc7788

    @tribuiduonguc7788

    Жыл бұрын

    @@statquest thank you for your rapid response

  • @henryhsu9517
    @henryhsu95172 ай бұрын

    Thank you Josh. This tutorial is amazing. I have a question about the number of parameters. In the LSTMbyHand model, there are 12 parameters. Instead, there are 16 parameters in the LightningLSTM model. My understanding is that [wlr1, wpr1, wp1, wo1] could be viewed as lstm.weight_ih_l0, and, [wlr2, wpr2, wp2, wo2] could be viewed as lstm.weight_hh_l0, and, [blr1, bpr1, bp1, bo1] could be viewed as lstm.bias_ih_10. Is it correct? If true, how do we realize lstem.bias_hh_l0 in the LSTMbyHand model?

  • @statquest

    @statquest

    2 ай бұрын

    Instead of adding the input and short term memories (the input and hidden state) together before adding the bias terms, you and have each one have it's own bias term before adding them together.

  • @henryhsu9517

    @henryhsu9517

    2 ай бұрын

    @@statquest BAMs!!! Thanks Josh. I learned a lot of details about LSTM from your tutorial!

  • @chandanbp
    @chandanbp8 ай бұрын

    All this content for free. Triple BAM!!!

  • @statquest

    @statquest

    8 ай бұрын

    Yes!

  • @azmyin
    @azmyin7 ай бұрын

    Dr. Starmer, I manually wrote the code following your tutorial but when I get to 19:50, I am getting the "grad can be implicitly created only for scalar outputs" error and its stopping the training process. I have pytorch with CUDA 12.1 support and the latest version of lightning installed

  • @statquest

    @statquest

    7 ай бұрын

    Please download and try the code that I wrote first.

  • @ptcita16
    @ptcita16 Жыл бұрын

    Amazing video, thanks Josh! Wanted to work with the TensorBoard but keeps getting error :s It does not generate any URL. Does that ever happen to you? Thanks in advance!

  • @statquest

    @statquest

    Жыл бұрын

    I'm sorry you're having trouble... Are you using my notebook or have you written your own code? Have you correctly navigated to where the "lightning_logs" directory is before running tensorboard?

  • @ptcita16

    @ptcita16

    Жыл бұрын

    @@statquest Thanks for replying ! I was using you code and made sure I was on the folder that has "lighting_logs". Was searching and seems to be realted to a TensorFlow issue in my machine. Is there a specific version I must have for it to work?

  • @statquest

    @statquest

    Жыл бұрын

    @@ptcita16 That I don't know, but you can try to update tensorboard with "pip install tensorboard --upgrade"

  • @statquest

    @statquest

    Жыл бұрын

    @@ptcita16 Also, can you give me the command line that you are using to get tensorboard running?

  • @statquest

    @statquest

    Жыл бұрын

    @@ptcita16 Is it possible that you don't have TensorBoard installed to begin with? This seems strange, but it might be the case. You can type "which tensorboard" on the command line to find out.

  • @suzhenkang
    @suzhenkang Жыл бұрын

    Could you make tranform video cant wait to see it

  • @statquest

    @statquest

    Жыл бұрын

    I'm working on it.

  • @suzhenkang

    @suzhenkang

    Жыл бұрын

    @@statquest Cool cant wait

  • @suzhenkang

    @suzhenkang

    Жыл бұрын

    @@statquest And GAN

  • @suzhenkang

    @suzhenkang

    Жыл бұрын

    @@statquest More complicated , your explaination is better

  • @prashlovessamosa
    @prashlovessamosa Жыл бұрын

    Hey man can you update your playlists because some of your last videos aren't in the playlists.

  • @statquest

    @statquest

    Жыл бұрын

    I'll do that today.

  • @jingwentang6768
    @jingwentang67685 ай бұрын

    Thank you for making the video. Any one knows where I can download the jupyter notebook ?(from the given link I did not find it)

  • @statquest

    @statquest

    5 ай бұрын

    Sorry about that. Everything just recently changed and I need to update things today. For now, you can find the code here: github.com/StatQuest/pytorch_lightning_tutorials/blob/main/README.md

  • @statquest

    @statquest

    5 ай бұрын

    github.com/StatQuest/pytorch_lightning_tutorials/blob/main/lstm_with_pytorch_and_lightning_v1.0.zip

  • @jingwentang6768

    @jingwentang6768

    5 ай бұрын

    It is amazing how you promptly replied to me. Thank you so much!@@statquest

  • @shamshersingh9680
    @shamshersingh968029 күн бұрын

    Hi Josh, in this line of code in training_step method --> output_i = self.forward(input_i[0]), why have we passed input_i[0] and not input_i. I presume, we are passing data of both companies in single batch. If we pass input_i([0]) we are passing only the first input into the forward method in each training step.

  • @statquest

    @statquest

    28 күн бұрын

    Why do you presume that we are passing data from both companies in a single batch?

  • @statquest

    @statquest

    28 күн бұрын

    Adding "print(input_i)" to the training_step() shows that each batch consists of just the values from a single company. See... batch_idx: 0 tensor([[0.0000, 0.5000, 0.2500, 1.0000]]) batch_idx: 1 tensor([[1.0000, 0.5000, 0.2500, 1.0000]]) To be honest, the best way to answer these questions is to fiddle with the code. You can learn a lot more that way much faster. The reason why I've put the code in a Lightning Studio is that you can make a copy of it and then run it and play around with it. And if it completely breaks, you just make another copy. It's super easy.

  • @shamshersingh9680

    @shamshersingh9680

    27 күн бұрын

    @@statquest Thanks a lot Josh... to be honest. when I started to deep dive in deep learning.. it scared me.. but your videos have just ignited my curiosity all over again. Thanks a lot Josh. Thanks a lot.

  • @statquest

    @statquest

    27 күн бұрын

    @@shamshersingh9680 Happy to help! But now you need to get your feet wet and dive in. :) (Just trying to encourage you!)

  • @shamshersingh9680

    @shamshersingh9680

    26 күн бұрын

    @@statquest Yeah done that already. Happy to say that now I pretty comfortable with neural networks and related work. Recently created a image classification models (although pretty basic ones using MNIST, CIFAR10 and CIFAR100 datasets). Just an update:- if someone is using Jupyter notebook then following command can load tensorboard in notebook itself: - %load_ext tensorboard %tensorboard --logdir=lightning_logs No need to go to Home --> File --> New --> Terminal --> tensorboard --logdir=lightning_logs. The magic commands will upload the tensorboard display in the notebook itself. Hope it helps.

  • @ZealotfeathersGorgonoth
    @ZealotfeathersGorgonoth9 ай бұрын

    I've been trying to get tensor board to work in VSCode and am having trouble. Does this set up not work for newer versions of python?

  • @statquest

    @statquest

    9 ай бұрын

    Hmmm.... It should work, even with newer versions of Python.

  • @ZealotfeathersGorgonoth

    @ZealotfeathersGorgonoth

    9 ай бұрын

    @@statquest I got it to work! I am new to the data science / Machine Learning world and I was wondering if you would make a video about your coding set up, what IDE's you recommend or how to set up a productive environment and the choices between using Juypter Notebooks, VSCode, etc. Would be really interesting to see how you do it!

  • @statquest

    @statquest

    9 ай бұрын

    @@ZealotfeathersGorgonoth Awesome! BAM! I'll keep that in mind for a topic. Most people I know use VSCode (I use notebooks since I think they are good for teaching and learning.) But there are a ton of other things that go into a good environment.

  • @namyashah
    @namyashah2 ай бұрын

    In this exact same code, what changes would I have to make if I want a Bidirectional LSTM and I want to predict more then 2 classes?

  • @statquest

    @statquest

    2 ай бұрын

    If you want to predict more than 2 classes, you can run the output through a fully connected layer and then through a softmax layer.

  • @asjadnabeel
    @asjadnabeel Жыл бұрын

    BAMs!! .... Is Statquest video uploaded on transformers ?!.. I couldn't locate on playlists .... Waiting for it....

  • @statquest

    @statquest

    Жыл бұрын

    Not yet. I'm working on it.

  • @asjadnabeel

    @asjadnabeel

    Жыл бұрын

    @@statquest Ok .. Thanks brother..

  • @cheynin
    @cheynin Жыл бұрын

    what "temp" using for in line "lstm_out, temp = self.lstm(input_trans)"?

  • @statquest

    @statquest

    Жыл бұрын

    It's a duple that contains the final hidden state (short-term memory) and the final cell state (long-term memory). In other words, we could have used that first part of the duple instead of lstm_out[-1] if we wanted to.

  • @Shahawir
    @Shahawir11 ай бұрын

    Hello, If you have panel data, imbalanced one, I mean like the companies you have, but suppose you have 500 companies, and companies data are not equal( some has 11 data points, some has 20, some has 34.. will LSTM be good here? What about if you want to incorporate some other categorical variables, does LSTM allow this? I need help with this. If anyone know any resources/keywords, or anything that can help me solve this problem, please do not hesitate to comment, Thank in advance

  • @statquest

    @statquest

    11 ай бұрын

    The whole idea of LSTMs (and all Recurrent Neural Networks) is to be able to work with different amounts of data associated with each sample or company or whatever it is you are using. If you have categorical data ,then you probably need to one-hot-encode the data. For details, see: kzread.info/dash/bejne/Z2xt0KWAlbqtYdo.html

  • @Shahawir

    @Shahawir

    11 ай бұрын

    @@statquest Thanks a lot for taking from your time to answer my question. You saved me, literally…🤝🤝

  • @statquest

    @statquest

    11 ай бұрын

    @@Shahawir bam!

  • @MrShreyansh502
    @MrShreyansh502 Жыл бұрын

    Hello statquest, statquest website seems to be down. It says "error establishing data connection".

  • @statquest

    @statquest

    Жыл бұрын

    Yep. It went down last night. It's back up.

  • @MrShreyansh502

    @MrShreyansh502

    Жыл бұрын

    @@statquest Thanks 👍

  • @luizcarlosazevedo9558
    @luizcarlosazevedo9558 Жыл бұрын

    i could only import lightning with pytorch_lightning, is it the same as lightning package?

  • @statquest

    @statquest

    Жыл бұрын

    Hmm... Try it and see if it works. However, I suspect it is different. You may need to update Lightning and PyTorch Lightning.

  • @duttaoindril
    @duttaoindril Жыл бұрын

    Still waiting on the last one in the series - attention.

  • @statquest

    @statquest

    Жыл бұрын

    Still working on it.

  • @michaeldouglas7641
    @michaeldouglas764110 ай бұрын

    When will hear from the GOAT ML/DS educator on the topics of 1) transformers, 2) GNNs and 3) VAE's (in descending order of importance)...?

  • @statquest

    @statquest

    10 ай бұрын

    My video on transformers is currently available for early access to channel members and patreon supporters.

  • @SaschaRobitzki
    @SaschaRobitzki4 ай бұрын

    Why is the LSTMbyHand's training_step not using the batch_idx?

  • @statquest

    @statquest

    4 ай бұрын

    Because we don't need to know the index.

  • @SaschaRobitzki

    @SaschaRobitzki

    4 ай бұрын

    @@statquest I was just wondering because you specifically mentioned batch_idx in the video, so I thought you actually had planned making use of it.

  • @statquest

    @statquest

    4 ай бұрын

    @@SaschaRobitzki I mention it because you need to include it, not because you need to use it.

  • @LuizHenrique-qr3lt
    @LuizHenrique-qr3lt Жыл бұрын

    Hey Josh send a "salve" tô the cível people ouro squad ia called darthcivel. Great vídeo like everyone else, congratulations!

  • @statquest

    @statquest

    Жыл бұрын

    Thank you! However, I can't quite make out what your comment is. What is a "salve"?

  • @LuizHenrique-qr3lt

    @LuizHenrique-qr3lt

    Жыл бұрын

    @@statquest "Salve" is like a greeting in Brazil, when you arrive somewhere with other people you say "salve" it's like "hi, how are you?"

  • @statquest

    @statquest

    Жыл бұрын

    @@LuizHenrique-qr3lt Muito obrigado! :)

  • @user-ih8kn3ji7k
    @user-ih8kn3ji7k4 ай бұрын

    Hi Josh, I have now watched your 20+ videos on NN and I have learnd a lot. Thanks a lot for a very good setup!!! As I am new to this, there are still many things I do not understand or cannot figure out myself. So I will ask you: When I use your nn.LSTM model on the 'stock' data and print the daily LSTM output after 300 epochs, I see that the output values are very different from input data; ie [0, 0.5, 0.25, 1], [1.0, 0.5, 0.25, 1.0]. The output i get for the same days are: Epoch 299: 0%| | 0/2 [00:00

  • @statquest

    @statquest

    4 ай бұрын

    The goal is to only predict what happens on day 5, so that is the only value we use in the loss function.