Lightning AI

4 ай бұрын

Meet Studio Templates | Get started today for free with Studio Templates

Пікірлер

@vmguerra12 күн бұрын

Nice intro to Thunder and DL compilers in general

@King_Muktar13 күн бұрын

Thank you For This 🤗🤗

@andrei_aksionau21 күн бұрын

Great introductionary video for a such a complex topic. Looking forward to a one about distributed.

@EngineerXYZ.25 күн бұрын

Benefits of using cosine annealing learning rate scheduler

@YokoSakh29 күн бұрын

Cool. I hope you’ll continue doing that lives.

@lucaantiga394128 күн бұрын

Thank you! Yes we will, see you next Friday!

@Lily-wr1nw29 күн бұрын

Is there a template for comfyui?

@PyTorchLightning29 күн бұрын

Yes! We do have templates using comfyui and more templates being added regularly.

@Lily-wr1nw29 күн бұрын

@@PyTorchLightning can you please link one in this post. It will be really helpful

@PyTorchLightning29 күн бұрын

@@Lily-wr1nw Visit Lightning.ai to browse the studio templates available! Here's a link to one to get you started: lightning.ai/mpilosov/studios/stable-diffusion-with-comfyui

@pedrogorilla483Ай бұрын

I’ve been trying to understand the stable diffusion unet in detail for a while. This video added a few pieces of information I was missing from other material. Thanks!

@kimomoh5439Ай бұрын

I hope you solve this problem in PyTorch Lightning: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock. self.pid = os.fork()

@PyTorchLightningАй бұрын

We are always working to alleviate problems people have while training. Join our discord to join the discussion and connect with a wide variety of experts in all things ML:. lnkd.in/g63PCKBN

@clippadasАй бұрын

Meu sonho, queria ter o modo 8 GPU desbloqueado pra mim usar um script para recuperar minha senha mais eu sou pobre

@samarbhosale8310Ай бұрын

I want to train for detecting text similarity for 2 questions between 0 and 1....my dataset is unlabelled how should i proceed can you guide.

@PyTorchLightningАй бұрын

Good question! Join our discord and get advice from a wide variety of experts in all things ML, including a special channel dedicated to this course. lnkd.in/g63PCKBN

@osamansr5281Ай бұрын

if overfit_batches uses the same batches for training and validation, shouldn't the validation loss == the training loss ?? I see the training loss getting reduced but the validation loss is increasing !! 😳

@osamansr5281Ай бұрын

I have a guess, but I'd appreciate some confirmation, that overfit_batches doesn't use the same batch in training and validation BUT the same batch count! so if the DataModule provides val_dataloader and train_dataloader they are going to be called and the same batch count is going to be sampled from both.

@PyTorchLightningАй бұрын

@@osamansr5281 The answer you arrived at is correct. :) Join the Lightning AI Discord for continued discussion with the ML community: discord.gg/zYcT6Yk9kw

@not_a_human_beingАй бұрын

omg this is horrible

@osamansr5281Ай бұрын

did I misunderstand something or the graph presented in the over-fitting section of the video from [0:22] to [1:00] is mislabeled🧐 over-fitting occurs when the train accuracy *RED* increases while the test accuracy *BLUE* decreases, correct? 🤔 aren't the colors are swapped! btw, thanks for the amazing tutorials and special thanks for updating them <3

@SebastianRaschkaАй бұрын

Good question. I think the your question arises because this shows the training and test accuracy in a slightly different context. Here, we are looking at the performance for different portions of the dataset. The overall idea is still true: the larger the gap the bigger the degree of overfitting. But the reason why you are seeing the training accuracy go down is that with more data, it becomes harder to memorize (because there's simply more data to memorize). And if there is more data (and it's harder to memorize), it becomes easier to generalize (hence the test accuracy goes up)

@saikatnextd2 ай бұрын

Love it thanks a lot Linus //

@MrPaPaYa862 ай бұрын

This was very clear and informative

@benc79102 ай бұрын

my plot_loss_and_acc(): def plot_loss_and_acc(log_dir) -> None: import pandas as pd import matplotlib.pyplot as plt metrics = pd.read_csv(f"{log_dir}/metrics.csv") # Group metrics by epoch and calculate mean for each metric df_metrics = metrics.groupby("epoch").mean() # Add epoch as a column df_metrics["epoch"] = df_metrics.index # Index is the grouping key (epoch) print(df_metrics.head(10)) df_metrics[["train_loss", "val_loss"]].plot( grid=True, legend=True, xlabel="Epoch", ylabel="Loss", title="Loss Curve" ) df_metrics[["train_acc_epoch", "val_acc_epoch"]].plot( grid=True, legend=True, xlabel="Epoch", ylabel="ACC", title="Accuracy" ) plt.show() plot_loss_and_acc(trainer.logger.log_dir)

@user-il9vr9oe7b2 ай бұрын

Couldn't you use 8bit precision during training by using double weights hence enabling more error tolerance and hence more speed up options.

@River-xd8sk2 ай бұрын

You just lost $510 because im not waiting 2 to 3 days to have my email "verified".

@NeoZondix3 ай бұрын

Thanks

@kevinsasso14053 ай бұрын

why are you blinking like that are you ok

@AbhishekBade13103 ай бұрын

X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.15, random_state=1, stratify=y) X_train, X_val, y_train, y_val = train_test_split( X_train, y_train, test_size=0.1, random_state=1, stratify=y_train) for this line of code im getting this error ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2. is there anything i can do to fix this?

@PyTorchLightning3 ай бұрын

Perhaps there is an issue in the data not getting loaded correctly and so there's a truncated dataset, which could cause that issue. If you open an issue on our course GitHub, we could help you debug and get to the bottom of it: github.com/Lightning-AI/dl-fundamentals/issues

@user-wr4yl7tx3w3 ай бұрын

i think we need more tutorial videos on lightning studio

@PyTorchLightning3 ай бұрын

Thanks for the feedback! Check out Lightning's founder William Falcon's youtube channel for more videos featuring Lightning Studio: www.youtube.com/@WilliamAFalcon

@prathameshdinkar29663 ай бұрын

That is quite a unique and nice functionality! I faced the issues of OOM at higher batch sizes, and I think this is a good solution to it! Keep the good work going 😁

@pal9993 ай бұрын

Isn't MLE used in Logistic regression and not Gradient descent?

@SebastianRaschka3 ай бұрын

Hi there, so in this example, we perform maximum likelihood estimation (MLE) using gradient descent

@user-ih8kn3ji7k3 ай бұрын

Sebastian, I have recvently started to watch your videos on AI. I find the material relatively easy to follow and very interesting. I do have a question related to section 3.6. In the conde we are looping over the minibatch 'for batch_idx, (features, class_labels) in enumerate(train_loader):'. At first I thought I understood this, but when I inserted a line in the code to print out the class_labels, I expected that the output on every second minibatch to be the same. However, they are not. Does this mean the every time we are running the line - for batch_idx, (features, class_labels) in enumerate(train_loader): - the date in being shuffled?? Ivar

@SebastianRaschka3 ай бұрын

Hi there. Yes the data is being shuffled via the data loader. This is usually recommended -- I have done experiments many years ago with and without shuffling, and neural networks learn better if they see the data in a different order in each epoch. You can turn off the shuffling though via `shuffle=False` if you want in the data loader if you want (in the code here it's set to shuffle=True)

@user-ih8kn3ji7k3 ай бұрын

@Prithviization4 ай бұрын

HATE PYTORCH LIGHTNING

@jiahao27094 ай бұрын

missing unit 6.5

@PyTorchLightning4 ай бұрын

its true! unit 6.5 was great while it lasted, but in order to not share outdated material, we retired that one subsection.

@astudent88854 ай бұрын

French accent is classy but also unfortunately hard to understand

@adosar72615 ай бұрын

Passing just `overfit_batches` to the trainer also outputs validation metrics even if `limit_val_batches=0`. Any ideas?

@PyTorchLightning4 ай бұрын

Good question! These are not meant to be used together. Overfit batches is probably overwriting limit val batches, but if you feel like there's a bug please open an issue on GitHub.

@stuxyz5 ай бұрын

Great Ideas. Thank u @thesephist!

@JohnSmith-he5xg5 ай бұрын

Why not leaky RELU with a relatively steep slope (say .5)? Seems like all these activation functions tend towards almost no slope before 0 (which slows training). There must be a reason?

@isaz24255 ай бұрын

Great video, also , the speed at which you do things is just right so I can folow and write the code at the same time. (I'm only 12mn into the video , but for now it's great).

@cognitive-carpenter5 ай бұрын

I think a great video if you don't have it already would be interfacing with REACT to reproduce Jupyter Notebook like embedments or stand alone webviews

@JohnSmith-he5xg5 ай бұрын

Shouldn't you have a Sigmoid activation for it to be a true Logistic Regression?

@user-jl9oy7nw4k5 ай бұрын

can i change my weights ?

@user-jl9oy7nw4k5 ай бұрын

i am confused how can i set reload_dataloaders_every_epoch true , lines you have given change are from which class or function , you gave only 3 lines how to understand from where these lines are

@adrianstaniec5 ай бұрын

this video is underrated!

@Deeznuts-wd2yu5 ай бұрын

I like the run through we get in every video but is there a github/collab file for the code we are using in every video? I would like to test the code myself for better understanding

@PyTorchLightning5 ай бұрын

Good question! You can check out the course site at lightning.ai/ai-education/ for relevant links to code files for each unit in Deep Learning Fundamentals.

@uoiuserresusaregnatsuj5 ай бұрын

Do I have to be expert in programming to build my LLM?

@PyTorchLightning5 ай бұрын

Check out Lightning Studios at Lightning.ai to find Studio Templates that can help jumpstart your LLM building and become an expert as you do it. It's more accessible than ever.

@SaschaRobitzki5 ай бұрын

When using `tuner.lr_find` in Lightning 2.2.0 with PyTorch 2.2.0, I get the warning below: ... \ Lib \ site-packages \ torch \ optim \ lr_scheduler . py : 143: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. warnings.warn("Detected call of `lr_scheduler.step()` before `optimizer.step()`. " Is that an issue with Lightning?

@SaschaRobitzki5 ай бұрын

`The batch size 65536 is greater or equal than the length of your dataset. Finished batch size finder, will continue with full run using batch size 65536` It's just a small test dataset but still, does that necessarily this is the optimal batch size?

@SebastianRaschka5 ай бұрын

This would be based on the assumption that larger batch sizes are always better, which has to be taken with a grain of salt. In this case, the dataset is so small that it would fit memory-wise, but it is probably not ideal because you would run full gradient instead of minibatch gradient descent then. I would reduce it.

@belamipro70736 ай бұрын

Sorry, but your sound is horrible.

@SaschaRobitzki6 ай бұрын

For some reason I can't get the trainer to run the fit without error, I always get `RuntimeError: DataLoader worker (pid(s) 42256, 36008, 33348, 43200) exited unexpectedly` with different pids at every run.

@SaschaRobitzki6 ай бұрын

Setting `num_workers=0` in the `DataLoader` fixed the issue. Though, it would be cool to have multiprocessing enabled without crashes.

@SebastianRaschka5 ай бұрын

@@SaschaRobitzki I sometimes get the same issue (it's a PyTorch, not a PyTorch Lightning related one as it appears in either case, if I just use PyTorch or PyTorch Lightning). Interestingly, I also only observe something like this when I work on small teaching code using MNIST or small text files. I suspect it's something to do with the DataLoader loading too many files that are opened and closed too quickly, because the files are so small in this case. I am not 100% certain on but my best guess since, like you said, changing to num_workers=0, usually always works.

@SaschaRobitzki6 ай бұрын

`datasets` is pretty picky when it comes to the `fsspec` version. I could get datasets 2.16.1 only to work with fsspec 2023.5.0, even though newer versions up to 2023.10.0 are supposed to be compatible.

@SebastianRaschka5 ай бұрын

Thanks for the comment. Arg, yeah, with PyTorch in general, I also use a Python version that is 1-2 versions behind the recent Python release. PyTorch is a pretty complex code base so it usually takes a bit of time to 100% support the next Python version.

@SaschaRobitzki6 ай бұрын

In Unit 7.4 and 7.5 I sometimes get the "RuntimeError: Detected more unique values in `preds` than `num_classes`. Expected only 10 but found 11 in `preds`." Any idea how to fix that?

@SebastianRaschka6 ай бұрын

Arg sorry to hear, that sounds like a frustrating one. I must say that I never encountered this issue and thus can't say much about the root cause. I am suspecting there's maybe some parsing issue in the PyTorch Dataset class. Maybe it's operating system depending. I wish I could tell you more here.

Lightning AI

The Thunder Sessions | Session 2

The Thunder Sessions | Session 4 | Transforms

The Thunder Sesssions | Session 3 | Deep Learning Compilers

The Thunder Sessions | Session 1

Meet Studio Templates | Get started today for free with Studio Templates

Meet Studio | Turn ideas into AI, Lightning Fast | from Lightning AI Creators of PyTorch Lightning

AI Regulation: A Fireside Conversation about AI's relationship with Washington DC

Implementing Your AI Strategy with Lightning Studio | Luca Antiga presentation at AI Summit Dec 2023

Unit 2.2 | What are Tensors? | Part 01 | Tensors for Data

Unit 4.1 | Logistic Regression for Multiple Classes | Part 2 | The Softmax Activation Function

Unit 4.1 | Logistic Regression for Multiple Classes | Part 5 | The Cross Entropy Loss Function

Unit 4.1 | Logistic Regression for Multiple Classes | Part 4 | Cross Entropy Loss Function

Unit 4.2 | Multilayer Neural Networks | Part 1 | Looking Beyond Linear Decision Boundaries

Unit 4.2 | Multilayer Neural Networks | Part 2 | The Multilayer Perceptron Architecture

Unit 4.2 | Multilayer Neural Networks | Part 3 | Basic Architecture Design Considerations

Unit 4.3 | Training a Multilayer Perceptron in PyTorch | Part 1

Unit 4.3 | Training a Multilayer Perceptron in PyTorch | Part 3

Unit 4.3 | Training a Multilayer Perceptron in PyTorch | Part 4

Unit 4.3 | Training a Multilayer Perceptron in PyTorch | Part 5

Unit 4.4 | Defining Efficient Data Loaders | Part 1 | Avoiding Data Loading Bottlenecks

Unit 4.4 | Defining Efficient Data Loaders | Part 2 | Datasets and Dataloaders

Unit 4.4 | Defining Efficient Data Loaders | Part 3 | Coding

Unit 4.4 | Defining Efficient Data Loaders | Part 4 | Coding

Unit 4.6 | Speeding Up Model Training Using GPUs

Unit 4.5 | Multilayer Neural Networks for Regression | Part 2 | Coding

Unit 5 | Organizing Your PyTorch Code with Lightning

Unit 5.1 | Getting Started with Structuring Your PyTorch Code using Lightning

Unit 5.2 | Training a Multilayer Perceptron in PyTorch Lightning | Part 1

Unit 5.2 | Training a Multilayer Perceptron in PyTorch Lightning | Part 2

Пікірлер