AI Coffee Break with Letitia

5 ай бұрын

Genie explained 🧞 Generative Interactive Environments paper explained

Пікірлер

@ais31534 сағат бұрын

Congrats! 🎉🎊🎈 Do you have any advice for students on how to read research papers in the AI field? Sometimes, reading a single paper can take a very long time. Also, how can we identify research gaps?

@zerotwo73199 сағат бұрын

Thanks for the inspiration, DOCTOR Letitia!

@aflah757220 сағат бұрын

Congratulations Dr. Letitia It was great meeting you at ACL and all the best for the future :)

@AICoffeeBreak18 сағат бұрын

Thank you, great meeting you too! Good luck to you too!

@aflah757217 сағат бұрын

@@AICoffeeBreak Thank You!!

@chinonyeigwe235122 сағат бұрын

Congratulations Letitia. Thank you for being open and sharing your journey with us. Your vulnerability offers consolation to some of us going through tough phases and I'll hold the nuggets of experience you've shared close. Eg. Being bold enough to approach people, to recognise we're more than what appears (eg our reviewed papers) etc

@Deepia-ls2fo22 сағат бұрын

Hi ! I've discovered your channel last year, many useful explanations here :) I'm a first year AI / Image processing PhD student in France, I also launched my channel recently. Many thanks for your insights, it's is so nice to have people that can inspire us share their stories. At 32:20 when you say you mentioned authors in your post, how did you go about doing that ? I've shared some of my content on reddit but never tried reaching out to the actual authors of the paper I used... I would absolutely love showing them how I animate some of their ideas, I just don't really know how to do so without it being annoying. Maybe an email ? Edit: and of course congrats for your PhD Dr. Letitia !

@AICoffeeBreak17 сағат бұрын

Wow, I just checked out your channel, incredible. You use manim, that's a lot of work, which pays of in the beautiful visualisations! Subscribed! How long do you need to make a video? Asking for a friend coffee bean. 😅 Coming to your question now: It's easier to tag authors when they've just released a paper, and the video is specifically about that paper. For broader concepts like "autoencoders," which are widely taught and cited, tagging someone like Goodfellow wouldn’t be as effective-he’d probably just ignore it. 😅 On Twitter, you can simply @mention an author at the end of the tweet. However, emailing doesn’t really help with generating views since authors can't easily share your video with a simple "Retweet" button. Most people are unlikely to take the time to compose their own social media post based on an email. Are you on Twitter?

@Deepia-ls2fo16 сағат бұрын

@@AICoffeeBreak Thank you for your reply and the kind words ! I would say it takes me around 60 hours to produce a video, but of course it depends on the length etc. Indeed I don't see myself tagging legends aha, I don't have Twitter and never thought of creating an account for the channel. That would be a good idea !

@AICoffeeBreak16 сағат бұрын

@Deepia-ls2fo yes, it definitely would make sense to make channel accounts for popular social networks. 😅

@user-xk6rg7nh8y22 сағат бұрын

Congratulations Dr. CoffeeBean 🎉🎉

@AICoffeeBreak17 сағат бұрын

Thank you!

@MichaelNeumann-n2vКүн бұрын

White Linda Hall Scott Jackson Susan

@s11-informationatyourservi44Күн бұрын

congratulations Dr. Leticia ❤

@franzbohmisch6365Күн бұрын

Congratulation and thank you for starting the channel. I saw allmost all your Videos and learned a lot!

@AICoffeeBreak23 сағат бұрын

Wow, thank you very much! Hope to see you from now on as well.

@AndreyKurenkovКүн бұрын

Congrats!!!

@AICoffeeBreak23 сағат бұрын

Thanks, Andrey!

@theosalmonКүн бұрын

Congratulations on graduating with highest honors... and highest hat!

@AICoffeeBreak23 сағат бұрын

Yes, indeed, my colleagues built the highest hat 🎓 I've seen so far.

@raghavshandilya1949Күн бұрын

Thanks Dr. , this helped, First congratulations are in place for thsi feat, people who are into the field will agree that this was not an easy task to accomplish. i have searched the internet and not many people are talking about their journey into PhD in AI or ML (as much as there should be) . I am currently pursuing my master in DS & ML and i have a goal of finishing up my PhD in coming years. I know you would be occupied with many tasks at hand but I would like to give a suggestion on topic for videos, which people like me might find really helpful. If you can create a playlist or a series where you break down the essential parts of getting into PhD would be amazing. Like the pre-requisites one need to have to start a PhD, how to find a guide, how to consider schools and what kind of preparation goes in to even consider a lab for PhD. I know this is very specific but these are the pain points where people like me are dumbfounded and have no idea how to proceed. Such video series of PhD component brekdown would really help. I will also write to you personally over email for guidance, will hope to hear from you and any guidance would be really appreciated. Love this channel and tbh i recently discovered it, but the videos you have are really good. 👍

@TemporaryForstudy2 күн бұрын

Where is the party? 😂

@thenoblerot2 күн бұрын

Wonderful! Congratulations, Doctor! 🍾 🥂

2 күн бұрын

Congratulations on successfully defending your PhD! I know firsthand how challenging it is to reach this milestone-I just defended mine this May, so I really appreciate the dedication and hard work you've put into it. I think it's fantastic that you started this KZread channel. It's not only a great way to share your expertise but also a valuable resource for people like me who want to stay up to date with the latest in AI and ML. Your ability to simplify and highlight the most important developments in the field is truly appreciated. Keep up the amazing work, and I look forward to seeing more of your content!

@emekaobiefuna45092 күн бұрын

Congrats Dr. Letitia 🎉🎉🎉🎉

@AliMoeeny2 күн бұрын

Congratulations

@AICoffeeBreak23 сағат бұрын

Thank you!

@Smoshfaaaaaaaaaaan2 күн бұрын

Congratulations!!

@DistortedV122 күн бұрын

Okay so that's why you know so much about this area lol

@AICoffeeBreak17 сағат бұрын

@deepfirecat2 күн бұрын

Congratulations! This was really a good video. Very inspiring.

@AICoffeeBreak2 күн бұрын

So glad to hear this!

@lw44232 күн бұрын

I love when women are in science

@juanmanuelcirotorres61552 күн бұрын

Congrats

@AICoffeeBreak23 сағат бұрын

Thank you!

@scottmiller25912 күн бұрын

Congratulations, you beautiful giraffe.

@AICoffeeBreak2 күн бұрын

🦒

@maloukemallouke97352 күн бұрын

Congratulations, "i think you should start with teaching us about AI with your video and FAQ"

@ikr4epi3072 күн бұрын

Congratz, can you share the GitHub page and if possible your thesis with me?

@IbrahimSobh2 күн бұрын

Congratulations my friend Dr. Letitia :)

@AM-yk5yd2 күн бұрын

Congrats!

@batukaanozen79222 күн бұрын

Thanks a lot !!

@CipherOne2 күн бұрын

Thank you for sharing! This was real enlightening.

@kenchang34562 күн бұрын

Congratulations!

@makhalid19992 күн бұрын

"AI Coffee Break with Dr. Letitia"

@AICoffeeBreak2 күн бұрын

I still need to publish the thesis to get the title. That's how it is in Germany. 😅 The defense is not enough.

@makhalid19992 күн бұрын

@@AICoffeeBreak Haha, good luck :)

@AICoffeeBreak2 күн бұрын

@makhalid1999

@ahmadalis15172 күн бұрын

Very inspiring video! Thanks for sharing! Could you tell me which software and hardware have you used to make videos?

@AICoffeeBreak23 сағат бұрын

Sure, do you mean this video, or the usual ones? For the usual ones, I use good old Powerpoint for the visualisations and Adobe Premiere for editing and the Coffee Bean. I record my usual videos with an old Phone (Samsung Galaxy A40) linked as a webcam to the computer. When I'm travelling and record outside my office, I have the dji pocket.

@AICoffeeBreak23 сағат бұрын

Also, a microphone is important. I have the Trust Emita Plus. Of course, there are more expensive ones out there.

@smartinezai2 күн бұрын

Do you have any tips on how to find phd programmes on ML?

@AICoffeeBreak18 сағат бұрын

Besides the usual searching on the Internet, maybe attend a conference, maybe in virtual mode to save a lot on the participation fees. There, usually people post a lot of positions (such as PhD positions) in the social channels. Just an idea.

@smartinezai17 сағат бұрын

@@AICoffeeBreak thank you, I'll try it out and hope I can find one with free tuition or some way to fund it

@SaketNam2 күн бұрын

Congratulations

@ruiguo12942 күн бұрын

I think we met at ACL🎉🎉

@cosmic_reef_173 күн бұрын

Marvellous video with a lot of helpful insights and advice! Congratulations on successfully finishing your PhD journey and good luck to you for what comes up next!

@AICoffeeBreak3 күн бұрын

Thank you so much!

@DerPylz3 күн бұрын

Wow, epic video!

@AICoffeeBreak3 күн бұрын

@Youkouleleh3 күн бұрын

or so Latent DIffusion Models (LDM), add gaussian noise to the latent code. While Paella permute random entry from the latent code with random entry from the dictionnary. why is this better? because the "noise" added for paella is only from "the dictionnary" which mean you dont need to deal with the diffusion model predicting a "mean" representation of the image? because diffusion models intermediate steps looks like a "average" representation of what the model think is the current prediction (because MSE perform mode averaging I guess).

@rannyrh7 күн бұрын

I love your analyses so much, very thoughtful and thorough. Thank you!

@AICoffeeBreak7 күн бұрын

Thank You for this wonderful comment!

@kamiboy10 күн бұрын

Do I understand correctly that in MAMBA to predict the current token the architecture only has access to an embedding influenced only by all previous down stream tokens, but with no influence from future upstream tokens? This is kind of a weakness, isn't it? I think the way LLMs are constructed their classification tasks can take into account all tokens in the sequence for predicting any token in a sequence.

@AICoffeeBreak10 күн бұрын

Yes, exactly. If the summary / history token misses something, it is gone forever. The hope is to learn to retain everything (important).

@kamiboy10 күн бұрын

@@AICoffeeBreak No, what I actually meant was something else. Let me explain it like this. Let us say the input sequence is "I LIKE TO [CLS] A BURGER", and I want to predict which token most likely fits into the [CLS] token. In an LLM the classification of [CLS] would be able to take into account all token previous to [CLS], "I LIKE TO", as well as all tokens after it "A BURGER". But I think from what I understand MAMA would only use "I LIKE TO" to predict, because the embeddings used to predict the current token only depends on what came before it in the sequence, am I understanding this right? This is certainly the case for the SSM's, but I am not sure with the selecting SSMs of MAMA.

@AICoffeeBreak10 күн бұрын

@kamiboy Ah, I understand, you mean bidirectional models like BERT that can classify things in the middle of the sequence. Yes, encoder models like BERT can look into the future. GPT models don't, because their attention is causally masked. There, the art ist to frame / prompt the problem such that the answer is at the end of the sequence. This is quite simple, and works for MAMBA too: "I like to eat a [CLS] burger. What word does [CLS] stand for? The answer is:" But there is no limitation onto MAMBA, as one can make it bidirectional as well, as Bidirectional LSTMs used to be: run the model once forwards to summarise the sequence to the left of CLS. Then run once again from the right until the CLS. Concatenate the forward and backwards summaries (token embeddings) and classify the CLS token based on that.

@kamiboy10 күн бұрын

@@AICoffeeBreak excellent, thanks for the clarification. I was already thinking that if I was correct then doing the bidirectional trick would be a solution. So that is how BERT works, neat to know.

@subusrable13 күн бұрын

this video is a gem. thanks!

@AICoffeeBreak13 күн бұрын

@xxlvulkann674313 күн бұрын

Very well and succinctly explained! This channel is a great educational resource!

@science.2024614 күн бұрын

mamba for video

@encapsulatio16 күн бұрын

Are there any LLM's out there that are specialized on teaching(potential students) and optimized by collaborating with people that are specialists in pedagogical frameworks and pedagogical tools?

@AICoffeeBreak5 күн бұрын

Haven't heard of something like this, except Khan Academy who announced they will use LLMs in teaching. Also, check out Karpathy's new company. x.com/karpathy/status/1813263734707790301?t=1Fng5bSgI4Bm5tcoJdHIYA&s=19

@AICoffeeBreak18 күн бұрын

ERRATUM: At 3:44 showing the two equations: The second one should have the derivative with respect to x_t rather than w_t, to increase the loss as much as possible by travelling in the direction of the gradient of the loss with respect to the INPUT rather than the loss with respect to the weights. Thanks to Hannes Whittingham for pointing this out! 🎯

@yingjiawan251418 күн бұрын

It is not clear to me from the video how FGSM modifies the input to offset the SGD weight update calculated on loss. The input x is not in the axes of the graph. Why changing the input can interfere with the weight update?

@AICoffeeBreak18 күн бұрын

Thanks for the question. What an old video, yes, I could have made it clearer. The idea is to backpropagate the loss through the weights up to the input neurons (input x) and in the same way in which SGD updates the weights, now we update the input x. I showed it for the weights because we can consider the input x, which is now variable, as additional sets of weights.

@PaulVautravers18 күн бұрын

Thank you for the good video! Just watched this as part of the BlueDot AI safety fundamentals course and excited to learn more about adversarial examples

@Trendingnews-u7h20 күн бұрын

kzread.info/dash/bejne/lmqdmo-igdHAgdo.html

@anluifb20 күн бұрын

So you came up with a method, didn't have time to explain the method to us, and didn't show us that it works. Great. If you still have time before Bangkok I would suggest rerecording and focusing on the implementation and interpretation of results rather than the context and wordy descriptions.

@AICoffeeBreak20 күн бұрын

Thanks for your feedback. The method is in the video, just not the tiny details. 1. Interpret with SHAP prediction and explanation. (Mentioned in the video) 2. Measure their alignment (mentioned) after: - normalisation: to bring the values to the same range (mentioned. Did not mention that shap properties make their value very different between output tokens with different probabilities) - aggregation: to collect the many values from many outputs. (mentioned. Did not mention we use the mean for this) For the results I've synthesized what we see with words and the main takeaways. For lengthy tables, please check the paper and its appendix. I don't know what you mean that the video doesn't show that it works. I've also shown an individual example before the takeaways. The problem that there is no ground truth, of course exists for us as well as for previous work. But for the first time in literature, we now *compare* existing works to each other-and to our method to them. This is why the context is important, namely to make this clear. Because our paper makes the contribution to evaluate and clarify the state of the field, and as a follow-up contribution, we have this new method by solving the shortcomings of existing tests.

@koiRitwikHai21 күн бұрын

The authors say LoRA is about low rank weight updates, which is a bad idea since all weight updates are not always low rank. But low rank gradients are a better alternative. My question is that the only difference between weight update matrices and gradient matrices is the multiplication of learning rate i.e. weight update matrix = learning rate * gradients Isn't it? So how come weight update matrices are not always low rank, but gradient matrices are? PS: congratulations on your defense :)

@AICoffeeBreak20 күн бұрын

Thank you! The trick is not that gradient matrices ARE low rank, but that the training process *converges* with low rank gradient matrices too, and this is what the authors also prove theoretically. Think of it this way: You want to move in Manhattan from A to B, but you can only do low-rank updates, meaning that you can only move up-down, left-right (in a subspace of the low-rank gradient update), but not also diagonally. Eventually, you can get to any B, by moving once left and once up, instead of once diagonally.

@AICoffeeBreak20 күн бұрын

This is a great question and if you have a follow-up, let me know. It is not easy to explain in a comment without any drawing. :)

@koiRitwikHai20 күн бұрын

@@AICoffeeBreak thank you so much for replying :) big fan of your channel you said, "the training process converges with low rank gradient matrices" does that also mean that the training process converges with low rank weight updates?

@AICoffeeBreak20 күн бұрын

Great to see you're following up! No, it does not mean that the training convergence is guaranteed by low-rank *weight* updates. Convergence refers to finding the set of weights that minimize the loss. If the correct updates are restricted due to low-rank weights, the optimal weights for convergence might never be found. However, similar to how movement in different directions can collectively bring you to your desired location, low-rank *gradient* updates can still lead to convergence.

AI Coffee Break with Letitia

[Own work] On Measuring Faithfulness or Self-consistency of Natural Language Explanations

Supercharging RAG with Generative Feedback Loops from Weaviate

GaLore EXPLAINED: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Shapley Values Explained | Interpretability for AI models, even LLMs!

Stealing Part of a Production LLM | API protects LLMs no more

Genie explained 🧞 Generative Interactive Environments paper explained

MAMBA and State Space Models explained | SSM explained

Sparse LLMs at inference: 6x faster transformers! | DEJAVU paper explained

Transformers explained | The architecture behind LLMs

Direct Preference Optimization: Your Language Model is Secretly a Reward Model | DPO paper explained

LLM hallucinations discover new math solutions!? | FunSearch explained

DALL-E 3 is better at following Text Prompts! Here is why. - DALL-E 3 explained

Adversarial Attacks and Defenses. The Dimpled Manifold Hypothesis. David Stutz from DeepMind #HLF23

What is LoRA? Low-Rank Adaptation for finetuning LLMs EXPLAINED

Are ChatBots their own death? | Training on Generated Data Makes Models Forget - Paper explained

The first law on AI regulation | The EU AI Act

Author Interviews, Poster Highlights, Summary of the ACL 2023 Toronto NLP

ChatGPT ist not an intelligent agent. It is a cultural technology. - Gopnik Keynote

[Own work] MM-SHAP to measure modality contributions

Eight Things to Know about Large Language Models

Moral Self-Correction in Large Language Models | paper explained

AI beats us at another game: STRATEGO | DeepNash paper explained

Why ChatGPT fails | Language Model Limitations EXPLAINED

"Watermarking Language Models" paper and GPTZero EXPLAINED | How to detect text by ChatGPT?

Training learned optimizers: VeLO paper EXPLAINED

ChatGPT vs Sparrow - Battle of Chatbots

Paella: Text to image FASTER than diffusion models | Paella paper explained

Generate long form video with Transformers | Phenaki from Google Brain explained

Movie Diffusion explained | Make-a-Video from MetaAI and Imagen Video from Google Brain

Пікірлер