Herman Kamper

Herman Kamper

A Git workflow

A Git workflow

Пікірлер

  • @rahilnecefov2018
    @rahilnecefov20182 күн бұрын

    I learned a lot as an Azerbaijani student. Thanks a lot <3

  • @ramilsabirov6591
    @ramilsabirov65915 күн бұрын

    Really great explanations. I also really like your calm way of explaining things. I get the feeling that you distill everything important before recording the video. Keep up the great work!

  • @kamperh
    @kamperh4 күн бұрын

    Thanks a ton for this!! I enjoy making the videos, but it definitely takes a bit of time :)

  • @liyingyeo5920
    @liyingyeo59206 күн бұрын

    Thank you

  • @rahilnecefov2018
    @rahilnecefov20187 күн бұрын

    bro just keep teaching, that is great!

  • @josephengelmeier9856
    @josephengelmeier985611 күн бұрын

    These videos are sorely underrated. Your explanations are concise and clear, thank you for making this topic so easy to understand and implement. Cheers from Pittsburgh.

  • @kamperh
    @kamperh10 күн бұрын

    Thanks so much for the massive encouragement!!

  • @Aruuuq
    @Aruuuq13 күн бұрын

    Working in NLP myself, I very much enjoy your videos as a refresher of the current ongoings. Continuing from your epilogue, will you cover the DPO process in detail?

  • @kamperh
    @kamperh12 күн бұрын

    Thanks for the encouragement @Aruuuq! Jip I still have one more video in this series to make (hopefully next week). It won't explain every little detail of the RL part, but hopefully the big stuff.

  • @OussemaGuerriche
    @OussemaGuerriche21 күн бұрын

    your way of explanation is very good

  • @shylilak
    @shylilak21 күн бұрын

    Thomas 🤣

  • @MuhammadSqlain
    @MuhammadSqlain25 күн бұрын

    good sir

  • @TechRevolutionNow
    @TechRevolutionNow25 күн бұрын

    thank you very much professor.

  • @ozysjahputera7669
    @ozysjahputera766927 күн бұрын

    One of the best explanations on PCA relationship with SVD!

  • @martinpareegol5263
    @martinpareegol5263Ай бұрын

    Why is it prefered to solve the problem as minimize the cross entropy over minimize de NLL? Are there more efficient properties doing that?

  • @chetterhummin1482
    @chetterhummin1482Ай бұрын

    Thank you, really great explanation, I think I can understand it now.

  • @zephyrus1333
    @zephyrus1333Ай бұрын

    Thanks for lecture.

  • @adosar7261
    @adosar7261Ай бұрын

    With regards to the clock analogy (0:48): "If you know where you are on the clock then you will know where you are in the input". Why not just a single clock with very small frequency? A very small frequency will guarantee that even for large sentences there will be no "overlap" at the same position in the clock for different positions in the input.

  • @ex-pwian1190
    @ex-pwian1190Ай бұрын

    The best explanation!

  • @frogvonneumann9761
    @frogvonneumann9761Ай бұрын

    Great explanation!! Thank you so much for uploading!

  • @Le_Parrikar
    @Le_ParrikarАй бұрын

    Great video. That meow from the cat though

  • @kobi981
    @kobi981Ай бұрын

    Thanks ! great video

  • @harshadsaykhedkar1515
    @harshadsaykhedkar15152 ай бұрын

    This is one of the better explanations of how the heck we go from maximum likelihood to using NLL loss to log of softmax. Thanks!

  • @shahulrahman2516
    @shahulrahman25162 ай бұрын

    Great Explanation

  • @shahulrahman2516
    @shahulrahman25162 ай бұрын

    Thank you

  • @yaghiyahbrenner8902
    @yaghiyahbrenner89022 ай бұрын

    Sticking to a simple Git workflow is beneficial, particularly using feature branches. However, adopting a 'Gitflow' working model should be avoided as it can become a cargo cult practice within an organization or team. As you mentioned, the author of this model has reconsidered its effectiveness. Gitflow can be cognitively taxing, promote silos, and delay merge conflicts until the end of sprint work cycles. Instead, using a trunk-based development approach is preferable. While this method requires more frequent pulls and daily merging, it ensures that everyone stays up-to-date with the main branch.

  • @kamperh
    @kamperh2 ай бұрын

    Thanks a ton for this, very useful. I think we ended up doing this type of model anyway. But good to know the actual words to use to describe it!

  • @basiaostaszewska7775
    @basiaostaszewska77752 ай бұрын

    It very clear explanation, thank you very much!

  • @bleusorcoc1080
    @bleusorcoc10802 ай бұрын

    Does this algorithm work with negative instance? I mean can i use vectors with both negative and postive values?

  • @kundanyalangi2922
    @kundanyalangi29222 ай бұрын

    Good explanation. Thank you Herman

  • @niklasfischer3146
    @niklasfischer31462 ай бұрын

    Hello Herman, first of all a very informative video! I have a question: How are the weight matrices defined? Are the matrices simply randomized in each layer? Do you have any literature on this? Thank you very much!

  • @kamperh
    @kamperh2 ай бұрын

    This is a good question! These matices will start out being randomly initialised, but then -- crucially -- they will be updated through gradient descent. Stated informally, each parameter in each of the matrices will be wiggled so that the loss goes down. Hope that makes sense!

  • @anthonytafoya3451
    @anthonytafoya34512 ай бұрын

    Great vid!

  • @electric_sand
    @electric_sand2 ай бұрын

    6:23 Your face need not be excused :)

  • @kamperh
    @kamperh2 ай бұрын

    :)

  • @ChrisNorulak
    @ChrisNorulak2 ай бұрын

    Had to basically learn git in 10 minutes and cook it down to 5 minutes for a group project at school - glad to something so visual and well explained (and code included!)

  • @kamperh
    @kamperh2 ай бұрын

    Wasn't sure this video was worth posting, so very happy this helped someone! :)

  • @delbarton314159
    @delbarton3141592 ай бұрын

    so in Q = XW, every single entry on the right side of this calculation needs to be learned?

  • @delbarton314159
    @delbarton3141592 ай бұрын

    Q, K and V are all populated with parameters all of which need to be learned?

  • @delbarton314159
    @delbarton3141592 ай бұрын

    D sub k is the dimensionality of the embeddings?

  • @delbarton314159
    @delbarton3141592 ай бұрын

    also, at 10:36 you refer to a relevant google ai blog post but I also cannot find that reference in the notes below this video. Could you post?

  • @kamperh
    @kamperh2 ай бұрын

    Happy to help! On p. 4 of the notes, you can just click the link in blue.

  • @delbarton314159
    @delbarton3141592 ай бұрын

    at the very beginning of this video, you mention "watch the videos on RNNs". I have been unable to find them....

  • @darh78
    @darh782 ай бұрын

    What a great explanation of DTW!

  • @delbarton314159
    @delbarton3141592 ай бұрын

    great stuff! would have liked to see the RNN lectures as well, but they don't seem to be in your channel.

  • @kamperh
    @kamperh2 ай бұрын

    Really happy that the videos are helping! The RNN videos are the last videos on my list; they have been recorded, but I still need to edit them substantially. I need to have it released before the middle of July, in case that helps. Sorry for delays!

  • @vivi412a8nl
    @vivi412a8nl3 ай бұрын

    I have a question regarding the u and v vectors. If I understand correctly (hopefully), then a word will have 2 embeddings, one for when it is a center word (which is v), and one for when it is a context word (which is u)? If so, which embedding will be used to represent the word after we trained the network? Let's say we initialize the matrices V and U at random, then we'd train the network to update both V and U? Then which matrix do we use for our embeddings? Sorry if the question doesn't make sense I'm very new to NLP.

  • @kamperh
    @kamperh3 ай бұрын

    Have a look at my other videos in the paylist (kzread.info/head/PLmZlBIcArwhPN5aRBaB_yTA0Yz5RQe5A_). I believe it is answered in one of them. Hope that helps!

  • @sw_2421
    @sw_24213 ай бұрын

    Thanks for explaination

  • @guestvil
    @guestvil3 ай бұрын

    Thanks! Best explanation on this that I've seen so far -and I've seen a lot.

  • @equationalmc9862
    @equationalmc98623 ай бұрын

    I am learning and completely fascinated.. but the cat interrupting was hilarious as well.

  • @richsajdak
    @richsajdak3 ай бұрын

    Fantastic job! This is one of the best explanations of DTW I've seen

  • @adrianjohn8111
    @adrianjohn81113 ай бұрын

    Wow. Thank you

  • @sauravgahlawat9077
    @sauravgahlawat90773 ай бұрын

    GOATed explaination!

  • @delbarton314159
    @delbarton3141593 ай бұрын

    K is ~ 5,000? (stated around time 6 minutes [6:00]) I thought k was number of "states" which, in turn, I thought was the POS. The number of parts of speech does not seem to be anywhere near 5,000. More like a handful....7? 10? 20? What am i missing?

  • @manoharmishra8172
    @manoharmishra81723 ай бұрын

    Thanks a Ton HK, I followed this whole series of NLP and truly its great, google and references helped as well, and your explanations are fresh and easily graspable, classroom talks were best part, I did struggle in hmm a bit, but eventually I got better here as well. Thanks for the great course. Any chance I get any questions paper or something to test myself over the course??

  • @delbarton314159
    @delbarton3141593 ай бұрын

    best explanation of positional encoding that I've seen. TY

  • @MarcoColangelo-mu6de
    @MarcoColangelo-mu6de3 ай бұрын

    Thank you very much, I found your explaination one of the clearest ones on the web, very useful

  • @EzraSchroeder
    @EzraSchroeder3 ай бұрын

    4:49 if anyone asks what you're doing: watching cat videos on the Internet

  • @kamperh
    @kamperh3 ай бұрын

    🤣

  • @Charles_Reid
    @Charles_Reid3 ай бұрын

    Thanks, this is a very helpful video. One question, in the video you mentioned that since probability is between 0 and 1 and probabilities sum to 1, you need to raise e to the power of each score and divide by the sum of the scores to obtain a probability. Is there a reason that you choose e as the base of the exponent? Why not choose another number? My confusion is that if I chose a number like 10 as the base, I'm pretty sure my softmax model would classify everything the same as if I had chosen e, but that the probabilities calculated would be different. I'm wondering if softmax is actually returning the real probability, or just a number between 0 and1 that behaves like the real probability. Thanks!

  • @kamperh
    @kamperh3 ай бұрын

    This is a really good question that I hadn't thought about before. First, using base 10 will probably work fine because of all the reasons you say. If you were training a neural network, you could probably use any number and the network would just adjust the logits to do what it must do. I see there are some practical reasons to use e: forums.fast.ai/t/why-does-softmax-use-e/78118 And finally I want to ask tongue-in-cheek: What does it mean when you say "real probability"? : ) No one knows the real probability except the Creator, and all we're doing is trying to model it ;)

  • @Charles_Reid
    @Charles_Reid3 ай бұрын

    @@kamperh Yeah maybe the "real probability" can only be 0 or 1, as the data point either does belong to the class or does not. But we don't know which class it belongs to, so SoftMax gives us a probability that is different from the so-called "real probability" but that helps us make a guess. Thank you for your help!