Activation Functions - Softmax

Ғылым және технология

Пікірлер: 45

  • @finneggers6612
    @finneggers66123 жыл бұрын

    so for anyone wondering. there is an issue in the derivative of this one! It is not as simple as I stated. I was a bit younger when I did this so please have mercy with me :) The rest should still be correct.

  • @earltan739
    @earltan7395 жыл бұрын

    Thanks for the most concise and forthright explanation of the softmax activation function I've ever seen!

  • @dor00012

    @dor00012

    4 жыл бұрын

    what are you taking about? i can barely understand what he's saying.

  • @giovannipizzato6888
    @giovannipizzato68884 жыл бұрын

    Amazing explanation, I loved the fact you took some example numbers, did the calculations, and showed how to values are modified by the function. Really got the point home. Keep it up!

  • @alinamanmedia
    @alinamanmedia3 жыл бұрын

    The best explanation I've heard of.

  • @jays907
    @jays9075 жыл бұрын

    Thank you so much for the explanation!

  • @GamelessAnonim
    @GamelessAnonim4 жыл бұрын

    Damn, why couldn't everyone explain it like this? I am dumb and I need an explanation like I am a 5-year-old but most of the explanations on the internet assume that we are all smart as fuck. Thank you!

  • @pankaj_pundir
    @pankaj_pundir5 жыл бұрын

    Great, found the difference between softmax and sigmoid. Thanks

  • @99strenth3
    @99strenth35 жыл бұрын

    Good explanation 10/10

  • @nuriel3
    @nuriel3 Жыл бұрын

    GREAT VIDEO ,Thank you !

  • @guillemguigocorominas2898
    @guillemguigocorominas28985 жыл бұрын

    I think what you mean in the last part about the difference between using a Sigmoid or a Softmax for classification is that for a binary classification problem you only need the probabilities of the two outcomes and a threshold, let's say if I predict A with over 50% probability then my prediction is A, otherwise my prediction is B. For a multi-classification task however you want to normalize over all possible outcomes to obtain a prediction probability for each class

  • @finneggers6612

    @finneggers6612

    5 жыл бұрын

    yeah exactly. I might not have pointed that out good enough

  • @ShermanSitter
    @ShermanSitter4 жыл бұрын

    At 5:30 the light bulb went on. THANK YOU! :)

  • @clubgothica
    @clubgothica5 жыл бұрын

    Excellent explanation.

  • @optimusagha3553
    @optimusagha35532 жыл бұрын

    Thanks, easy to follow👏🏾👏🏾

  • @feridakifzade9070
    @feridakifzade90704 жыл бұрын

    Perfect ✌🏻✌🏻

  • @DamianReloaded
    @DamianReloaded4 жыл бұрын

    I wonder in which cases it's advantageous to use softmax over using percentages of the total sum? Numerically it seems softmax is good for separating big values from smaller ones: EDIT: **googles** Apparently is exactly that. To make high values more evident.

  • @omaral-janabi9186
    @omaral-janabi91865 жыл бұрын

    I Love it :) thanks

  • @familypart2617
    @familypart26174 жыл бұрын

    I love your videos!!!!! It helped me create my very first AI ever! Ur tutorials are so concise! I was wondering, if you know how to do it, could you make a tutorial on q learning in Java, then deep q learning in Java, the deep q learning being something I have been struggling with implementation

  • @finneggers6612

    @finneggers6612

    4 жыл бұрын

    What part are you struggling with? I have implemented it. Basically you should first implement a version with a table without neural networks. After that, you replace the table with a neural network and add a replay buffer. I have code which works. You can also add me on discord (Luecx@0540) and we can talk about it in detail

  • @familypart2617

    @familypart2617

    4 жыл бұрын

    @@finneggers6612 I have the concept down for the basic q learning, however, I cannot figure out how to even begin to train the AI, like what inputs to give it and how to train the AI with the reward and punishment. I tried to send a friend request to chat with u on discord about it, but it didn't work, I can give u my discord real quick KRYSTOS THE OVERLORD#4864

  • @idobooks909
    @idobooks9093 жыл бұрын

    This (3:20) little thing tells a lot about you and is the way to reach more subs. Thanks!

  • @ax5344
    @ax53445 жыл бұрын

    at @2:06, there seems to be a typo. x should be [0,1,2,3,4,5] instead of [1,2,3,4,5,6]. f(0) = 1/1+e(0) = 0.5; f(1) !=0.5

  • @finneggers6612

    @finneggers6612

    5 жыл бұрын

    Yeah you are right. My bad. Thanks for noticing!

  • @swaralipibose9731
    @swaralipibose97313 жыл бұрын

    You just got a new like and subscriber

  • @okonkwo.ify18
    @okonkwo.ify18 Жыл бұрын

    There’s no problem with sigmoid , all activation functions have their uses

  • @dogNamedMerlin
    @dogNamedMerlin5 жыл бұрын

    Thanks, Finn- helpful! I don't think you mention why you need the exponential functions in the Softmax definition. If you showed some negative example values as components of your a-vector (totally legitimate outputs of e.g. a layer with a tanh activation function) it would be easier to see that without them, you wouldn't be guaranteed probabilities bounded by zero and one.

  • @finneggers6612

    @finneggers6612

    5 жыл бұрын

    You are correct! I did not think about that when I made the video but your critic is 100% correct. Thank you for pointing this one out. the exponential function has a nice derivative behaviour and the output value is always > 0. Everything else would not make sense in this context.

  • @ahmedelsabagh6990
    @ahmedelsabagh69904 жыл бұрын

    Excellent explanation

  • @ccuuttww
    @ccuuttww5 жыл бұрын

    the problem of softmax is about the derivative form here math.stackexchange.com/questions/945871/derivative-of-softmax-loss-function?rq=1 u must consider two case 1. i not equal to k and 2. i equal to k I have calculate it by myself but I m not sure if it is right can u go through it once?

  • @ahmedelsabagh6990
    @ahmedelsabagh69903 жыл бұрын

    Super excellent

  • @Ip_man22
    @Ip_man225 жыл бұрын

    thanks a lot

  • @yusuferoglu9287
    @yusuferoglu92875 жыл бұрын

    thanks for the explanation Finn!! I have a question. whenever ı google "derivative ofsoftmax function ı always find something like this " eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative/ ". I am working on a project that is about pure java implementation of multi Layered NN can you help me how can ı use derivative of softmaxfunction.

  • @TheFelipe10848
    @TheFelipe108483 жыл бұрын

    Congrats on making this so simple to understand, you actualy know what the function does. I sometimes wonder if people actually even understand the content they reproduce or are just too lazy to try to put things in a way others can understand. Einstein did famously say: "If you can't explain it simply you don't understand it well enough"

  • @finneggers6612

    @finneggers6612

    3 жыл бұрын

    well I hope I explained it right. I was a lot younger and I feel like the derivative might not be that simple.... Hope its still remotely correct :)

  • @american-professor
    @american-professor4 жыл бұрын

    why do we use e?

  • @omarmiah7496

    @omarmiah7496

    3 жыл бұрын

    my understanding is that we use e because it doesn't change the probability by much as opposed to multiplying by a constant such as 100. it's a form of normalizing data. check out kzread.info/dash/bejne/aqSnwax-h5eYqNY.html - he goes deeper into the actual use of euler's constant e

  • @alaashams8137
    @alaashams81373 жыл бұрын

    respect

  • @alexkatz9047
    @alexkatz90474 жыл бұрын

    why do we use "e"?

  • @finneggers6612

    @finneggers6612

    4 жыл бұрын

    I am not 100% sure but maybe its a combination of: "The derivative is pretty simple" and "We need something exponential". The latter one so that probabilities make a little bit more sense

  • @privacyprivate9330
    @privacyprivate93304 жыл бұрын

    what is "e" ? how i can get the "e" value ?..

  • @privacyprivate9330

    @privacyprivate9330

    4 жыл бұрын

    in 4.24 Minute

  • @finneggers6612

    @finneggers6612

    4 жыл бұрын

    privacy private It’s probably Eulers number. It’s about 2.7 but in every programming language it should be defined somewhere.

  • @privacyprivate9330

    @privacyprivate9330

    4 жыл бұрын

    @@finneggers6612 Thank you

  • @RamVelagala
    @RamVelagala4 жыл бұрын

    thanks for the explanation.

Келесі