L18.3: Modifying the GAN Loss Function for Practical Use
Ғылым және технология
Slides: sebastianraschka.com/pdf/lect...
-------
This video is part of my Introduction of Deep Learning course.
Next video: • L18.4: A GAN for Gener...
The complete playlist: • Intro to Deep Learning...
A handy overview page with links to the materials: sebastianraschka.com/blog/202...
-------
If you want to be notified about future videos, please consider subscribing to my channel: / sebastianraschka
Пікірлер: 15
in the end everything is clear.....i have gone through many videos but dint get proper understandig....but this video explains whats going inside those equation!! great
@SebastianRaschka
2 жыл бұрын
awesome! glad to hear!
Thank you so much for this beautiful and detailed discussion.. after reading and getting confused from so many places.. landed on your channel .. and it cleared all ambiguities.. 🙏😇
@SebastianRaschka
Жыл бұрын
awesome, I am happy to hear this!
well explained. thanks
Hi. I have a question. For the second equation at 7:30, how did you get the info for the [0, inf]? Log(0) = -inf. Or is that the equation was changed to be negative log likely hood, so it became inf. Thank you.
@SebastianRaschka
3 жыл бұрын
Good catch, looks like I forgot the minus sign at the bottom.
Thank you for the great video! At 14:19, I understand that the first equation is the gradient ascent from the original paper, and the second equation is negative log-likelihood, which from what I understand is just adding negative signs to the normal log-likelihood. But how did you transform eqn 1 to the normal log-likelihood to begin with? Integration? Thank you!
4:08 how D(G(z)) --> 0 maximize the gradient descent equation??????? please help im completely confused!
@adityarajora7219
2 жыл бұрын
is minimize meant only magnitude? ....I am considering you are minimizing magnitude! is that the case?
@SebastianRaschka
2 жыл бұрын
"how D(G(z)) --> 0 maximize the gradient descent equation?" Actually we are not trying to maximize it but minimize it. So in the equation at the top, it's gradient descent with log(1-D(G(z))). Since we feed fake images, D(G(z)) will be close to 0 in the beginning, so the loss is log(1-0) = log(1) = 0. With gradient descent it will go towards -infty as D(G(z)) -> 1.
Quite vague poor explanation of gradient vanishing when strong discriminator. The gradient -1 is not weak , the reason it vanishes is because -1 is multiplied (chain rule) to gradient of D w.r.t G, which is too small or confident/saturated discrimators. Once you change the formulation to 1/y_hat , the small gradient is not multiplied to a very large number
This video is a bit complicated.
@SebastianRaschka
Жыл бұрын
Thanks for the feedback!
@Belishop
Жыл бұрын
@@SebastianRaschka Thank you for the video!