Understanding Variational Autoencoder | VAE Explained

In this video I deep dive into Variational Autoencoder (VAE) . If you're interested in understanding the inner workings of Variational Autoencoders, and how it differs from traditional autoencoder, you're in the right place.
🔍 In this video, we'll cover the following key points:
What is a Variational Autoencoder (VAE) and how does it work?
Difference between Autoencoder and Variational Autoencoder.
The loss function used in Variational Autoencoder to optimize their training.
Building your very own Variational Autoencoder
Conditional VAE (Conditional Variational Autoencoder)
Resources Used in Making Video
1. Understanding Variational Autoencoders (VAEs) - tinyurl.com/vae-link1
2. Ali Ghodsi, Lec : Deep Learning, Variational Autoencoder, Oct 12 2017 [Lect 6.2] - • Video
3. Variational Autoencoders (VAEs): Generative AI I - tinyurl.com/vae-link2
Helpful Links
KL Divergence
1. Wikipedia - tinyurl.com/vae-link3
2. tinyurl.com/vae-link4
Computing P(x)
1. Sec 2.1 - tinyurl.com/vae-arxiv-link
2. tinyurl.com/vae-stack-1
3. tinyurl.com/vae-stack-2
Background Track - Fruits of Life by Jimena Contreras
Email : explainingai.official@gmail.com

Пікірлер: 16

  • @Explaining-AI
    @Explaining-AI9 ай бұрын

    Github Code - github.com/explainingai-code/VAE-Pytorch

  • @bhanutejanellore9609
    @bhanutejanellore96099 ай бұрын

    This is the best content on VAE i saw in KZread. You must do more videos!

  • @Explaining-AI

    @Explaining-AI

    9 ай бұрын

    Thank you ! Yes I do plan to post a video every 7-10 days for the foreseeable future. Do consider subscribing if you like the content :)

  • @ronnieleon7857
    @ronnieleon78572 ай бұрын

    From the video it is evident that using a variational autoencoder for image de-noising works better compared to just using an autoencoder as the latent representation of the images generated by the encoder in an autoencoder is mapped to a point instead of a latent space with a distribution of the latent masks generated by the encoder.

  • @douwedb
    @douwedb5 ай бұрын

    Really nice video, thanks a lot for this great explanation!

  • @Explaining-AI

    @Explaining-AI

    5 ай бұрын

    Thank you :)

  • @random_op
    @random_op13 күн бұрын

    Great video sir Sir, can you make a video explaining all probability and statistics required for understanding generative models thoroughly. Or you can suggest some beginner friendly resources to master those? Lacking those, I am unable to grasp the whole concept of VAE, GAN, Diffusion models ...

  • @Explaining-AI

    @Explaining-AI

    6 күн бұрын

    Hello, Really sorry for the late reply. I will try to make a video regarding this some time later, but for now I am just adding resources which I found beneficial. Not saying these are the best(or the only resources you should use) but these helped me. 1. Linear Algebra Course by Prof Gilbert Strang and videos on Lem.ma(www.lem.ma/books/Ai_Km5W5WPfsPZqqV2XWGg/landing) by Prof. Pavel Grinfeld 2. Linear algebra 3blue1brown videos - www.3blue1brown.com/topics/linear-algebra 3. StatQuest Videos - kzread.info/head/PLblh5JKOoLUK0FLuzwntyYI10UQFUhsY9 4. Machine Learning: A Probabilistic Perspective 5. Mathematics for Machine Learning book (mml-book.github.io/) Obv these would take some time to go through(I think its worth it), so in parallel you can start with Auto encoder and then VAE's. While going through them, whenever you stumble on some topic/math that is unclear to you, try to use these above resources to get a thorough understanding of it and then go back to see if that topic is clearer now. Feel free to email me if you think having a discussion or call on this would be helpful to you and I will try to get a better understanding of exactly what are the missing gaps and try to suggest accordingly.

  • @nitishgupta143
    @nitishgupta1438 ай бұрын

    I need a super like button for this explanation. Just one suggestion, you should consider using a more unique name for the channel. It's tough finding it by a simple search.

  • @Explaining-AI

    @Explaining-AI

    8 ай бұрын

    Thank you so much for saying that. And I agree, when I was thinking about the channel name I just used whatever my creativity(rather lack of it) could stumble upon in quick time so that I could start uploading videos... without giving any thought into searchability, and now because its such a common term, reaching the channel page via search is extremely difficult. Will be soon changing it to a more unique (and hopefully catchy) name.

  • @hientq3824
    @hientq38244 ай бұрын

    7:49 I'm not really understand why the underline part is constant, can u explain a bit?

  • @Explaining-AI

    @Explaining-AI

    4 ай бұрын

    @hientq3824 Since p(x) is the distribution of images, our data, No matter what we do to q(z)... log(p(x)) is not going to be impacted by it. To further simplify, then you can bring the log(p(x)) term outside the sum. And because that sum is just 1 , as it is Sum(q(z)) over all possible z, which means the last term is just log(p(x)).

  • @inkxmpetentertyp7047
    @inkxmpetentertyp70474 ай бұрын

    Can you implement this Model also for music generation? I assume you can but since you talk only about image - generating I thought I better ask myself.

  • @Explaining-AI

    @Explaining-AI

    4 ай бұрын

    Yes. Only the choice of how we encode the input into latent and how we decode the latent to reconstruction(encoder and decoder) might change based on whether its music/image/speech/language. I should have mentioned in the video that the input representation can be of any modality. If you are interested in using vae's for music you can checkout this paper, which is one of the early works that do the same - arxiv.org/pdf/1803.05428.pdf

  • @alexijohansen
    @alexijohansen5 ай бұрын

    Thank you, but assumes a lot of math knowledge.

  • @Explaining-AI

    @Explaining-AI

    5 ай бұрын

    Thank you so much for that feedback. Next time wherever the video is making these assumptions, I will try to give a brief overview of them.