Jensen's Inequality

The machine learning consultancy: truetheta.io
Want to work together? See here: truetheta.io/about/#want-to-w...
Article on the topic: truetheta.io/concepts/machine...
Jensen's Inequality appears multiple times in any rigorous machine learning textbook. It's essential for the key principles and foundational algorithms that make this field so productive. In this video, I state what it is, explain why it's important and show why it's true.
SOCIAL MEDIA
LinkedIn : / dj-rich-90b91753
Twitter : / duanejrich
Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon: / mutualinformation
Sources and Learning More
To see Jensen's Inequality used in the justification for the EM algorithm, see section 11.4.7 of [1]. For its use in Information Theory, see section 2.6 of [2].
[1] Murphy, K. P. (2012). Machine Learning: a Probabilistic Perspective. MIT Press, Cambridge, MA, USA.
[2] Cover, T. M. & Thomas, J. A. (2006), Elements of Information Theory 2nd Edition, Wiley-Interscience, NY USA
Contents
00:00 - Why Jensen's Inequality is important
02:01 - Stating the Inequality
03:30 - Showing the Inequality
06:36 - Outro

Пікірлер: 128

  • @willbutplural
    @willbutplural Жыл бұрын

    You just simply explained a complex topic that I spent 3 hours on reading a textbook into 10-minute video. Your ability to condense and concisely explain these topics in your videos have been phenomenal. Great job!

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    Thank you! Glad this is working for you

  • @Form74
    @Form743 жыл бұрын

    Thanks for the intuition-nurturing graphics. Very helpful!

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Glad it was helpful!

  • @SumayaKazi
    @SumayaKazi3 жыл бұрын

    Congrats on the launch of your channel and first video, DJ! This was awesome!

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Thanks Sumaya! More coming :)

  • @jiayangcheng
    @jiayangcheng2 жыл бұрын

    Intuition is indeed what helps at least me to understand (not just short-term memory) a concept, great work, thank you!

  • @sskhdsk

    @sskhdsk

    Ай бұрын

    Human transforms short-term memory to long-term memory by understanding and prediction.

  • @PolyRocketMatt
    @PolyRocketMatt4 ай бұрын

    Probably one of the most underrated inequalities... Shows up everywhere (mostly Machine-Learning these days, but I also encountered this in neutron transport and rendering of images)

  • @karanshah1698
    @karanshah1698 Жыл бұрын

    You have no idea how often your explanations blow my mind. It is an "aha" moment every single time, a concept clicks so well! Please keep up this amazing work.

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    THank you and I will! I got something big in the works :)

  • @karanshah1698

    @karanshah1698

    Жыл бұрын

    @@Mutual_Information Do you plan on doing one for EM derivation of GMMs?

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    @@karanshah1698 EM yes, GMMs, yes eventually. Using them together?? No I didn’t think of that, hm

  • @rangjungyeshe
    @rangjungyeshe Жыл бұрын

    You sure have a gift for teaching ! Plus, what a slick production . It takes a lot of hard work and skill to make something look as simple and obvious as you do. Awesome.

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    I appreciate that!

  • @hansenmarc
    @hansenmarc Жыл бұрын

    I was curious about Jensen’s inequality, having seen it in the context of EM. You did a great job of providing even more context and explaining the intuition. The animation makes it so easy to understand why it is true. Simply outstanding. This is hands-down the best video I’ve seen on the subject. Thank you! Just subscribed.

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    Thanks - great to have you! This was my first vid. I've gotten a lot of useful feedback since, but glad this one still lands

  • @ananthakrishnank3208
    @ananthakrishnank32083 ай бұрын

    Then for concave function, I expect 'greater than equal to', instead if less than equal to.

  • @debatirthadeb6632
    @debatirthadeb66323 ай бұрын

    Great explanation. Instead ofaverageg, its better to think of weighted average; this will easily convey the idea of the formal definition of convex function : )

  • @chnaka7518
    @chnaka7518Ай бұрын

    Wow. The way you simplified the concept. I was amazed.😍

  • @user-vr1so7tc7x
    @user-vr1so7tc7x3 жыл бұрын

    Great explanation, I am at the process of figuring out the cross-entropy and you video helped me with Jensen inequality concept! Keep it up!

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Thank you! Happy to help

  • @LittleBigVlad25
    @LittleBigVlad25 Жыл бұрын

    Great visualisation, really good job! Thank you very much!

  • @BGWee
    @BGWee2 жыл бұрын

    As a tired econometrics student with a dull lecturer, this helped a bunch, thanks

  • @tferrerd
    @tferrerd3 жыл бұрын

    Very interesting. Keep ‘em coming DJ!

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    First comment :) will do!

  • @brankojangelovski3105
    @brankojangelovski31052 жыл бұрын

    nice quality and explanation, really helped me out

  • @anandseth1772
    @anandseth17724 ай бұрын

    Nicely and Intuitively explained! Thanks

  • @user-fx1hs8qr6h
    @user-fx1hs8qr6h7 ай бұрын

    Great visualization, thank you very much for your effort to break it down so well!!! :)

  • @EverlastingsSlave
    @EverlastingsSlave2 жыл бұрын

    Thanks for such great visuals

  • @manoeanna
    @manoeanna2 жыл бұрын

    Great video! Thanks for sharing your knowledge!

  • @manueljenkin95
    @manueljenkin952 жыл бұрын

    Thank you very much for this wonderful presentation. A lot of effort must have gone into getting an animation that feels intuitive.

  • @mCoding
    @mCoding3 жыл бұрын

    Great intuition, great visualizations! Mathematicians will also say that you need to assume that X is integrable in order for Jensen's inequality to hold. Jensen also has far reaching consequences in theoretical probability, and even analysis in general. Can't wait for more!

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    This means a lot getting your comment here. Much appreciated! And yes! There are unfortunately rigor qualifications that I omit to keep the vid light. In the case of integrability, I hadn’t thought of that, so thanks for pointing it out :)

  • @shubhamjoshi449
    @shubhamjoshi4499 ай бұрын

    Great Video ...Thanks for the efforts you put in these Videos ..🙂

  • @Soedmaelk
    @Soedmaelk2 жыл бұрын

    This was an awesome explanation. Thank you! Out of curiosity, how did you make the animation? By the way, that was also really well made!

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Hey thanks and to answer your question, I stick together a bunch of graphs made in Altair using a personal library. Altair is very nice plotting library

  • @asdfasdfuhf

    @asdfasdfuhf

    Жыл бұрын

    Looks like he is using manim made by 3blue1brown

  • @vasanthakumarg4538
    @vasanthakumarg45382 ай бұрын

    Very clear explanation. Keep up the good work

  • @xxxxxfirefoxxxxx
    @xxxxxfirefoxxxxx2 жыл бұрын

    You basically took an esoteric formula and explained it in a stupid-people friendly way. Thank you

  • @jacoboribilik3253

    @jacoboribilik3253

    10 ай бұрын

    There's no need to put yourself down in such a way. You can be thankful for the content this guy is putting out on YT by liking, subscribing and hitting the share buttom...stop dragging yourself over the coals.

  • @hardy8488
    @hardy84883 жыл бұрын

    Great video, hopefully there is a followup videos on how Jensen's Inequality becomes the important part of EM, KLDiv and so on.

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Yes! The EM algorithm will be covered, but later this year. If you're curious immediately, I linked to some sources in the description where Jensen's Inequality is used. In Cover's book, there is a section "Jensen's Inequality and Its Consequences" which show how foundational it is for Information Theory.

  • @NoNTr1v1aL
    @NoNTr1v1aL2 жыл бұрын

    Amazing video! Subscribed.

  • @MrEmilosen
    @MrEmilosen2 жыл бұрын

    Perfect presentation, thank you!!

  • @emanuelhuber4312
    @emanuelhuber43122 жыл бұрын

    Awesome video! Really easy to follow

  • @jmbrjmbr2397
    @jmbrjmbr23972 ай бұрын

    Your channel looks great, thanks!

  • @jarvis-yu
    @jarvis-yu3 ай бұрын

    Nice animation making things a lot more intuitive, thanks.

  • @partyhorse420
    @partyhorse420 Жыл бұрын

    Amazing explanation!

  • @rafaelladeira6049
    @rafaelladeira60492 жыл бұрын

    Outstanding explanation!

  • @forughghadamyari8281
    @forughghadamyari82813 ай бұрын

    Thanks...you've explained it clearly

  • @chiragvashist8415
    @chiragvashist8415 Жыл бұрын

    You are awesome. I have been binge watching your videos.🖖

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    Excellent, thank you!

  • @yennefer415
    @yennefer4152 жыл бұрын

    Huh.. it clicked in like 3 seconds after seeing the comparison with line. And why it's true is so obvious now. Amazing. Thanks.

  • @KapilSachdeva
    @KapilSachdeva2 жыл бұрын

    Brilliant explanation!

  • @descent21iri89
    @descent21iri892 жыл бұрын

    incredibly clear and helpful! thx a lot

  • @sastryanjaneya5863
    @sastryanjaneya58637 ай бұрын

    very clear explanation with high energy .. I like it

  • @Mutual_Information

    @Mutual_Information

    7 ай бұрын

    lol old video will lots of energy.. I've chilled out a bit since, but thank you

  • @santroma1
    @santroma12 жыл бұрын

    Great explanation!

  • @olegmonogarov7219
    @olegmonogarov72192 жыл бұрын

    Excelent explanation indeed!

  • @kadenhesse9777
    @kadenhesse97773 жыл бұрын

    This was awesome! Could you make a video about where this is applied? you talked about how it effects ML but could you show an example? Thank you! btw the algorithm showed me this video so hopefully ur on the rise! Honored to be this early

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Yea I’ll do a video on the EM algorithm, where this shows up. Also, variational inference, eventually. And I’m happy to hear that! I hope you’re right but we’ll see.

  • @AKASHSOVIS
    @AKASHSOVIS2 жыл бұрын

    You deserve more likes!

  • @alexanderk5835
    @alexanderk58352 жыл бұрын

    Very good explanation without keeping it complicated, thanks a lot/

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Course man - glad you liked it

  • @thorgexyz
    @thorgexyz3 жыл бұрын

    Thanks. Very interesting. I listend to a talk from Nassim Taleb where he talked about the Jensen’s Inequality.

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Yea it shows up a lot. Glad you enjoyed it

  • @Glassful
    @Glassful3 жыл бұрын

    You are awesome...this explanation is so cool.

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Glad you like it!

  • @davidjohnston4240
    @davidjohnston42403 жыл бұрын

    This is relevant to extractor theory used in cryptography (usually as min-entropy, not the Shannon entropy you assume here) - how does your function change the entropy per bit of the input data?

  • @damiangames1204
    @damiangames12042 жыл бұрын

    Nice visualization!

  • @simonhradetzky7055
    @simonhradetzky7055 Жыл бұрын

    GREAT VISUALISATION TY

  • @salgsalgglas
    @salgsalgglas Жыл бұрын

    This is so good. Thankyou.😊

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    And thank you too Satish!

  • @connorshorten6311
    @connorshorten63113 жыл бұрын

    Awesome video!

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Thank you! More coming - one every 3 weeks.

  • @za012345678998765432
    @za0123456789987654322 жыл бұрын

    Just found your channel through your comment on 3b1b's video, very nice explanation. Btw, is the opposite inequality true for concave functions?

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Glad you’re here! His channel is a huge inspiration. And yep, concave functions you get the opposite. The neg of a convex I’d a concave function and that deserves the inequality.

  • @Throwingness
    @Throwingness2 жыл бұрын

    Liked and commented. Thank you and more please!

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    I got something good cookin :)

  • @Kopakabana001
    @Kopakabana0013 жыл бұрын

    Love the videos!

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Thanks!

  • @r.hazeleger7193
    @r.hazeleger719315 күн бұрын

    Great vid bruv

  • @danialdunson
    @danialdunson2 жыл бұрын

    great channel

  • @veri_bilimi
    @veri_bilimi Жыл бұрын

    Amazing! Thank you very much!

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    Thank you very much!

  • @pedramhaqiqi7030
    @pedramhaqiqi70307 ай бұрын

    most OP explanation of all time. My intuition so far had came from showing it w the definition of convexity, This was awesome, relating it to N sampling was the key. Learning Online Convex Opt, any tips :p prof is planning 50% avg midterm

  • @selinacarter8849
    @selinacarter88492 жыл бұрын

    What software did you use for this cool moving graph thingy??

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Thanks for the love! Answered you on Twitter :)

  • @kiarashgeraili8595
    @kiarashgeraili85952 жыл бұрын

    Very Very nice!

  • @randomvideos3628
    @randomvideos36282 жыл бұрын

    Gem of a video...

  • @angelinag5076
    @angelinag50762 жыл бұрын

    Thanks !

  • @simpleworld542
    @simpleworld5422 жыл бұрын

    Thank You boss

  • @line8748
    @line874810 ай бұрын

    If you have a concave function, does the inequality sign just get flipped? Thank you very much for the content!

  • @Mutual_Information

    @Mutual_Information

    10 ай бұрын

    Yes it does!

  • @ramirolopezvazquez4636
    @ramirolopezvazquez46364 ай бұрын

    Awesome! So ... given a random variable X I can use Jensen's inequality to estimate the local curvature of a function 'f' ?

  • @egoreremeev9969
    @egoreremeev99693 жыл бұрын

    So what would happen if the space of points below(!) the function is convex? Will we get the different inequality?

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Yep, then that would be a concave function and the inequality would be reversed. A pretty common example of that is the log(x) function.

  • @get_youtube_channel_name
    @get_youtube_channel_name2 жыл бұрын

    great vid! but it would be even better if you can show the formal proof and connect it with the visualization u showed

  • @gjcamacho
    @gjcamacho10 ай бұрын

    Is it possible you could made a video on Variational Inference and the intuition on the loss function?

  • @Breeezn
    @Breeezn4 ай бұрын

    Cool inequality! If we take f(x) = x² then this inequality tells us that for any real a, b: a² + 2ab + b²

  • @omololaomotalade8105
    @omololaomotalade8105 Жыл бұрын

    Thank you for this..

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    For sure - If you'd like, you can do me a solid and anyone into ML/stats about the channel :)

  • @sadiyaahmad6680
    @sadiyaahmad6680 Жыл бұрын

    Can you upload some content about estimation maximization with mixed poisson?

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    Hm, sorry but that's unlikely. It's just too specific. The topics I've picked are already fairly niche. If I go into a very specific subtopic, it'll appeal to very few folks (unless there is something particularly fascinating about it)

  • @alphamikeomega5728
    @alphamikeomega57283 жыл бұрын

    3:19 in and I get it. Thanks!

  • @prakhyatshankesi3749
    @prakhyatshankesi3749 Жыл бұрын

    Subscribed

  • @inordirection_
    @inordirection_3 жыл бұрын

    I never understood why this inequality was true or what it really meant the first time I saw it, and my engineering prof said don't worry about the intuition just know how to use it (bleh), but KZread somehow knew to suggest this to me months later! Thanks for the clear explanation

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Glad you enjoyed it!

  • @omridrori3286
    @omridrori32862 жыл бұрын

    My friend you are amazing like really i feel bad how much time i wast in trying underatand it when you explain it amaizing in 5 minute Can you also make video on vae and variational inference elbo and all that? Please it is really topic which hard for a lot of people and look like it exactly in your domain Please

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Thanks! And I do have plans to make a video on variational inference but it may take awhile. There are a few videos in front of it. But it’s coming!

  • @WilliamDye-willdye
    @WilliamDye-willdye3 жыл бұрын

    Heh. 4:55 "what mathematicians will ask, and engineers probably won't, is 'why?'." That definitely matches my own experience.

  • @lenishpandey192
    @lenishpandey1922 жыл бұрын

    Can't thankyou enough!

  • @akhilezai
    @akhilezai3 жыл бұрын

    Holyshit, this is a great video!

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Thank you - it means a lot !

  • @nikoskonstantinou3681
    @nikoskonstantinou36813 жыл бұрын

    Hey, nice video. I though we would see the f((a+b)/2)

  • @Mutual_Information

    @Mutual_Information

    3 жыл бұрын

    Glad you enjoyed it!

  • @cnidariantide4207
    @cnidariantide42072 жыл бұрын

    I wish I had friends like you! What's on the bookshelf? =) The intuition follows immediately from grokking the idea of convex functions! Some of the coolest tricks in mathematics come from manipulation of inequalities. There's a brilliant little maths book named "The Cauchy-Schwarz Master Class" which I recommend for anyone wanting to master these dark arcana. Sometimes, like in this case, an animation is just unbeatable, however!

  • @Mutual_Information

    @Mutual_Information

    2 жыл бұрын

    Thank you! I’ve heard that book recommended a few times but have never checked it out. I’ll order it!

  • @carlaparla2717
    @carlaparla2717 Жыл бұрын

    I don't like the chosen function for the visualization because it is everywhere increasing. From this video i m not convinced that this reasoning would be valid for say a parabola segment.

  • @rocamonde
    @rocamonde2 жыл бұрын

    The visual explanation is excellent. However, it might seem that it does not hold if one picks a convex function that is below the straight line. For completeness, it is worth remarking that the equality holds for any straight line. Because of this, for all convex functions f(x) there is always a straight line ax+b for which f(x)>=ax+b for all x. (Namely, the line you pick has to be the tangent of the convex function at E[X]).

  • @erikysilvagomes5496

    @erikysilvagomes5496

    Жыл бұрын

    My question was not answered by the author, maybe you can help me. My doubt is precisely about the equality holding for any straight line... why it is true? I can understand it holds for linear functions of type f(x) = c.x, in the sense of addition and multiplication conservation -> hence f(E(x)) = E(f(x)). Functions of the type f(x) = ax+b are straight lines, but not linear funcions in these sense, so we cannot prove that f(E(x))=E(f(x)). Try to use f(x) ~ exp(k) and the transformation Y = ax + b, you'll see that f(E(x))=a/k+b and E(f(x))=exp(kb/a)*a/k, which do not hold for any straight line, but just for b=0.

  • @rocamonde

    @rocamonde

    Жыл бұрын

    I’m not sure what you’re saying. The equality is satisfied when f is an affine transformation. This is trivial because E is a linear operator: E(f(X)) = E(aX+b) = a E(X)+b=f(E(X)) where the first step applies the definition that f is affine (a straight line), the second, that E is linear, and the third using again the definition of f being affine. In your example you’re mixing up letting f be an exponential or letting it be an affine transformation, so you get something weird. You can’t apply both transformations at the same time, that is not something that Jensen inequality talks about in any way. If the function is convex, you get an inequality. If the function is strictly convex (unless X is constant a.s.) you get strict inequality. And in the opposite edge case that the function is affine (which is also convex), you get strict equality.

  • @erikysilvagomes5496

    @erikysilvagomes5496

    Жыл бұрын

    @@rocamonde Of course you are correct, in my example I made a mistake calculating E[f(X)], for that reason the equality does not hold. My approach was to see the theorem from the affine functions properties, but in fact is much more simple to see it from the expectation linearity. Thank you, your explanation was clear!

  • @rangjungyeshe
    @rangjungyeshe Жыл бұрын

    Thanks!

  • @Mutual_Information

    @Mutual_Information

    Жыл бұрын

    Thank you! 3rd donation ever :)