Robust Principal Component Analysis (RPCA)

Ғылым және технология

Robust statistics is essential for handling data with corruption or missing entries. This robust variant of principal component analysis (PCA) is now a workhorse algorithm in several fields, including fluid mechanics, the Netflix prize, and image processing.
Book Website: databookuw.com
Book PDF: databookuw.com/databook.pdf
These lectures follow Chapter 3 from:
"Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by Brunton and Kutz
Amazon: www.amazon.com/Data-Driven-Sc...
Brunton Website: eigensteve.com
This video was produced at the University of Washington

Пікірлер: 104

  • @kyrilo1993
    @kyrilo19933 жыл бұрын

    I like how he thanks us at the end of every video when WE should be the ones thanking him.

  • @reocam8918

    @reocam8918

    3 жыл бұрын

    That's how master differs from ordinary teachers. He treat teaching more as a performance 😉

  • @QuantizedFields
    @QuantizedFields3 жыл бұрын

    Finally people can now distinguish Clark Kent from Superman! I thought it is never gonna happen

  • @xXxdomygxXx

    @xXxdomygxXx

    3 ай бұрын

    Imagine do lot of math to reveal Superman's face while he just use his X-Ray view to undress you in O(1) computational cost

  • @debu478
    @debu4783 жыл бұрын

    Need a detailed lecture series on RPCA, you are a gem sir Thank you for such amazing explanation

  • @adamsoffer5040
    @adamsoffer50402 жыл бұрын

    i love your explanations, they are so eloquent and fluent! thank you!

  • @mrboyban
    @mrboyban3 жыл бұрын

    Fluid mechanics is certainly a very interesting topic! Many thanks for share it.

  • @dbracale
    @dbracale2 жыл бұрын

    You are amazing! Your explanations are impeccable! Thank you!

  • @aanchaldogra9802
    @aanchaldogra98023 жыл бұрын

    Huge fan Mr Steve.

  • @williamgomez6087
    @williamgomez60873 жыл бұрын

    World need more people like you

  • @andrewgibson7797
    @andrewgibson77973 жыл бұрын

    I like how you can view this as reconstructing the missing data on one hand, or filtering out the outliers on the other. Naively, those seem like conceptually very different tasks to me, but I guess they're really not.

  • @lena191
    @lena1913 жыл бұрын

    YAY!! I was waiting for the video on RPCA

  • @maydin34
    @maydin343 жыл бұрын

    Very informative. Great video.

  • @autumnreed2079
    @autumnreed2079 Жыл бұрын

    Thank you so much for this video. It was very eye opening for getting into ML

  • @BangNguyen-fe6fl
    @BangNguyen-fe6fl3 жыл бұрын

    Great video, Sir!

  • @abc3631
    @abc36313 жыл бұрын

    Awesome as usual

  • @user-fi4ob8kv7f
    @user-fi4ob8kv7f2 жыл бұрын

    I really appreciate your help!

  • @ChrisMcAce
    @ChrisMcAce2 жыл бұрын

    Thanks for these videos! Nitpick: The error lines at 3:50 should be vertical as you're talking about regression. (For PCA they would be perpendicular as shown.)

  • @Eigensteve

    @Eigensteve

    2 жыл бұрын

    Good catch! Agreed, should be vertical for standard SVD. Updated in my most recent slides :)

  • @amnn8507
    @amnn85072 жыл бұрын

    Thank you for the great video. I am very interested in the Netflix example (sounds like a missing value imputation problem) but couldn't find any resources/papers explaining it. I am mostly interested in using RPCA for missing value imputation in time-series. Could you please share some materials on that subject?

  • @jhonportella5618
    @jhonportella56183 жыл бұрын

    Nice video. It is amazing how RPCA introduces Robustness in front of huge differences. I have a question regarding to your choice of mu. In your code you are choosing mu as mu = n1*n2/(4*sum(abs(X(:)))); where does this expression come from?

  • @mohammadfateh2023
    @mohammadfateh20235 ай бұрын

    Thanks a lot for sharing your knowledge.

  • @Eigensteve

    @Eigensteve

    5 ай бұрын

    My pleasure

  • @nikhileshnatraj331
    @nikhileshnatraj3313 жыл бұрын

    Great content. But aren’t there more effective solutions? There is a whole field called robust statistics; one can for example estimate the covariance matrix using Maronna’s M estimators prior to computing the eigenvectors and eigenvalues

  • @zacmac
    @zacmac3 жыл бұрын

    Hi I stumbled upon this video randomly but ended up watching all of it! I like the way you explain this complex topic in an easy to understand way without bombarding us with maths. I've always wondered how Netflix and KZread are so good at reccomendaitons and now I know. they find a solution to an ill posed inverse problem by minimizing rank(L) + abs(S) in a convex minimization regime. I have a question what is the origin of the 'low rank' terminology.

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    Glad you liked it!

  • @ashiktm4188
    @ashiktm41883 жыл бұрын

    Thanks for the video

  • @user-or7ji5hv8y
    @user-or7ji5hv8y3 жыл бұрын

    Wow great video

  • @tdoge
    @tdoge3 жыл бұрын

    Danke great video!

  • @anilcelik16
    @anilcelik162 жыл бұрын

    Thanks. Then is there any reason to use regular PCA at all?

  • @raaedalmayali3685
    @raaedalmayali36853 жыл бұрын

    Hello Mr. Steve, please, what is the features that RPCA extracted it from image?

  • @fengliu7904
    @fengliu79043 жыл бұрын

    From a math perspective, why the low-rank decomposition can handle the outlier shown at 4:40?

  • @vicktorioalhakim3666
    @vicktorioalhakim36663 жыл бұрын

    Is the L1 norm PCA considered RPCA? In essence, is RPCA a subclass of robust optimization?

  • @studybooks3395
    @studybooks33953 жыл бұрын

    I studied PCA last week. And now this. 😆

  • @fatihakkentli8034
    @fatihakkentli80343 жыл бұрын

    thanks

  • @sriphanikrishnakarri9150
    @sriphanikrishnakarri91503 жыл бұрын

    Great video as always but eveytims i wanted to know your recording setup and software

  • @JoelRosenfeld

    @JoelRosenfeld

    3 жыл бұрын

    You can glean a lot from the video itself. He is behind a glass window of some kind and he has a lapel mic to capture sound. He records everything and then flips the video in post. You can tell that because his part is opposite in his whiteboard videos early on. To ensure legibility he wears a black sweater and has a black background, which is clever. In his book he uses Python and MATLAB. I’m guessing everything is assembled in Adobe Premiere or Final Cut Pro. Though, there are lots of free options out there.

  • @raaedalmayali3685
    @raaedalmayali36853 жыл бұрын

    Hello Mr. Brunton, please, in your book, "Data Driven Science & Engineering " in page 124, in RPCA Code, in "while" instruction, why you use "count < 1000" ? what is you mean by 1000 ?

  • @gabrielshultz5872
    @gabrielshultz58722 жыл бұрын

    How do you create "allFaces.mat " from the yale database so I can follow along in the book? I got the database, but am not sure how to easily import it to matlab.

  • @IceTurf
    @IceTurf3 жыл бұрын

    How do I go about improving my mathematical know-how? I can do mathematical operations, but I struggle with high level stuff to intuitively understand it sometimes.

  • @avatar098

    @avatar098

    3 жыл бұрын

    When you do your reading, look up anything you don't know how to do, or any terms that may feel unfamiliar. It's easier to do this in a university course setting, but it's also possible to do when self studying. Don't give up! The more you read and study, the more common certain themes come up. If you don't understand anything, treat that concept as a black box and try to understand at a high level what we are trying to achieve. Then slowly work through the black box until you get to a level that you feel satisfied with :)

  • @somebody198
    @somebody1982 жыл бұрын

    Do I understand correctly that this method does not help reduce the data dimension?

  • @iskhezia
    @iskheziaАй бұрын

    I cant download or open tem PDF book. Someone are having the same problem?

  • @hannahvo
    @hannahvo Жыл бұрын

    why low rank matrix represent normal data?

  • @chymoney1
    @chymoney13 жыл бұрын

    Have you done any topological data analysis? It’s very intriguing

  • @satyamprakash7030

    @satyamprakash7030

    3 жыл бұрын

    Hey this may sound creepy but I looked for the channels you have subscribed in order to find channels I might like. I am basically interested in all kind of fields that require advance math in one way or another. If you have some time then would you answer some of my questiong regarding channels you might recommend. Thanks.

  • @zae5pm
    @zae5pm3 жыл бұрын

    I'm doing POD which is based on PCA. Is their constraint PCA?

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    Sure, in lots of these least-squares regression algorithms it is possible to add constraints. I think of POD and PCA as being essentially the same algorithmically.

  • @sidali126
    @sidali126 Жыл бұрын

    Is there any available implementation in python? Kind regards.

  • @mohamedmeskini1650
    @mohamedmeskini16503 жыл бұрын

    kAk∗ + λkEk1 is the convex, can you explain that

  • @somebody198
    @somebody1982 жыл бұрын

    How exactly is this algorithm trained? I mean, nowhere in the given calculations it was required to have several observations, one matrix was enough. Why can't we just take a picture and extract the right components from it?

  • @jasonwhite6463
    @jasonwhite64633 жыл бұрын

    Is a 2011 pub, recent? Appreciate video but couldn't help but ask.

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    Depends on the field. In applied math, definitely. In computer graphics or deep learning, maybe not. Although, the seminal works from both fields are still important.

  • @alessandrobitetto2361
    @alessandrobitetto23613 жыл бұрын

    By 0-norm do you mean the number of non-zero entries? Thanks

  • @skeletonrowdie1768

    @skeletonrowdie1768

    3 жыл бұрын

    Yes. When he talks about ||S||, which should be a *sparse* matrix. The non zero entries act as a loss for the algorithm.

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    @@skeletonrowdie1768 Yes indeed

  • @scienteer3562
    @scienteer35623 жыл бұрын

    Could this be used to solve sudoku's?. Teach it with lots of completed puzzles and the uncompleted puzzle is just a sparse sampling.

  • @erkintek

    @erkintek

    3 жыл бұрын

    there are more robust ways to solve sudoku :d

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    Cool idea!

  • @JoelRosenfeld
    @JoelRosenfeld3 жыл бұрын

    Nice video. I like the Netflix and POD examples, and I’ll give it a go in my own DMD work. I think the first example could be better motivated by discussing the difficulties that FaceID is having with identifying masked faces and unlocking phones in this pandemic. There has been suggestions that Apple will bring back Touch ID with the iPhone 13 because of this. Not just cops and robbers but issues with everyday people and technology in their pocket. One question, would you get similar results regularizing by the TV norm? Just a hot take. But I feel you should get similar results.

  • @JoelRosenfeld

    @JoelRosenfeld

    3 жыл бұрын

    Maybe I’m getting my wires crossed here. Is the TV norm the same as l1? Coming in to this field from pure operator theory has me consulting glossaries more often then not

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    Good point. I actually filmed this before the pandemic ;) who knew that partially masked faces were going to be such a thing!

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    @@JoelRosenfeld TV norm is a bit different. But it does have some similarities in how it tries to regularize problems by reducing overfitting. TV regularization made the most sense to me in the context of differentiation (great paper by Chartrand: www.hindawi.com/journals/isrn/2011/164564/)

  • @JoelRosenfeld

    @JoelRosenfeld

    3 жыл бұрын

    @@Eigensteve I had wondered if that was the case. :) How far ahead do you record these? That’s quite the lead time!

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    @@JoelRosenfeld Totally depends... some videos are recorded a while in advance and I just sit on them, and others come out a bit faster... looks like now I am ahead about 2-3 months on most videos, but I have at least one from a year ago. :)

  • @zhichaozhao172
    @zhichaozhao1723 жыл бұрын

    can i ask what is the brand of black T-shirt? I am searching for a good quality T-shirt and stick with it

  • @userou-ig1ze
    @userou-ig1ze3 жыл бұрын

    paper from 10y ago is recent? Thanks for these very illuminating series

  • @JoelRosenfeld

    @JoelRosenfeld

    3 жыл бұрын

    Yes, a paper from 10 years ago is fairly recent. It takes time for algorithms and methods to be adopted by the greater community.

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    Yep, depends on the field. In applied mathematics and statistics, a decade isn't that long. In computer vision and deep learning, a decade feels like a longer time

  • @three-min-to-go
    @three-min-to-go3 жыл бұрын

    Hi Professor you are so handsome that I really enjoy your video like a TV drama!

  • @appa609
    @appa6093 жыл бұрын

    I don't think the L1 regression is actually uniquely defined... you can shift it up and down and as long as the line doesn't cross any data points the norm doesn't increase.

  • @vicktorioalhakim3666

    @vicktorioalhakim3666

    3 жыл бұрын

    Indeed, L1-norm minimization is not unique, as shown Boyd's book "Convex optimization".

  • @JoelRosenfeld

    @JoelRosenfeld

    3 жыл бұрын

    @@vicktorioalhakim3666 Right, I would agree with that, but here aren't we just using L1 regularization of a least squares problem? Or am I missing something?

  • @vicktorioalhakim3666

    @vicktorioalhakim3666

    3 жыл бұрын

    ​@@JoelRosenfeld Not sure why you're talking about L1 regularization, as the original poster is talking about L1 *regression*, however L1 regularization is just adding a L1-norm term to the original objective function -> not unique on the Pareto curve.

  • @vicktorioalhakim3666

    @vicktorioalhakim3666

    3 жыл бұрын

    @@JoelRosenfeld BTW, great videos!

  • @JoelRosenfeld

    @JoelRosenfeld

    3 жыл бұрын

    Thanks! I’m enjoying putting my videos together. :) I mention regularization because I thought that’s how these sparse regression approaches worked. Maybe it’s late and I’m just not connecting the dots right now.

  • @bhargav7476
    @bhargav74763 жыл бұрын

    What even is that? Calculus? Statistics? Geometry? What do I google If I wanna learn that maths?

  • @chaser27

    @chaser27

    3 жыл бұрын

    Yes

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    Probably linear algebra first, statistics second, and for high-dimensional data, it is related to geometry. And all of the algorithms involve optimization.

  • @bhargav7476

    @bhargav7476

    3 жыл бұрын

    @@Eigensteve Thank You, will start with Liner Algebra.

  • @Turcian
    @Turcian3 жыл бұрын

    Haha, video compression failed due to the salt and pepper noise after 15:40. Not very robust.

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    That is so cool! Nice meta observation!

  • @saeedsaimonable
    @saeedsaimonable3 жыл бұрын

    Could u talking about architecture robot interactive/creative and AI

  • @stevelk1329
    @stevelk13293 жыл бұрын

    "very cool, a little bit alarming, but I'm going to walk you through it." Wait, doesn't that mean he might be admitting he's irresponsible?? Good grief. What does he expect people to think?.. "well he's telling us how to do potentially really bad stuff but that's okay cuz he's also telling us it might be bad."

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    Lots of powerful technologies have good and bad uses. And the cat's out of the bag with this one... really this is standard linear algebra. But I thought it was important to point out that we should at least be aware of the implications.

  • @tantzer6113
    @tantzer61137 ай бұрын

    Someone does not like The Big Lebowski?

  • @darkmath100
    @darkmath1003 жыл бұрын

    1:10 The math behind this is intriguing but if the goal is to build more robust surveillance technology is it really worth it? Sure the money's probably good but is helping build out an Orwellian police state morally sound?

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    There are good and bad applications of most powerful technologies and algorithms. This is by no means the only application of robust statistics, but is one of the easiest to understand and relate to.

  • @darkmath100

    @darkmath100

    3 жыл бұрын

    @@Eigensteve All new technology is revolutionary but when humanity discovered nuclear fission it took a long time to reign it in. I suspect AI is in the same predicament: www.vice.com/en/article/y3gjjw/the-nypd-sent-a-creepy-robotic-dog-into-a-bronx-apartment-building

  • @WhenThoughtsConnect
    @WhenThoughtsConnect3 жыл бұрын

    its like a ship but one person is absurdly fat

  • @Zxymr
    @Zxymr3 жыл бұрын

    Brilliant! Would this work with kernel PCA as well?

  • @Eigensteve

    @Eigensteve

    3 жыл бұрын

    Good question... I found this interesting NeurIPS paper on this topic: papers.nips.cc/paper/2008/file/8f53295a73878494e9bc8dd6c3c7104f-Paper.pdf

Келесі