Lecture 4 - Perceptron & Generalized Linear Model | Stanford CS229: Machine Learning (Autumn 2018)

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: stanford.io/ai
Anand Avati
PhD Candidate and CS229 Head TA
To follow along with the course schedule and syllabus, visit:
cs229.stanford.edu/syllabus-au...

Пікірлер: 83

  • @badassopenpolling
    @badassopenpolling Жыл бұрын

    As per me - real examples were missing in this lecture. Examples help in clarity of thoughts.

  • @PingCheng-wf2pb
    @PingCheng-wf2pb4 ай бұрын

    I think this lecture is well-organized and easy to get, which helps me immensely. I started reading the Notes first, but I still don't understand GLMs. After checking out the comments here, I decided to watch Ng's video (2008 version). Unfortunately, it was pretty much like the Notes. Then I went back to this video and found it amazing! Like, the conclusion of the "learning update rule" is handy but got missed in Notes. Also, the explanation of "assumptions/design choices" is clearer than the notes, which gives me a more concrete feel. The examples in 59 minutes are also incredibly great. I hope you dig this video and stop getting swayed by negative comments.

  • @GaneshPvt-xg4cc
    @GaneshPvt-xg4cc12 күн бұрын

    lecture be so good that optimus prime gets curious and asks questions

  • @samurai_coach
    @samurai_coach Жыл бұрын

    51:40 Memo. In conclusion, just pick which pdf method you use depending on the types of models(Gaussian, Bernoulli, and so on) and plug in the value for h theta(x) for the function to train.

  • @samurai_coach
    @samurai_coach Жыл бұрын

    59:30 good question to sums up what he's been explaining in the lecture

  • @wingedmechanism
    @wingedmechanism4 ай бұрын

    For anyone who needs it the updated link for the notes are here: cs229.stanford.edu/lectures-spring2022/main_notes.pdf

  • @Dimi231

    @Dimi231

    4 ай бұрын

    have you seen the PS 1 homework they have to do anywhere? the programming and math that they upload on scope?

  • @vedikagupta407

    @vedikagupta407

    Ай бұрын

    Thank you very much :)

  • @amplified8706

    @amplified8706

    15 күн бұрын

    Thank you very much, I was following Anands lectures for ML 2019 and I felt lost sometimes and needed the structured notes to revise after the lecture. Once again thanks a lot!

  • @dimensionentangled4514
    @dimensionentangled45142 жыл бұрын

    We begin by learning about Perceptrons. This is motivated by the previous discussions on logistic regression, where we use the sigmoid function. In case of perceptrons, we use a modified function in place of the sigmoid. Next, we look at what exponential families are and some related examples. This is more a statistics thing. Next, we learn about GLMs (Generalised Linear Models). The appearance of the sigmoid function in logistic regression becomes apparent from this discussion. Finally, we study Softmax regression via the use of Cross-Entropy (defined therein.)

  • @McAwesomeReaper

    @McAwesomeReaper

    8 ай бұрын

    Perceptron is one of my favorite autobots.

  • @McAwesomeReaper
    @McAwesomeReaper8 ай бұрын

    Hopefully in the last 5 years someone at Stanford took it upon themselves to attach some magnets to the bottom of these panes.

  • @user-pf8pe2ed1y
    @user-pf8pe2ed1y Жыл бұрын

    At about 1:07, "These "Ys" and "xs" would have been sampled". I thought for Sufficient Statistics, the Bernoulli distribution would not need to be sampled, it is assumed to have enough data, as a GLM?

  • @saraelshafie
    @saraelshafie4 ай бұрын

    how do i get the updated lecture notes? it gives 404 page on the standford website

  • @finnfinn2002

    @finnfinn2002

    2 ай бұрын

    cs229.stanford.edu/main_notes.pdf

  • @judychen9693
    @judychen96935 ай бұрын

    People are too harsh on this lecturer. Even if he's a senior student, he's still learning. He hasn't got decades of experience like Dr. Ng. I think he did a fine job delivering this lecture.

  • @RakibHasan-ee2cd

    @RakibHasan-ee2cd

    5 ай бұрын

    He was actually quite clear 2/3 of the interview except maybe the start.

  • @ian-haggerty
    @ian-haggerty3 ай бұрын

    Does anyone have the problem sets available?

  • @PingCheng-wf2pb
    @PingCheng-wf2pb4 ай бұрын

    42:00 "Given an x,we get an exponential family distribution, and the mean of that distribution will be the prediction that we make for a given new x"!!!!!

  • @vikrantkhedkar6451
    @vikrantkhedkar645111 ай бұрын

    How can i get problem sets

  • @mrpotatohed4
    @mrpotatohed4 Жыл бұрын

    Great lecture, really helped clarify to me GLMs.

  • @alpaslankurt9394
    @alpaslankurt93945 ай бұрын

    How can we find lecture notes ? Is there chance to I get ? Your lecture note page says 404 not found.

  • @studybuddy8307

    @studybuddy8307

    21 күн бұрын

    cs229.stanford.edu/lectures-spring2022/main_notes.pdf

  • @MAS-cz4mf
    @MAS-cz4mf5 ай бұрын

    at 9:14 it should be theta transpose that is perpendicular to line

  • @Minato-gn1tz
    @Minato-gn1tz Жыл бұрын

    after the probablistic interpretation topic in the last lecture everything has just went over my head can anyone pls tell me a good resource to learn statistics and probability of this level ?

  • @TusharAnandfg

    @TusharAnandfg

    11 ай бұрын

    lookup some books used in first year probability and statistic courses.

  • @anubhavkumarc

    @anubhavkumarc

    6 ай бұрын

    It's a Masters level course so it makes sense that it assumes you know undergrad statistics. Probably go through some undergrad stats courses

  • @anubhavkumarc

    @anubhavkumarc

    6 ай бұрын

    Will add they have prerequisites on the course website with specific courses mentioned so go through that, that'll help.

  • @harshagarwal2517
    @harshagarwal2517Ай бұрын

    can anyone please send the problem sheets as i am unable to login through piazza. Thanks for your help.

  • @OK-lj5zc
    @OK-lj5zc7 ай бұрын

    omg this course Lectures 1-3: easy breezy positivity with Andrew Ng Lecture 4: getting hit in the head with a textbook hope it doesn't keep escalating like this...

  • @user-ut3fk8gw3t
    @user-ut3fk8gw3t5 ай бұрын

    At time 26:36 in the video you made a mistake in the expression of phi where the numerator should be e to the power phi and not 1. just do cross product fraction to check it.

  • @studybuddy8307
    @studybuddy830713 күн бұрын

    where do i learn this level of statistics

  • @rahulpadhy6325
    @rahulpadhy6325 Жыл бұрын

    What is the practical use of the properties explained in exponential families?

  • @fahyen6557

    @fahyen6557

    Жыл бұрын

    finding the expected value and variance. duh

  • @anubhavkumarc

    @anubhavkumarc

    6 ай бұрын

    Optimising the likelihood is much easier in exponential families (that is you train the model more easily), expectation (that is our hypothesis/prediction) and variance are also much easier found computationally (because derivatives in general are less computationally expensive than integrals).

  • @closingtheloop2593
    @closingtheloop25934 ай бұрын

    Him writing an expression saying " sum of class triangle, square, circle" is comedic gold. I died. 1:21:00

  • @adityak7144
    @adityak714410 ай бұрын

    27:18 Wouldn't the "a(n)" function (log-partition) be log(1+e^-n) + n, instead of just log(1+e^-n)?

  • @ShaluSarojKumar

    @ShaluSarojKumar

    10 ай бұрын

    exactly my question! 😵‍💫

  • @marcoreichel5194
    @marcoreichel5194 Жыл бұрын

    Man, this guy's lecture was really disorganised and confusing. Why didn't he just follow Andrew Ng's notes?

  • @PingCheng-wf2pb

    @PingCheng-wf2pb

    4 ай бұрын

    Disagree! I think this lecture is well-organized and clear. For example, the conclusion of the "learning update rule" is particularly useful but unfortunately miss in Notes. Also, the explanation of "assumptions/design choices" is more clear than Notes, which gives me a more concrete sense.

  • @projectbravery
    @projectbravery Жыл бұрын

    This was really good, it did take a while compared to the previous lectures with Andrew

  • @calebvantassel1936

    @calebvantassel1936

    Жыл бұрын

    How? It's duration is within five minutes of the other lectures.

  • @fahyen6557

    @fahyen6557

    Жыл бұрын

    @@calebvantassel1936 what the hell r u talkign about

  • @UsmanKhan-tb5zy

    @UsmanKhan-tb5zy

    10 ай бұрын

    @@fahyen6557 the duration of the lecture is almost the same

  • @creativeuser9086
    @creativeuser9086 Жыл бұрын

    What is h(theta) for the last example of softmax regression?

  • @rijrya

    @rijrya

    Жыл бұрын

    it's the matrix of all c different logits, normalized, i.e. the thing he writes on the board at 1:17:21 search up softmax regression and click the first stanford edu link for a better explanation

  • @creativeuser9086

    @creativeuser9086

    Жыл бұрын

    @@rijrya correct. I’m also wondering why wouldn’t we train k different binary logistic classifiers instead of the softmax, especially that we can’t train the model to take input that is not in any of the k classes (say we want to classify the input as either dog, cat, mouse, and we input a horse); in a binary classifier it would output 0 for each of the dog, cat and mouse classifiers, but for softmax p(y)=0 which makes the likelihood 0 no matter what, so we can’t train.

  • @rijrya

    @rijrya

    Жыл бұрын

    @@creativeuser9086 it probably comes down to an efficiency issue, as creating a binary classification model for each class would be very inefficient especially as the number of classes increases. also since there would be a lot of redundancy in the data used to train each model i.e. the same data is used multiple times for separate models, I think you might run into overfitting issues. for the example you suggested, I think the solution that still incorporates softmax regression would be to have the classes dog, cat, mouse, and none of the above, then this would classify a horse with better results

  • @rijrya

    @rijrya

    Жыл бұрын

    I searched it up and it seems that k binary classifiers are typically only preferred over softmax when the classes aren’t mutually exclusive, e.g {dog, cat, animal}, in this case softmax would not work very well

  • @creativeuser9086

    @creativeuser9086

    Жыл бұрын

    @@rijrya I see. Regarding efficiency, there shouldn’t be a difference between K-binary classifiers and softmax since we use the data once to train all k-classifiers in parallel. The number of parameters and gradient computations are the same.

  • @dsazz801
    @dsazz801 Жыл бұрын

    A great lecture! Thank you!

  • @haoranlee8649
    @haoranlee86497 ай бұрын

    great lectures

  • @nanunsaram
    @nanunsaram2 жыл бұрын

    Thank you!

  • @akintoyefelix5124
    @akintoyefelix512411 ай бұрын

    Kind of....

  • @priyapandey8951
    @priyapandey8951 Жыл бұрын

    The course schedule link and syllabus link provided have the notes links that are not working. Can anybody provide the correct link?

  • @bernarddanice1294

    @bernarddanice1294

    Жыл бұрын

    Do you still need it?

  • @priyapandey8951

    @priyapandey8951

    Жыл бұрын

    @@bernarddanice1294 yes definitely for my exams I need it.

  • @user-bv4eh6tt9e

    @user-bv4eh6tt9e

    Жыл бұрын

    @@bernarddanice1294 sorry, do you still have this links?

  • @harshgupta8936

    @harshgupta8936

    Жыл бұрын

    @@bernarddanice1294 yes if you have

  • @harshgupta8936

    @harshgupta8936

    Жыл бұрын

    if you get it from somewhere please send it

  • @gracefulmango1234
    @gracefulmango123410 ай бұрын

    lovely

  • @guiuismo
    @guiuismo5 ай бұрын

    What is bro waffling about

  • @AditiYadav-jm8zc
    @AditiYadav-jm8zc6 ай бұрын

    wowwwww lecture

  • @yong_sung
    @yong_sung10 ай бұрын

    1:00:35

  • @5MrSlavon
    @5MrSlavonАй бұрын

    Softmax regression: kzread.info/dash/bejne/m46Ix9iaYLq5hLQ.htmlsi=1cHP27fQImm027xh&t=4101

  • @kmishy
    @kmishy2 жыл бұрын

    23:02 Sir, It's PMF not PDF

  • @antonyprinz4744

    @antonyprinz4744

    Жыл бұрын

    probability density function= PDF

  • @kmishy

    @kmishy

    Жыл бұрын

    @@antonyprinz4744 Bernoulli distribution is for discrete random variable, PMF is defined for Discrete random variable

  • @joshmohanty585

    @joshmohanty585

    6 ай бұрын

    The distinction between PMF and PDF is entirely artificial. They are both Radon Nikodym derivatives.

  • @amulya1284
    @amulya1284 Жыл бұрын

    This lecture sure does dissappoint 😢

  • @badassopenpolling

    @badassopenpolling

    Жыл бұрын

    I disagree from your comment. Professor has done a good job. Explained algorithms very well. You got free lectures from a reputed university who has best people. Show some respect !!

  • @amulya1284

    @amulya1284

    Жыл бұрын

    @@badassopenpolling you are right! I got overwhelmed in the beginning when i commented also the topic he is teaching is very mathematical.........but i personally preferred other profs! This guy knows his stuff but i have seen better explanations to the same content :/🫠

  • @alienfunbug

    @alienfunbug

    Жыл бұрын

    @@badassopenpolling He's entitled to not being thrilled with the content delivery. Just because it's free, reputable, and in depth doesnt mean there is not room for improvement. I can 100% assure you the instructor would agree, theres always room for growth. A perfect message means nothing if it is not received by its audience.

  • @creativeuser9086

    @creativeuser9086

    Жыл бұрын

    @@badassopenpolling what is h(theta) for the last example of softmax regression?

  • @nowornever7990

    @nowornever7990

    Жыл бұрын

    ​@@creativeuser9086 thetha is the set of parameters (here it is 2D plane because he has considered two features x1 and x2) which will draw a straight line *thetha1 * x1 + thetha 2 * x2 + constant* (or a plane for n-dimensions). This straight line (or plane) will help us to decide where a point belongs to the one class or not. IF NOT, then if we put the value of a point X (or here x1 and x2) to this plane equation, then the output value will be less than zero.. Else value will be greater than zero if that point is a possible candidate of that class..

  • @nabeel123ful
    @nabeel123ful Жыл бұрын

    This dude obviously didn't have the same level of the in-depth knowledge on this important topic as his professor, but just wrote down whatever on the lecture note. He couldn't provide insightful comments on what he wrote down, which I guess is mainly because he hadn't done any real-world project/research on the topics thus didn't deeply understand the stuff he was lecturing. I mean, he should know the stuff thus what he lectured was right, but he didn't fully understand it thus his audience would feel confused and bored. Sorry for being too harsh on him, but this topic deserves a much better lecture.

  • @odedgilad9761
    @odedgilad9761 Жыл бұрын

    there was a student that was asked to write one sentence again ("more bigger"), and it was waste of my time by 10 secend. Do you realize that the total amount of wasting the time of all the viewers in this video is about 45 days O-: this man need to go to jail!

  • @chinthalaadireddy2165

    @chinthalaadireddy2165

    10 ай бұрын

    🤣

  • @OK-lj5zc

    @OK-lj5zc

    7 ай бұрын

    🤣 👏

  • @vishnumahesh5988

    @vishnumahesh5988

    5 ай бұрын

    dude seriously? you wasted your valuable 10 seconds by commenting here.

  • @leeris19

    @leeris19

    Ай бұрын

    @@vishnumahesh5988 right ? AHAHHAH