Lecture 4 - Perceptron & Generalized Linear Model | Stanford CS229: Machine Learning (Autumn 2018)
For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: stanford.io/ai
Anand Avati
PhD Candidate and CS229 Head TA
To follow along with the course schedule and syllabus, visit:
cs229.stanford.edu/syllabus-au...
Пікірлер: 83
As per me - real examples were missing in this lecture. Examples help in clarity of thoughts.
I think this lecture is well-organized and easy to get, which helps me immensely. I started reading the Notes first, but I still don't understand GLMs. After checking out the comments here, I decided to watch Ng's video (2008 version). Unfortunately, it was pretty much like the Notes. Then I went back to this video and found it amazing! Like, the conclusion of the "learning update rule" is handy but got missed in Notes. Also, the explanation of "assumptions/design choices" is clearer than the notes, which gives me a more concrete feel. The examples in 59 minutes are also incredibly great. I hope you dig this video and stop getting swayed by negative comments.
lecture be so good that optimus prime gets curious and asks questions
51:40 Memo. In conclusion, just pick which pdf method you use depending on the types of models(Gaussian, Bernoulli, and so on) and plug in the value for h theta(x) for the function to train.
59:30 good question to sums up what he's been explaining in the lecture
For anyone who needs it the updated link for the notes are here: cs229.stanford.edu/lectures-spring2022/main_notes.pdf
@Dimi231
4 ай бұрын
have you seen the PS 1 homework they have to do anywhere? the programming and math that they upload on scope?
@vedikagupta407
Ай бұрын
Thank you very much :)
@amplified8706
15 күн бұрын
Thank you very much, I was following Anands lectures for ML 2019 and I felt lost sometimes and needed the structured notes to revise after the lecture. Once again thanks a lot!
We begin by learning about Perceptrons. This is motivated by the previous discussions on logistic regression, where we use the sigmoid function. In case of perceptrons, we use a modified function in place of the sigmoid. Next, we look at what exponential families are and some related examples. This is more a statistics thing. Next, we learn about GLMs (Generalised Linear Models). The appearance of the sigmoid function in logistic regression becomes apparent from this discussion. Finally, we study Softmax regression via the use of Cross-Entropy (defined therein.)
@McAwesomeReaper
8 ай бұрын
Perceptron is one of my favorite autobots.
Hopefully in the last 5 years someone at Stanford took it upon themselves to attach some magnets to the bottom of these panes.
At about 1:07, "These "Ys" and "xs" would have been sampled". I thought for Sufficient Statistics, the Bernoulli distribution would not need to be sampled, it is assumed to have enough data, as a GLM?
how do i get the updated lecture notes? it gives 404 page on the standford website
@finnfinn2002
2 ай бұрын
cs229.stanford.edu/main_notes.pdf
People are too harsh on this lecturer. Even if he's a senior student, he's still learning. He hasn't got decades of experience like Dr. Ng. I think he did a fine job delivering this lecture.
@RakibHasan-ee2cd
5 ай бұрын
He was actually quite clear 2/3 of the interview except maybe the start.
Does anyone have the problem sets available?
42:00 "Given an x,we get an exponential family distribution, and the mean of that distribution will be the prediction that we make for a given new x"!!!!!
How can i get problem sets
Great lecture, really helped clarify to me GLMs.
How can we find lecture notes ? Is there chance to I get ? Your lecture note page says 404 not found.
@studybuddy8307
21 күн бұрын
cs229.stanford.edu/lectures-spring2022/main_notes.pdf
at 9:14 it should be theta transpose that is perpendicular to line
after the probablistic interpretation topic in the last lecture everything has just went over my head can anyone pls tell me a good resource to learn statistics and probability of this level ?
@TusharAnandfg
11 ай бұрын
lookup some books used in first year probability and statistic courses.
@anubhavkumarc
6 ай бұрын
It's a Masters level course so it makes sense that it assumes you know undergrad statistics. Probably go through some undergrad stats courses
@anubhavkumarc
6 ай бұрын
Will add they have prerequisites on the course website with specific courses mentioned so go through that, that'll help.
can anyone please send the problem sheets as i am unable to login through piazza. Thanks for your help.
omg this course Lectures 1-3: easy breezy positivity with Andrew Ng Lecture 4: getting hit in the head with a textbook hope it doesn't keep escalating like this...
At time 26:36 in the video you made a mistake in the expression of phi where the numerator should be e to the power phi and not 1. just do cross product fraction to check it.
where do i learn this level of statistics
What is the practical use of the properties explained in exponential families?
@fahyen6557
Жыл бұрын
finding the expected value and variance. duh
@anubhavkumarc
6 ай бұрын
Optimising the likelihood is much easier in exponential families (that is you train the model more easily), expectation (that is our hypothesis/prediction) and variance are also much easier found computationally (because derivatives in general are less computationally expensive than integrals).
Him writing an expression saying " sum of class triangle, square, circle" is comedic gold. I died. 1:21:00
27:18 Wouldn't the "a(n)" function (log-partition) be log(1+e^-n) + n, instead of just log(1+e^-n)?
@ShaluSarojKumar
10 ай бұрын
exactly my question! 😵💫
Man, this guy's lecture was really disorganised and confusing. Why didn't he just follow Andrew Ng's notes?
@PingCheng-wf2pb
4 ай бұрын
Disagree! I think this lecture is well-organized and clear. For example, the conclusion of the "learning update rule" is particularly useful but unfortunately miss in Notes. Also, the explanation of "assumptions/design choices" is more clear than Notes, which gives me a more concrete sense.
This was really good, it did take a while compared to the previous lectures with Andrew
@calebvantassel1936
Жыл бұрын
How? It's duration is within five minutes of the other lectures.
@fahyen6557
Жыл бұрын
@@calebvantassel1936 what the hell r u talkign about
@UsmanKhan-tb5zy
10 ай бұрын
@@fahyen6557 the duration of the lecture is almost the same
What is h(theta) for the last example of softmax regression?
@rijrya
Жыл бұрын
it's the matrix of all c different logits, normalized, i.e. the thing he writes on the board at 1:17:21 search up softmax regression and click the first stanford edu link for a better explanation
@creativeuser9086
Жыл бұрын
@@rijrya correct. I’m also wondering why wouldn’t we train k different binary logistic classifiers instead of the softmax, especially that we can’t train the model to take input that is not in any of the k classes (say we want to classify the input as either dog, cat, mouse, and we input a horse); in a binary classifier it would output 0 for each of the dog, cat and mouse classifiers, but for softmax p(y)=0 which makes the likelihood 0 no matter what, so we can’t train.
@rijrya
Жыл бұрын
@@creativeuser9086 it probably comes down to an efficiency issue, as creating a binary classification model for each class would be very inefficient especially as the number of classes increases. also since there would be a lot of redundancy in the data used to train each model i.e. the same data is used multiple times for separate models, I think you might run into overfitting issues. for the example you suggested, I think the solution that still incorporates softmax regression would be to have the classes dog, cat, mouse, and none of the above, then this would classify a horse with better results
@rijrya
Жыл бұрын
I searched it up and it seems that k binary classifiers are typically only preferred over softmax when the classes aren’t mutually exclusive, e.g {dog, cat, animal}, in this case softmax would not work very well
@creativeuser9086
Жыл бұрын
@@rijrya I see. Regarding efficiency, there shouldn’t be a difference between K-binary classifiers and softmax since we use the data once to train all k-classifiers in parallel. The number of parameters and gradient computations are the same.
A great lecture! Thank you!
great lectures
Thank you!
Kind of....
The course schedule link and syllabus link provided have the notes links that are not working. Can anybody provide the correct link?
@bernarddanice1294
Жыл бұрын
Do you still need it?
@priyapandey8951
Жыл бұрын
@@bernarddanice1294 yes definitely for my exams I need it.
@user-bv4eh6tt9e
Жыл бұрын
@@bernarddanice1294 sorry, do you still have this links?
@harshgupta8936
Жыл бұрын
@@bernarddanice1294 yes if you have
@harshgupta8936
Жыл бұрын
if you get it from somewhere please send it
lovely
What is bro waffling about
wowwwww lecture
1:00:35
Softmax regression: kzread.info/dash/bejne/m46Ix9iaYLq5hLQ.htmlsi=1cHP27fQImm027xh&t=4101
23:02 Sir, It's PMF not PDF
@antonyprinz4744
Жыл бұрын
probability density function= PDF
@kmishy
Жыл бұрын
@@antonyprinz4744 Bernoulli distribution is for discrete random variable, PMF is defined for Discrete random variable
@joshmohanty585
6 ай бұрын
The distinction between PMF and PDF is entirely artificial. They are both Radon Nikodym derivatives.
This lecture sure does dissappoint 😢
@badassopenpolling
Жыл бұрын
I disagree from your comment. Professor has done a good job. Explained algorithms very well. You got free lectures from a reputed university who has best people. Show some respect !!
@amulya1284
Жыл бұрын
@@badassopenpolling you are right! I got overwhelmed in the beginning when i commented also the topic he is teaching is very mathematical.........but i personally preferred other profs! This guy knows his stuff but i have seen better explanations to the same content :/🫠
@alienfunbug
Жыл бұрын
@@badassopenpolling He's entitled to not being thrilled with the content delivery. Just because it's free, reputable, and in depth doesnt mean there is not room for improvement. I can 100% assure you the instructor would agree, theres always room for growth. A perfect message means nothing if it is not received by its audience.
@creativeuser9086
Жыл бұрын
@@badassopenpolling what is h(theta) for the last example of softmax regression?
@nowornever7990
Жыл бұрын
@@creativeuser9086 thetha is the set of parameters (here it is 2D plane because he has considered two features x1 and x2) which will draw a straight line *thetha1 * x1 + thetha 2 * x2 + constant* (or a plane for n-dimensions). This straight line (or plane) will help us to decide where a point belongs to the one class or not. IF NOT, then if we put the value of a point X (or here x1 and x2) to this plane equation, then the output value will be less than zero.. Else value will be greater than zero if that point is a possible candidate of that class..
This dude obviously didn't have the same level of the in-depth knowledge on this important topic as his professor, but just wrote down whatever on the lecture note. He couldn't provide insightful comments on what he wrote down, which I guess is mainly because he hadn't done any real-world project/research on the topics thus didn't deeply understand the stuff he was lecturing. I mean, he should know the stuff thus what he lectured was right, but he didn't fully understand it thus his audience would feel confused and bored. Sorry for being too harsh on him, but this topic deserves a much better lecture.
there was a student that was asked to write one sentence again ("more bigger"), and it was waste of my time by 10 secend. Do you realize that the total amount of wasting the time of all the viewers in this video is about 45 days O-: this man need to go to jail!
@chinthalaadireddy2165
10 ай бұрын
🤣
@OK-lj5zc
7 ай бұрын
🤣 👏
@vishnumahesh5988
5 ай бұрын
dude seriously? you wasted your valuable 10 seconds by commenting here.
@leeris19
Ай бұрын
@@vishnumahesh5988 right ? AHAHHAH