Quant Psych
4 ай бұрын
2,366
1

Dealing with nonlinear data: Polynomial regression and log transformations

Come take a class with me! Visit simplistics.net
Here's the video on transformations: • Transformations in Sta...
Here's the video on diagnostics plots: • Diagnostics: What to l...
Here's the video on poisson regression: • Poisson Regression in ...
Here's the videos on generalized linear models: • Understanding Generali...
Here's the link to the blog post I write: quantpsych.net/some-thoughts-...
My Multivariate playlist: • Multivariate Statistics

Пікірлер: 50

@toryreads4 ай бұрын
Consider this me asking nicely (BEGGING) for the non-linear regression/Bayesian video! :D Also, arm twist! Arm twist!
@QuantPsych
4 ай бұрын
Noted!
@derekcaramella873019 күн бұрын
How have I not found this channel sooner! Amazing stuff, binge watching this channel
@jackelsey76564 ай бұрын
Here's another vote for a nonlinear regression analysis video. That approach made sense for my dissertation research (inverse problems with mechanistic time-series models), and I'm curious what your perspective is. It seems to me like weighted least squares can work well in many heteroscedastic contexts if you assume residuals are independent and have a constant CoV.
@galenseilis5971
4 ай бұрын
I agree, there are some heteroskedastic processes with parameters that can be estimated with weighted least squares.
@user-zf7tk7io6c4 ай бұрын
Hello, i am statistician. i live in Africa and really appreciate this lessons. So Fun thanks to the Teacher.
@Lello9914 ай бұрын
Hey Dustin! Speaking of non-linear data, what about a video on Generalized Additive (Mixed) Models? GA(M)Ms?! I'm sure it'd be sooooo useful for many of us!!!
@galenseilis5971
4 ай бұрын
Data itself is almost never linear, although in special cases it can be. In order for a Cartesian product of two sets to be linear it must be a function (i.e. left-total and right-unique) satisfying homogeneity of scaling of order one and additivity. The only data sets I have encountered that were linear were synthetic examples.
@galenseilis5971
4 ай бұрын
I've used GAMs with time series data, and I can readily see GAMMs be similarly useful in hierarchical time series.
@galenseilis5971
4 ай бұрын
I take it back. I don't think any finite sample can be linear since for any maximum point there will exist a scalar multiple of it in the real numbers that is not in the data set.
@jackelsey7656
4 ай бұрын
Bottom of the heap has a nice video on it.
@galenseilis59714 ай бұрын
I look at non-linear regression and Bayesian regression as logically independent classes of models. They can both involve using more fundamental principles, rather than just grabbing a recipe off the shelf as the other extreme, which I think is a valuable skill for a statistician to have.
@dominicl67124 ай бұрын
Well done! Looking forward to the GLM video. I still did not fully understand the link functions there.
@galenseilis5971
4 ай бұрын
FWIW the Wikipedia page on generalized linear models discusses the role of the link function explicitly.
@QuantPsych
4 ай бұрын
I've already made a video on GLMs: kzread.info/dash/bejne/haWCj9OlgbKzZaQ.html
@galenseilis59714 ай бұрын
Plotting the residuals can be very beneficial for learning about the performance of a predictive model. There is a common pitfall worth mentioning though. The distribution of the residuals is not in general the likelihood distribution. Take for example the equation Y = X + epsilon where Y ~ Poisson(lambda) X ~ Poisson(mu) and epsilon ~ Poisson(tau). If you compute the residuals you will obtain a Skellam random variable rather than a Poisson random variable.
@StatisticsSupreme4 ай бұрын
On the log transform, you say the estimate b is now on a log scale, yes. But that is not a problem for interpretation, when you transform it back to where it came from. Exponentiate that value and you are back and can interpret it as normal. So there is no real "cost" there. But overall nice video, as always :D
@galenseilis5971
4 ай бұрын
I agree.
@galenseilis5971
4 ай бұрын
Also, since the logarithm is monotonic we can readily anticipate the direction of change in the conditional expectation when we consider a change in one of the predictors.
@galenseilis5971
4 ай бұрын
Supposing for example the conditional expectation E[Y|X=x] = exp(m * x+b) then it is straightforward to take the derivative with respect to x via the chain rule of calculus: dE[Y|X=x]/dx = m * exp(m * x + b) Thus we can calculate how much Y is changing on average with respect to a change in x by knowing m, x, and b.
@QuantPsych
4 ай бұрын
Except that it's not a constant change anymore, meaning we can't say "for every change in our predictor, there is a x point change in y."
@galenseilis5971
4 ай бұрын
@@QuantPsych That's true. Since not everything is linear, or even most things, it is wise not to recoil from introducing non-linearity into models when it has warrant. Reading from Kit Yates', "How to expect the unexpected", I came across the term "linearity bias". It is informally defined as a cognitive bias of tending to assume that changes are linear. One concern I have about only (or predominantly) teaching models that are linear in the predictors is that it may enculcate or reinforce linearity bias in students. But I'm not read on the psychology or education literature to say if that concern has been addressed; just concerned for now.
@i9iveup4 ай бұрын
I would recommend Fractional Polynomial Models that identify the best transformations of the covariates, with the obvious risk of overfitting and ambiguity in the interpretation of the coefficients.
@galenseilis5971
4 ай бұрын
Oh neat, I didn't know that approach by name. I agree that over fitting is the largest risk with fractional polynomial models since they're a natural superset of polynomial models.
@galenseilis59714 ай бұрын
It has been a long while since I have really thought about semi-partial correlation coefficients. But if memory serves it does not in general equal to the conditional correlation coefficient except under certain families of distributions. A sufficient criterion for distributional assumptions to hold such that the partial correlation equals the conditional correlation is when the joint distribution is in an exponential parametric family of distributions.
@galenseilis59714 ай бұрын
I have noticed the word "line" in "linear", but unfortunately the terminology is more complicated than Dustin presented. I'll give a couple of reasons: The first is that they are not synonyms in mathematics. All lines are linear, but not all linear functions are lines. For example, the derivative operator linear on the space of analytic functions, but it is not a line per se. The second is that statisticians were focused on the parameters when they coined the term "linear model". Conventionally "linear model" refers to a regression model which is linear in its conditional expection with respect to the unknown parameters. This makes both the example polynomial regression and log-transformed regression model in the video out to be special cases of linear models.
@TheMrSodo914 ай бұрын
Thanks for your work as always, I am approaching Bayesian statistics so it would be great to see you going into bayesian regression. please please please!
@QuantPsych
4 ай бұрын
It's on my to-do list :)
@adrianor.3974 ай бұрын
What package has the visualize() function? Great explanations, as usual!
@QuantPsych
4 ай бұрын
flexplot
@igoryakovenko13434 ай бұрын
Great video! Any chance you be open to sharing a link to the dataset used so we can re create the exercise and try it ourselves? Thank you!
@galenseilis5971
4 ай бұрын
If the data is not available you can readily simulate data suitable for these cases if you just need them for exercise.
@QuantPsych
4 ай бұрын
Most of my datasets are here: quantpsych.net/data/ That particular one is called depression_wide
@igoryakovenko1343
4 ай бұрын
Thank you for directing to the link. Am I completely blind or does the depression_wide set not contain any of the variables in the video (i.e., cancer related or rizz)? Sorry if I'm missing it somewhere.@@QuantPsych
@19827574 ай бұрын
How do you interpret the coefficients of the polynomial model?
@galenseilis5971
4 ай бұрын
Hint: The conditional expectation of the predicted variable on a predictor will be monotonic in the coefficients under mild assumptions.
@QuantPsych
4 ай бұрын
It's not intuitive. It's the expected change in Y when the square of X increases by one unit. The only thing that's really intuitive is the sign (positive versus negative, indicating whether it's concave upward or downward, respectively). I usually don't bother interpreting it. I just look at the plot.
@galenseilis5971
4 ай бұрын
@@QuantPsych In the case of a quadratic polynomial the sign(um) of the leading coefficient tells us about concavity/convexity. This works because a twice-differentiable function of a single variable is convex if and only if its second derivative is nonnegative on its entire domain. A similar result holds for concavity. Polynomials are always twice-differentiable. Many polynomials are neither convex/concave over their entire domain. The second derivative of a quadratic is always the leading coefficient, which is why the inference is straightforward in this case. The leading term of higher-degree polynomials cannot reliably be used this way. When we're dealing with single-variable functions I'd give the same recommendation; just look at a plot. When you get into multivariable systems (which is typical of realistic systems) it is much more difficult to eyeball the convexity/concavity. I think that trying to visually infer concavity/convexity from PCA plots or parallel axis plots is unlikely to be reliable, for example. If you're lucky enough to have a function that is second-differentiable in all its inputs, then you can generalize the result given in single-variable calculus. It requires finding the stationary points using the gradients, then using the Hessian to (1) determine which points are optima and (2) then for the optimal points use the signum of the eigenvalues to evaluate convexity/concavity/neither.
@1982757
4 ай бұрын
In other words, it depends...@@QuantPsych
@galenseilis59714 ай бұрын
The biggest limitation of polynomial regression is over fitting. Via the Stone-Weierstrass theorem we can say that a sufficient number of polynomial terms will fit as well as we like. In fact, many functions (including the exponential function; wink wink) have a Taylor series which is basically a polynomial with an infinite number of terms.
@1997aaditya2 ай бұрын
Why don't you use poly(var_name, n) instead, for orthogonal polynomials?
@QuantPsych
2 ай бұрын
Because I can never remember how to do that.
@bobbrian16414 ай бұрын
You ARE hot. And entertaining. Great video. I will have to learn this stuff eventually... Subscribed.
@QuantPsych
4 ай бұрын
Ha! Flattered again :)
@galenseilis59714 ай бұрын
The phrasing "y=x^2 gives a polygon" must have been a brain fart. It happens. Polygons and polynomials are distinct mathematical concepts.
@QuantPsych
4 ай бұрын
Yes, the brain did indeed fart.