How to Visualize Data (Part 1)

Do you want more structured and personalized information? Come take a class with me! Visit simplistics.net and sign up for self-guided or live classes.
Additional Resources:
• Univariate Visualizati...
• Controlling/Conditioni...
• Understanding Generali...
• Bivariate Visualizatio...
• Dealing with nonlinear...

Пікірлер: 47

  • @samj.vizcaino-vickers8512
    @samj.vizcaino-vickers85124 ай бұрын

    Thank you for the Visual Partitions article!! :)

  • @QuantPsych

    @QuantPsych

    4 ай бұрын

    Of course!

  • @galenseilis5971
    @galenseilis59714 ай бұрын

    I take adjective "multivariate" to mean that there are multiple (often to be taken as "more than two") statistically dependent variables in the full joint distribution for the collection of variables being considered. This definitional approach mostly agrees with what people more broadly call "multivariate analysis" which might include MAN(C)OVA, GLM, PCA, factor analysis, CCA, CA, PCoA, discriminant analysis (e.g. LDA), clustering, recursive partitioning, ANNs, parallel coordinate plots, simultaneous equations, vector autoregression, and many others. But it does disagree with people who say that "multivariate regression" merely means multiple predicted variables as my view allows for further statistical dependence among the predictors. I've even seen people say that you cannot perform regression when there are statistically dependent predictors (often citing perfect multicollinearity or variance inflation). But I don't agree with that at all. Often what is required in my experience is to include a latent covariance matrix over those variables.

  • @janja7471
    @janja74714 ай бұрын

    Good too see some new stuff coming up,

  • @galenseilis5971
    @galenseilis59714 ай бұрын

    The marginal plots are a neat idea.

  • @galenseilis5971
    @galenseilis59714 ай бұрын

    I've never taught an introduction to categorical variables, but I expect that the visualization of "drawing a slope" helps students see that this is part of the space of linear models. It is metaphorical as there are no intermediate values in the data between the two levels of a binary (categorical) variable.

  • @galenseilis5971
    @galenseilis59714 ай бұрын

    If you're going to plot integer-valued data using a histogram, I recommend taking care to set the bin widths to unity and decide whether you want the bin boundaries to be left/right/center-located.

  • @marketingresearchmethods3078
    @marketingresearchmethods30783 ай бұрын

    These videos are not for the first-timers. Cool for those who already know how to chat about statistics.❤

  • @QuantPsych

    @QuantPsych

    3 ай бұрын

    Alas, this is true.

  • @galenseilis5971
    @galenseilis59714 ай бұрын

    One thing I would like to do more of is use colour palettes in my plots that are still distinguishable to people with colour blindness. Using different line and marker styles also helps.

  • @dominicl6712
    @dominicl67124 ай бұрын

    Very well done with the three-way interaction. I currently have to deal with it 😢

  • @faruqueazamwalid219
    @faruqueazamwalid2192 ай бұрын

    thanks a bunch!! I have a random question: how to add a proportional weightage to a numeric variable on the outcome variable in a lm/glm and visualise, it is much like giving the weightage to the sample size of each study “n” to the outcome. Much like meta analysis. thanks in advance.

  • @jseen9568
    @jseen95682 ай бұрын

    Your last example is essentially a mediation or a moderation effect correct? and a 4 way interaction, or changing variable D alters the the effect of variable A, on variable B, in the presence of variable C. Which is essentially a moderated-mediation effect.

  • @lzuo123
    @lzuo1234 ай бұрын

    really enjoy the video, thanks!

  • @aza6513
    @aza65134 ай бұрын

    Oh thanks you comebackk still waaiting your make tutorial on latent varoable modeling, factor analysis, pca optimal scaling, sem, irt as glmm, dcm etc

  • @QuantPsych

    @QuantPsych

    4 ай бұрын

    I'll put them on my list :)

  • @toad8427
    @toad84274 ай бұрын

    Goated

  • @galenseilis5971
    @galenseilis59714 ай бұрын

    N-way interaction effects can be well-understood through a combination of pure mathematics and simulation experiments. I don't tend to find them completely on their own, but if you do then they should tend to maximize the 'multilinear' product-moment correlation coefficient that I developed in my MSc thesis (section 3.1 if I recall correctly). These days I feel Luke-warm about the correlation functions I defined, but my discussion of parity, signum and orthants might provide some intuition.

  • @galenseilis5971
    @galenseilis59714 ай бұрын

    What precisely is meant by "curvilinearity" in this video? Just a synonym for a model which is nonlinear in its predictors? The examples of log transform and polynomial predictors in the recent video are still just linear models in the conventional sense of being linear in the parameters. That would seem to match.

  • @robyndasilva4609
    @robyndasilva46094 ай бұрын

    Love the videos! request for a future video about moderation mediation analysis in R? :)

  • @QuantPsych

    @QuantPsych

    4 ай бұрын

    I'll add it to my list :)

  • @jessicaray1762
    @jessicaray17622 ай бұрын

    Hey can you do a video explaining the dispformula in glmmTMB? Thanks ahead of time.

  • @ambiirnyc
    @ambiirnyc4 ай бұрын

    this is great

  • @mariogallego9678
    @mariogallego96784 ай бұрын

    Looking forward to watching the second part! How could I plot a 3-way interaction model with flexplot if I have two random factors?

  • @QuantPsych

    @QuantPsych

    4 ай бұрын

    Meaning two categorical variables? The same way you would with two numeric variables.

  • @mariogallego9678

    @mariogallego9678

    4 ай бұрын

    @@QuantPsych exactly, the 2 random factors are categorical. Additionally, one of the predictors of the 3-way interaction term is numerical. Ideally, it should be the x axis and plot the trends. So far, I am using emtrends.

  • @davidpiterman1100
    @davidpiterman11004 ай бұрын

    Hi Thank you for all your videos. Do you have a video explaining how to handle Outliners? (e.g., omit zscore >3?, IQR *1.5 and etc)

  • @QuantPsych

    @QuantPsych

    4 ай бұрын

    Not specifically that. My approach is to analyze it as is, then analyze it with the outlier deleted and see if it matters. If it doesn't matter, I don't have to worry about it. If it does, I report the results of both approaches.

  • @galenseilis5971
    @galenseilis59714 ай бұрын

    Lines and line segments are not actually the same thing, Colloquially it is fine to call a line segment a line, but when careful reasoning is required I think that conflating these terms can create confusion. Despite the word "line" in "line segment", line segments are not lines and they are not linear due to their end points. Lines by definition must go on forever in both directions. You have thousands of years of geometry to thank for that. If I recall correctly, Euclid was the first mathematician to axiomatize Euclidean geometry. Among those axioms is the parallel postulate. Interestingly, rejecting the parallel postulate leads for elliptic and hyperbolic geometry. And just before you think that's just math trivia, both of these non-Euclidean geometries are used in Physics. And if you think that's weird, wait till you see pseudo-Riemannian manifolds.

  • @anne-katherine1169
    @anne-katherine11694 ай бұрын

    you could make up a better word to replace "multivariate" for multiple predictors then it gets a wikipedia entry and you get one for inventing it *plus one for marginal plots; why not

  • @galenseilis5971

    @galenseilis5971

    4 ай бұрын

    My experience with Wikipedia is that the term would usually need to be somewhat accepted by a community of people 'before' moderators would allow such a Wikipedia page to exist.

  • @QuantPsych

    @QuantPsych

    4 ай бұрын

    I should :). Maybe just a "dustin" analysis. "Oh, I see you have multiple predictor variables. Have you ever taken a dustin class? You need to use a dustin analysis for this."

  • @galenseilis5971
    @galenseilis59714 ай бұрын

    Some of this approach seems to assume that if you have evidence of non-linearity in the predicted variable (as a function of the predictors) that it is presumably from an interaction effect. While interaction effects are non-linear (specifically multilinear), they are just one of an infinite number of ways for something to be non-linear. If I were to say that a non-linearity isn't an interaction effect it would be like saying that something isn't a banana; it doesn't narrow it down very much. I think that interaction effects are worth considering, but when I see evidence of non-linearity I ask a broader question about what form of non-linearity rather than defaulting to the assumption of an interaction effect.

  • @QuantPsych

    @QuantPsych

    4 ай бұрын

    I see why you're saying that, but that's not what I meant. In retrospect, i should have clarified that a bit. If you read the paper it's more clear why you're looking for nonlinear effects.

  • @galenseilis5971

    @galenseilis5971

    4 ай бұрын

    Fair enough.

  • @tatjanajak
    @tatjanajak4 ай бұрын

    Yay!

  • @QuantPsych

    @QuantPsych

    4 ай бұрын

    Double yay!

  • @zimmejoc
    @zimmejoc4 ай бұрын

    when looking at the intercepts to see if there's a main effect, shouldn't we mean center things? Even the smallest change in slope can have big differences in the intercept if our X range is quite large instead of closer to zero. I love geeking out about stats, but nobody else I know does :(

  • @QuantPsych

    @QuantPsych

    4 ай бұрын

    Unless I'm missing something, I don't see how that will change anything. The axis labels will change, but the angle of the slopes won't. See the R code below (if youtube will allow me to post it??). require(tidyverse) require(flexplot) a = matrix(.4, nrow=3, ncol=3); diag(a) = 1 a = fifer::cor2cov(a, c(5,5,5)) d = MASS::mvrnorm(n=555, mu=c(50,50,50), Sigma = a) %>% data.frame %>% set_names(c("y", "x1", "x2")) flexplot(y~x1 + x2, data=d, method="lm") flexplot(y~x1 + x2, data=d %>% mutate(across(everything(), scale)), method="lm")

  • @zimmejoc

    @zimmejoc

    4 ай бұрын

    @@QuantPsych I'm getting a error when I try to install your fifer package. Welp, that means I need to sign up for your simplistics R course. I know just enough R to be dangerous and chatGPT has made me very dangerous with R. Time to learn R for realz...I was wanting to hold off until I had my latest R&R back at the journal and my current paper submitted before pulling that trigger.

  • @galenseilis5971

    @galenseilis5971

    4 ай бұрын

    @@zimmejoc I think you're coming to a reasonable conclusion that ChatGPT is not really a substitution for knowing how to code.

  • @jeevacation
    @jeevacation4 ай бұрын

    I really like these videos but the background music is distracting sadly :(

  • @determinedsalmon178
    @determinedsalmon1784 ай бұрын

    Could you please turn up the volume of the background music? Like a lot?

  • @QuantPsych

    @QuantPsych

    4 ай бұрын

    Ha! Somebody still complained and I can only hear it when I am not talking *at all*. The BG music helps disguise the noise of cars driving outside my window, and it gives a "mood" to the videos I quite like :)

  • @galenseilis5971

    @galenseilis5971

    4 ай бұрын

    @@QuantPsych FWIW I like the music itself, but there are various tools and associated tutorials for removing background noises such as vehicle traffic.

  • @determinedsalmon178

    @determinedsalmon178

    4 ай бұрын

    @@QuantPsych I was jk. Love your videos.