Quant Psych

Quant Psych

Visual Partitions

Visual Partitions

Advanced dplyr practice in R

Advanced dplyr practice in R

Practice with dplyr in R

Practice with dplyr in R

Пікірлер

  • @ALI_B
    @ALI_BКүн бұрын

    Great stuff as usual. Keep up.

  • @BobbyBrown-n3h
    @BobbyBrown-n3hКүн бұрын

    Awesome! Well done

  • @thejll
    @thejllКүн бұрын

    Nice videos! Do you have one where you explain why we should use mixed-effect modelling instead of doing separate regressions on data subsets?

  • @QuantPsych
    @QuantPsychКүн бұрын

    Kinda of. You can check out this video: kzread.info/dash/bejne/Z6iDy8iGZZTAf84.html

  • @Dodong-pf8dc
    @Dodong-pf8dc3 күн бұрын

    No Taylor series expansion?

  • @babak0203
    @babak02035 күн бұрын

    sir, what shoud do in negative skewed cases? Gamma?

  • @gabrielbrandao9857
    @gabrielbrandao98575 күн бұрын

    Guy! You're amazing. Good job!

  • @vinitalec
    @vinitalec6 күн бұрын

    Your videos are excellent!Thanks for helping me understand this subject.

  • @kiwanukajoseph6812
    @kiwanukajoseph68127 күн бұрын

    So can we conclude that "tobit models, truncated models, and the heckmann model( tobit II model) follow a Gamma distribution?

  • @tomaswust3505
    @tomaswust35058 күн бұрын

    Extremely helpful video ! Thank you for your clear explanations

  • @IArkProject
    @IArkProject9 күн бұрын

    Lol this was awesome, despite super intense delivery it was really awesome with the old school 40s music, and made it feel wholesome. And very helpful metaphor.

  • @jorgeluizdesantanajunior6775
    @jorgeluizdesantanajunior67759 күн бұрын

    Excellent video. I'd just add that you just have to be careful about how to interpret interaction terms with both continuous variables. If you put an interaction of two continuos variables, the continuous variable that moderates the relationship is interpreted in terms of 0 and 1 (similar to a dummy). However, 0 and 1 may not make that much sense in your analysis. For instance, if you are interested in a relationship (y ~ x) that change according to age and your sample comprises people between 18-60 years old, you're measuring how your relationship change if you goes from 0 to 1 year old. This is not that useful if you're studying people between 18-60 years. Because, it can be the case that your relationship is negative for people that has zero years old, increasing until it gets positive at 10 years old. As a result, you may be misled thinking that your relationship change from negative to positive, but in your sample (18-60), it will never be negative. In other words, you can change the signal of x coefficient just changing the scale of age, because the coefficient on x always captures the effect of x when age is zero. So if you have a continuous variable interacting, I usually recommend to rescale it with mean 0 and 1 standard deviation. Then, you have a usefull interpretation: your relationship around the average value of the moderating variable is captured in the coefficient on x, and what happens with your relationship if you increase/decrease it in 1 standard deviation is captured in the interaction term.

  • @QuantPsych
    @QuantPsych2 күн бұрын

    Centering is a good idea if you're interested in interpreting the numbers. I usually don't bother with interpreting them and instead just plot them.

  • @dimitrioskioroglou4316
    @dimitrioskioroglou431611 күн бұрын

    I totally agree with you... ranks are not that useful. The way I think it is that ranks result from an underlying latent process. We need to understand and properly model the process, not the ranks which represent a snapshot. It is not the easier thing to do. But better trying tricky stuff than chasing ghosts.

  • @carylelainecastaneda5924
    @carylelainecastaneda592413 күн бұрын

    Dang. this might be the only time I have come to appreciate econometrics. Thanks! You're such as great teacher!

  • @English_Sauce
    @English_Sauce14 күн бұрын

    Someone commented that your channel is likea gold mine hahaha. They are right

  • @anne-katherine1169
    @anne-katherine116915 күн бұрын

    Ok, I love flexplot (and I'm watching your oldest videos because I love your videos), but something that bothers me is that I don't know how to get the same graphs without it, so I don't know all the details of what it is actually doing, so how would I know if something was wrong, or distorted, or if there were additional details that I'm missing because I skipped many steps by just having a plot right away?

  • @QuantPsych
    @QuantPsych2 күн бұрын

    Are you saying you'd like the underlying ggplot code? There is an option in flexplot to return a string of the ggplot code. I think it's something like "flexplot(y~x, data=d, plot.string=T)"

  • @sjrigatti
    @sjrigatti17 күн бұрын

    Are you teaching statistics in COMIC SANS??

  • @QuantPsych
    @QuantPsych2 күн бұрын

    Nope! Moon flower bold.

  • @tilakbhusal2577
    @tilakbhusal257719 күн бұрын

    My thesis is based on HLM and you just saved my life.

  • @cameronhanna367
    @cameronhanna36719 күн бұрын

    These videos are excellent

  • @kriegsmandot1
    @kriegsmandot120 күн бұрын

    Hey! Thanks so much for these videos! Where might I find this 3 hospitals data set you used?

  • @ClaptarTV
    @ClaptarTV20 күн бұрын

    A very nice video! Thank you for sharing

  • @yulia6354
    @yulia635424 күн бұрын

    as a russian person I think you nailed the russian accent! Well done :D and thanks for your videos! As a medical doctor and a big fan of statistics I really love your way of teaching people complicated stuff)

  • @QuantPsych
    @QuantPsych2 күн бұрын

    High praise from a native :)

  • @calenwu
    @calenwu25 күн бұрын

    mf explains this in 6minutes while my computational statistics professor cant explain this in 2 lectures

  • @criticallyunderfunded3707
    @criticallyunderfunded370726 күн бұрын

    Hi, I have recently stumbled upon your great channel and have been binging your videos. To preface this, I am not anyone special. I am currently doing my Masters in Psych/Neuroscience and stats and stats communication is my side job. With that being said, I vehemently disagree with the perspective that everything should be taught as a linear model. Much of what we do in Psychology is linear modelling, but I found that it is simpler to teach things like the t-Test, the F-Test or the ANOVA independetly and then bring them all together into the little facets of the GLM. I usually frustrates me to no avail when people come to me for help and their professors have decided to just teach everything under the guise of a linear model. For them it just seems completly overwhelming. In principle, I get your argument. It would be better if people had a full understanding of the GLM or even the GLIM but purely didactically speaking this is just too much. Let me make an example. I can teach someone the basic of ANOVA in around 15 minutes. They do not need any special knowledge for my explanation. If they understand, I can teach them about the F-Test, which gives me a vehicle to repeat some things about the chi(squared) and the F distribution. If I want to later integrate it into the GLM, they will have heard the logic of ANOVA thrice (for one factor, for multiple factors and for ANCOVA) and the transition into the GLM is easier. If I teach the ANOVA purely through the GLM lense I need to do so much more preperation. My pupils would have to have a somewhat firm grasp about dummy coding, the basics of regression anslysis and the different kinds of linear model parameters. Then on top of this I have to explain the logic of ANOVA, but not the native logic of ANOVA. Instead I have to explain ANOVA in GLM which on it's own is (imo) a bit more difficult to explain than pure ANOVA. I could, of course, just explain ANOVA and then explain an ANOVA in GLM, but this is just the same thing that I initially outlines, just crammed into one session with more effort and less time spent on what ANOVA really does. Maybe as an aside, I think ANOVA is a brilliant procedure and gives a good introduction into quantifying questions that are not straight forward. However, in my experience, if ANOVA is being taught from the GLM perspective students get the impression that ANOVA is part of the GLM and falls to the same assumptions, which is just not true. ANOVA os exceedingly robust to almost all violations, which cannot be said of the GLM. I

  • @QuantPsych
    @QuantPsych2 күн бұрын

    Agree to disagree. "I can teach someone the basic of ANOVA in around 15 minutes." Yes, but then you have to take another 15 minutes to explain a one-sample t. Then another 15 for an independent t. Then another for a related t. And so on and so on. I can teach someone the basics of GLM in 15 minutes, then they never have to learn another procedure. Instead, they build on what I've already taught them. "My pupils would have to have a somewhat firm grasp about dummy coding." I wouldn't consider that a prerequisite. I do teach dummy coding, but tell them, "YOU don't have to do this. The computer will do this for you. But I'm showing you what's happening in the background so you know it's not magic." "if ANOVA is being taught from the GLM perspective students get the impression that ANOVA is part of the GLM and falls to the same assumptions, which is just not true. ANOVA os exceedingly robust to almost all violations, which cannot be said of the GLM." I very much disagree. ANOVA is just a different way to rearrange the math, so they're equivalent. The assumptions and robustness are identical. I've got a bunch of videos on this and a textbooks. You're welcome to check it out to see how it works.

  • @anangelsdiaries
    @anangelsdiaries28 күн бұрын

    Great video, subscribed!

  • @RichmondDarko-qo2me
    @RichmondDarko-qo2meАй бұрын

    Thank you very much for such informative videos. I spent several years in class and didn't understand all these concepts, but watching this video has made things easier for my comprehension. I have a few questions I would like to ask: When performing a statistical test, we use a parametric test if the data or variable in question is normally distributed, and a non-parametric alternative if the data or variable is not normally distributed. My question is: when does the central limit theorem come into play here? Also, a colleague of mine told me to always use parametric tests even if the data is not normally distributed. His explanation was that parametric tests are more powerful than non-parametric tests. So, should I straightforwardly use the non-parametric alternative when I observe that my data is not normally distributed, or should I take the CLT into consideration and use the parametric test?

  • @QuantPsych
    @QuantPsych2 күн бұрын

    Central limit theorem makes linear models very robust to violations of normality. That means your inferences will probably be sound (i.e., p-values and confidence intervals will be fairly accurate). But, inference is just *one* thing I'm trying to do with stats; I also want to accurately model the data. If the distribution isn't normal, I shouldn't assume a normal distribution. I instead use generalized linear models (not non-parametric tests). Your colleague is wrong. They're only more powerful if you meet the assumptions. But your colleague is right--use parametric models (but the parametric may be a negative binomial regression rather than a typical regression).

  • @Cluless02
    @Cluless02Ай бұрын

    What advances have been made since Erich Fromm??

  • @idodlek
    @idodlekАй бұрын

    Hello Mr. Fife 😀 Does, for example, running general linear model as t-test versus mann-whitney u test and comparing theirs results count as sensitivity analysis? Or only transformations, bootstraping and trimming would count as sensitivity analysis?

  • @QuantPsych
    @QuantPsych2 күн бұрын

    Yes, that could count a sensitivity analysis. I do wonder though if you might run into a situation where MW and t-tests agree, but modern robust methods would disagree.

  • @goktugmk
    @goktugmkАй бұрын

    You're amazing and you explain very clearly. Please keep making videos.😊

  • @stephenomenal1245
    @stephenomenal1245Ай бұрын

    This guy is so f*cking awesome! Very informative as well as great energy!

  • @fruithillfarm6113
    @fruithillfarm6113Ай бұрын

    Diagnostic criteria require an optimal cutoff. Those cutoffs are not arbitrary or determined by one dataset (the focus of researchers). Clinicians often conceptualize the data continuously (e.g., pre-diabetic, higher risk for cardiovascular disease, pre-clinical risk for stress-mediated chronic disease development), but patients want to know if they have a condition or not (category). Clinical scientists don't categorize everything because we only know how to use ANOVAs , but what a condescending standpoint. Eliminating categorical cutoffs eliminates diagnoses. I'm good with that, but really, as a patient, are you?

  • @galenseilis5971
    @galenseilis5971Ай бұрын

    Eventually mutually exclusive choices about whether or how to treat have to be made which induces some amount of discreteness.

  • @QuantPsych
    @QuantPsych2 күн бұрын

    Did I say that *all* clinical scientists categorize things for an ANOVA? I don't believe I said that. Some do. I have known many to do it. It's not condescending to accurately state that some people categorize so they can use their ANOVAs. I think you're missing my point. I recognize (and say as much in the video) that sometimes, at the end of the day, you need to make a decision (e.g., a diagnosis). I do not object to discretizing data at that point (provided we keep in mind the data are continuous). Rather, my objection is discretizing before doing analyses (and analyzing categorical versions of our continuous variables).

  • @DistortedV12
    @DistortedV12Ай бұрын

    If all you know is ANOVA, what would you do instead?

  • @galenseilis5971
    @galenseilis5971Ай бұрын

    What alternatives are appropriate to ANOVA depends on the analysis problem; there isn't a one-size-fits-all approach. Saying that something isn't ANOVA is like saying something isn't a banana; it doesn't narrow things down very much. Start with the problem you want to solve and search for or develop the best method you can for it.

  • @QuantPsych
    @QuantPsych2 күн бұрын

    Hire a statistician :)

  • @dragcot9677
    @dragcot9677Ай бұрын

    as an ecologist in progrees I can say, in ecology EVERYONE is using GLM all the time even when they could be using other simpler methods so here I am trying to actually understand them ahjhahaha

  • @QuantPsych
    @QuantPsych2 күн бұрын

    Ha! Sounds like you're better off in ecology than here in psych.

  • @sprachenwelt
    @sprachenweltАй бұрын

    Or you could just drop it all and go fishing!

  • @QuantPsych
    @QuantPsych2 күн бұрын

    I'm always in favor of fishing.

  • @qwerty11111122
    @qwerty11111122Ай бұрын

    Rowan University! I was in the first year of freshman to go all 4 years majoring in bioinformatics!! Edit: negative binomial mentioned 15:15

  • @QuantPsych
    @QuantPsychАй бұрын

    A fellow prof!

  • @nl7247
    @nl7247Ай бұрын

    Please also discuss the problems when categorical data are analysed as continuous data. Thank you for your videos.❤

  • @QuantPsych
    @QuantPsychАй бұрын

    What problem? That's very common to do that. For example, male/female becomes 1/0 and we can use regression to do a t-test. Unless you mean something else?

  • @nl7247
    @nl7247Ай бұрын

    @@QuantPsychI mean if using continuous numbers to analyze categories, e.g., we don't really consider there could be 0.73 in the gender range when we use 1 or 0 (or 2) to represent only two genders (not getting to get into the recent gender classification discussion here). Or, something which should only be integers that making it continuous does not make sense in real world, although we often say or hear people have an average of 0.83 car... Thank you for your thoughts and reply.

  • @galenseilis5971
    @galenseilis5971Ай бұрын

    @@nl7247 One of the ways that models can be less realistic is to ignore the set of possible outcomes. If I have a count variable, e.g. Poisson, the expected value is not in general an observable outcome. That's okay if you are truly interested in the expected value. If you are not interested in the expected value then you should use something else like a distribution over the observables.

  • @tulipped
    @tulippedАй бұрын

    Myanmese (or Burman, depending on who you ask).

  • @QuantPsych
    @QuantPsychАй бұрын

    Excellent! I was hoping I'd get someone who knows :)

  • @Break_down1
    @Break_down1Ай бұрын

    1:04..or maybe we measure people who share the same gender. Why can’t I see a clear reason that “gender” is not a common candidate for nesting variable (ie people usually just control for it), but classroom always is?

  • @QuantPsych
    @QuantPsychАй бұрын

    With gender we generally exhaust the categories we're interested (e.g., male, female, nonbinary). With classrooms we do not because we can't possibly sample all classrooms out there.

  • @hamidjess
    @hamidjessАй бұрын

    This is a Nobel Price in Languages right here.

  • @freddytackos
    @freddytackosАй бұрын

    i am not in anyway involved with doing statistics. i just love hearing a man be real about things.

  • @brazilfootball
    @brazilfootballАй бұрын

    Thank you for this video!! Can you go into the differences between linear regression vs. the “decision tree” of tests in more detail? Is it a matter of pros and cons of methods or just old vs new techniques? One obvious thing that comes to my mind is one can’t account for repeated sampling with a t-test or ANOVA, right?

  • @QuantPsych
    @QuantPsychАй бұрын

    I think this video will address that: kzread.info/dash/bejne/fauKzsGEj7eyqNI.html

  • @brazilfootball
    @brazilfootballАй бұрын

    @@QuantPsych Doesn't get much better than that! Thank you! 😅

  • @qwerty11111122
    @qwerty11111122Ай бұрын

    10:00 consider the bumblebee

  • @QuantPsych
    @QuantPsychАй бұрын

    TIL about bumblebee languages :) Fascinating stuff!

  • @paulyoung3897
    @paulyoung3897Ай бұрын

    This was great

  • @pianofortissima4410
    @pianofortissima4410Ай бұрын

    Why does he shout the whole time? 😮

  • @swinginkeke
    @swinginkekeАй бұрын

    Totally agree in theory, but docs love ORs and the Titanic turns slowly. How can I better communicate interpretability of betas if I keep the outcome continuous? “For each year older the kiddo is, we see delay to initial imaging increase by 1.6 days.” The blank stares haunt my dreams.

  • @QuantPsych
    @QuantPsychАй бұрын

    True. Probably better to show them a plot.

  • @galenseilis5971
    @galenseilis5971Ай бұрын

    Are you equivocating the random variables with (conditional) expected values of the variables? They are not the same in important aspects for planning.

  • @1997aaditya
    @1997aadityaАй бұрын

    Why don't you use poly(var_name, n) instead, for orthogonal polynomials?

  • @QuantPsych
    @QuantPsychАй бұрын

    Because I can never remember how to do that.

  • @trini-rt6xn
    @trini-rt6xnАй бұрын

    I'm not a Statistician or a Biostatistician, and I'm not even good at Math, but your explanation was so crystal clear even I can understand it. Sweet! And I've had Senior Level Management folk - VPs, SVPs - from major Big Pharma companies ask to keep hacking away at data that plain as daylight like the Continuous Variable Distribution you showed in this video, and I keep asking myself: "am I so stupid? Am I missing something obvious?" After all, the data is being summarized and showing whatever its showing, but somehow the big folks want it to show something else. And I'm always like "what else do you want it to show? It is what it is!" Of course, I swallow my pride and hide my impatience because maybe, just maybe, I'm really stupid. But after months of slicing and dicing data into invisible chunks, it always comes back to where I started. Scary! Thanks again for making advanced topics palatable for myself and others like me. It gives us hope.

  • @QuantPsych
    @QuantPsychАй бұрын

    Thanks!

  • @royals2013
    @royals2013Ай бұрын

    “But previous literature did” hm ok yeah let’s shy away from that excuse

  • @QuantPsych
    @QuantPsychАй бұрын

    Seriously!

  • @galenseilis5971
    @galenseilis5971Ай бұрын

    What excuse? Fife cited a paper. What's wrong with that? If you read the paper and find problems with it that's one thing, but citing a source for a claim isn't an excuse as I understand it. Please elaborate.

  • @royals2013
    @royals2013Ай бұрын

    @@galenseilis5971 PI only wants to do something because previous literature did something. Statistics evolves, better practices emerge. Bad statistics are replicated far too often from people assuming the original methods are appropriate.

  • @galenseilis5971
    @galenseilis5971Ай бұрын

    @@royals2013 I understand the context of your comment better. Thank you. Yeah, that's a really good point. A lot of literature has false claims in it, and that is something to have some vigilance about. Assuming uncritically that previous literature is *de facto* correct is unwise. It sounds like Fife has read the paper, albeit a substantial amount of time ago. If we want to further evaluate the paper that is up to us. In this context I think Fife is just making a claim with a citation. You're right that we should not take the conclusions of the paper at face value, but I don't think it is excuse-making to cite previous work as evidence for a claim.

  • @royals2013
    @royals2013Ай бұрын

    @@galenseilis5971 ah wasn’t meaning it as a quote from quantpsych btw lol just a quote that I hear a lot from PIs. Totally agree w everything said in the vid

  • @antoniobarros3415
    @antoniobarros3415Ай бұрын

    As always, the vlog is excellent. It brings to mind a quote from Frank Harrell on categorisation: ‘Employ it when the intention is to mislead the reader" ;-)

  • @galenseilis5971
    @galenseilis5971Ай бұрын

    I've enjoyed perusing Harrell's biostatistics book.

  • @McDreamyn_mdphd
    @McDreamyn_mdphdАй бұрын

    I've encountered that many tend to create categorical variables to use as predictors in logistic regression models, so that the value on the logit scale can be easily interpreted as an odds ratio. But what they don't realize is that the values can be recoded to keep the continuous distribution of the variable, but transformed it so that the value of 0 can indicate the value of say the bottom 25th percentile and the value of 1 can equal the value at the upper 25th percentile. Now in theory you are still interpreting the values as if they were a binary variable, but at least you do not lose statistical power by capping the natural variability of an informative covariate

  • @swinginkeke
    @swinginkekeАй бұрын

    Can you walk me through a practical example of this? I’m a biostatistician at a hospital and or docs ALWAYS want odds, even at the expense of losing data/power/etc. I like this idea, but haven’t come across it before.

  • @galenseilis5971
    @galenseilis5971Ай бұрын

    Hmm, I cannot say that I find this use case compelling. The canonical logistic regression is already clear enough to interpret as-is without further tinkering. Not that I think other categorizing strategies are appealing here either.

  • @antoniobarros3415
    @antoniobarros3415Ай бұрын

    @@swinginkeke probably, they should take the responsibility for the decision. They should have a look to DCA (decision curve analysis).

  • @McDreamyn_mdphd
    @McDreamyn_mdphdАй бұрын

    @@swinginkeke as (Xi (Xi− X25th Percentile) / (X75th Percentile − X25th Percentile), where 0 = 25th percentile and 1 = 75th percentile for person i on variable X.

  • @McDreamyn_mdphd
    @McDreamyn_mdphdАй бұрын

    @@galenseilis5971 Well, I agree, but for publication purposes in medical journals, there is often less interest in understanding a single unit by unit increase in the log odds of say some diagnostic test and instead there is a desire to transform the interpretation to something that is clinically meaningful. If I have a patient and I want to understand the dose-effect of a statin on an inflammatory marker (troponin), the transformation I outlined above is a very straightforward approach of translating the odds ratio in a very easy and understandable metric, especially for clinicians who may not necessarily be adept at reading medical literature. Over my career, I have learned that success it is less about what I know, and instead what I can do to demystify the numbers and make them clinically relevant to my peers in the publication process.