yuzaR Data Science

Welcome to my VLOG! My name is Yury Zablotski & I love to use R for Data Science = "yuzaR Data Science" ;)

This channel is dedicated to data analytics, data science, statistics, ML and AI! Join me as I dive into the world of data analysis, & coding. Whether you're interested in business analytics, data mining, data visualization, or pursuing an online degree in data science, I've got you covered. If you are curious about Google Data Studio, data centers & certified data analyst & data scientist programs, you'll find the necessary knowledge right here. You'll greatly increase your odds to get online master's in data science & data analytics degrees. Boost your knowledge & skills with my engaging content. Subscribe to stay up-to-date with the latest & most useful data science programming tools. Let's embark on this data-driven journey together!

If you wish to support me, please join the channel 🙏 kzread.info/dron/cGXGFClRdnrwXsPHi7P4ZA.htmljoin

Ай бұрын

Data Reveals | How to be Successful and Happy | How to avoid being Poor and Unhappy (4K)

3 ай бұрын

Multivariable Linear Regression in R: Everything You Need to Know!

3 ай бұрын

9 FLAWS of ‘Summary’ Function You DIDN’T Know About and How to Fix Them

4 ай бұрын

Master Simple Linear Regression with Numeric Predictor in R

5 ай бұрын

Quantile Regression Reporting Made Easy: How to Create Stunning Plots and Tables in Minutes!

6 ай бұрын

Make Multiplots Like a Pro with {patchwork} | R package reviews

7 ай бұрын

Master Box-Violin Plots in {ggplot2} and Discover 10 Reasons Why They Are Useful

8 ай бұрын

7 Reasons to Master Scatter Plots in {ggplot2} with World Happiness Data

9 ай бұрын

Histograms and Density Plots with {ggplot2}

10 ай бұрын

Bar Charts with {ggplot2}

11 ай бұрын

Conditioning with {dplyr} Modify Your Data Quick

11 ай бұрын

Join Tables with {dplyr}

11 ай бұрын

Combine Tables with {dplyr}

Жыл бұрын

Transform Your Data Like a Pro with {tidyr} and Say Goodbye to Messy Data!

Жыл бұрын

Mastering {dplyr}: 50+ Data Wrangling Techniques!

Жыл бұрын

Top 10 Must-Know {dplyr} Commands for Data Wrangling in R!

Жыл бұрын

Don’t Ignore Interactions - Unleash the Full Power of Models with {emmeans} R-package

Жыл бұрын

{emmeans} Game-Changing R-package Squeezes Hidden Knowledge out of Models!

Жыл бұрын

Quantile Regression as The Most Useful Alternative for Ordinary Linear Regression

Жыл бұрын

R package reviews {gtsummary} Publication-Ready Tables of Data, Statistical Tests and Models!

Жыл бұрын

Effective Resampling for Machine Learning in Tidymodels {rsample} R package reviews

Жыл бұрын

4 Reasons Non-Parametric Bootstrapped Regression (via tidymodels) is Better then Ordinary Regression

Жыл бұрын

R demo | Many (Grouped / Nested) Models Simultaneously are Very Effective

Жыл бұрын

R demo | Robust Regression (don't depend on influential data)

Жыл бұрын

R package reviews | sjPlot | Easily Visualize Data And Model Results

Жыл бұрын

R package reviews | report | Report Statistical Results of Tests, Models, Data!

2 жыл бұрын

R package reviews | glmulti | Find The Best Model !

2 жыл бұрын

Tidy Data and Why We Need It!

2 жыл бұрын

R demo | Kruskal-Wallis test + Post-Hoc | How to conduct, visualize, interpret & more 😉

Пікірлер

@WilForDataScience2 күн бұрын

Hey there! I'm wondering where you get the information about the conventional thresholds for interpretation (like for p-values, Bayes, etc). There are so many different versions from different authors out there, which one should we trust? I'm really struggling to make up my mind! I already know about the effectsize package, but should we trust their frames of reference? Thanks in advance sir.

@Dominus_Ryder2 күн бұрын

The code link in the description does not seem to be working anymore, do you have an updated one by any chance?

@bartoszkedziora32565 күн бұрын

Absolutely amazing

@yuzaR-Data-Science5 күн бұрын

Thanks so much 🙏 hope you enjoy other topics too!

@bervelinlumesa34755 күн бұрын

Hello Sir. Is it possible to display only one category of the dichotomous variable provided with the "by" parameter. For example I only want to display the percentage of people who said Yes (sport-Oui). library(gtsummary) library(questionr) data("hdv2003") hdv2003 %>% tbl_summary( include = c("sexe", "relig", "relig"), by = "sport", percent = "row", statistic = all_categorical() ~ "{p}%" ) . The objective is that I would subsequently like to combine several tables where in the column I will have the percentages of several variables.

@yuzaR-Data-Science5 күн бұрын

Yes it’s possible. Check out the arguments of the Funktion please yourself, I am away from my computer for a week. And you can combine several tables easily via tbl_merge

@bervelinlumesa34754 күн бұрын

@@yuzaR-Data-Science Thank you

@bervelinlumesa34754 күн бұрын

I found modify_column_hide function that can do it

@RiyazBashaRK6 күн бұрын

Hi professor.Is there any way we can dynamically adjust line in quantile regression plot and it should affect other plots

@yuzaR-Data-Science5 күн бұрын

Which line do you mean? And which other plots?

@user-dl5go9tg6g9 күн бұрын

Amazing

@yuzaR-Data-Science9 күн бұрын

Thanks 🙏

@ramoda1312 күн бұрын

Amazing video. Thank you.

@yuzaR-Data-Science12 күн бұрын

Glad you liked it! Thanks for watching!

@ramoda1313 күн бұрын

Great video

@yuzaR-Data-Science13 күн бұрын

Glad you enjoyed it! Thanks for watching!

@joaoalexissantibanezarment476613 күн бұрын

Excellent video🙌 It would be great a video of bayesian analysis

@yuzaR-Data-Science13 күн бұрын

Noted! Thanks for your positive feedback! 🙏

@hendrikpehlke497313 күн бұрын

Thank you for this video. I just have two question. 1. You showed the plot of the variable importance ("plot('model_name', type = "s")). Is there a way to extract the names of the variables and/or interactions using a threshold (e.g. 75%)? I need them as a list, e.g. "variab_interactions_plus_75 <- ????". 2. I used the example data "mtcars". Is an interaction "hp:cyl" equal to an interaction "cyl:hp"?

@yuzaR-Data-Science13 күн бұрын

Hi Hendrik, not that I know of. Once I needed to use for my paper, I created the data frame manually and only put the things inside, I wanted. But I think it's not that much more work, after algorithm done the whole heavy lifting ;) cheers

@hendrikpehlke497314 күн бұрын

I also subscribed. Your videos are always very informativ and helpful. Thank you.

@yuzaR-Data-Science14 күн бұрын

Thanks for the sub! And for watching! I am happy you like my content!

@hendrikpehlke497317 күн бұрын

Wow! Thank you! So many important informations. I have to watch this video several times. But one question: In which order would you use the packages "janitor" and "dlookr". Would be interesting to teach people how to load and handle "dirty" excel table, fix some excel problems (e.g. date as numbers or entries like "no data" in numerical columns etc) and if those problems are fixed to use "dlookr" to diagnose, explore and repair the data.

@yuzaR-Data-Science16 күн бұрын

Thanks a lot for your nice reply, Hendrik! I would use janitor first and dlookr on top. I guess you already have seen the janitor video on my channel. If not, feel free ;) I also have one video on tidy data, where I show the dirty table, but there is not much of R programming. Thanks for watching!

@hendrikpehlke497319 күн бұрын

Wow! One of the best videos I have ever seen. Vwer informative.

@yuzaR-Data-Science19 күн бұрын

Wow, thanks for such a generous feedback! If you know some folks who also would benefit from it, feel free to share it! I wish I had something like this video as I started to learn R. I hope the other videos are helpful too! Thanks again! Cheers!

@ManuelAlejandroCastroGarcia20 күн бұрын

Thanks for sharing your knowledge. How can I add the Yate´s correction into the plot?

@yuzaR-Data-Science19 күн бұрын

Unfortunately, either not possible, or I don't know why and how. But thanks for good feedback!

@ManuelAlejandroCastroGarcia20 күн бұрын

Thanks for sharing your knowledge. How can I add the Yate´s correction into the plot?

@yuzaR-Data-Science19 күн бұрын

Unfortunately, either not possible, or I don't know why and how. But thanks for good feedback!

@joaoalexissantibanezarment476620 күн бұрын

This is an excellent video!! I was thinking, a nonparametric alternative for linear regression could be LOESS regression and boostrapp could be done without problem but, because LOESS is a nonparametric, instead of medians the means could be used properly or also in this case the medians should be used?

@yuzaR-Data-Science19 күн бұрын

While resampling allows for a better use of means, I am a big fan of medians, because if the distribution of anything after bootstrapping does not get normal, like in the case of p.values, I would trust the median, but not the mean. So, I would use median as much as I can.

@joaoalexissantibanezarment476619 күн бұрын

@@yuzaR-Data-Science Ok, I really thank you for answer!

@yuzaR-Data-Science19 күн бұрын

you are very welcome!

@joaoalexissantibanezarment476617 күн бұрын

@@yuzaR-Data-Science I had another question. Althoguh bootstrapping is not exactly an option to handle outliers, could be the case that the more resamples used, the more robust is the model to outliers?

@yuzaR-Data-Science16 күн бұрын

Yes, because then you would resample the most frequent cases more often, so their distribution would be higher, and the outliers ... hmm, we would not get rid of them, but they will be resampled very rarely. hope that helps. cheers

@manaka_21 күн бұрын

man u just won another subscriber. This is one of the best videos about dplyr that i have ever seen :). Congrats dude!

@yuzaR-Data-Science21 күн бұрын

Glad you enjoyed! Hope other videos are helpful too! Thanks for a nice feedback!

@nukenelson22 күн бұрын

Is it possible that alongwith P value we also have Chi Square Statistics value & df value etc in GtSummary table..if yes what's the code

@yuzaR-Data-Science22 күн бұрын

Not straight forward unfortunately. You could create a data frame and use flextable and the export it in word or png

@joaoalexissantibanezarment476623 күн бұрын

I'm doing better analysis of my PhD data because of your videos, thanks a lot!

@yuzaR-Data-Science22 күн бұрын

I am glad my content is helpful! Thanks for watching and commenting! That’s the best support 🙏

@paulocastro550225 күн бұрын

why not use the Chi-square test goodness-of-fit also?

@yuzaR-Data-Science23 күн бұрын

you actually can, here is what I think about them: Close relatives: connect the dots Ironically, there is nothing exact about Exact Binomial test. It is called “exact” because it simply calculates the p-value directly from the probability, and not from any kind of statistics, like the Chi-Square. However, Chi-Square’s Goodness of Fit test is only approximation for a p-value, that is why the exact binomial test is recommended. Proportion test If you have lots of data (N > 30) or more than two outcomes, use a proportion test which is highly similar to the exact binomial test. In fact, the Exact binomial test is exactly the same as the proportion test with Yates continuity correction, which is used by the proportion test by default. Below, I explicitly wrote down such correction: prop.test(x = 7, n = 10, p = 0.5, correct = T) 1-sample proportions test with continuity correction data: 7 out of 10, null probability 0.5 X-squared = 0.9, df = 1, p-value = 0.3428 alternative hypothesis: true p is not equal to 0.5 95 percent confidence interval: 0.3536707 0.9190522 sample estimates: p 0.7 One sample Chi-Square test However, as you can see above, the proportion test calculates the chi-squared statistics, so it is actually calling a chi-squared test. And interestingly a proportion test without Yates continuity correction gives identical results to a Goodness-of-Fit One-sample Chi-Square test: prop.test(x = 7, n = 10, p = 0.5, correct = F) 1-sample proportions test without continuity correction data: 7 out of 10, null probability 0.5 X-squared = 1.6, df = 1, p-value = 0.2059 alternative hypothesis: true p is not equal to 0.5 95 percent confidence interval: 0.3967781 0.8922087 sample estimates: p 0.7 chisq.test(c(7,3)) # total is 10 and p = 0.5 for both numbers by default Chi-squared test for given probabilities data: c(7, 3) X-squared = 1.6, df = 1, p-value = 0.2059 (Simplest) Logistic regression If you are not overwhelmed yet, you can also go further and conduct the simplest logistic regression possible (don’t need to understand it now! I’ll cover it in different videos). Below you’ll find an log-odds output of the logistic regression (0.847) which can be expressed in probability using the plogis function. You’ll see that the probability is exactly 0.7, as in the tests above and the p-value is similar. By the way, the p-value represents the probability of observing a result as extreme or more extreme than the one you got, assuming the null hypothesis is true. m <- glm(c(rep(1,7), rep(0,3)) ~ 1, family = binomial()) broom::tidy(m) # A tibble: 1 × 5 term estimate std.error statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> 1 (Intercept) 0.847 0.690 1.23 0.220 plogis(0.8472979) [1] 0.7

@paulocastro550221 күн бұрын

@@yuzaR-Data-Science Thank you for your attention and complete answer. This was very insightful. Also thanks for your video about regressions vs statistical tests, which I wondered about that for months or more (since I don't have a strong background on statistics, I'm a Biologyst who (tries to ^^) enjoy statistics). So, the proportion test calculates the pvalue from the Chi-square value as in the Chi-square test unlike binomial test? At least I can see a clear advantage given that proportion test gives the confidence intervals and I would expect that the Exact binomial test to be similar to the binomial glm but in fact it's similar to Chi-square pvalue. Is this due to the Yates continuity correction?, I found some sources saying is can be a bit conservative. For some reason, I tried to do the Chi-square test and it gives me the same result with continuity correction (which is the default it seems, and doesn't specify which) and not corrected, equal to the proportion test without correction. When i used pvalue by Monte Carlo simulation it gives something closer to the Yate's continuity correction Chi-squared test for given probabilities with simulated p-value (based on 2000 replicates) data: c(7, 3) X-squared = 1.6, df = NA, p-value = 0.3543 What is best? Using continuity corrections or alpha adjustments for multiple outcomes, or both?

@yuzaR-Data-Science20 күн бұрын

Monte Carlo simulation is the best I think and adjustment for multiple comparisons is always a must.

@NextGenAge25 күн бұрын

Great video! Is it possible to only show the pairwise comparisons between one group e.g. 'Original' and Synthetic1, Synthetic2, Synthetic3 ... etc? It also does pairwise comparisons between those synthetic groups which I don't want to show and also don't want to conduct tests for except with the original one. Having a separate one by one figure takes up a lot of space so wondering if this is possible?

@yuzaR-Data-Science25 күн бұрын

I think it’s difficult with this function, although not impossible. But it’s much more practical to model it, with for example quantile regression, and use tab_model function from sjPlot package 📦 I have videos on both if need some assistance for a start

@yuchuxie817028 күн бұрын

Very nice video! It helped me a lot.🤓 I have a question here, when I try to build a function for mixed effect models and continue to the next step of glmulti, it warns me that "Error during wrapup: unused argument (REML = F) Error: no more error handlers available (recursive errors?); invoking 'abort' restart". Could you pls tell me how might this happen and how to solve this problem? That would be very appreciated!😳

@yuzaR-Data-Science27 күн бұрын

hi, thanks for feedback! first, have you installed all the important packages (lme4, lmerTest ... etc.)? and secondly, have you tried this? glmer.glmulti<-function(formula, data, random = "", ...){ glmer(paste(deparse(formula),random), data = data, REML = F, ...) } mixed_model <- glmulti( y = response ~ predictor_1 + predictor_2 + predictor_3, random = "+(1|random_effect)", crit = aicc, data = data, family = binomial, method = "h", fitfunc = glmer.glmulti, marginality = F, level = 1 )

@yuchuxie817026 күн бұрын

@@yuzaR-Data-Science Thanks for your reply! Yes, I followed these steps, and still got the error warning.

@yuzaR-Data-Science20 күн бұрын

then I guess there are two many predictors. if I try it with ca. 20 it collapses and I need to restart rstudio. so, reduce the number of predictors to the most sensical ones and then run the glmulti

@yuchuxie817020 күн бұрын

@@yuzaR-Data-Science I'll try this. Thank you soooo much for your kind suggestion！

@yuzaR-Data-Science19 күн бұрын

sure, let me know whether it worked. by the way if you use "d" instead of "h", you can see how many models you are going to make, and if it goes over 6figures, I would reduce the number of predictors first and then use the glmulti

@duyanhtran472329 күн бұрын

Thank you sir

@yuzaR-Data-Science28 күн бұрын

So nice of you! Thanks you for watching and commenting!

@jalalkassout422629 күн бұрын

I think your the best explaining statistics in such smooth way. I'm wondering if your blog is out of service?

@yuzaR-Data-Science28 күн бұрын

Thanks man! Greatly appreciate your positive feedback. My blog was shut down, since they want me to pay. I refuse to pay, because I actually do something useful for the world for free. So, hope for your understanding. The good news is, youtube is still free and will stay free, so, when you just stop the video at any time and type the code it's free. However, when you want to see the whole code from any of the video, you could join the channel (kzread.info/dron/cGXGFClRdnrwXsPHi7P4ZA.htmljoin), because for the members, I do provide the whole code.

@kellycriterion1019Ай бұрын

Grear work. I look forward to your videos on Survival Analysis in R

@yuzaR-Data-ScienceАй бұрын

Thank you very much! Will definitely do. Just made some videos on linear regression in R. Logistic will follow and then the rest of models including Survival and ML one day. 3 years ago I've done two videos on survival already, but they are old, theoretical and low quality. I'll redo them in a more concise and R focused way. Thanks for nice feedback and for watching!

@kellycriterion1019Ай бұрын

Great work. Please do videos on Survival Analysis in R as well😊

@yuzaR-Data-ScienceАй бұрын

@dinohadjiyannis3225Ай бұрын

brilliant stuff. Very easy to follow. Can you create a dedicated playlist regarding machine learning models ?

@yuzaR-Data-ScienceАй бұрын

Great suggestion! I'am on it, just made some videos on linear regression in R. Logistic will follow and then the rest of ML one day. Thanks for nice feedback and for watching!

@dinohadjiyannis322528 күн бұрын

@@yuzaR-Data-Science you have no idea how easy these are to follow. thanks again. keep it up.

@yuzaR-Data-Science20 күн бұрын

thanks! do my best to upload more, but this summer is pretty busy.

@WilForDataScienceАй бұрын

Excellent content quality! You sir always find a way to keep us interestingly attached to the video. You are like our Statistics and Data Science dealer. Thanks for your labor. We'll keep growing up.

@yuzaR-Data-ScienceАй бұрын

Wow, thank you! Your comments, my friend, are the most supportive and motivating! So, after reading them I just wanna jump straight into creating a new video on one of the 1000 ideas I have :) For instance, I am finally starting the logistic regression series. One of my favorite topics ;) Genuinely thankful for your continuous support! Cheers!

@OnLyhereAloneАй бұрын

Thanks for this but is this not the classic case when p-value adjustments for multiple testing need to be applied? Why wasnt it applied? Also, in a situation where we have only 2 possible outcomes, when should the binomial test be used versus Chi-square? Would this be based on preference or one is objectively more appropriate versus the other?

@yuzaR-Data-ScienceАй бұрын

Man, you ask good questions! ;) first, yes, you are absolutely right, p-values adjustment would be the right thing to do, but to the time I have done the paper, I used it irregularly. For the video I also try to keep the focus, so it's concise.

@yuzaR-Data-ScienceАй бұрын

And since I try to keep videos short, some infos does not end up there, but was considered, while I was writing the script. For instance, check these parts below, I hope you'd find them useful: Intuition Ironically, there is nothing exact about Exact Binomial test. It is called “exact” because it simply calculates the p-value directly from the probability, and not from any kind of statistics, like the Chi-Square. However, Chi-Square’s Goodness of Fit test is only approximation for a p-value, that is why the exact binomial test is recommended. Proportion test If you have lots of data (N > 30) or more than two outcomes, use a proportion test which is highly similar to the exact binomial test. In fact, the Exact binomial test is exactly the same as the proportion test with Yates continuity correction, which is used by the proportion test by default. Below, I explicitly wrote down such correction: prop.test(x = 7, n = 10, p = 0.5, correct = T) One sample Chi-Square test However, as you can see above, the proportion test calculates the chi-squared statistics, so it is actually calling a chi-squared test. And interestingly a proportion test without Yates continuity correction gives identical results to a Goodness-of-Fit One-sample Chi-Square test: prop.test(x = 7, n = 10, p = 0.5, correct = F) chisq.test(c(7,3)) # total is 10 and p = 0.5 for both numbers by default (Simplest) Logistic regression If you are not overwhelmed yet, you can also go further and conduct the simplest logistic regression possible (don’t need to understand it now! I’ll cover it in different videos). Below you’ll find an log-odds output of the logistic regression (0.847) which can be expressed in probability using the plogis function. You’ll see that the probability is exactly 0.7, as in the tests above and the p-value is similar. By the way, the p-value represents the probability of observing a result as extreme or more extreme than the one you got, assuming the null hypothesis is true. m <- glm(c(rep(1,7), rep(0,3)) ~ 1, family = binomial()) broom::tidy(m) plogis(0.8472979)

@statlab_stat.solutionАй бұрын

❤

@yuzaR-Data-ScienceАй бұрын

🙏

@SUNILYADAV-tv5zeАй бұрын

Nice lecture for resampling. Please make a video for simulation study

@yuzaR-Data-ScienceАй бұрын

Thanks 🙏 Sunil, I’ll do. But it’ll take some time, because I first want to cover frequentists stats. Then come to simulation

@SUNILYADAV-tv5zeАй бұрын

Nice video and excellent explanation 👍

@yuzaR-Data-ScienceАй бұрын

Thanks 🙏 Sunil! Glad you enjoyed it!

@aminebahmed7526Ай бұрын

Nice video as usual, keep up the good work

@yuzaR-Data-ScienceАй бұрын

Thanks, will do! Appreciate your continuous support!!! Commenting, watching and liking is really the best support! So, thanks again!

@neurosciencehubbykissikont5378Ай бұрын

Your videos are intuitive. Can you start a playlist on machine learning in R?

@yuzaR-Data-ScienceАй бұрын

thanks man! yes, that's the plan, but first I would cover stat models, like logistic regression etc. after that I would go full ML and AI ;)

@RichmondDarko-qo2meАй бұрын

Very insightful. Thank you vey much. Please can I get your email. I will like to ask you some stuff I find confusing since you are the expert :)

@yuzaR-Data-ScienceАй бұрын

Hi Richmond, thanks a lot for your nice feedback! I do not share my email, but that's no problem, because you can ask anything here in the comments section of videos and I would do my best to answer as quick and as good as I can. The channel members get quicker and more insightful responses though, and the higher their level, the more time I can invest into answering questions, thus if you wish, join my channel: kzread.info/dron/cGXGFClRdnrwXsPHi7P4ZA.htmljoin

@RichmondDarko-qo2me21 күн бұрын

@@yuzaR-Data-Science Thank you very much for such informative videos. I spent several years in class and didn't understand all these concepts, but watching this video has made things easier for my comprehension. I have a few questions I would like to ask: When performing a statistical test, we use a parametric test if the data or variable in question is normally distributed, and a non-parametric alternative if the data or variable is not normally distributed. My question is: when does the central limit theorem come into play here? Also, a colleague of mine told me to always use parametric tests even if the data is not normally distributed. His explanation was that parametric tests are more powerful than non-parametric tests. So, should I straightforwardly use the non-parametric alternative when I observe that my data is not normally distributed, or should I take the CLT into consideration and use the parametric test?

@yuzaR-Data-Science20 күн бұрын

I am not sure the CLT helps too much, but using parametric test for a highly skewed data is absolute nonsense. The power difference is minimum and is overrated. I also have colleagues who use non-parametric tests by default. Another extreme nonsensicality and laziness. Just for the sake of learning effect, please, take skewed data and calculate mean and median to see how much difference you'll get. And if your colleague would really care about power, he/she would use multivariable models, not univariable tests. And this is what I would recommend to you - the test are fine, in the beginning - but try to learn models and their assumptions when you want to go to the next level. Cheers and thanks again for a nice feedback!

@RichmondDarko-qo2me20 күн бұрын

@@yuzaR-Data-Science I'm really grateful for finding time in your busy schedule to reply me. So please when should I use the CLT or I shouldn't use it at all. Thank you

@yuzaR-Data-Science19 күн бұрын

but what exactly do you mean by CLT? bayesian methods?

@hikeaway1596Ай бұрын

🙏👍💪😎

@yuzaR-Data-ScienceАй бұрын

thanks!

@hikeaway1596Ай бұрын

best R content out-there!

@yuzaR-Data-ScienceАй бұрын

Glad you still enjoy it ;)

@user-wr4yl7tx3wАй бұрын

Thanks but why is it called ‘exact’?

@yuzaR-Data-ScienceАй бұрын

amazing question! Thanks :) I was thinking to put it into a video actually, but deleted it as "boring" and "less useful" part of the script :) Here is what I was going to say: Ironically, there is nothing exact about Exact Binomial test. It is called “exact” because it simply calculates the p-value directly from the probability, and not from any kind of statistics, like the Chi-Square. However, Chi-Square’s Goodness of Fit test is only approximation for a p-value, that is why the exact binomial test is recommended. Hope it answers the question :)

@SUNILYADAV-tv5zeАй бұрын

Sir please make a video lecture on Simulation study.

@yuzaR-Data-ScienceАй бұрын

Thanks 🙏 Sunil, I’ll do. But it’ll take some time, because I first want to cover frequentists stats. Then come to simulation

@SUNILYADAV-tv5zeАй бұрын

@@yuzaR-Data-Science Sure Sir I will wait the video. Thank you so much Sir 🙏

@yuzaR-Data-ScienceАй бұрын

You are very welcome! :)

@hikeaway1596Ай бұрын

🙏👍💪😎

@yuzaR-Data-ScienceАй бұрын

thanks a lot!

@hikeaway1596Ай бұрын

🙏👍💪😎

@yuzaR-Data-ScienceАй бұрын

thanks!

@hikeaway1596Ай бұрын

great practical content! thanks

@yuzaR-Data-ScienceАй бұрын

Glad it was helpful!

@hikeaway1596Ай бұрын

top content, very concise and to the point! thanks!

@yuzaR-Data-ScienceАй бұрын

waw, thanks for such a generous feedback!

@hikeaway1596Ай бұрын

I can't stop watching your videos ;) please produce more of them, It's really fun to learn from your content.

@yuzaR-Data-ScienceАй бұрын

learning addiction is the best addiction ever! ;)

@M.Nagah89Ай бұрын

I have a question plz, Why did we put “temp” as a predictor to imputate missing values in Ozone variable ?

@yuzaR-Data-ScienceАй бұрын

simply as an example of a predictor

@M.Nagah89Ай бұрын

@@yuzaR-Data-Science Am sorry, I cant get it !

@yuzaR-Data-Science20 күн бұрын

sorry, what exactly can't you get?

@yuzaR-Data-Science20 күн бұрын

sorry, what exactly can't you get?

@M.Nagah8920 күн бұрын

@@yuzaR-Data-Science Do we have to put a predictor to impute missing values in a variable?

@moviezone8130Ай бұрын

I love this channel. As always high quality content. I am happy I have also found you on LinkedIn.

@yuzaR-Data-ScienceАй бұрын

Glad you enjoy it! And thanks sooo much for your positive feedback! It's very encouraging! 🙏🙏🙏

@hikeaway1596Ай бұрын

good quality content as always thanks! keep up a great work!

@yuzaR-Data-ScienceАй бұрын

thanks for your continuous support! :)

@jeffbenshetlerАй бұрын

Excellent demonstration in R.

@yuzaR-Data-ScienceАй бұрын

thanks a lot Jeff, glad you enjoyed it! :)

@edinsondelgado4895Ай бұрын

I love your channel, I think it's one of the best channels in KZread right now

@yuzaR-Data-ScienceАй бұрын

Wow, thanks! That's the best feedback ever! Greatly appreciate! Hope my future content will be useful too!

@WilForDataScience2 күн бұрын
Hey there! I'm wondering where you get the information about the conventional thresholds for interpretation (like for p-values, Bayes, etc). There are so many different versions from different authors out there, which one should we trust? I'm really struggling to make up my mind! I already know about the effectsize package, but should we trust their frames of reference? Thanks in advance sir.
@Dominus_Ryder2 күн бұрын
The code link in the description does not seem to be working anymore, do you have an updated one by any chance?
@bartoszkedziora32565 күн бұрын
Absolutely amazing
@yuzaR-Data-Science5 күн бұрын
Thanks so much 🙏 hope you enjoy other topics too!
@bervelinlumesa34755 күн бұрын
Hello Sir. Is it possible to display only one category of the dichotomous variable provided with the "by" parameter. For example I only want to display the percentage of people who said Yes (sport-Oui). library(gtsummary) library(questionr) data("hdv2003") hdv2003 %>% tbl_summary( include = c("sexe", "relig", "relig"), by = "sport", percent = "row", statistic = all_categorical() ~ "{p}%" ) . The objective is that I would subsequently like to combine several tables where in the column I will have the percentages of several variables.
@yuzaR-Data-Science5 күн бұрын
Yes it’s possible. Check out the arguments of the Funktion please yourself, I am away from my computer for a week. And you can combine several tables easily via tbl_merge
@bervelinlumesa34754 күн бұрын
@@yuzaR-Data-Science Thank you
@bervelinlumesa34754 күн бұрын
I found modify_column_hide function that can do it
@RiyazBashaRK6 күн бұрын
Hi professor.Is there any way we can dynamically adjust line in quantile regression plot and it should affect other plots
@yuzaR-Data-Science5 күн бұрын
Which line do you mean? And which other plots?
@user-dl5go9tg6g9 күн бұрын
Amazing
@yuzaR-Data-Science9 күн бұрын
Thanks 🙏
@ramoda1312 күн бұрын
Amazing video. Thank you.
@yuzaR-Data-Science12 күн бұрын
Glad you liked it! Thanks for watching!
@ramoda1313 күн бұрын
Great video
@yuzaR-Data-Science13 күн бұрын
Glad you enjoyed it! Thanks for watching!
@joaoalexissantibanezarment476613 күн бұрын
Excellent video🙌 It would be great a video of bayesian analysis
@yuzaR-Data-Science13 күн бұрын
Noted! Thanks for your positive feedback! 🙏
@hendrikpehlke497313 күн бұрын
Thank you for this video. I just have two question. 1. You showed the plot of the variable importance ("plot('model_name', type = "s")). Is there a way to extract the names of the variables and/or interactions using a threshold (e.g. 75%)? I need them as a list, e.g. "variab_interactions_plus_75 <- ????". 2. I used the example data "mtcars". Is an interaction "hp:cyl" equal to an interaction "cyl:hp"?
@yuzaR-Data-Science13 күн бұрын
Hi Hendrik, not that I know of. Once I needed to use for my paper, I created the data frame manually and only put the things inside, I wanted. But I think it's not that much more work, after algorithm done the whole heavy lifting ;) cheers
@hendrikpehlke497314 күн бұрын
I also subscribed. Your videos are always very informativ and helpful. Thank you.
@yuzaR-Data-Science14 күн бұрын
Thanks for the sub! And for watching! I am happy you like my content!
@hendrikpehlke497317 күн бұрын
Wow! Thank you! So many important informations. I have to watch this video several times. But one question: In which order would you use the packages "janitor" and "dlookr". Would be interesting to teach people how to load and handle "dirty" excel table, fix some excel problems (e.g. date as numbers or entries like "no data" in numerical columns etc) and if those problems are fixed to use "dlookr" to diagnose, explore and repair the data.
@yuzaR-Data-Science16 күн бұрын
Thanks a lot for your nice reply, Hendrik! I would use janitor first and dlookr on top. I guess you already have seen the janitor video on my channel. If not, feel free ;) I also have one video on tidy data, where I show the dirty table, but there is not much of R programming. Thanks for watching!
@hendrikpehlke497319 күн бұрын
Wow! One of the best videos I have ever seen. Vwer informative.
@yuzaR-Data-Science19 күн бұрын
Wow, thanks for such a generous feedback! If you know some folks who also would benefit from it, feel free to share it! I wish I had something like this video as I started to learn R. I hope the other videos are helpful too! Thanks again! Cheers!
@ManuelAlejandroCastroGarcia20 күн бұрын
Thanks for sharing your knowledge. How can I add the Yate´s correction into the plot?
@yuzaR-Data-Science19 күн бұрын
Unfortunately, either not possible, or I don't know why and how. But thanks for good feedback!
@ManuelAlejandroCastroGarcia20 күн бұрын
Thanks for sharing your knowledge. How can I add the Yate´s correction into the plot?
@yuzaR-Data-Science19 күн бұрын
Unfortunately, either not possible, or I don't know why and how. But thanks for good feedback!
@joaoalexissantibanezarment476620 күн бұрын
This is an excellent video!! I was thinking, a nonparametric alternative for linear regression could be LOESS regression and boostrapp could be done without problem but, because LOESS is a nonparametric, instead of medians the means could be used properly or also in this case the medians should be used?
@yuzaR-Data-Science19 күн бұрын
While resampling allows for a better use of means, I am a big fan of medians, because if the distribution of anything after bootstrapping does not get normal, like in the case of p.values, I would trust the median, but not the mean. So, I would use median as much as I can.
@joaoalexissantibanezarment476619 күн бұрын
@@yuzaR-Data-Science Ok, I really thank you for answer!
@yuzaR-Data-Science19 күн бұрын
you are very welcome!
@joaoalexissantibanezarment476617 күн бұрын
@@yuzaR-Data-Science I had another question. Althoguh bootstrapping is not exactly an option to handle outliers, could be the case that the more resamples used, the more robust is the model to outliers?
@yuzaR-Data-Science16 күн бұрын
Yes, because then you would resample the most frequent cases more often, so their distribution would be higher, and the outliers ... hmm, we would not get rid of them, but they will be resampled very rarely. hope that helps. cheers
@manaka_21 күн бұрын
man u just won another subscriber. This is one of the best videos about dplyr that i have ever seen :). Congrats dude!
@yuzaR-Data-Science21 күн бұрын
Glad you enjoyed! Hope other videos are helpful too! Thanks for a nice feedback!
@nukenelson22 күн бұрын
Is it possible that alongwith P value we also have Chi Square Statistics value & df value etc in GtSummary table..if yes what's the code
@yuzaR-Data-Science22 күн бұрын
Not straight forward unfortunately. You could create a data frame and use flextable and the export it in word or png
@joaoalexissantibanezarment476623 күн бұрын
I'm doing better analysis of my PhD data because of your videos, thanks a lot!
@yuzaR-Data-Science22 күн бұрын
I am glad my content is helpful! Thanks for watching and commenting! That’s the best support 🙏
@paulocastro550225 күн бұрын
why not use the Chi-square test goodness-of-fit also?
@yuzaR-Data-Science23 күн бұрын
you actually can, here is what I think about them: Close relatives: connect the dots Ironically, there is nothing exact about Exact Binomial test. It is called “exact” because it simply calculates the p-value directly from the probability, and not from any kind of statistics, like the Chi-Square. However, Chi-Square’s Goodness of Fit test is only approximation for a p-value, that is why the exact binomial test is recommended. Proportion test If you have lots of data (N > 30) or more than two outcomes, use a proportion test which is highly similar to the exact binomial test. In fact, the Exact binomial test is exactly the same as the proportion test with Yates continuity correction, which is used by the proportion test by default. Below, I explicitly wrote down such correction: prop.test(x = 7, n = 10, p = 0.5, correct = T) 1-sample proportions test with continuity correction data: 7 out of 10, null probability 0.5 X-squared = 0.9, df = 1, p-value = 0.3428 alternative hypothesis: true p is not equal to 0.5 95 percent confidence interval: 0.3536707 0.9190522 sample estimates: p 0.7 One sample Chi-Square test However, as you can see above, the proportion test calculates the chi-squared statistics, so it is actually calling a chi-squared test. And interestingly a proportion test without Yates continuity correction gives identical results to a Goodness-of-Fit One-sample Chi-Square test: prop.test(x = 7, n = 10, p = 0.5, correct = F) 1-sample proportions test without continuity correction data: 7 out of 10, null probability 0.5 X-squared = 1.6, df = 1, p-value = 0.2059 alternative hypothesis: true p is not equal to 0.5 95 percent confidence interval: 0.3967781 0.8922087 sample estimates: p 0.7 chisq.test(c(7,3)) # total is 10 and p = 0.5 for both numbers by default Chi-squared test for given probabilities data: c(7, 3) X-squared = 1.6, df = 1, p-value = 0.2059 (Simplest) Logistic regression If you are not overwhelmed yet, you can also go further and conduct the simplest logistic regression possible (don’t need to understand it now! I’ll cover it in different videos). Below you’ll find an log-odds output of the logistic regression (0.847) which can be expressed in probability using the plogis function. You’ll see that the probability is exactly 0.7, as in the tests above and the p-value is similar. By the way, the p-value represents the probability of observing a result as extreme or more extreme than the one you got, assuming the null hypothesis is true. m <- glm(c(rep(1,7), rep(0,3)) ~ 1, family = binomial()) broom::tidy(m) # A tibble: 1 × 5 term estimate std.error statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> 1 (Intercept) 0.847 0.690 1.23 0.220 plogis(0.8472979) [1] 0.7
@paulocastro550221 күн бұрын
@@yuzaR-Data-Science Thank you for your attention and complete answer. This was very insightful. Also thanks for your video about regressions vs statistical tests, which I wondered about that for months or more (since I don't have a strong background on statistics, I'm a Biologyst who (tries to ^^) enjoy statistics). So, the proportion test calculates the pvalue from the Chi-square value as in the Chi-square test unlike binomial test? At least I can see a clear advantage given that proportion test gives the confidence intervals and I would expect that the Exact binomial test to be similar to the binomial glm but in fact it's similar to Chi-square pvalue. Is this due to the Yates continuity correction?, I found some sources saying is can be a bit conservative. For some reason, I tried to do the Chi-square test and it gives me the same result with continuity correction (which is the default it seems, and doesn't specify which) and not corrected, equal to the proportion test without correction. When i used pvalue by Monte Carlo simulation it gives something closer to the Yate's continuity correction Chi-squared test for given probabilities with simulated p-value (based on 2000 replicates) data: c(7, 3) X-squared = 1.6, df = NA, p-value = 0.3543 What is best? Using continuity corrections or alpha adjustments for multiple outcomes, or both?
@yuzaR-Data-Science20 күн бұрын
Monte Carlo simulation is the best I think and adjustment for multiple comparisons is always a must.
@NextGenAge25 күн бұрын
Great video! Is it possible to only show the pairwise comparisons between one group e.g. 'Original' and Synthetic1, Synthetic2, Synthetic3 ... etc? It also does pairwise comparisons between those synthetic groups which I don't want to show and also don't want to conduct tests for except with the original one. Having a separate one by one figure takes up a lot of space so wondering if this is possible?
@yuzaR-Data-Science25 күн бұрын
I think it’s difficult with this function, although not impossible. But it’s much more practical to model it, with for example quantile regression, and use tab_model function from sjPlot package 📦 I have videos on both if need some assistance for a start
@yuchuxie817028 күн бұрын
Very nice video! It helped me a lot.🤓 I have a question here, when I try to build a function for mixed effect models and continue to the next step of glmulti, it warns me that "Error during wrapup: unused argument (REML = F) Error: no more error handlers available (recursive errors?); invoking 'abort' restart". Could you pls tell me how might this happen and how to solve this problem? That would be very appreciated!😳
@yuzaR-Data-Science27 күн бұрын
hi, thanks for feedback! first, have you installed all the important packages (lme4, lmerTest ... etc.)? and secondly, have you tried this? glmer.glmulti<-function(formula, data, random = "", ...){ glmer(paste(deparse(formula),random), data = data, REML = F, ...) } mixed_model <- glmulti( y = response ~ predictor_1 + predictor_2 + predictor_3, random = "+(1|random_effect)", crit = aicc, data = data, family = binomial, method = "h", fitfunc = glmer.glmulti, marginality = F, level = 1 )
@yuchuxie817026 күн бұрын
@@yuzaR-Data-Science Thanks for your reply! Yes, I followed these steps, and still got the error warning.
@yuzaR-Data-Science20 күн бұрын
then I guess there are two many predictors. if I try it with ca. 20 it collapses and I need to restart rstudio. so, reduce the number of predictors to the most sensical ones and then run the glmulti
@yuchuxie817020 күн бұрын
@@yuzaR-Data-Science I'll try this. Thank you soooo much for your kind suggestion！
@yuzaR-Data-Science19 күн бұрын
sure, let me know whether it worked. by the way if you use "d" instead of "h", you can see how many models you are going to make, and if it goes over 6figures, I would reduce the number of predictors first and then use the glmulti
@duyanhtran472329 күн бұрын
Thank you sir
@yuzaR-Data-Science28 күн бұрын
So nice of you! Thanks you for watching and commenting!
@jalalkassout422629 күн бұрын
I think your the best explaining statistics in such smooth way. I'm wondering if your blog is out of service?
@yuzaR-Data-Science28 күн бұрын
Thanks man! Greatly appreciate your positive feedback. My blog was shut down, since they want me to pay. I refuse to pay, because I actually do something useful for the world for free. So, hope for your understanding. The good news is, youtube is still free and will stay free, so, when you just stop the video at any time and type the code it's free. However, when you want to see the whole code from any of the video, you could join the channel (kzread.info/dron/cGXGFClRdnrwXsPHi7P4ZA.htmljoin), because for the members, I do provide the whole code.
@kellycriterion1019Ай бұрын
Grear work. I look forward to your videos on Survival Analysis in R
@yuzaR-Data-ScienceАй бұрын
Thank you very much! Will definitely do. Just made some videos on linear regression in R. Logistic will follow and then the rest of models including Survival and ML one day. 3 years ago I've done two videos on survival already, but they are old, theoretical and low quality. I'll redo them in a more concise and R focused way. Thanks for nice feedback and for watching!
@kellycriterion1019Ай бұрын
Great work. Please do videos on Survival Analysis in R as well😊
@yuzaR-Data-ScienceАй бұрын
Thank you very much! Will definitely do. Just made some videos on linear regression in R. Logistic will follow and then the rest of models including Survival and ML one day. 3 years ago I've done two videos on survival already, but they are old, theoretical and low quality. I'll redo them in a more concise and R focused way. Thanks for nice feedback and for watching!
@dinohadjiyannis3225Ай бұрын
brilliant stuff. Very easy to follow. Can you create a dedicated playlist regarding machine learning models ?
@yuzaR-Data-ScienceАй бұрын
Great suggestion! I'am on it, just made some videos on linear regression in R. Logistic will follow and then the rest of ML one day. Thanks for nice feedback and for watching!
@dinohadjiyannis322528 күн бұрын
@@yuzaR-Data-Science you have no idea how easy these are to follow. thanks again. keep it up.
@yuzaR-Data-Science20 күн бұрын
thanks! do my best to upload more, but this summer is pretty busy.
@WilForDataScienceАй бұрын
Excellent content quality! You sir always find a way to keep us interestingly attached to the video. You are like our Statistics and Data Science dealer. Thanks for your labor. We'll keep growing up.
@yuzaR-Data-ScienceАй бұрын
Wow, thank you! Your comments, my friend, are the most supportive and motivating! So, after reading them I just wanna jump straight into creating a new video on one of the 1000 ideas I have :) For instance, I am finally starting the logistic regression series. One of my favorite topics ;) Genuinely thankful for your continuous support! Cheers!
@OnLyhereAloneАй бұрын
Thanks for this but is this not the classic case when p-value adjustments for multiple testing need to be applied? Why wasnt it applied? Also, in a situation where we have only 2 possible outcomes, when should the binomial test be used versus Chi-square? Would this be based on preference or one is objectively more appropriate versus the other?
@yuzaR-Data-ScienceАй бұрын
Man, you ask good questions! ;) first, yes, you are absolutely right, p-values adjustment would be the right thing to do, but to the time I have done the paper, I used it irregularly. For the video I also try to keep the focus, so it's concise.
@yuzaR-Data-ScienceАй бұрын
And since I try to keep videos short, some infos does not end up there, but was considered, while I was writing the script. For instance, check these parts below, I hope you'd find them useful: Intuition Ironically, there is nothing exact about Exact Binomial test. It is called “exact” because it simply calculates the p-value directly from the probability, and not from any kind of statistics, like the Chi-Square. However, Chi-Square’s Goodness of Fit test is only approximation for a p-value, that is why the exact binomial test is recommended. Proportion test If you have lots of data (N > 30) or more than two outcomes, use a proportion test which is highly similar to the exact binomial test. In fact, the Exact binomial test is exactly the same as the proportion test with Yates continuity correction, which is used by the proportion test by default. Below, I explicitly wrote down such correction: prop.test(x = 7, n = 10, p = 0.5, correct = T) One sample Chi-Square test However, as you can see above, the proportion test calculates the chi-squared statistics, so it is actually calling a chi-squared test. And interestingly a proportion test without Yates continuity correction gives identical results to a Goodness-of-Fit One-sample Chi-Square test: prop.test(x = 7, n = 10, p = 0.5, correct = F) chisq.test(c(7,3)) # total is 10 and p = 0.5 for both numbers by default (Simplest) Logistic regression If you are not overwhelmed yet, you can also go further and conduct the simplest logistic regression possible (don’t need to understand it now! I’ll cover it in different videos). Below you’ll find an log-odds output of the logistic regression (0.847) which can be expressed in probability using the plogis function. You’ll see that the probability is exactly 0.7, as in the tests above and the p-value is similar. By the way, the p-value represents the probability of observing a result as extreme or more extreme than the one you got, assuming the null hypothesis is true. m <- glm(c(rep(1,7), rep(0,3)) ~ 1, family = binomial()) broom::tidy(m) plogis(0.8472979)
@statlab_stat.solutionАй бұрын
❤
@yuzaR-Data-ScienceАй бұрын
🙏
@SUNILYADAV-tv5zeАй бұрын
Nice lecture for resampling. Please make a video for simulation study
@yuzaR-Data-ScienceАй бұрын
Thanks 🙏 Sunil, I’ll do. But it’ll take some time, because I first want to cover frequentists stats. Then come to simulation
@SUNILYADAV-tv5zeАй бұрын
Nice video and excellent explanation 👍
@yuzaR-Data-ScienceАй бұрын
Thanks 🙏 Sunil! Glad you enjoyed it!
@aminebahmed7526Ай бұрын
Nice video as usual, keep up the good work
@yuzaR-Data-ScienceАй бұрын
Thanks, will do! Appreciate your continuous support!!! Commenting, watching and liking is really the best support! So, thanks again!
@neurosciencehubbykissikont5378Ай бұрын
Your videos are intuitive. Can you start a playlist on machine learning in R?
@yuzaR-Data-ScienceАй бұрын
thanks man! yes, that's the plan, but first I would cover stat models, like logistic regression etc. after that I would go full ML and AI ;)
@RichmondDarko-qo2meАй бұрын
Very insightful. Thank you vey much. Please can I get your email. I will like to ask you some stuff I find confusing since you are the expert :)
@yuzaR-Data-ScienceАй бұрын
Hi Richmond, thanks a lot for your nice feedback! I do not share my email, but that's no problem, because you can ask anything here in the comments section of videos and I would do my best to answer as quick and as good as I can. The channel members get quicker and more insightful responses though, and the higher their level, the more time I can invest into answering questions, thus if you wish, join my channel: kzread.info/dron/cGXGFClRdnrwXsPHi7P4ZA.htmljoin
@RichmondDarko-qo2me21 күн бұрын
@@yuzaR-Data-Science Thank you very much for such informative videos. I spent several years in class and didn't understand all these concepts, but watching this video has made things easier for my comprehension. I have a few questions I would like to ask: When performing a statistical test, we use a parametric test if the data or variable in question is normally distributed, and a non-parametric alternative if the data or variable is not normally distributed. My question is: when does the central limit theorem come into play here? Also, a colleague of mine told me to always use parametric tests even if the data is not normally distributed. His explanation was that parametric tests are more powerful than non-parametric tests. So, should I straightforwardly use the non-parametric alternative when I observe that my data is not normally distributed, or should I take the CLT into consideration and use the parametric test?
@yuzaR-Data-Science20 күн бұрын
I am not sure the CLT helps too much, but using parametric test for a highly skewed data is absolute nonsense. The power difference is minimum and is overrated. I also have colleagues who use non-parametric tests by default. Another extreme nonsensicality and laziness. Just for the sake of learning effect, please, take skewed data and calculate mean and median to see how much difference you'll get. And if your colleague would really care about power, he/she would use multivariable models, not univariable tests. And this is what I would recommend to you - the test are fine, in the beginning - but try to learn models and their assumptions when you want to go to the next level. Cheers and thanks again for a nice feedback!
@RichmondDarko-qo2me20 күн бұрын
@@yuzaR-Data-Science I'm really grateful for finding time in your busy schedule to reply me. So please when should I use the CLT or I shouldn't use it at all. Thank you
@yuzaR-Data-Science19 күн бұрын
but what exactly do you mean by CLT? bayesian methods?
@hikeaway1596Ай бұрын
🙏👍💪😎
@yuzaR-Data-ScienceАй бұрын
thanks!
@hikeaway1596Ай бұрын
best R content out-there!
@yuzaR-Data-ScienceАй бұрын
Glad you still enjoy it ;)
@user-wr4yl7tx3wАй бұрын
Thanks but why is it called ‘exact’?
@yuzaR-Data-ScienceАй бұрын
amazing question! Thanks :) I was thinking to put it into a video actually, but deleted it as "boring" and "less useful" part of the script :) Here is what I was going to say: Ironically, there is nothing exact about Exact Binomial test. It is called “exact” because it simply calculates the p-value directly from the probability, and not from any kind of statistics, like the Chi-Square. However, Chi-Square’s Goodness of Fit test is only approximation for a p-value, that is why the exact binomial test is recommended. Hope it answers the question :)
@SUNILYADAV-tv5zeАй бұрын
Sir please make a video lecture on Simulation study.
@yuzaR-Data-ScienceАй бұрын
Thanks 🙏 Sunil, I’ll do. But it’ll take some time, because I first want to cover frequentists stats. Then come to simulation
@SUNILYADAV-tv5zeАй бұрын
@@yuzaR-Data-Science Sure Sir I will wait the video. Thank you so much Sir 🙏
@yuzaR-Data-ScienceАй бұрын
You are very welcome! :)
@hikeaway1596Ай бұрын
🙏👍💪😎
@yuzaR-Data-ScienceАй бұрын
thanks a lot!
@hikeaway1596Ай бұрын
🙏👍💪😎
@yuzaR-Data-ScienceАй бұрын
thanks!
@hikeaway1596Ай бұрын
great practical content! thanks
@yuzaR-Data-ScienceАй бұрын
Glad it was helpful!
@hikeaway1596Ай бұрын
top content, very concise and to the point! thanks!
@yuzaR-Data-ScienceАй бұрын
waw, thanks for such a generous feedback!
@hikeaway1596Ай бұрын
I can't stop watching your videos ;) please produce more of them, It's really fun to learn from your content.
@yuzaR-Data-ScienceАй бұрын
learning addiction is the best addiction ever! ;)
@M.Nagah89Ай бұрын
I have a question plz, Why did we put “temp” as a predictor to imputate missing values in Ozone variable ?
@yuzaR-Data-ScienceАй бұрын
simply as an example of a predictor
@M.Nagah89Ай бұрын
@@yuzaR-Data-Science Am sorry, I cant get it !
@yuzaR-Data-Science20 күн бұрын
sorry, what exactly can't you get?
@yuzaR-Data-Science20 күн бұрын
sorry, what exactly can't you get?
@M.Nagah8920 күн бұрын
@@yuzaR-Data-Science Do we have to put a predictor to impute missing values in a variable?
@moviezone8130Ай бұрын
I love this channel. As always high quality content. I am happy I have also found you on LinkedIn.
@yuzaR-Data-ScienceАй бұрын
Glad you enjoy it! And thanks sooo much for your positive feedback! It's very encouraging! 🙏🙏🙏
@hikeaway1596Ай бұрын
good quality content as always thanks! keep up a great work!
@yuzaR-Data-ScienceАй бұрын
thanks for your continuous support! :)
@jeffbenshetlerАй бұрын
Excellent demonstration in R.
@yuzaR-Data-ScienceАй бұрын
thanks a lot Jeff, glad you enjoyed it! :)
@edinsondelgado4895Ай бұрын
I love your channel, I think it's one of the best channels in KZread right now
@yuzaR-Data-ScienceАй бұрын
Wow, thanks! That's the best feedback ever! Greatly appreciate! Hope my future content will be useful too!