8. Sampling and Standard Error

MIT 6.0002 Introduction to Computational Thinking and Data Science, Fall 2016
View the complete course: ocw.mit.edu/6-0002F16
Instructor: John Guttag
Prof. Guttag discusses sampling and how to approach and analyze real data.
License: Creative Commons BY-NC-SA
More information at ocw.mit.edu/terms
More courses at ocw.mit.edu

Пікірлер: 55

  • @leixun
    @leixun3 жыл бұрын

    *My takeaways:* 1. Probability sampling: simple random sampling 0:58 2. A data analysis example 6:11 - How to tight standard deviation: take larger samples, not more samples 14:40 3. How to visualise and understand the data: error bar 17:30 - When confidence intervals don't overlap, we can conclude that means are statistically significantly different 18:30 4. Standard error 25:04 - Standard error vs standard deviation 29:33 - Problem with standard error: we don’t have population standard deviation 30:46, but we can use sample standard deviation to get a close estimation - Three different distributions and their skews 36:15, when we use sample standard deviation to estimate population standard deviation, more samples are needed for distributions with more skews 5. The good results are always aligned with confidence intervals 43:35

  • @xingnanzhou8628

    @xingnanzhou8628

    3 жыл бұрын

    Thanks Lei. You did a great job.

  • @leixun

    @leixun

    3 жыл бұрын

    @@xingnanzhou8628 You are welcome

  • @aidenigelson9826

    @aidenigelson9826

    2 жыл бұрын

    @@leixun Quick question, when do we NEED to have more than one sample? And when can we just use one?

  • @coolwinder

    @coolwinder

    2 жыл бұрын

    14:00 - 95% Confidence Interval

  • @kavishkadilharawickramasin4726

    @kavishkadilharawickramasin4726

    2 жыл бұрын

    Thank you ☺️❤️

  • @chasecolin22
    @chasecolin224 жыл бұрын

    Great video. I had always wondered about how size of the population affects size of the sample. Was surprised to see that it doesn't!

  • @NitinPasumarthy

    @NitinPasumarthy

    3 жыл бұрын

    Yeah! It doesn't, as long the distribution stays the same.

  • @coolwinder
    @coolwinder2 жыл бұрын

    Standard deviation: Value of symmetric distance towards both sides from the mean that accounts for ~96% of the samples. Confidence interval: 1.96*sd (accounts for a symmetric interval around the mean that accounts for 95% of the samples) Standard error (of the means): is standard deviation of a sample population scaled by the number of samples. It is approximately equal to the standard error of the whole population. 37:00 - The bigger extremes between samples, the higher error between sample and whole population would be for lesser number of samples. 40:00 - The size of population doesn't matter, but the skew and the size of the step.

  • @delcapslock100
    @delcapslock1006 жыл бұрын

    It would be useful to carefully explain the difference between sample (a random draw of size n from the population), individual sample elements ( each member of the sample), and replications (number of samples drawn). Otherwise it could be easy to confuse which one your talking about.

  • @anacosta5447

    @anacosta5447

    3 жыл бұрын

    All 3 concepts have been explained in previous lectures, from this same playlist.

  • @surajregmi11
    @surajregmi116 ай бұрын

    I would also recommend reading the recommended book and taking some time to digest the information there (like thinking and staring at the wall). It can be really confusing if we just watch the lecture. Especially this lecture requires reading from students' side too. It builds upon a few lectures before. The lectures before this particular one were easy to follow. Nonetheless, the material is very important and extremely interesting if you understand it well.

  • @nashsok

    @nashsok

    6 ай бұрын

    +1 for reading the text - I've been reading the assigned readings after each lecture and the way it covers the same material in a slightly different way with different examples has really helped to set the knowledge in my mind.

  • @manuel56354
    @manuel563545 жыл бұрын

    This is pure gold.

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w3 ай бұрын

    this is such an amazing lecture.

  • @NitinPasumarthy
    @NitinPasumarthy3 жыл бұрын

    Thank you MIT for making these courses open! To verify if we chose the right sample size, we identify what fraction of times we break the Empirical rule. But I'm not clear on why it is fine to use the estimated standard error (and not the population standard error) while computing this fraction? Say the population is skewed and we chose a small sample, wont the estimated standard error be more inaccurate? If that is so, how can it be used to verify the distance between population and sample means?

  • @studywithjosh5109
    @studywithjosh51093 жыл бұрын

    I hope more people learn about this free lecture

  • @sicabicruz8204
    @sicabicruz82043 жыл бұрын

    Thanks a lot for publishing this video. Did someone else also replicate this code in R? Thumbs up!

  • @SKyrim190
    @SKyrim1903 жыл бұрын

    I am a bit surprised he didn't comment on how the standard deviation of the sample has a bias as an estimator for the standard deviation of the population, and how you should divide by (n-1) instead of n (n being the sample size) when doing this estimation...

  • @sharan9993

    @sharan9993

    3 жыл бұрын

    I also thought that due to degrees of freedom but since we hav libraries that do this maybe he ignored

  • @aidenigelson9826

    @aidenigelson9826

    2 жыл бұрын

    Quick question, when do we NEED to have more than one sample? And when can we just use one?

  • @aidenigelson9826

    @aidenigelson9826

    2 жыл бұрын

    @@sharan9993 Quick question, when do we NEED to have more than one sample? And when can we just use one?

  • @SKyrim190

    @SKyrim190

    2 жыл бұрын

    @@aidenigelson9826 I don't think I understand your question...you always need more than one sample to meaningfully calculate mean and deviation

  • @sharan9993

    @sharan9993

    2 жыл бұрын

    @@aidenigelson9826 actually we always take just one sample but the idea is that if u keep taking infinitely many samples the mean of these samples mean is the true population mean. For experiments we always take one sample and calculate the p values and thus confidence interval to get an idea of how that sample represents the true population.

  • @aidenigelson9826
    @aidenigelson98262 жыл бұрын

    Quick question, when do we NEED to have more than one sample? And when can we just use one?

  • @ahmedennajari5392
    @ahmedennajari5392 Жыл бұрын

    Good job sir very helpful

  • @raymondjiii
    @raymondjiii7 ай бұрын

    I am wondering if the numTrials plays a part at the end in calculating the confidence interval. Sample size of 200 gave 95% but how does numTrials affect this?

  • @carinazh
    @carinazh6 жыл бұрын

    Thank you very much.

  • @tungdinh3664
    @tungdinh36642 жыл бұрын

    do we need stats and probability before this course? I'm a bit lost T_T

  • @dieterzucker72
    @dieterzucker723 жыл бұрын

    Well, when he is comparing different distributions vs population size... why the uniform has at sample size 25 more or less a difference of 7.5% ( 37:21 ) and in the next slide about 25% ( 39:12 )?

  • @JCResDoc94
    @JCResDoc944 жыл бұрын

    *☼ **31:00** THE CATCH!*

  • @blackbuddhaa
    @blackbuddhaa2 жыл бұрын

    17:15 Increase size of sample than number of sample. What does he mean?

  • @videofountain
    @videofountain7 жыл бұрын

    Thanks.

  • @jongcheulkim7284
    @jongcheulkim72842 жыл бұрын

    Thank you.

  • @tsuba666
    @tsuba666 Жыл бұрын

    23:20 So 100 samples of size 600 each gives us 600 000 samples in total ? And no one pipes up ? Alright, my brain must have fried somewhere along the line, then.

  • @bunorah1335
    @bunorah13353 жыл бұрын

    Where is the link for lecture 9

  • @mitocw

    @mitocw

    3 жыл бұрын

    kzread.info/dash/bejne/qH16ral_nJSpnps.html

  • @nishanpoudel3817
    @nishanpoudel38174 жыл бұрын

    Can anyone tell me why the states are independent in an election?

  • @studywithjosh5109

    @studywithjosh5109

    3 жыл бұрын

    Because one states voting cannot affect how another state votes

  • @studywithjosh5109

    @studywithjosh5109

    3 жыл бұрын

    But he stated that this was false

  • @jinruifoo7087
    @jinruifoo70873 жыл бұрын

    There are so many errors in the notes...

  • @ArunKumar-yb2jn
    @ArunKumar-yb2jn2 жыл бұрын

    29:35 When I felt I was human like the rest.

  • @quocvu9847
    @quocvu98478 ай бұрын

    20:29

  • @programmer1010
    @programmer1010 Жыл бұрын

    23:21. So 100*600=600k 😂