An intuitive introduction to Difference-in-Differences

Difference-in-Differences is one of the most widely applied methods for estimating causal effects of programs when the program was not implemented as a randomized controlled trial.
In this video I describe the situations where the method is applicable and give you the intuition behind it. I also explain how and why you might want to use regression to estimate diff-in-diff effects. Throughout, I talk about the key assumption required for the diff-in-diff estimate to be valid.
Intended audience: Folks who have had some exposure to linear regression models, but want to learn more statistical methods.

Пікірлер: 106

  • @pedrocolangelo2458
    @pedrocolangelo24583 жыл бұрын

    This is probably one of the best videos on this subject that I've ever seen. Thanks!!

  • @SalehBabazadeh
    @SalehBabazadeh9 жыл бұрын

    Thank you so much Doug! I just wanted to encourage you for keeping up this great job. your videos are awesome and I believe , they are being used by different people in different field.

  • @dougmckee673

    @dougmckee673

    9 жыл бұрын

    +Saleh Babazadeh Thanks so much for the kind words! I really should post more of these!

  • @TaroQuispe

    @TaroQuispe

    4 жыл бұрын

    @@dougmckee673 thanks from my side too, very clear and easy to understand. Do consider posting similar vids on regression techniques and similar, cheers!

  • @anuvaagarwal3492
    @anuvaagarwal34923 жыл бұрын

    One of the best and most lucid explanations of the DID method. Thank you for this, Doug. Especially how you explain the intuition behind how the calculation of the DID estimate done by hand is same as that estimated by the regression model. And the part where you elaborate on the simple benefits of using a regression for a DID model, is great. Really appreciate it that you having shared your understanding here.

  • @monicamu8013
    @monicamu80134 жыл бұрын

    When I watched the video for the first time, I was totally lost. During the second time, I took pauses in between to allow myself take more time to understand your super intelligent and super long sentences. It is so much clearer now. Thank you so much!

  • @lawrencecobb2107
    @lawrencecobb21072 жыл бұрын

    This is such a clear and helpful video. I’m taking an exam in an hour and doing last minute double checks. This makes me feel more confident, thank you

  • @sharonie
    @sharonie3 жыл бұрын

    Best Diff-in-diff course I have learned. Thanks!

  • @bl.l1506
    @bl.l15064 жыл бұрын

    Your videos have been vital for understanding the contents of my statistics course for me! So far, I've supplemented every new concept with your videos. Sometimes, I even watch your video first and then do the readings. Please keep doing these videos!

  • @dianaadamczyk5273
    @dianaadamczyk52736 жыл бұрын

    Can't tell you how useful your videos are. Thanks for passing on the knowledge!

  • @zaraazami4936
    @zaraazami49368 жыл бұрын

    Thank you so much! This video was waaay much helpful than reading pages and pages on DD! Very clear and to the point! Thank you!!

  • @brothermalcolm
    @brothermalcolm3 жыл бұрын

    Absolutely brilliant tutorial, first result returned, wish youtube was always this helpful!

  • @xb2856
    @xb2856 Жыл бұрын

    way more intuative than previously thought, well put thanks

  • @Josefk40
    @Josefk408 жыл бұрын

    Excellent explanation in 12 minutes. Thank you

  • @Itachi0567
    @Itachi05674 жыл бұрын

    thanks a lot for this clear explanation, you dont know how much it helped me

  • @Non-disjunction
    @Non-disjunction3 жыл бұрын

    You are such a legend mister McKee

  • @thefadingmoonlight
    @thefadingmoonlight8 жыл бұрын

    Thank you so much for uploading this! I had looked online at DID and was confused. This made it so easy to understand and apply.

  • @techierealestate
    @techierealestate5 жыл бұрын

    Clear and right to the point. I always wondered why the multiplication coefficient is the DD coeff, Now I know :D

  • @marben7062
    @marben70628 жыл бұрын

    Thank you very much Doug. It helped me to analyse my data (pooled cross section).

  • @yading9202
    @yading92025 жыл бұрын

    Very clear, easy to understand. Great job!

  • @anglofranses8205
    @anglofranses82053 жыл бұрын

    This is pure gold. Thanks!

  • @huekim589
    @huekim5892 жыл бұрын

    Very good and funny videos bring a great sense of entertainment!

  • @kevinvandenbrink8214
    @kevinvandenbrink82149 жыл бұрын

    Thanks for the video, really helped me in my finance research. Just one thing when you talk about the dummy variable Dtr, I think it takes 1 if the person is in the treatment group and 0 if the person is the control group.

  • @dougmckee673

    @dougmckee673

    9 жыл бұрын

    Kevin van den Brink You're exactly right--When (if) I re-record this I'll fix that. Thanks!

  • @digray6732
    @digray67322 жыл бұрын

    Thank you for this! I didn't quite understand the very last point, i.e. the difference between the points made for when DD is 'ok' (appropriate) and 'not ok'

  • @zhouchen7682
    @zhouchen76828 жыл бұрын

    Very useful, wait for more.

  • @sembilanbereguler2602
    @sembilanbereguler26029 жыл бұрын

    Based on regression result (at 8:59), what is criteria to reject null hypothesis (to say that the effect of lunch program is statistically significant)?

  • @emeraldwei6672
    @emeraldwei6672 Жыл бұрын

    Thank you! I would like to know, if there isn't a comparable group, like Rio, then how can one figure out the effect of this programme?

  • @Run4un
    @Run4un4 ай бұрын

    In this EX, are y-scores the post-scores or the pre-post differences? I`m guessing just post scores? Thanks for clarifying!

  • @braddoremus588
    @braddoremus5886 жыл бұрын

    Thank you - very good explanation. Helped clear a lot up for me.

  • @GradualReportSerbia
    @GradualReportSerbia4 жыл бұрын

    Abrupt ending, good video

  • @VikramSingh-sf1ev
    @VikramSingh-sf1ev3 жыл бұрын

    Very clear to the point

  • @rheabanerjee4938
    @rheabanerjee49385 жыл бұрын

    I wish you would post more, you're great!

  • @Non-disjunction
    @Non-disjunction3 жыл бұрын

    Amazing video

  • @leopan54321
    @leopan543212 жыл бұрын

    Dude. This saved me thanks :)

  • @DavidLihm
    @DavidLihm8 жыл бұрын

    Thank you so much, this has been really useful!

  • @libbyalthea3061
    @libbyalthea30617 жыл бұрын

    Hello! Thank you for a great video! Do you any advice for estimating necessary sample size before implementing treatment? Thanks!

  • @wisuraweerathunga2188
    @wisuraweerathunga21884 жыл бұрын

    Thanks for this one ! You made it clear !

  • @linearseller2835
    @linearseller28358 жыл бұрын

    What a great video. I did miss conclusions about the example, though. Beta3 is 30, but it has a p-value equal to 0.228. Can we conclude that this free lunch plan didn't have a statistical relevance (at 95%), right? Those 30 points could have been by chance, right?

  • @dougmckee673

    @dougmckee673

    8 жыл бұрын

    +Linear Seller Absolutely correct and not that surprising given there were only 10 observations in this sample.

  • @oldtree700
    @oldtree7007 жыл бұрын

    Hi, Doug! Thank you so much for your great video. I have a quick question. At the end of the video you mentioned the example for the case where DiD is not ok. If the free lunch program has been implemented already in the control group, is there anyway I can still use it as a control group? Semiparametric DiD can be used?

  • @xingu7561
    @xingu75615 жыл бұрын

    It is really helpful!This vedio is easy to understand for new learners like me!I really appreciate your help!If i can survive from my phd program,i hope i can make vedios like this in the future!

  • @inferno9004
    @inferno90048 жыл бұрын

    IGreat video Doug !!! if there is just have 1 treatment and control group with pre vs post time data and we want to include many control variables , say 5, how do we fit a model with 5 control variables ? What does the regression equation look like ?

  • @dougmckee673

    @dougmckee673

    8 жыл бұрын

    +inferno9004 It looks just like the regression model shown in the video with the addition of your control variables.

  • @thej1091
    @thej10912 жыл бұрын

    Thank you kind sir! :)

  • @tuhinurrahmanchowdhury9705
    @tuhinurrahmanchowdhury97053 жыл бұрын

    Great video. It saved me!

  • @vedantss
    @vedantss2 жыл бұрын

    Very useful!

  • @Ytremz
    @Ytremz8 жыл бұрын

    Brilliant

  • @saraly2
    @saraly22 жыл бұрын

    Thank you!

  • @hd81504
    @hd815047 жыл бұрын

    First off, thanks for the great video, Doug! I have a follow-up question to one of the comments below: One person commented: So do I understand correctly an extension of the model for 3 treatment groups and 1 control with pre and post could look the following: y = β0 + β1 * Dpost + β2 * Dtr1 + β3 * Dtr2 + β4 * Dtr3 + β5 * Dpost * Dtr1 + β6 * Dpost * Dtr2 + β7 * Dpost * Dtr3 + β8 * X β5: DiD effect for Treatment 1 β6: DiD effect for Treatment 2 β7: DiD effect for Treatment 3 And you replied that is correct. So my question is can you do this same procedure in logistic regression when your dependent variable is dichotomous (e.g., disease vs. no disease)?

  • @dougmckee673

    @dougmckee673

    7 жыл бұрын

    Interpreting coefficients on interaction terms in nonlinear models (like logistic) is tricky. If it were me, I would just estimate a linear probability model, but there's a much longer (and better) answer here: stats.stackexchange.com/questions/89513/difference-in-differences-estimator-for-logistic-regressions

  • @lemoncobra2563

    @lemoncobra2563

    5 жыл бұрын

    To respond to doug, I want to use a word of caution on using LPM is that you can have unbounded probabilities and your errors will be heteroskedastic. The latter can be fixed by an extra option but the former as a fundamental issue within the estimator itself. I would argue the point of using DiD is to examine the magnitude of change from a program, etc and with a logit regression you will get your coefficients, calculate the margins, and use the margins to calculate a probability that the DD had on your dependent variable. You're kind of muddling the point of using a logit in this regard but it still works. Kind of loses some explanatory power and loses the charm. Still doable though.

  • @bright1402
    @bright14025 жыл бұрын

    Thank you for your video! But at the time 8:06, what is the difference between \beta_0 and \epsilon?

  • @jotaeleoh

    @jotaeleoh

    5 жыл бұрын

    Beta_0 is the effect or value of outcome "y" (not including the rest of the variables). Epsilon is the error term which basically contains all other components of "y".

  • @fritzlouw8434
    @fritzlouw84348 жыл бұрын

    Much appreciated. Keep it up man!

  • @sarapluviano410
    @sarapluviano4107 жыл бұрын

    Hi, thanks for the video. In the beginning you say that DID is useful for estimating causal effects of programs when the program is not implemented as a randomized controlled trial. So, in a randomized controlled trial DID are not necessary? Thanks!

  • @zeinebouni8764
    @zeinebouni87648 жыл бұрын

    Hi Mr Doug, Thank you for this interesting Video. Is it possible to do DID with ordinal Outcomes? My variables: Rating Firms (Y), D1 (D1== Treated simple; 0 Control Sample); D2 (D2==1 if after treatment; 0 Before). I didn't found any examples to know if is it possible and to see how we can interprete the estimators. Your response is very important for me. Thank you.

  • @dougmckee673

    @dougmckee673

    8 жыл бұрын

    +Zeineb Ouni I haven't seen it done, but you I believe you could estimate an ordered logit model (ologit) with the same covariates shown above (D1, D2, and D1*D2 in your case). You have to be careful with interpreting interactions in the ordered logit, but I think the basic idea is valid.

  • @zeinebouni8764

    @zeinebouni8764

    8 жыл бұрын

    +Doug McKee Thank you so much.

  • @shubrathak.p.7198
    @shubrathak.p.71988 жыл бұрын

    Hi Doug. Please help me! Can I use DID if my data does not follow the assumption of normality? If not..is there a non-parametric DID?!

  • @dougmckee673

    @dougmckee673

    8 жыл бұрын

    If you have a large enough number of observations (at *least* 25, and I'd feel comfortable over 100), then your outcome doesn't need to be normal--The Central Limit Theorem says your estimate of the treatment effect will be approximately normal. I believe there are nonparametric DiD-like methods when you have a continuous treatment and you believe the effect is nonlinear, but I don't know much about them.

  • @shubrathak.p.7198

    @shubrathak.p.7198

    8 жыл бұрын

    Thank you Doug!

  • @josephdover6822
    @josephdover68228 жыл бұрын

    Hi Doug! Thank you so much for your video I just wanted to ask you a small question: I am also planning to use the difference in differences model. I am looking at the impact of the EURO (introduiced in 1998 and in circulation in 2002) on trade flows between countries in Europe and I am new to STATA hence I am not too sure how to proceed. I did the following regression regress Tradeflow Governmenteffectiveness1 Unemployment1 GDPpercapita1 Populationsize1 Governmenteffectiveness2 Unemployment2 GDPpercapita2 Populationsize2 Distance1-2 But I am not sure what I should do next? Any help would be very much appreciated! :) Best, Joseph

  • @dougmckee673

    @dougmckee673

    8 жыл бұрын

    +joseph dover To apply a difference in difference, you'll need to divide your trade flows into some set that might be affected by the introduction of the Euro (treatment) and another set that definitely would not be (control). You will also need to reshape your data so you have observations of each trade flow before and after the Euro was introduced. Then you should be able to apply the regression method shown in the video. Good luck!

  • @bright1402
    @bright14025 жыл бұрын

    Thank you so much for your video! But in the last slide, I could not understand the Not OK case...

  • @alfonsoga95
    @alfonsoga955 жыл бұрын

    Thanks, I have one question though, what's the name of the program you're using for the regression? I'm not familiar with it, I find it quite practical

  • @oyvsni6679

    @oyvsni6679

    4 жыл бұрын

    Doug is using Stata

  • @GoonieFridkin
    @GoonieFridkin8 жыл бұрын

    Hi. Thanks so much for this! Quick question though. I've just run a DD regression on my data. The DD beta score isn't significant, but the group (test vs control) beta is. What does this mean?

  • @dougmckee673

    @dougmckee673

    8 жыл бұрын

    The insignificant DD beta means there is no significant effect of the treatment. The significant group beta means you have significant pre-treatment differences between the groups.

  • @eiinre
    @eiinre7 жыл бұрын

    Hi Doug, how do I add additional controls (i.e. X) into the model? I am using SPSS to do the DiD. Do I just add the control variable and regard it as an independent variable?

  • @hassanmurtzakhan
    @hassanmurtzakhan9 жыл бұрын

    I am trying to run this through STATA and its omitted Beta3 because of multicolinearity between variables can you guide me how to handle it. Thanks

  • @dougmckee673

    @dougmckee673

    9 жыл бұрын

    Hassan Murtza Khan I don't usually answer Stata questions on KZread, but I'll make an exception just this once. :) There are two possibilities. The first is that you don't have observations for each group (treatment and control) in both the before and after periods. Tabulate your treatment dummy and your control dummy and make sure all four cells have observations. The second possibility is that you made a mistake constructing the interaction variable. Check this by tabulating the interaction with each of the dummies to make sure the result makes sense. Now your job is to try these and report back so everyone can learn!

  • @homayoungerami4176
    @homayoungerami41764 жыл бұрын

    thanks, it was easy to digest

  • @johndupont8596
    @johndupont85968 жыл бұрын

    Hi Doug Thanks a lot for the video! I just have a question. I want to conduct a different in Differences module on STATA between students that received maths lessons and those that didn't . I would like to test when having extra maths lesson help student achieve higher marks. My variables are: "StudentID" "TIME" "MATHS_LESSON" "MARKS" But the problem I have is that not every students have received maths lessons over the period of time and I would like to create 2 groups one "maths_lesson" one "Nomaths_lesson" by adding them to the variable column "StudentID". How should I proceed? Let me recap: I am now trying to obtain is a graph with "time" on the x axis and "marks" on the y axis with two line (one for the group of students who took maths classes and the one for the group that didn't) but I am struggling a bit to achieve this. Hope I am clear in describing my problem! Best regards, John

  • @dougmckee673

    @dougmckee673

    8 жыл бұрын

    +John Dupont Using your TIME variable, you should divide your observations into "before" and "after" groups. You've already divided your students into those that got the treatment (MATHS_LESSON) and those that didn't. Once you have that, you can compute means of the four cells and subtract them to get the DD estimate. I advise first understanding your data and computing the required numbers before worrying about communicating those numbers with a graph. Hope this helps!

  • @rohangopalakrishnan7417
    @rohangopalakrishnan74173 жыл бұрын

    Big from you Doug

  • @ec.juanfranulcuangolee3294
    @ec.juanfranulcuangolee32944 жыл бұрын

    Any impact evaluation it is supossed to be started #Building the #DataBase.. then the methodoly as DID must be analized..isn't???

  • @sembilanbereguler2602
    @sembilanbereguler26029 жыл бұрын

    Based on regression result (at 8:59), what is criteria to reject null hypothesis?

  • @monicabraga4344
    @monicabraga43442 жыл бұрын

    how did you do it can you share with me , thank you

  • @tarpinianmt
    @tarpinianmt9 жыл бұрын

    Thank you so much for this, I had never heard of difference in differences until a reading I had for economic development. I'm actually planning to reference this video in a paper; do you have anything you'd want me to include for a citation? Thanks again.

  • @dougmckee673

    @dougmckee673

    9 жыл бұрын

    Matthew Tarpinian I'm really glad you've found the video helpful, but it's probably not appropriate for a citation in your paper. If you want a good reference for the method, I suggest using Angrist and Pischke's _Mostly Harmless Econometrics_ instead.

  • @tjahangon7286
    @tjahangon72869 жыл бұрын

    Thank you very much. This video really helps me. What statistic program did you use in this video? Stata?

  • @dougmckee673

    @dougmckee673

    9 жыл бұрын

    ***** I did use Stata to get some of the numbers shown, but the content is fairly independent of the software in this video. Stata plays a bigger role in some of my other videos.

  • @tjahangon7286

    @tjahangon7286

    9 жыл бұрын

    Thank you very much.

  • @tjahangon7286

    @tjahangon7286

    9 жыл бұрын

    Doug McKee May I ask one more question? I am using binary dependent variable (dummy). I have search information in internet and find that it is possible to have a regression model with binary dependent variable (in STATA: .probit and.logit command). In your opinion, can it be also implemented in regression of a DD model (I mean, using command .logit y DTr DPost DTrXDPost)?

  • @dougmckee673

    @dougmckee673

    9 жыл бұрын

    ***** Short answer: Yes. Longer answer: If you use your binary dependent variable in a linear regression model exactly as shown here, you are estimating a linear probability model. The coefficients can be interpreted as effects on the probability of the dependent variable being one. Most economists would do this. You *could* estimate a logistic model with the same variables on the right hand side, but it is much harder to interpret the magnitude of the coefficient on the interaction.

  • @tjahangon7286

    @tjahangon7286

    9 жыл бұрын

    Doug McKee Do you mean that if y is a binary dependent variable and: 1. I use command [regress y DTr DPost DTrXDPost], then I am "estimating a linear probability model. The coefficients can be interpreted as effects on the probability of the dependent variable being one." 2. I use command [.logit y DTr DPost DTrXDPost], then "it is much harder to interpret the magnitude of the coefficient on the interaction." I hope your answer is "yes".

  • @Dniem
    @Dniem3 жыл бұрын

    Hello Professor Armstrong!

  • @JM-fr9bc
    @JM-fr9bc3 жыл бұрын

    What are the assumptions of dif in dif?

  • @lauramendezcarvajal5149
    @lauramendezcarvajal51498 жыл бұрын

    Douglas thanks for this amazing video, it helped me so much! I just have a question: why (y) has only one test score? I am a little bit confused about the pre-test and post-test information. If I have the test scores before the implementation and the scores after, how do I compute them? Thanks

  • @dougmckee673

    @dougmckee673

    8 жыл бұрын

    They key is to have (or be able to compute) the average test score of both groups before AND after the intervention.

  • @vegasastras9194
    @vegasastras91943 жыл бұрын

    What is that program 8:17, looks very neat

  • @donasp5391

    @donasp5391

    3 жыл бұрын

    Stata

  • @ahmedseliem3201
    @ahmedseliem32013 жыл бұрын

    how to do a difference in difference method using SPSS? need practical steps

  • @brucelee7782
    @brucelee77825 жыл бұрын

    I didnt get the did effect of 30 from 7:35 somebody help please! 😓

  • @liveybeha

    @liveybeha

    4 жыл бұрын

    I didn't either at first! Remember to average (rather than add) each set of observations before doing the DiD calculation.

  • @Nem3siS4o
    @Nem3siS4o7 жыл бұрын

    Thanks!

  • @ursulapulyer916
    @ursulapulyer9168 жыл бұрын

    thank you!

  • @chocolateyum678
    @chocolateyum6786 жыл бұрын

    thank . you!!!!!!!!

  • @matinhewing1
    @matinhewing16 жыл бұрын

    Who down voted this video? Someone who didn't get a free lunch?

  • @weoweoteo

    @weoweoteo

    6 жыл бұрын

    lol! this vid was super helpful. especially for my econometrics exam tomorrow xd

  • @brothermalcolm
    @brothermalcolm3 жыл бұрын

    everything made sense until @7:55 help!

  • @joaoluistbarroso6917
    @joaoluistbarroso69173 жыл бұрын

    Show

  • @sjhoenen
    @sjhoenen9 жыл бұрын

    Thanks!