When Should You Adjust Standard Errors for Clustering?,The Quarterly Journal of Economics

当前位置： X-MOL 学术 › Q. J. Econ. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

When Should You Adjust Standard Errors for Clustering?
The Quarterly Journal of Economics ( IF 13.7 ) Pub Date : 2022-09-28 , DOI: 10.1093/qje/qjac038
Alberto Abadie ₁ , Susan Athey ₂ , Guido W Imbens ₃ , Jeffrey M Wooldridge ₁

Affiliation

Clustered standard errors, with clusters defined by factors such as geography, are widespread in empirical research in economics and many other disciplines. Formally, clustered standard errors adjust for the correlations induced by sampling the outcome variable from a data-generating process with unobserved cluster-level components. However, the standard econometric framework for clustering leaves important questions unanswered: (i) Why do we adjust standard errors for clustering in some ways but not others, for example, by state but not by gender, and in observational studies but not in completely randomized experiments? (ii) Is the clustered variance estimator valid if we observe a large fraction of the clusters in the population? (iii) In what settings does the choice of whether and how to cluster make a difference? We address these and other questions using a novel framework for clustered inference on average treatment effects. In addition to the common sampling component, the new framework incorporates a design component that accounts for the variability induced on the estimator by the treatment assignment mechanism. We show that, when the number of clusters in the sample is a nonnegligible fraction of the number of clusters in the population, conventional clustered standard errors can be severely inflated, and propose new variance estimators that correct for this bias.

中文翻译：

什么时候应该调整聚类的标准误差？

集群标准误差，集群由地理等因素定义，在经济学和许多其他学科的实证研究中很普遍。形式上，聚类标准误差调整通过从具有未观察到的聚类级组件的数据生成过程中对结果变量进行采样而引起的相关性。然而，用于聚类的标准计量经济学框架留下了未解决的重要问题：（i）为什么我们以某些方式而不是其他方式调整聚类的标准误差，例如，按州而不是按性别，在观察性研究中而不是在完全随机的实验中？(ii) 如果我们观察到总体中的大部分聚类，聚类方差估计是否有效？(iii) 在什么情况下选择是否聚类以及如何聚类会产生影响？我们使用一个新的框架来解决这些问题和其他问题，以对平均治疗效果进行聚类推理。除了通用抽样组件外，新框架还包含一个设计组件，用于说明处理分配机制在估计器上引起的可变性。我们表明，当样本中的聚类数量是总体中聚类数量不可忽略的一部分时，传统的聚类标准误差可能会严重膨胀，并提出新的方差估计来纠正这种偏差。

更新日期：2022-09-28

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>