Sumhdfe is a Stata package that produces summary and diagnostic information of linear fixed effect models. It shows:
- The frequency of fixed effects
- How many groups (e.g., firms) have no variation within fixed effects
- The residual within-fixed-effect variation of the regression variables
It is currently in beta version, so all comments and suggestions are welcome.
For a discussion of within-fixed-effect variation, and the underlying issues that sumhdfe addresses, see deHaan (2021). Similarly, if you find these diagnostics to be useful, please cite:
**deHaan, Ed. (2021). Using and Interpreting Fixed Effects Models. ** Available at SSRN: https://ssrn.com/abstract=3699777.
Sumhdfe requires the latest development versions of reghdfe
and ftools
to be installed prior to installation.
To install these packages and sumhdfe
, follow the steps below:
cap ado uninstall ftools
cap ado uninstall reghdfe
cap ado uninstall sumhdfe
net install ftools, from("https://raw.githubusercontent.com/sergiocorreia/ftools/groupreg/src/")
net install reghdfe, from("https://raw.githubusercontent.com/sergiocorreia/reghdfe/reghdfe6/src/")
net install sumhdfe, from("https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/src/")
Sumhdfe
can be used in one of two ways:
- As a postestimation command following
reghdfe
- As a standalone command
Post-estimation version
First run reghdfe
and then run sumhdfe
. A simple example is show below, see the Stata help file for additional examples.
use "https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/sumhdfe_demo_data.dta", clear
reghdfe y x1 x2 , a(firm year)
sumhdfe
Standalone version
Run sumhdfe
directly.
use "https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/sumhdfe_demo_data.dta", clear
sumhdfe y x1 x2 , a(firm year)
The sumhdfe
command will provide four panels by default.
Panel A provides summary statistics for the sample used in reghdfe
.
Example:
Notes:
- It can be customized similar to
estat summarize
- N includes singletons, so it differs from N shown in the
reghdfe
output
Panel B provides summary statistics for the fixed effects themselves.
Example:
Notes:
- Interpretation of the above example:
- There are 189 unique firms within the firm fixed effects, 28 of which are singletons (i.e., appear just once). An individual firm has between 1 and 8 observations.
- There are 39 unique years within the year fixed effects, 8 of which are singletons.
- Iterating across both firm and year eliminates 2 more "joint singletons," for a total of 38 singletons eliminated from the
reghdfe
output.
Panel C quantifies how often each variable is constant within a given fixed effect group (such as within a given firm). These observations can have unexpected effects on regression coefficients and, if numerous, should be carefully evaluated.
Example:
Notes:
- Interpretation of the above example:
- Variable x1 has (623-38=) 585 observations excluding singletons.
- Within the non-singleton data, 58 firms have no variation in x1; i.e., each firm has the same x1 in all years. Those 58 firms relate to 217 observations.
- X1 is constant within 4 years, relating to 28 observations.
Panel D shows how much variation in each variable is lost (or absorbed) due to the fixed effects, in terms of both standard deviations and r-squared.
Example:
Notes:
- Interpretation of the above example:
- The standard deviation of x1 is 79.7 in the pooled sample (as also showed in Panel A), but the within-fixed-effect standard deviation of x1 is 22.7. Thus, the within-fixed effect variation of x1 is roughly 28.4% of the pooled sample.
- In terms of r-squared, the firm fixed effects explain roughly 87% of the variation in x1 while the year fixed effects explain roughly 13%. Combined, the fixed effects explain 92.4% of the variation in x1.
- Technical note: the r-squared is relative to the sample including singletons, for which the r-squared is mechanically equal to 100%.
The histogram(#)
option tabulates the frequencies of observations within a fixed effect grouping.
Example:
For example, sumhdfe, histogram(1)
shows the frequencies of observations for the first fixed effect grouping listed within a(firm year)
, which in this case if firm. You can also specify the fixed effect name; for example sumhdfe, histogram(year)
.
For additional examples and additional options, see the stata help file with help sumhdfe
, or its online version.
- Allow for easy export of each table to csv/excel/tex
- Tutorial/documentation with real-world example
- Add an option to visually compare the pooled- and within-fixed-effect variation in a variable. In the meantime, it can be manually done as follows:
use "https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/sumhdfe_demo_data.dta", clear
qui: reghdfe y x1 x2, a(firm year)
qui: reghdfe x1 if e(sample), a(firm year) resid
twoway (histogram x1, fcolor(green%75) lcolor(none)) (histogram _reghdfe_resid, ///
fcolor(navy%70) lcolor(none)), legend(on order(1 "x1" 2 "within-FE x1"))
(will be added as new versions are posted)
If you have questions or experience problems please use the issues tab of this repository.
Known bugs: