Coder Social home page Coder Social logo

s3alfisc / summclust Goto Github PK

View Code? Open in Web Editor NEW
6.0 1.0 2.0 1.63 MB

R module for cluster specific information (as in the Stata summclust module)

Home Page: https://s3alfisc.github.io/summclust/

License: Other

R 99.44% Stata 0.56%
clustered-standard-errors fixest linear-regression robust-inference

summclust's Introduction

summclust

R-CMD-check CRAN status

runiverse-package Codecov test coverage

{summclust} is an R module for cluster level measures of leverage and influence, and further implements CRV3 and CRV3J cluster robust variance estimators.

For an introduction to the package, take a look at its vignette.

For a quick overview of different CRV estimators, take a look at the cluster robust variance estimation vignette.

For a very detailed description of the implemented methods, in particular a discussion of the different leverage and influence metrics, see:

MacKinnon, J.G., Nielsen, M.Ø., Webb, M.D., 2022. Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust. QED Working Paper 1483. Queen’s University.

For the Stata version of the package, see here.

Installation

You can install the development version of summclust from CRAN, GitHub and r-universe with:

# install from CRAN
install.packages('summclust')

# from r-universe (windows & mac, compiled R > 4.0 required)
install.packages('summclust', repos ='https://s3alfisc.r-universe.dev')

# install.packages("devtools")
devtools::install_github("s3alfisc/summclust")

summclust's People

Contributors

kylebutts avatar s3alfisc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

summclust's Issues

fixest: sparse matrix support

For very high-dimensional fixed effects problems, using sparse matrices should make the MNW jackknife estimator computationally feasible (based on a suggestion by @kylebutts).

summclust() overrides type = "CRV3J"

If type = "CRV3J" is specified via summclust(), the function still returns a CRV3 variance covariance matrix. No downstream errors, but misleading output.

add center option to autoplot

Add option to center beta_kj's around beta for all covariances k, to guarantee beta_kj's are all on the same scale.

Support for models with unit specific time trends?

Hi, thanks for a great package!

I am trying to calculate CR3 standard errors for a model that includes unit specific time trends. Do you have any advice on how / if this can be implemented in summclust?

Here's a minimal example, showing the error that I get when I try to use the summclust() function on a model that includes unit specific time trends.

library(summclust)
library(fixest)
df <- fixest::trade

fit  <- feols(Euros ~ dist_km | Year + Origin + Origin[Year], data = df)
summ <- summclust(fit, params = ~dist_km, cluster = ~Origin)

The error is pasted below:

Error in `[[.default`(Origin, Year) : 
  attempt to select more than one element in vectorIndex

Thanks so much for your time and your help

fix fixed effects

Only fixed effects that are nested within clusters are allowed to be projected out in the bootstrap. For numerical stability, it is further recommended to project these fixed effects out. Therefore

  • add an option to summclust.lm and summclust.fixest, e.g. absorb_cluster_fe and set it to TRUE by default.
  • for regression objects that project out multiple fixed effects that are non-nested, make sure to properly add them as dummies to the design matrix X (see fwildclusterboot code)
  • potentially, experiment with out-projection of nested fe's in every jackknife iteration

Add extractor methods

For

  • - leverage
  • - partial leverage
  • - p-values
  • - t-statistics
  • - confidence intervals
  • - jackknife beta's

add measures

Add measures of

  • leverage
  • partial leverage
  • coefficient of variation
  • $\hat{\beta}_{j}^G}$
  • a_harm, a_geo, a_quad
  • effective number of clusters as in Carter et al

As described in MacKinnon, Nielsen & Webb (2022).

Adjust to CRAN submission comments

  • do not start the description with "This package", package name,

  • The Description field is intended to be a (one paragraph) description of what the package does and why it may be useful.
    Please add more details about the package functionality and implemented methods in your Description text.

  • Please add \value to .Rd files regarding exported methods and explain the functions results in the documentation. Please write about the structure of the output (class) and also what the output means. (If a function does not return a value, please document that too, e.g. \value{No return value, called for side effects} or similar)

    • coeftable.Rd: \value
    • coeftable.summclust.Rd: \value
    • plot.summclust.Rd: \value
    • summary.summclust.Rd: \value
    • summclust.fixest.Rd: \value
    • summclust.lm.Rd: \value
    • summclust.Rd: \value
    • vcov_CR3J.fixest.Rd: \value
    • vcov_CR3J.lm.Rd: \value
    • vcov_CR3J.Rd: \value
  • \dontrun{} should only be used if the example really cannot be executed (e.g. because of missing additional software, missing API keys, ...) by the user. That's why wrapping examples in \dontrun{} adds the comment ("# Not run:") as a warning for the user.
    Does not seem necessary. Please replace \dontrun with \donttest.

  • Please unwrap the examples if they are executable in < 5 sec, or replace \dontrun{} with \donttest{}.

  • Please wrap examples that need packages in ‘Suggests’ in if(requireNamespace("pkgname")){} instead.

  • Finally, resubmit.

Additional comments from my side:

  • improve docs
  • add @nord tags to internal functions

pkgcheck results - main

Checks for summclust (v0.2)

git hash: f66d35ff

  • ✔️ Package name is available
  • ✖️ does not have a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✖️ Package has no HTML vignettes
  • ✖️ These functions do not have examples: [coeftable, coeftable.summclust, model_matrix.fixest, model_matrix.lm, model_matrix, plot.summclust, summary.summclust, summclust].
  • ✖️ Function names are duplicated in other packages
  • ✔️ Package has continuous integration checks.
  • ✖️ Package coverage is 67.8% (should be at least 75%).
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE

Support WLS

Support the summclust / the cluster jackknife for WLS.

Bug in tidy.summclust

Change the following:

From

  se <- diag(sqrt(vcov[param_, param_, drop = FALSE]))

to

  se <- sqrt(diag(vcov[param_, param_, drop = FALSE]))

in order to avoid warning message `Warning message:

In sqrt(vcov[param_, param_, drop = FALSE]) : NaNs produced.

Note that this error has no consequences on the displayed results.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.