Coder Social home page Coder Social logo

rddapp's People

Contributors

felixthoemmes avatar kimberlywebb avatar papsti avatar wliao229 avatar zejin avatar zwenyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rddapp's Issues

mfrd_est

Currently mfrd_est has an option for tr (treatment), but this option is never used (because we don't do fuzzy MRDD frontier approach). Best to delete this option for now.

download plot buttons are broken for PC

These buttons mysteriously attempt to download files with the names simulate_power_chart_FMT where FMT is the requested format (e.g. .png) on PC, but this string never appears in the script simulate_power.R, where the button functions are coded!

mfrd_est

When boot is larger than 10, we often get an error message about an extremely bad integrand and we need to investigate what is happening

est.cov and est.itt in frontier models

  • est.cov appears not to be working with frontier - notify user either in help file or output
  • est.itt appears not to be working with frontier - notify user

demo using CARE data

Should we create a demo of the software using the CARE data? I'm thinking the simplest would be a text demo (like a vignette in R, but just presented as a linked HTML document).

If we did this, I would remove the material related to the CARE data from the help manual and instead just link to the demo from the help manual where relevant (e.g. in the Data section).

Thoughts?

Vignette

Write vignette using CARE data

est.cov and est.itt in summary output

When est.cov (effect of RDD on covariate) is used, the estimates are just dumped into additional rows, without any labels.
Probably need a complete replicate of the full table.

This happens in mfrd_est and rd_est.

mrd_impute

mrd_impute needs stop.on.error just like mrd

ngrid default should be 250, not 2500

mfrd_est - arguments

has a cluster option but by itself never computes clustered standard errors (it only does bootstraps), maybe remove this option? Does removing it break some other parts?

mfrd_est non-parametric

mfrd_est frontier approach should have non-parametric version (see here: https://stats.stackexchange.com/questions/206476/how-to-choose-appropriate-bandwidth-for-kernel-regression/208071)

This describes using cross-validation to choose bandwidth (which seems feasible). We could implement cross-validation, then estimate the model with chosen bandwidth h, and then use the exact same code (that forms all these predicted values along the grid of the frontier) to estimate effects. Inferential stats would come through bootstrapping again. I would suggest to choose the bandwidth once via CV, and then bootstrap with a fixed bw.

plot.mfrd help file

  • plot.mfrd help file needs to specify what "m_s", "m_h", and "m_t" are
  • plot.mfrd help file needs to specify what granularity is

Errors when running an RDD with two assignment variables using CARE data

I'm trying to run an RDD with two assignment variables in the shiny app with the CARE data, so that I can refer to the interface while editing that part of the help manual, but I'm getting errors trying to set everything up.

I used the "Sample data" section of this doc to set the RDD up.

I first set the outcome variable to SBIQ48, the treatment receipt variable to DC_TRT, and the treatment design to MOMWAIS0 <= 85 and APGAR5 <= 3, as specified in the above doc. However, it turns out that the sample of the CARE data currently included in the shiny beta app does not include any observations with APGAR5 <= 3.

I increased the assignment threshold for APGAR5 to 8 (so APGAR5 <= 8 gets assigned to treatment), and while the tables in the Data tab all load properly, and I can check assumptions for the MOMWAIS0 variable, I get an error for figure 2.1 saying "intitialization failed" when I try to perform McCrary's sorting test on the APGAR5 variable. Moreover, all model estimation fails in the Estimates tab. I suspect there may be a problem using DC_TRT variable for treatment assignment in the case of this multiple assignment, and perhaps we'd need to dig up the old RDD2_TR column of data to get the RDD with two assignment variables to run.

I only bothered to write this up in detail since it also concerns the vignette/demo stuff, but for right now, I just need to be able to run an RDD with two assignment variables to edit the associated section in the help manual. Is there some quick fix you can think of?

Covariate in model should not show Cohen's d

If a covariate is added, the user can display the coefficient of that covariate. By default a Cohen's d is displayed.
This should be suppressed. It is likely not meaningful for the researcher, and importantly, it is also wrong, because it is currently estimated based on the regression coefficient, and not a treatment effect estimate.

have plot download buttons (png, svg, pdf) prompt user for plotting parameters

When the user clicks a plot download button like download to .png for instance, they should see a prompt that asks them to put in arguments for the plotting function,

png(filename = "Rplot%03d.png",
    width = 480, height = 480, units = "px", pointsize = 12,
     bg = "white",  res = NA, ...,
    type = c("cairo", "cairo-png", "Xlib", "quartz"), antialias)

so that they can control plot size, aspect ratio, resolution, etc.

display help page with table of contents

I'll start by writing a basic code that takes some text from the master help page file Felix is writing (help_page_full.Rmd) and generates a page_help.R file that Shiny reads. I'd also like to eventually have the page_help.R file automatically re-generated when help_page_full.Rmd is updated... I'll have to think about how to do this.

mfrd_est - example code

example code in mfrd_est does not make sense because y is not a function of a discontinuous x1, but only x2
Replace with "y <- 3 + 2 * (x1 >=0) + 3 * cov + 10 * (x2 >= 0) + rnorm(1000)"

later package

It seems that the "later" package is needed, but I don't see it listed as a dependency. Please investigate.

Timeout in dc_est

... should be set to a default of 30seconds, but the user should be able to set an argument that extends this (e.g., ....timeout=60). An optional argument should be NULL, which is "run to completion". When the time out is reached, error message should inform user that they should choose different bin and bw settings, or increase timeout setting. Help file of dc_est should be updated as well.

mfrd_est - helpfile

Help file says boot is an option to obtain "standard deviation" of estiamtes - I think this should say "standard error"

Missing CARE data description from help_page.Rmd

After May 2, the CARE data description (and other edits) conspicuously disappeared from help_page.Rmd; I suspect that's because an older version of the file was was being worked on after that point.

There's no real issue with this since we decided to migrate the CARE data description from the help page to the vignette anyway. I just wanted to note here for @felixthoemmes that everything he already wrote for the CARE data (and that I edited) can be found in this version of help_page.Rmd so that it can be copy+pasted into the vignette doc when the time comes.

Correct expression for the Gaussian kernel in the help manual?

I don't think I understand the typesetting of the Gaussian kernel in help_page.Rmd, as written:

Gaussian: $K(u) = 1/\sqrt(2/\pi^-1/2u^2)$ .

I'm specifically confused by what's supposed to be in the exponent on pi (\pi^)... If you reply with a quick snap of the handwritten equation (or a screenshot of it typeset somewhere), I can typeset it properly in help_page.Rmd. Thanks!

power function

It seems that the power estimation relied on a default for t.design (or maybe even other defaults).
When this was changed in the rd_est or mfrd_est, the power function does not seem to work anymore.
The assumption is that the power function relied on a default, but the power function should rely on the actual user input of the treatment assignment.

mrd_power output

mrd_power output unclear which row goes to which effect

Alternatively, have a summary function for it.

column labels mrd_power rd_power

Column labels in the output object of mrd_power and rd_power should be improved.

Alternatively, maybe have a summary function for these two objects.

download svg / jpeg

Currently all plots can only be downloaded as PDFs. Users should be able to also download SVG or JPEGs - thoughts on which formats would be best?

Labels in "Parametric Model for Outcome" panel of Power page are unintuitive

Some parameter labels in this panel are a little unintuitive, especially for the RDD with two assignment variables.

First, the 1D case should be clarified. For example, "Treatment" should perhaps be be replaced by "Treatment effect". Then the 2D case should be made consistent with the clarified 1D labels (e.g. "Treatment effect (T1)"). It might also be good to put "Slope" and "Intercept" under a new heading called "Regression" so it's clear what the slope and intercept refer to.

Lastly, it should also perhaps be clarified whether "Slope" and "Intercept" pertain to the regression on the treatment data or the control data (i.e. to either side of the threshold). This may be obvious in context, but as an outsider just learning about RDD, it's is hard for me to infer from the current interface. I also realize that the last point might be better clarified in something like the vignette than in the actual interface, to avoid making the GUI too clunky. Just jotting it down here to remind myself of this suggestion next term!

Somewhat related: why does the "Treatment" field persist in this panel when switching to an RDD with two assignment variables when, in the two var case, there are two new fields called "Treatment (T1)" and "Treatment (T2)", which are ostensibly the treatment effects for each assignment variable... In the case of a two var RDD, what does the old "Treatment" field specify?

rd_type example

rd_type has no example in help file, add one that is based on examples in rd_est

est.itt

est.itt is correctly displayed, but the user should be made aware in the output that the ITT estimate is shown, and not the one that is based on actual fuzzy assignment.
Maybe an additional heading is needed.

This affects both rd_est and mfrd_est.

mfrd_est - ngrid default

ngrid default seems kind of large (2500), maybe change default or check paper by Wong et al. why 2500 was chosen. 2500 makes this very slow, with less grid points, bootstrapping is actually pretty fast

ITT estimate has opposite sign of complier average

In fuzzy RDDs the ITT and complier average treatment effect is estimated.
At least in the CARE dataset, I noticed that the sign switches. I assume that in the ITT case, the program simply returns the regression coefficient, and does not do the internal coding of what is treatment and what is control, and just relies on the standard factor coding.

Wenyu, can you look into this? If this gets confusing, I believe Wang should be able to point out to you quickly where in the code this happens.

dc_test

  • output from dc_test needs to be cleaned, formatted like summary function
  • dc_test default should be verbose=TRUE
  • plot from dc_test needs labels
  • error message when bin is too large needs to be improved
  • hangs when bin is too small - maybe potential fix for this

mfrd_est - t.design default

t.design is too important to have a default - the risk of wrong models by user is too big, change default to NULL and return proper error message

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.