The ordinalsimr from neuroshepherd

Memoisize stats function

Inputs are likely to be repeated many times over either in a Shiny session or when using the function directly. For performance improvement, consider using {memoise}

Determine proper mechanism for saving data

CRAN (and general good practice) discourages code that automatically and directly saves output to users' folders, unless the user has specified this by entering an outdest with e.g. a file path or GUI. Saving to a scratch/temporary folder is a possibility, but users will presumably want to save to a more permanent location.

Allow custom seeds?

Related to #15

Allow either a function or set of seed values that users can input. In either case, this vector of custom seeds would need to be the same length as the number of trials.

Unclear if this is a good idea or even necessary.

Create Action for automatically deploying app builds to ShinyApps.io

Create DOI on release

Can be accomplished via Zenodo

https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content

Allow custom ordinal tests

(This is a doozy of a problem.)

In short, users may want to supply their own ordinal tests for use in the app. I think the best way to accommodate this is to have a text entry box that is just a container for a list (i.e. enter functions R list-style).

I would need to provide users with (and enforce) a standard syntax to use for variables. 😭
The list items should be named
The stats tests function in its current form would need to be overhauled to allow this
- At minimum, the pre-allocated vector would need to become a reactive object for the varying number of tests employed
How to handle functions exported by packages not already attached by the Shiny session? 😿

Allow user-input for the statistics functions

Current code employs a variety of different statistical tests, but relies entirely on the default arguments for these tests. This is acceptable for the MVP, but it will eventually be necessary to open these up for user-access. This will consist of two parts:

Refactor the code such that the ordinal_tests function is comprised of wrapper-functions for each of the tests I'm using. Or possibly some other approach. Main idea: need to make the function arguments accessible in a way that does not pass arguments to the wrong function.
Create an interface/module in the Shiny app for inputting this information.

Create sample input data and/or demo

Creating an .rds file containing a bunch of sample parameters to use and run within the Shiny app (or outside of it in the stats functions) plus a button to activate "demo-mode" could be useful for instructing users on how to use the app

Fix error on checking probability summation

Currently, rum_simulations() checks that the probabilities for each group sum to 1. However, this check currently checks for exact matches without considering floating point numbers. This produces an error on even simple scenarios such as below. Use a more robust equivalence check such as dplyr::near() just allowing for typical machine tolerance.

prob0 = c(.1,.2,.3,.4)
prob1 = c(.4,.3,.2,.1)

sum(prob0) 
#> [1] 1
sum(prob0) == 1
#> [1] TRUE

sum(prob1)
#> [1] 1
sum(prob1) == 1
#> [1] FALSE

^{Created on 2023-06-24 with reprex v2.0.2}

Use non-correlated seeding approach

See Morris et al. 2019 for details on this issue. In short, successive seeds will yield correlated results across repetitions thus yielding biased results.

Create n-iterations input module

Add a labels column to probability table

This can help users keep track of which probabilities align with which labels for an ordinal outcome. Current table only has two columns for entering probabilities. Example:

Label	Null Prob	Int. Prob
Very Likely	.25	0
Likely	.25	.33
Unlikely	.25	.33
Very Unlikely	.25	.33

Allow option to save as .csv or .Rdata

Writing to and reading from an .Rdata file is faster than .csv, but the latter is more flexible and broadly portable. Potentially allow user to choose between these formats, although consider limiting the ability to use .csv after a user reaches a certain number of rows due to likely performance issues with using e.g. Excel.

Allow unequal group sizes

In practice, very few trials will end up with perfectly-balanced sample sizes by the end of a trial due to chance, adverse side effects, etc. The stats functions and app will need to be able to handle such unequal groupings.

Adjust the assign_groups() and ordinal_tests() functions to accept unequal N
Add relevant UI to the Shiny app

Create module for saving data

Create a module that allows a user to save an .RData file with results and input parameters to a location of choice on their system.

Superordinate to #10.

Notify users of problems with zero-probability outcomes

Some (which?) of the currently-used stats functions fail or produce warnings when any of the outcome categories have a zero probability. I will likely set-up tryCatchs to prevent total failures, but there should also be a check or notification on data entry so that users are aware some results may be NA all the way through.

Workspace issue with fisher.test

On even "normal" sized runs (e.g. 1000 iterations of 50:50 allocated 235 patients), fisher.test() can run out of allocated workspace because I am not simulating a p-value for this test. Default argument is workspace = 200000, and I have temporarily increased this to 2e7.

However, this should be a user-accessible parameter, ideally in the Shiny app, in case people still hit workspace limits.

Create sample size parameter entry

Capture unique warning messages

Tests such as Wilcox and Chi-Squared often produce warnings due to e.g. computing p-values with ties and p-value approximations. It would be useful to capture these warnings when they appear at least once, or create an index of which iterations encountered which warnings.

Would need to clarify the cases in which this feature is actually useful before implementing any solution.

Jitter zero-probability outcomes

Create an option within the data input table to jitter values such that all probabilities are non-zero positive decimals

Develop tests for core functions

Tests of the self-developed functions should be created to confirm that outputs stay consistent as the project develops, and/or that updates to functions come with commensurate updates in the tests.

Allow users to change RNG kind

See #15 for comments on implementation, and using external package for other RNG kinds

Add README file

Should include a minimal README file describing how to download the package from GitHub, what this project accomplishes, and, in the long run, add badges for e.g. CRAN, code coverage, etc.

Allow uploads of parameters from previous sessions

There may be need/desire for people to re-upload the parameters from the saved .RData file, and run new or derivative analyses from this data. If this seems to be necessary, it would make sense to save the .RData with a class or e.g. as an S3 object that would be identified by a data upload module.

Use `binom.test()` instead of self-calculating CIs

~~I should probably use stats::qt() with degrees of freedom equal to the sample size (minus the number of groups K?) rather than assuming normality with stats::qnorm().~~

Use binom.test() for calculating CIs as it implements Newcombe adjustment

Links in Shiny app to repo and doc website

Pass reactive data between modules

Code that finally helped me figure out how to move reactive data from the probability input section to other parts of the app. Three key parts:

return() the reactive data from the server generating the data. In this case, return(probability_data) from mod_data_entry_server.
Within app_server(), save mod_data_entry_server("id") to an object.!!It will still be able to provide output to the UI even like this!!
In the server module needing access to the data, expose an argument to pass through the object saved in the previous step. (Note that the argument passed through need to be treated as a function i.e. arg().)

This now needs to be repeated for other data input modules, and passed through to the stats functions.

See NeuroShepherd/SyntheticParameters/pass-data

Create a home page

Create a home page so that some app instructions are readily available to users.

Can try to do this as a (R)Markdown file that is included according to https://shiny.posit.co/r/gallery/application-layout/including-html-text-and-markdown-files/. Need to figure out proper location for such a file; see chapters 7 and 8 of R Packages (2e) for potential uses of system.file().

(It seems the devtools::load_all() shim should work with this r-lib/devtools#179 which means the ShinyApps.io deployment will still work fine too.)

Change project name

Will require a number of updates

Git/GitHub repo name
RProj name
Package name in DESCRIPTION file
Possibly: elsewhere?

Stick to using no-space naming, and try to find a good name that is all lowercase (because R is case-sensitive)

Estimate time to complete runs

A preliminary estimate of how long it will take to complete calculations could be a useful tool to include as a validation step; print to console and/or show the results in a modal on the Shiny app

Display a calculation progress bar

Similar to #12, but rather than relying on a preemptive estimate/calculation, the progress bar provides real-time status updates on progress of the calculations in the Shiny app. However, this is generally not compatible with Shiny as each worker spawns a child-process that does not communicate with the parent process until completion

See below for possible workarounds on this issue:

And also here in {ipc} package which explicitly states it can be used for creating progress bars

https://github.com/fellstat/ipc

Rebuild README automatically with GitHub actions

According to {golem} README.Rmd template:

You'll still need to render README.Rmd regularly, to keep README.md up-to-date. devtools::build_readme() is handy for this. You could also use GitHub Actions to re-render README.Rmd every time you push. An example workflow can be found here: https://github.com/r-lib/actions/tree/v1/examples.

Correct use of packages/functions in NAMESPACE

There are some packages that are currently fully reexported via the NAMESPACE, but this is bad practice. Explicitly declare imported functions using importFrom tags

Document default seeding

The current default random seeding approach is to set the seed for each run as equal to the run number, e.g. run number 5 includes set.seed(5). This should be clearly documented in at least two places IMO:

In a vignette
Directly in the app, possibly as a modal or hover textbox

Create page for entering outcome probabilities and related info

This page will allow someone to manually enter outcome probabilities for the control and trial groups.

Probabilities entry:

Create labeled two-column data table with up to 20 rows (or start at 5 and have an add/delete row feature)
Limit reactivity; require user to press a submission button
Check that number of rows is equal in the two groups
Check that the sum of each column is exactly 1. (Perhaps have a checkbox feature to allow rounding? Probably not necessary, but I will use near(x, 1) for any machine level funkiness on this check).

Trial size and N simulations entry:

Box to enter the number of participants per trial arm. (To start, assume balanced arms?)
Box to enter the number of simulations to run

Create Action for running tests automatically

Add tag and release information

(Find info about how to properly do this for R releases, GitHub tags/labels, etc.)

Parallelize stats procedures

The stats procedures are all independent from one another so an option to parallelize them with e.g. {parallel}, {future}, or {furrr} should show a performance improvement for larger number of trials.

Create vignettes

At minimum, these should document how to use the Shiny app, and how to use the apps underlying functions outside of the app.

Round numbers displayed in the p-value tables

Request per ALB: round the DT p-value tables to 4 or 5 decimal places so the display looks nicer

Allow user-uploaded data

The initial version of this app will only be set-up to accept manually-entered ordinal-outcome information because this is A) simpler and B) is not a ton of work for a user (although it admittedly reduces the usability of the app). I see two potential use-cases/scenarios for which user-uploaded data would be requested, however:

Existing human subjects research (or other) data that has ordinal outcomes and two groups. In this case, allowing an upload of the data set and selection of A) the outcome variable and B) the grouping variable should automatically calculate the percentages for each outcome level for the two different groups.
Uploading a table or list of different outcome probabilities. This is most likely to be used when someone wants to run simulations of different theoretical outcome scenarios for the ordinal endpoint. This will require some consideration on recommended/required data format (and I could end up providing e.g. a downloadable data format/.csv file format for people to use.)

neuroshepherd / ordinalsimr Goto Github PK

ordinalsimr's People

Contributors

Stargazers

Watchers

ordinalsimr's Issues

Recommend Projects

Recommend Topics

Recommend Org