neuroshepherd / ordinalsimr Goto Github PK
View Code? Open in Web Editor NEWR package and Shiny app for simulating ordinal endpoint results
Home Page: https://neuroshepherd.github.io/ordinalsimr/
License: Other
R package and Shiny app for simulating ordinal endpoint results
Home Page: https://neuroshepherd.github.io/ordinalsimr/
License: Other
Inputs are likely to be repeated many times over either in a Shiny session or when using the function directly. For performance improvement, consider using {memoise}
CRAN (and general good practice) discourages code that automatically and directly saves output to users' folders, unless the user has specified this by entering an outdest with e.g. a file path or GUI. Saving to a scratch/temporary folder is a possibility, but users will presumably want to save to a more permanent location.
Related to #15
Allow either a function or set of seed values that users can input. In either case, this vector of custom seeds would need to be the same length as the number of trials.
Unclear if this is a good idea or even necessary.
Can be accomplished via Zenodo
https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content
(This is a doozy of a problem.)
In short, users may want to supply their own ordinal tests for use in the app. I think the best way to accommodate this is to have a text entry box that is just a container for a list (i.e. enter functions R list-style).
Current code employs a variety of different statistical tests, but relies entirely on the default arguments for these tests. This is acceptable for the MVP, but it will eventually be necessary to open these up for user-access. This will consist of two parts:
ordinal_tests
function is comprised of wrapper-functions for each of the tests I'm using. Or possibly some other approach. Main idea: need to make the function arguments accessible in a way that does not pass arguments to the wrong function.Creating an .rds file containing a bunch of sample parameters to use and run within the Shiny app (or outside of it in the stats functions) plus a button to activate "demo-mode" could be useful for instructing users on how to use the app
Currently, rum_simulations()
checks that the probabilities for each group sum to 1. However, this check currently checks for exact matches without considering floating point numbers. This produces an error on even simple scenarios such as below. Use a more robust equivalence check such as dplyr::near()
just allowing for typical machine tolerance.
prob0 = c(.1,.2,.3,.4)
prob1 = c(.4,.3,.2,.1)
sum(prob0)
#> [1] 1
sum(prob0) == 1
#> [1] TRUE
sum(prob1)
#> [1] 1
sum(prob1) == 1
#> [1] FALSE
Created on 2023-06-24 with reprex v2.0.2
See Morris et al. 2019 for details on this issue. In short, successive seeds will yield correlated results across repetitions thus yielding biased results.
This can help users keep track of which probabilities align with which labels for an ordinal outcome. Current table only has two columns for entering probabilities. Example:
Label | Null Prob | Int. Prob |
---|---|---|
Very Likely | .25 | 0 |
Likely | .25 | .33 |
Unlikely | .25 | .33 |
Very Unlikely | .25 | .33 |
Writing to and reading from an .Rdata
file is faster than .csv
, but the latter is more flexible and broadly portable. Potentially allow user to choose between these formats, although consider limiting the ability to use .csv
after a user reaches a certain number of rows due to likely performance issues with using e.g. Excel.
In practice, very few trials will end up with perfectly-balanced sample sizes by the end of a trial due to chance, adverse side effects, etc. The stats functions and app will need to be able to handle such unequal groupings.
assign_groups()
and ordinal_tests()
functions to accept unequal NCreate a module that allows a user to save an .RData file with results and input parameters to a location of choice on their system.
Superordinate to #10.
Some (which?) of the currently-used stats functions fail or produce warnings when any of the outcome categories have a zero probability. I will likely set-up tryCatch
s to prevent total failures, but there should also be a check or notification on data entry so that users are aware some results may be NA
all the way through.
On even "normal" sized runs (e.g. 1000 iterations of 50:50 allocated 235 patients), fisher.test()
can run out of allocated workspace
because I am not simulating a p-value for this test. Default argument is workspace = 200000
, and I have temporarily increased this to 2e7.
However, this should be a user-accessible parameter, ideally in the Shiny app, in case people still hit workspace limits.
Tests such as Wilcox and Chi-Squared often produce warnings due to e.g. computing p-values with ties and p-value approximations. It would be useful to capture these warnings when they appear at least once, or create an index of which iterations encountered which warnings.
Would need to clarify the cases in which this feature is actually useful before implementing any solution.
Create an option within the data input table to jitter values such that all probabilities are non-zero positive decimals
Tests of the self-developed functions should be created to confirm that outputs stay consistent as the project develops, and/or that updates to functions come with commensurate updates in the tests.
See #15 for comments on implementation, and using external package for other RNG kinds
Should include a minimal README file describing how to download the package from GitHub, what this project accomplishes, and, in the long run, add badges for e.g. CRAN, code coverage, etc.
There may be need/desire for people to re-upload the parameters from the saved .RData file, and run new or derivative analyses from this data. If this seems to be necessary, it would make sense to save the .RData with a class or e.g. as an S3 object that would be identified by a data upload module.
I should probably use stats::qt()
with degrees of freedom equal to the sample size (minus the number of groups K
?) rather than assuming normality with stats::qnorm()
.
Use binom.test()
for calculating CIs as it implements Newcombe adjustment
Code that finally helped me figure out how to move reactive data from the probability input section to other parts of the app. Three key parts:
This now needs to be repeated for other data input modules, and passed through to the stats functions.
See NeuroShepherd/SyntheticParameters/pass-data
Create a home page so that some app instructions are readily available to users.
Can try to do this as a (R)Markdown file that is included according to https://shiny.posit.co/r/gallery/application-layout/including-html-text-and-markdown-files/. Need to figure out proper location for such a file; see chapters 7 and 8 of R Packages (2e) for potential uses of system.file()
.
(It seems the devtools::load_all()
shim should work with this r-lib/devtools#179 which means the ShinyApps.io deployment will still work fine too.)
Will require a number of updates
Stick to using no-space naming, and try to find a good name that is all lowercase (because R is case-sensitive)
A preliminary estimate of how long it will take to complete calculations could be a useful tool to include as a validation step; print to console and/or show the results in a modal on the Shiny app
Similar to #12, but rather than relying on a preemptive estimate/calculation, the progress bar provides real-time status updates on progress of the calculations in the Shiny app. However, this is generally not compatible with Shiny as each worker spawns a child-process that does not communicate with the parent process until completion
See below for possible workarounds on this issue:
And also here in {ipc} package which explicitly states it can be used for creating progress bars
According to {golem} README.Rmd template:
You'll still need to render
README.Rmd
regularly, to keepREADME.md
up-to-date.devtools::build_readme()
is handy for this. You could also use GitHub Actions to re-renderREADME.Rmd
every time you push. An example workflow can be found here: https://github.com/r-lib/actions/tree/v1/examples.
There are some packages that are currently fully reexported via the NAMESPACE, but this is bad practice. Explicitly declare imported functions using importFrom
tags
The current default random seeding approach is to set the seed for each run as equal to the run number, e.g. run number 5 includes set.seed(5)
. This should be clearly documented in at least two places IMO:
This page will allow someone to manually enter outcome probabilities for the control and trial groups.
Probabilities entry:
near(x, 1)
for any machine level funkiness on this check).Trial size and N simulations entry:
(Find info about how to properly do this for R releases, GitHub tags/labels, etc.)
The stats procedures are all independent from one another so an option to parallelize them with e.g. {parallel}, {future}, or {furrr} should show a performance improvement for larger number of trials.
At minimum, these should document how to use the Shiny app, and how to use the apps underlying functions outside of the app.
Request per ALB: round the DT p-value tables to 4 or 5 decimal places so the display looks nicer
The initial version of this app will only be set-up to accept manually-entered ordinal-outcome information because this is A) simpler and B) is not a ton of work for a user (although it admittedly reduces the usability of the app). I see two potential use-cases/scenarios for which user-uploaded data would be requested, however:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.