mikeksmith / mstoolkit Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 7.0 1.75 MB

MSToolkit v3

Home Page: http://mikeksmith.github.io/MSToolkit

License: Other

R 94.74% HTML 3.00% Makefile 0.03% SAS 0.10% Visual Basic 6.0 2.13%

mstoolkit's People

Contributors

Stargazers

Watchers

Forkers

mawds kylebaron duytran16 cksun-usc strategist922 jluo0015 romainfrancois

mstoolkit's Issues

Ability to use other simulation engines (mrgsolve) to calculate RESP

Other simulation engines (mrgsolve, simulcast, RxODE, PKPDsim) are very good at calculating outcomes for arbitrarily complex models e.g. ODEs. We should be able to hook into these with MSToolkit.

Consider `{withr}` package to help manage temporary files from tests, example code

Much of the code in examples and one or two test cases write and read from .csv files via generateData, analyzeData etc. We should use functionality in the {withr} package to manage these and clean up after use.

It might also be an option for setEctdDataMethod to use temp files (via {withr}) to store results, as opposed to having them in the working directory or internal memory.

Feedback on progress for user

Lines 224, 274, 305 of analyzeData uses utils::setTxtProgressBar(pb, i) to provide feedback on progress to user. Perhaps there are some newer tools in the {cli} package that might be better? e.g. https://cli.r-lib.org/articles/progress.html

FIX for CRAN - reduce CPU use in test cases

Forwarded message from CRAN

Dear maintainer,

your package MSToolkit does not comply with the CRAN policies:

In your tests, you make use of as many (virtual) CPU cores as available in the current machine, i.e. with 32 parallel processes in the unit tests for the most recent Windows check machine.

This does not make any sense and wastes CRAN resources (and for some other reasons cause that the package fails the Windows checks rather frequently).

Please follow the CRAN policies and do not use more than 2 processes!

We expect an update rather quickly. It took me quite soem time to investigate the problem since you are using some obfuscating unit check framework.

Best,
Uwe Ligges

Refactor functions to facilitate dplyr workflow - use .data as first argument.

It would be beneficial to be able to use dplyr workflow construction of simulated trial replicates from the component functions of MSToolkit, perhaps using tibbles as output to remove the need to write out .csv files. May also be able to more easily batch analysis using tools like purrr.

Update tests to use {testthat} and assess test coverage

Legacy MSToolkit package uses {runit} testing framework. Migrate these to {testthat} and look at test coverage. Want to be able to increase confidence in users that the package is well tested.

Use PsN SSE, mrgsolve, other datasets as input to analyzeData

Other simulation engines produce data for analysis. Would be good to be able to use these as input to analyzeData.

Be consistent in testthat case naming

Some test scripts are named test-.R but some are test.data..R.

I would prefer test-.R format. Avoids later possible problems with test naming.

Remove code dealing with NONMEM parsing

Code was introduced in MSToolkit v2 to parse NONMEM $PRED statements for use in defining the RESP function. I suggest removing this code - other tools are emerging which will enable users to translate PKPD and disease models into code for simulation (see simulx, PKPDSim, RxODE etc.). This would clean up the implementation of MSToolkit.

Remove or clean up Startup message

Startup message (onLoad) for MSToolkit is currently:

# MSToolkit package version 3.3.0 developed for Pfizer by Mango Solutions
# E-Mail: [email protected]`

I'd like to deprecate that message. At the very least it should pick up the version number from the DESCRIPTION file rather than hand-editted.

Use and update existing Github Pages examples as vignettes

Documentation for MSToolkit is based on the wiki / HTML pages. While these provide example code, it's harder to access than an R vignette. Vignettes would give users a better idea of what to expect when using MSToolkit.

Provide as.ReplicateData function to facilitate integration with DDMoRe

The DDMoRe project aims to provide tools to facilitate interoperability of models and tasks across different target tools.

The Model Description Language (MDL) will provide a Design Object specifying the design to be used in simulation, and also the definition of the model to facilitate generation of response outcomes.

Suggest providing an as.ReplicateData function to enable generation of ReplicateData files. This could be achieved by converting DDMoRe Standard Output (SO) objects (generating by simulating outcome data from an MDL Model Object, Parameter Object and Design Object) into appropriate ReplicateData standard. This would allow MSToolkit users to use analyzeData(...) on ReplicateData generated via simulation in an arbitrary tool.

test-performanalysis.R is missing the rAnalysisScript.R

The test-performanalysis.R cannot find the script rAnalysisScript.R.

This script (and analysiscode.sas) is in https://github.com/MikeKSmith/MSToolkit/tree/master/inst/systemTest/data/Scripts which should probably be relocated to an appropriate folder within the testthat framework or testthat should point to the correct location to find that script.

Change testcovariates.csv

I think the testthat/data/createCovariates/testCovariates.csv should have different values for each of the covariates i.e. X1, X2, X3 because we will likely want to test that we can sample each covariate independently e.g. sample from X1; sample from the three covariates X1, X2 and X3 retaining correlation structures; and sample whole subject covariate sets (X1, X2, X3) using ID as a stratification variable. So while I agree that we don't need 8 decimal places like in the original, maybe we just round those numbers to two decimal places?

In test.data.allocate.R ensure generateData( ) is writing to temporary area

test.data.allocate.R includes test.data.allocate.repeatedTreatments which in turn calls generateData( ). Can we align behaviour here with other instances where we use generateData and ensure that files are written to a temporary area for testing. Test runs OK, but was previously commented out...

Rename "master" branch to "main"

Can we rename the "master" branch to "main"

.ectdEnv is not exported and fails some external tests

The .ectdEnv is scratch memory for some functions and is not exported currently.

FIX for CRAN - Use {parallel} for parallel processing

(May already be fixed, but let's double check)

R 2.14.0 (18 months ago) introduced package 'parallel', and using
doParallel will generally give your users (including the CRAN check
farm) a better experience that doSNOW. But your packages are still
using doSNOW (for MSToolkit in preference to doParallel, and for
diverSity oddly on Linux only!)

Can we please have versions with

Depends: R (>= 2.14)

and making use of doParallel rather than doSNOW.

Refactor to change from grid = TRUE/FALSE to ncpus for user to specify number of CPUs in `analyzeData`.

Back in 2006 when MSToolkit was developed, personal computers didn't generally have multiple cores available on the CPU, so we only implemented multiple CPU options if grid = TRUE.

Now we should change the option name to allow the user to specify a number of CPUs to use for analysis. For CRAN (see #18) we should set the default to 2. Or set default to an option called Ncpu (see Rdatatable/data.table#5658).

Should we assume that number of clusters = ncpus?

Check `analyzeData` function uses best available methods for parallel processing with R

Check analyzeData processing uses best available methods for processing with R. May want to consider a change to using methods like purrr.

Consider using {valtools} to demonstrate validation of MSToolkit

To increase confidence in users about using MSToolkit, we might want to consider providing validation coverage and documentation using tools like {valtools}. https://phuse-org.github.io/valtools/ https://www.youtube.com/watch?v=1Bxk2wReFzE&ab_channel=RStudio

Functions should have at least one example code snippet to illustrate use

At present most functions have one example snippet of code. I would suggest extending to include at least one example to illustrate different possibilities of argument values etc.