mikeksmith / mstoolkit Goto Github PK
View Code? Open in Web Editor NEWMSToolkit v3
Home Page: http://mikeksmith.github.io/MSToolkit
License: Other
MSToolkit v3
Home Page: http://mikeksmith.github.io/MSToolkit
License: Other
Other simulation engines (mrgsolve, simulcast, RxODE, PKPDsim) are very good at calculating outcomes for arbitrarily complex models e.g. ODEs. We should be able to hook into these with MSToolkit.
Much of the code in examples and one or two test cases write and read from .csv files via generateData
, analyzeData
etc. We should use functionality in the {withr}
package to manage these and clean up after use.
It might also be an option for setEctdDataMethod to use temp files (via {withr}
) to store results, as opposed to having them in the working directory or internal memory.
Lines 224, 274, 305 of analyzeData
uses utils::setTxtProgressBar(pb, i)
to provide feedback on progress to user. Perhaps there are some newer tools in the {cli} package that might be better? e.g. https://cli.r-lib.org/articles/progress.html
Forwarded message from CRAN
Dear maintainer,
your package MSToolkit does not comply with the CRAN policies:
In your tests, you make use of as many (virtual) CPU cores as available in the current machine, i.e. with 32 parallel processes in the unit tests for the most recent Windows check machine.
This does not make any sense and wastes CRAN resources (and for some other reasons cause that the package fails the Windows checks rather frequently).
Please follow the CRAN policies and do not use more than 2 processes!
We expect an update rather quickly. It took me quite soem time to investigate the problem since you are using some obfuscating unit check framework.
Best,
Uwe Ligges
It would be beneficial to be able to use dplyr workflow construction of simulated trial replicates from the component functions of MSToolkit, perhaps using tibbles as output to remove the need to write out .csv files. May also be able to more easily batch analysis using tools like purrr.
Legacy MSToolkit package uses {runit} testing framework. Migrate these to {testthat} and look at test coverage. Want to be able to increase confidence in users that the package is well tested.
Other simulation engines produce data for analysis. Would be good to be able to use these as input to analyzeData.
Some test scripts are named test-.R but some are test.data..R.
I would prefer test-.R format. Avoids later possible problems with test naming.
Code was introduced in MSToolkit v2 to parse NONMEM $PRED statements for use in defining the RESP function. I suggest removing this code - other tools are emerging which will enable users to translate PKPD and disease models into code for simulation (see simulx, PKPDSim, RxODE etc.). This would clean up the implementation of MSToolkit.
Startup message (onLoad) for MSToolkit is currently:
# MSToolkit package version 3.3.0 developed for Pfizer by Mango Solutions
# E-Mail: [email protected]`
I'd like to deprecate that message. At the very least it should pick up the version number from the DESCRIPTION file rather than hand-editted.
Documentation for MSToolkit is based on the wiki / HTML pages. While these provide example code, it's harder to access than an R vignette. Vignettes would give users a better idea of what to expect when using MSToolkit.
The DDMoRe project aims to provide tools to facilitate interoperability of models and tasks across different target tools.
The Model Description Language (MDL) will provide a Design Object specifying the design to be used in simulation, and also the definition of the model to facilitate generation of response outcomes.
Suggest providing an as.ReplicateData function to enable generation of ReplicateData files. This could be achieved by converting DDMoRe Standard Output (SO) objects (generating by simulating outcome data from an MDL Model Object, Parameter Object and Design Object) into appropriate ReplicateData standard. This would allow MSToolkit users to use analyzeData(...) on ReplicateData generated via simulation in an arbitrary tool.
The test-performanalysis.R cannot find the script rAnalysisScript.R.
This script (and analysiscode.sas) is in https://github.com/MikeKSmith/MSToolkit/tree/master/inst/systemTest/data/Scripts which should probably be relocated to an appropriate folder within the testthat framework or testthat should point to the correct location to find that script.
I think the testthat/data/createCovariates/testCovariates.csv should have different values for each of the covariates i.e. X1, X2, X3 because we will likely want to test that we can sample each covariate independently e.g. sample from X1; sample from the three covariates X1, X2 and X3 retaining correlation structures; and sample whole subject covariate sets (X1, X2, X3) using ID as a stratification variable. So while I agree that we don't need 8 decimal places like in the original, maybe we just round those numbers to two decimal places?
test.data.allocate.R includes test.data.allocate.repeatedTreatments which in turn calls generateData( )
. Can we align behaviour here with other instances where we use generateData
and ensure that files are written to a temporary area for testing. Test runs OK, but was previously commented out...
Can we rename the "master" branch to "main"
The .ectdEnv is scratch memory for some functions and is not exported currently.
(May already be fixed, but let's double check)
R 2.14.0 (18 months ago) introduced package 'parallel', and using
doParallel will generally give your users (including the CRAN check
farm) a better experience that doSNOW. But your packages are still
using doSNOW (for MSToolkit in preference to doParallel, and for
diverSity oddly on Linux only!)
Can we please have versions with
Depends: R (>= 2.14)
and making use of doParallel rather than doSNOW.
Back in 2006 when MSToolkit was developed, personal computers didn't generally have multiple cores available on the CPU, so we only implemented multiple CPU options if grid = TRUE.
Now we should change the option name to allow the user to specify a number of CPUs to use for analysis. For CRAN (see #18) we should set the default to 2. Or set default to an option called Ncpu (see Rdatatable/data.table#5658).
Should we assume that number of clusters = ncpus?
Check analyzeData processing uses best available methods for processing with R. May want to consider a change to using methods like purrr.
To increase confidence in users about using MSToolkit, we might want to consider providing validation coverage and documentation using tools like {valtools}. https://phuse-org.github.io/valtools/ https://www.youtube.com/watch?v=1Bxk2wReFzE&ab_channel=RStudio
At present most functions have one example snippet of code. I would suggest extending to include at least one example to illustrate different possibilities of argument values etc.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.