Comments (2)
I'd like to push back on this a bit for the following reasons:
-
In our current environment, the data fetch typically takes no more than half an hour, which seems a reasonable time for a feasibility study.
-
In the develop branch I've already added the option to sample the cohorts prior to fitting the PS model. The fitting of the PS model typically can take up to 2 days in our environment, so sampling seems more helpful here.
-
The way to sample is not entirely obvious to me: we could sample uniformly across T and C, but if T or C are of very different sizes we might shrink one cohort too much. If we sample both T and C separately (as done in the createPs function as mentioned above) the ratio changes, making interpretation difficult (in the createPs function this is automatically corrected for).
Maybe we could just introduce a generic function that generates new cohorts by randomly sampling from other cohorts? It doesn't solve problem 3, but at least it puts the responsibility in the hands of the user.
from cohortmethod.
Ok, added the option anyway (needed it for the method evaluation, where some cohorts had >7 mln subjects).
In the current development version getDbCohortMethodData
has a new argument maxCohortSize
. If set to a value >0, both target and comparator cohort will be restricted to this size (through random sampling): a54d834
Of course, this argument has also percolated to the createGetDbCohortMethodDataArgs
function.
from cohortmethod.
Related Issues (20)
- Add calibrated and uncalibrated oneSidedP to export
- Implement representativeness diagnostic
- Unexpected columns are created using matchOnPsAndCovariates - solution removing hardcoded concepts in mergeCovariatesWithPs HOT 1
- Better implement behavior in fitOutcomeModel() when combining useCovariates with interaction terms HOT 5
- MetaData class proposal HOT 5
- question on equipoise calculation HOT 3
- Error unused argument (outcomeIds = outcomeIds) HOT 2
- `runCmAnalyses(refitPsForEveryOutcome = TRUE)` Error in gzfile(file, "rb"): cannot open the connection HOT 4
- Automatically compute covariate balance in subgroups when specifying interaction terms
- If high correlations are discovered but stopOnError = FALSE, record problem in output data somehow
- Add `unblindForEvidenceSynthesis` column to diagnostics summary table
- Add diagnostics for negative controls
- Feature request: exclude highly correlated covariates from propensity score calculation HOT 3
- "missing value where TRUE/FALSE needed" when fitting outcome models for other outcomes HOT 3
- CohortMethodData is read from env cache even when the covarieSettings have changed within the same R session HOT 3
- Using computeSharedCovariateBalanceArgs causes some non-informative warning messages in the log HOT 1
- Function `drug()` in the demo, does not exist HOT 1
- Add vignette showing results data model
- Error in cohort method in export results from inside strategus HOT 5
- Use of big integer outcome, target and comparator ids causes `checkmate` failures HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cohortmethod.