rpact-com / rpact Goto Github PK
View Code? Open in Web Editor NEWrpact: Confirmatory Adaptive Clinical Trial Design and Analysis
Home Page: https://rpact-com.github.io/rpact/
rpact: Confirmatory Adaptive Clinical Trial Design and Analysis
Home Page: https://rpact-com.github.io/rpact/
getDesignGroupSequential(futilityBounds = c(0,0), bindingFutility = FALSE) |> plot()
Plot legend displays bindingFutility = TRUE
getSampleSizeRates(groups = 1, thetaH0 = 0.2, riskRatio = FALSE)
getPowerRates(groups = 1, thetaH0 = 0.2, riskRatio = FALSE, maxNumberOfSubjects = 100)
There should be a warning that riskRatio = FALSE will be ignored (correct for getSamplesizeMeans and getPowerMeans)
getDesignGroupSequential() |> getSampleSizeSurvival() |> summary()
Expected study duration duration should have the same format as Analysis time (change to 2 decimal places)
design <- getDesignGroupSequential(typeOfDesign = "asOF")
dataExample <- getDataset(cumEvents = c(67, 129), cumLogRanks = c(-1.1, -2.0))
getAnalysisResults(design = design, dataInput = dataExample) |> summary()
For group sequential design second line in summary should not "Fixed weight" but "Planned information rate",
"Fixed weight" only for combination test"!
Bug for pipe operator in analysis results:
S <- getDataSet(
events1 = c(11, 12),
events2 = c(6, 7),
n1 = c(36, 39),
n2 = c(38, 40))
R <- getDataSet(
events1 = c(12, 10),
events2 = c(8, 8),
n1 = c(32, 33),
n2 = c(31, 29))
getDesignInverseNormal(kMax = 2) |> getDataSet(S1 = S, R = R) |>
getAnalysisResults(stage = 1, nPlanned = 150, intersectionTest = "Simes" ) |> pull(conditionalPower)
We believe that there is something odd in the calculation of Final CIs at the first stage of a group sequential
design in RPACT for non-inferiority designs. We believe that this potential bug is only present for noninferiority
designs and at the first stage.
Our reasoning is based on the following:
We wish with this document to illustrate the potential bug in RPACT for the first stage adjusted confidence
intervals and median unbiased estimate for a non-inferiority design.
library(rpact)
# our design
design <- getDesignGroupSequential(
sided = 1, alpha = 0.025, beta = 0.1,
informationRates = c(2/3, 1), typeOfDesign = "asOF"
)
# our fake data - create a scenario where we stop right at the boundary
fakeData2 <- getDataSet(
overallEvents = c(1067), # number of events corresponding to 2/3s
overallLogRanks = c(-1.713),
overallAllocationRatio = c(1)
)
adjust_noninferiority <- getAnalysisResults(design = design,
dataInput = fakeData2,
directionUpper = F,
thetaH0 = 1.05)
# We look at the final stage (first stage in this example) confidence intervals
adjust_noninferiority
Here we anticipated that the Final CIs upper boundary would be closer to 1.05 (thetaH0) and that the
Median unbiased estimate would be equal to the observed cumulative effect size.
To check that the CI is uncorrect we can use that the stage-wise confidence interval at the first stage should
be equal to the naively calculated confidence interval which we can get from the fixed design model.
If we assume that it was not a sequential design, we get the following confidence intervals:
# our design
design <- getDesignGroupSequential(
sided = 1, alpha = 0.025, beta = 0.1,
informationRates = c(1)
)
# creating data that stops right at the boundary
fakeData <- getDataSet(
overallEvents = c(1067),
overallLogRanks = c(-1.713),
overallAllocationRatio = c(1)
)
# get the naive and adjusted inference
adjust_noninferirority <- getAnalysisResults(design = design,
dataInput = fakeData,
directionUpper = F,
thetaH0 = 1.05)
adjust_noninferirority
Here the upper confidence interval crosses one and is different from the GSD final CI.
We are going to find the value of μ = log(HR) for which the probability if 2.5% of being greater than
-1.713 (non-inf) or -2.5095 (superiority).
Starting with non-inf:
upper_ci_noninf <- function(x) pnorm(-1.713, mean = (log(x))*sqrt(1067)/2) - 0.025
uniroot(upper_ci_noninf, lower = 0.7, upper = 1.2)$root
We get an upper limit of 1.015 which is different than the RPACT result. But the same as the fixed design
CI.
If the trial was a superiority trial instead, we could re-use the non-inferiority group sequential design but
without specifying the thetaH0 argument. Here we observe that if we reject the null hypothesis at the first
stage, we get identical Final CIs and Median unbiased estimate as in the non-inferiority analysis. However,
all other measures of inference are different in this scenario:
# superiority fake data - creating data that stops right at the boundary
fakeData <- getDataSet(
overallEvents = c(1067),
overallLogRanks = c(-2.5095),
overallAllocationRatio = c(1)
)
# get the naive and adjusted inference
adjust_superiority <- getAnalysisResults(design = design,
dataInput = fakeData,
directionUpper = F)
adjust_superiority
Notice to do the analysis on the non-inferiority scale we needed to change the log-rank test to equal the
non-inferiority log-rank. Using the formula from Chen and Chen (Chen YH, Chen C. Testing superiority
at interim analyses in a non-inferiority trial. Stat Med. 2012 Jul 10;31(15):1531-42. doi: 10.1002/sim.5312.
Epub 2012 Mar 22. PMID: 22438208.)
We believe that the bug in RPACT is due to a missing adjustment of non-inferiority in the final CIs at the
first stage. Furthermore, we beleive that the non-inferiority analysis’ median unbiased estimate and final
CIs are incorrectly the same three measures of inference but based on a superiority trial.
This ends the arguments for the potential bug.
Hi!
First of all thank you for this very nice and needed package!
I am trying to simulate an adaptive design using the function getSimulationSurvival()
. I would also like to use thetaH0=0.9
instead of the default thetaH0=1
. I am wondering if this parameter is respected during the simulation and/or in the results.
library(rpact)
library(dplyr)
dIN <- getDesignInverseNormal(kMax=2,
alpha=0.025,
beta=0.1,
sided=1,
typeOfDesign = "noEarlyEfficacy",
informationRates = c(19/125, 1),
futilityBounds = c(-1))
dIN_sim <- getSimulationSurvival(design = dIN,
pi2 = 0.003,
eventTime = 12,
hazardRatio = c(0.5),
dropoutRate1 = 0.15, dropoutRate2 = 0.15, dropoutTime = 12,
accrualTime = c(0, 20),
accrualIntensity = c(1000),
thetaH0 = 0.9,
directionUpper = F,
plannedEvents = c(19, 125), #cumulated
maxNumberOfEventsPerStage = c(NA, 200), #max events
minNumberOfEventsPerStage = c(NA, 106), #min events (per stage, not cumulative!)
conditionalPower = 0.9,
seed = 1910)
Since I would like to adapt the simulation manually (specifically, restart subject accrual at the timing of the interim based on conditional power and/or pi2), I tried to replicate scenarios from the simulation data using the smaller functions getStageResults()
and getConditionalPower()
.
simulation_replicates <- getData(dIN_sim) %>% mutate(iterationNumber = factor(iterationNumber))
interesting_case <- simulation_replicates %>% filter(iterationNumber == 12)
interesting_case #increased sample size slightly from 106 to 131, i.e. overall 15o cumulated events in stage 2
which gives me
iterationNumber stageNumber pi1 pi2 hazardRatio analysisTime numberOfSubjects overallEvents1 overallEvents2 eventsPerStage rejectPerStage
1 12 1 0.001501127 0.003 0.5 15.34246 15342 7 12 19 0
2 12 2 0.001501127 0.003 0.5 84.62074 20000 49 101 131 1
eventsNotAchieved futilityPerStage testStatistic logRankStatistic conditionalPowerAchieved pValuesSeparate trialStop hazardRatioEstimateLR
1 0 0 0.9191049 0.9191049 NA 0.1790203579 FALSE 0.6559214
2 0 0 3.6488241 3.6648582 0.7509866 0.0001774007 TRUE 0.5496526
Plugging in the results of the getSimulationSurvival()
interim analysis result into getDataset()
followed by getStageResults()
and getConditionalPower()
to the best of my ability:
#with thetah0=0.9
cp_with_thetah0_09 <- getConditionalPower(stageResults = getStageResults(dIN, dataInput = getDataset(cumulativeEvents=c(19),
cumulativeLogRanks = c(-0.9191049), # minus!
cumulativeAllocationRatios = c(1)),
directionUpper = F,
stage = 1,
thetaH0 = 0.9),
thetaH1 = 0.6559214, #use HR of stage 1
nPlanned = 131) #adaptively determined ssr for stage 2
#with thetah0=1
cp_with_thetah0_1 <- getConditionalPower(stageResults = getStageResults(dIN, dataInput = getDataset(cumulativeEvents=c(19),
cumulativeLogRanks = c(-0.9191049), # minus!
cumulativeAllocationRatios = c(1)),
directionUpper = F,
stage = 1,
thetaH0 = 1),
thetaH1 = 0.6559214, #use HR of stage 1
nPlanned = 131) #adaptively determined ssr for stage 2
Comparing the results for reported conditional power:
> cp_with_thetah0_09$conditionalPower
[1] NA 0.4896989
> cp_with_thetah0_1$conditionalPower
[1] NA 0.7497147
> interesting_case$conditionalPowerAchieved
[1] NA 0.7509866
which made me suspect that the reported conditional power of getSimulationSurvival()
does not respect the thetaH0 parameter. Or is $conditionalPower
not to be confused with $conditionalPowerAchieved
?
Additionally, why do I need to plug in the negative LR statistic into the getStageResults(getDataset())
function? I suspect the reason is the directionUpper
argument, but I fail to see why the getSimulationSurvival()
does not report the negative LR's that I would expect for HR's < 1.
Thank you!
With PWEXP, I think you can do:
fit <- pwexp.fit(...)
fit$AIC
How can you get the AIC for getPiecewiseExponentialDistribution(...)?
Hello,
I am getting help from the example you put up at
www.rpact.org/vignettes/planning/rpact_binary_planning_example
I am trying to install rpact packge on my mac and I am getting an error message which lead to an installation failure. Specifically, I see this error:
make: gfortran: No such file or directory". I also tried to download
a developer version but I still see the same error.
Could you help me guide how to install this packge?
installing *source* package ‘rpact’ ...
** package ‘rpact’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
clang++ -mmacosx-version-min=10.13 -std=gnu++14
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
-I'/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include'
-I/usr/local/include -fPIC -Wall -g -O2 -c RcppExports.cpp -o
RcppExports.o
clang++ -mmacosx-version-min=10.13 -std=gnu++14
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
-I'/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include'
-I/usr/local/include -fPIC -Wall -g -O2 -c f_as251.cpp -o f_as251.o
gfortran -mmacosx-version-min=10.13 -fno-optimize-sibling-calls -fPIC -Wall
-g -O2 -c f_as251_fortran.f -o f_as251_fortran.o
make: gfortran: No such file or directory
make: *** [f_as251_fortran.o] Error 1
ERROR: compilation failed for package ‘rpact’
* removing
‘/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rpact’
The downloaded source packages are in
‘/private/var/folders/tf/m_6xpy4j237b43dg4yfd53mw0000gn/T/Rtmpr9ciqn/downloaded_packages’
Warning messages:
1: In readRDS(dest) : lzma decoder corrupt data
2: In install.packages("rpact") :
installation of package ‘rpact’ had non-zero exit status
Hello,
I am trying to understand thetaH1 in getSimulationSurvival. I am interested in an adaptive design involving a thetaH0 bound of 0.9. Note that I am interested in testing H0: HR >= 0.9 vs. H1: HR < 0.9, that means the 0.9 bound is not a non-inferiority bound in the classical sense. In the first scenario I ran a simulation without specifying thetaH1, which should be using the test-statistics of stage 1 and 2 and the observed hazard ratio of stage 2 to reassess the required events for stage 3. I picked out an iteration which had a hazard ratio of 0.4503625 at stage 2. Then I re-ran the simulation, using the same seed, with thetaH1 = 0.4503625 and looked at the same iteration again. Given I used the same seed, it was expected that the results including stage 2 test-statistic and observed hazard ratio are the same. However, this time, the reassessement, which in my understanding only requires the test-statistics of the previous stages (which here are the same between the two scenarios) and an assumed hazard ratio, which I also made sure to be the same given my specification of thetaH1, reported an amount of events for stage 3 which differs from the first scenario.
My code is below. Unless my understanding of the subject is incorrect, I do suspect some kind of inconsistency in regards the thetah0 bound between the two scenarios.
Kind regards
Tim
library(dplyr)
library(rpact)
futilityBounds <- c(0, 0.688)
N <- 14000
informationRates <- c(0.4, 0.6, 1)
dIN <- getDesignInverseNormal(kMax = 3,
alpha = 0.025,
beta = 0.1,
sided = 1,
informationRates = informationRates,
futilityBounds = futilityBounds,
typeOfDesign = "asUser",
userAlphaSpending = c(0, 0.01, 0.025)
)
myCalcEventsFunction_with_division_by_thetaH0 <- function(...,
stage, thetaH0, conditionalPower, estimatedTheta,
plannedEvents, eventsOverStages,
minNumberOfEventsPerStage, maxNumberOfEventsPerStage, allocationRatioPlanned,
conditionalCriticalValue) {
theta <- max(1 + 1e-12, estimatedTheta)
cat(paste0("conditional critical value: ", conditionalCriticalValue, "\n"))
cat(paste0("estimated theta: ", estimatedTheta, "\n"))
requiredStageEvents <-
max(0, conditionalCriticalValue + qnorm(conditionalPower))^2 * (((1+allocationRatioPlanned)^2)/allocationRatioPlanned) / log(theta/thetaH0)^2
requiredStageEvents <- min(
max(minNumberOfEventsPerStage[stage], requiredStageEvents),
maxNumberOfEventsPerStage[stage]
) + eventsOverStages[stage - 1]
return(requiredStageEvents)
}
#wrapper function for convenience, allows us to create the same simulation but it can only differ in thetaH1 and the calceventsfunction
sim_wrapper <- function(thetaH1 = NA_real_, calcEventsFunction = NULL) {
dIN_sim <- getSimulationSurvival(design = dIN,
thetaH0 = 0.9,
directionUpper = F,
pi2 = 0.003,
hazardRatio = c(0.4),
eventTime = 12,
dropoutRate1 = 0.08,
dropoutRate2 = 0.08,
dropoutTime = 12,
accrualTime = c(0,0.01),
maxNumberOfSubjects = N,
plannedEvents = c(20, 30, 50),
maxNumberOfIterations = 75,
longTimeSimulationAllowed = T,
seed = 1910,
showStatistics = T,
maxNumberOfRawDatasetsPerStage = 2,
minNumberOfEventsPerStage = c(20, 10, 20),
maxNumberOfEventsPerStage = c(20, 10, 45),
conditionalPower = 0.9,
calcEventsFunction = calcEventsFunction,
thetaH1 = thetaH1
)}
#standard, use observed HR as thetaH1, interesting iteration 75 chosen
sim_wrapper(thetaH1 = NA_real_, calcEventsFunction = NULL) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#at stage 2, some outputs:
#estimated HR: 0.4503625
#test statistic: 1.896057
#-> for stage 3: increase single events from 20 to 41
#using the same seed, the data looks the same up to IA2. Now using the HR that we have just seen from iteration 75 in the
#"use interim HR as theta case" (0.4503625) as a fixed thetaH1.
#-> if I understand the theory correctly, it should lead to the same SSR, as the test statistics of stage 1 and 2 are identical,
#and the theta for the SSR is the same
sim_wrapper(thetaH1 = 0.4503625, calcEventsFunction = NULL) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#first of all consistency checks, estimated HR and test statistic of stage 2 are equal to the case above...
#estimated HR: 0.4503625
#test statistic: 1.896057
#BUT, the ssr now says we need 30 singular events for stage 3 instead of the 41 from above.
#now with custom function, translated from cpp code to the best of my ability, so it should be the same as rpact native if done correctly
#since we are using only 75 iterations, and we are interested in iteration 75, the last printed line in your console corresponds to the
#conditional critical value and estimated theta of that iteration 75 (didn't know how to do that more neatly)
sim_wrapper(thetaH1 = NA_real_, calcEventsFunction = myCalcEventsFunction_with_division_by_thetaH0) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#conditional critical value: 0.924377935201586 (from console print...)
#estimated theta: 1.79855127135391 (from console print...)
#results look the same as the native rpact ssr function with 41 single events at stage 3
#but focus on estimatedTheta which gets passed internally
#estimated theta: 1.79855127135391
#while hazardRatioEstimateLR 0.4503625
#obviously there is a directionUpper mirroring
#but 1/1.79855127135391 != 0.4503625
#maybe estimatedTheta already the theta which would lead to the same test-statistic given thetaH0 = 0.9
#but 1/(0.4503625/0.9) != 1.79855127135391
#however 1/(0.4503625/0.9) == 1.79855127135391/0.9
#now with a fixed theta, again, we should not have differences here for iteration 75 as the
#chosen fixed theta is exactly the theta observed in stage 2 of iteration 75
sim_wrapper(thetaH1 = 0.4503625, calcEventsFunction = myCalcEventsFunction_with_division_by_thetaH0) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#conditional critical value: 0.924377935201586 -> same as before, as expected
#estimated theta: 1.99839018568375 -> not the same
#this time this equation holds
#1/(0.4503625/0.9) == 1.99839018568375
#now I try this by hand, and on the negative side of the normal distribution first so i dont get confused by the directionUpper mirroring...
#max(0, conditionalCriticalValue + qnorm(conditionalPower))^2 * (((1+allocationRatioPlanned)^2)/allocationRatioPlanned) / log(theta/thetaH0)^2
min(0, -0.924377935201586 - qnorm(0.9))^2 * (((1+1)^2)/1) / log(0.4503625/0.9)^2
#40.6071 -> I guess this would be consistent with the 41 from scenario 1 with thetaH1=NA_real_ and rpact native ssr function
#however, I plugged in the true HR of 0.4503625
#if I want to do this the directionUpper way (applying thetaH0 first, then mirroring), trivially the same as log(1) is 0.
max(0, 0.924377935201586 + qnorm(0.9))^2 * (((1+1)^2)/1) / log(1/(0.4503625/0.9))^2
#40.6071
#what happens in the fixed thetah1 case where it said we need 30 events for stage 3?
#estimatedTheta internally is 1.99839018568375, i.e. 1/(0.4503625/0.9)
#but then gets plugged in
##max(0, conditionalCriticalValue + qnorm(conditionalPower))^2 * (((1+allocationRatioPlanned)^2)/allocationRatioPlanned) / log(theta/thetaH0)^2
#which divides it again by 0.9
#i.e.
max(0, 0.924377935201586 + qnorm(0.9))^2 * (((1+1)^2)/1) / log(((1/(0.4503625/0.9))/0.9))^2
#30.58873
#((1/(0.4503625/0.9))/0.9) = 2.220434 = 1/0.4503625
Dear rpact team,
Recently I used the package to do some simple simulation, and I experienced some issue "Error: index error". Below is the minimal code needed to reproduce the error. I looked into the getSimulationSurvival function, and it seems the problem is related to a C function "_rpact_getSimulationSurvivalCpp". Hope you can resolve the issue.
n_subj1 <- 3264
accrual_time1 <- getAccrualTime(
accrualTime = c(0, (20/3), (20/3)*2, (20/3)*3),
accrualIntensity = n_subj1 * c(0.10, 0.35, 0.55)/(20/3)
)
design1 <- getDesignGroupSequential(
typeOfDesign = "asHSD",
gammaA = -8,
informationRates = c(0.6, 1),
beta = 0.1,
alpha = 0.025
)
design_plan1 <- getSampleSizeSurvival(
design1,
pi2 = 0.0781,
hazardRatio = 0.76,
accrualTime = accrual_time1,
eventTime = 12
)
sim1 <- getSimulationSurvival(
design = design1,
pi2 = 0.0781,
directionUpper = FALSE,
hazardRatio = 0.76,
accrualTime = accrual_time1,
eventTime = 12,
plannedEvents = as.numeric(ceiling(design_plan1$eventsPerStage)),
maxNumberOfIterations = 1e2,
longTimeSimulationAllowed = TRUE,
maxNumberOfRawDatasetsPerStage = 1e2
)
We are working on a three-stage design for binary outcomes with adaptation of populations at the first interim using rpact. The OBF alpha spending function is used for efficacy. We consider two population selection, and the futility only happens when none of the population could meet the selection criteria. We implement this futility criteria in a customized population selection function for getSimulationEnrichmentRates().
We were working on rpact 3.4.0 (R version 4.1.0) and the simulation results look reasonable. However, when we are using the latest version (3.5.0), the futility probabilities at the first interim becomes zero for all simulation scenarios. We attached the code for a simplified version of our design for illustration. We developed the selection criteria using predictive probabilities for our project, but for this simplified one we only use the effect estimates, which requires at least 25% reduction from the assumed placebo rate (10%). The results are
Futility stop per stage [1]: 0.0820, 0.1050, 0.4650
(version 3.4.0)
Futility stop per stage [1]: 0.0000, 0.0000, 0.0000
(version 3.5.0)
Could you please help us to figure out what is going wrong?
Example:
library(rpact)
# Define design
ind <- getDesignInverseNormal(
kMax = 3,
alpha = 0.025,
beta = 0.1,
sided = 1,
informationRates = c(530, 706, 1176)/1176,
#futilityBounds = c(0, 1),
typeOfDesign = "asOF",
typeBetaSpending = "none")
getPowerRates(
design = ind,
groups = 2,
riskRatio = FALSE,
thetaH0 = 0,
pi1 = 10/100*0.5,
pi2 = 10/100,
directionUpper = FALSE,
maxNumberOfSubjects = 1176,
allocationRatioPlanned = 1) |> summary()
# Define selection function
mySelection <- function(effectVector, stage) {
selectedPopulations <- rep(TRUE, 2)
if (stage == 1) {
selectedPopulations <- rep(FALSE, 2)
if (effectVector[2] / 0.1 >= 0.25)
selectedPopulations[2] = TRUE
else if (effectVector[1] / 0.1 >= 0.25)
selectedPopulations[1] = TRUE
}
return(selectedPopulations)
}
# Perform simulation
simResults <- getSimulationEnrichmentRates(
design = ind,
effectList = list(
subGroups = c("S", "R"),
prevalences = c(0.5, 0.5),
piControl = rep(10/100, 2),
piTreatments = matrix(
c( 5/100, 5/100,
5/100, 5/100,
10/100, 5/100), ncol = 2, byrow = TRUE)),
intersectionTest = "Simes",
stratifiedAnalysis = FALSE,
directionUpper = FALSE,
typeOfSelection = "userDefined",
effectMeasure = "effectEstimate",
successCriterion = "all",
plannedSubjects = c(530, 706, 1176),
maxNumberOfIterations = 1000,
seed = 1234,
selectPopulationsFunction = mySelection)
print(simResults)
It is great that count endpoints are now supported in rpact
.
Would it be possible to further enhance it including the following:
Thanks a lot for considering!
For this case, no error message should be displayed (the number of stages for the dummy design can be arbitrary)
dummyDesign <- getDesignGroupSequential(kMax = 4,
alpha = 0.025,
sided = 1,
typeOfDesign = "asOF")
dataSurvival <- getDataset(cumulativeEvents = c(41, 86),
cumulativeLogRanks = c(-1.865, -2.510))
getAnalysisResults(design = dummyDesign,
dataInput = dataSurvival,
directionUpper = FALSE,
maxInformation = 120) |> summary()
The help of getSampleSizeSurvival
says that setting argument allocationRatioPlanned
to 0
will yield the "optimal" randomization ratio (RR). However, it is unclear to me how exactly rpact defines "optimality" and what exactly is numerically implemented. Could this please be improved (or, alternatively, the functionality removed to avoid confusion)?
Of note, we have recently submitted a pre-print (led by my colleague @godwinyyung) on the choice of the RR for time-to-event endpoints: https://arxiv.org/abs/2407.03420v1. The pre-print shows that the choice of the most suitable randomization ratio for time-to-event endpoints requires a trade-off between different quantities (e.g. number of events, trial duration etc). It also highlights limitations of Schoenfeld's and Freedman's approximation to the log-rank test for unequal randomization ratios and demonstrates that Rubinstein's approximation (currently not supported in rpact) is more accurate.
Thanks a lot for considering this and for all your great work on rpact!!
I have noticed a change in the output of the summary
function for the sample size/power calculation in version 4.0.0 compared with 3.5.1, where the efficacy boundary (treatment effect scale) for two-sided tests is not shown any more. The output for one-sided tests is not affected. I was wondering if this change is intended or not, as it was helpful to show the efficacy boundary for the study design.
rpact 4.0.0:
summary(sampleSizeResult <- getSampleSizeMeans(
alternative = 10, stDev = 24, sided = 2,
alpha = 0.05, beta = 0.2
))
Sample size calculation for a continuous endpoint
Fixed sample analysis, significance level 5% (two-sided).
The results were calculated for a two-sample t-test, H0: mu(1) - mu(2) = 0,
H1: effect = 10, standard deviation = 24, power 80%.
Stage Fixed
Efficacy boundary (z-value scale) 1.960
Number of subjects 182.8
Two-sided local significance level 0.0500
summary(sampleSizeResult <- getSampleSizeMeans(
alternative = 10, stDev = 24, sided = 1,
alpha = 0.025, beta = 0.2
))
Sample size calculation for a continuous endpoint
Fixed sample analysis, significance level 2.5% (one-sided).
The results were calculated for a two-sample t-test, H0: mu(1) - mu(2) = 0,
H1: effect = 10, standard deviation = 24, power 80%.
Stage Fixed
Efficacy boundary (z-value scale) 1.960
Number of subjects 182.8
One-sided local significance level 0.0250
Efficacy boundary (t) 7.006
Legend:
(t): treatment effect scale
rpact 3.5.1:
summary(sampleSizeResult <- getSampleSizeMeans(
alternative = 10, stDev = 24, sided = 2,
alpha = 0.05, beta = 0.2
))
Sample size calculation for a continuous endpoint
Fixed sample analysis, significance level 5% (two-sided).
The results were calculated for a two-sample t-test, H0: mu(1) - mu(2) = 0,
H1: effect = 10, standard deviation = 24, power 80%.
Stage Fixed
Efficacy boundary (z-value scale) 1.960
Number of subjects 182.8
Two-sided local significance level 0.0500
Lower efficacy boundary (t) -7.006
Upper efficacy boundary (t) 7.006
Legend:
(t): treatment effect scale
summary(sampleSizeResult <- getSampleSizeMeans(
alternative = 10, stDev = 24, sided = 1,
alpha = 0.025, beta = 0.2
))
Sample size calculation for a continuous endpoint
Fixed sample analysis, significance level 2.5% (one-sided).
The results were calculated for a two-sample t-test, H0: mu(1) - mu(2) = 0,
H1: effect = 10, standard deviation = 24, power 80%.
Stage Fixed
Efficacy boundary (z-value scale) 1.960
Number of subjects 182.8
One-sided local significance level 0.0250
Efficacy boundary (t) 7.006
Legend:
(t): treatment effect scale
Thank you!
TrialDesignPlanSurvival (getSampleSizeSurvival / getPowerSurvival)
SimulationResultsSurvival (getSimulationSurvival)
SimulationResults[MultiArm/Enrichment]Survival (getSimulationMultiArmSurvival / getSimulationEnrichmentSurvival)
TrialDesignPlanSurvival (getSampleSizeSurvival / getPowerSurvival)
SimulationResultsSurvival (getSimulationSurvival)
SimulationResults[MultiArm/Enrichment]Survival (getSimulationMultiArmSurvival / getSimulationEnrichmentSurvival)
TrialDesignPlanSurvival (getSampleSizeSurvival / getPowerSurvival)
SimulationResultsSurvival (getSimulationSurvival)
SimulationResults[MultiArm/Enrichment]Survival (getSimulationMultiArmSurvival / getSimulationEnrichmentSurvival)
In the aggregated simulation results, there are also
The names should be changed to "Observed events per stage"
To implement a consistent solution that is still runnable with existing scripts, it would be best to find a new name for "eventsPerStage", something like "singleEventsPerStage". To ensure a clear distinction from "singleNumberOfEventsPerStage", this variable should be renamed to "singleEventsPerStagePerArm".
My understanding is that currently rpact only provides limited support for single-arm trials with binomial endpoints getSampleSizeRates(..., groups = 1, normalApproximation = F)
.
The request is to enhance support for single arm trials and extend support to two-arm trials.
Requests for enhancements/clarifications for single arm trials:
rpact
appears to calculate sample size conservatively, i.e. it chooses the sample size n such that the targeted power is not only achieved for this sample size gsDesign::nBinomial1Sample
which has an argument conservative
).rpact
uses a multivariate normal approximation to the joint distribution of test statistics across interims. Again, this implies that the calculation of stopping probabilities does not match the exact binomial calculations.Requests for two arm trials:
Exact
package.Thanks a lot for considering!
Hello,
I have looked into the differences in event reporting between objects generated by getSampleSizeSurvival() and getSimulationSurvival(). My attention was specifically on determining whether the number of events is intended to be understood as cumulative or not. My findings are below, I suspect at least one bug in a summary call. Since those two functions are frequently used together, I would argue that subtle semantic differences between them can easily result in mistakes.
Kind regards,
Tim
library(rpact)
library(dplyr)
#define combination function
#inverse normal combination function
dIN <- getDesignInverseNormal(kMax = 2,
alpha = 0.025,
beta = 0.1,
sided = 1,
informationRates = c(0.25, 1),
typeOfDesign = "asOF")
#Design (theory - without SSR)
#analytical ss calculation to get an idea for required N / events
dIN_no_ssr <- getSampleSizeSurvival(design = dIN,
thetaH0 = 0.9,
pi2 = 0.003,
hazardRatio = 0.45,
eventTime = 12,
accrualTime = c(0,10),
followUpTime = 24)
#1. with summary
summary(dIN_no_ssr)
#From Output: Cumulative number of events 21.9 87.5
#...talks about CUMULATIVE 21.9 events at interim and 87.5 final analysis
#2. without summary
dIN_no_ssr
#From Output:
#Number of events per stage [1]: 21.9
#Number of events per stage [2]: 87.5
#...does not say cumulative, instead refers to events per stage, but can be deduced that its meant as cumulative
#3. calling the eventsPerStage field directly
dIN_no_ssr$eventsPerStage
#From Output:
# [,1]
# [1,] 21.87025
# [2,] 87.48101
#...same as before, implicitly cumulative
#Now with the simulation function
#Design (simulation - with SSR)
dIN_ssr <- getSimulationSurvival(design = dIN,
thetaH0 = 0.9,
directionUpper = FALSE,
pi2 = 0.003,
hazardRatio = 0.45,
eventTime = 12,
accrualTime = c(0, 10),
maxNumberOfSubjects = ceiling(dIN_no_ssr$numberOfSubjects[1]),
plannedEvents = c(22, 88), #here we enter cumulative numbers
minNumberOfEventsPerStage = c(NA, 88-22), #but not here
maxNumberOfEventsPerStage = c(NA, 88-22), #intentionally leaving min=max, so we always have the same number of events in stage 2
conditionalPower = 0.9,
showStatistics = T
)
#1. with summary
summary(dIN_ssr)
#From Output:
#Cumulative number of events: 22.0 66.0
#... strongly suspect this is a bug, it should not stay cumulative in the output here or the numbers are wrong
#2. without summary
dIN_ssr
#From Output:
# Number of events per stage [1] : 22
# Number of events per stage [2] : 66
# Cumulative number of events [1]: 22
# Cumulative number of events [2]: 88
#...much better, cannot be misunderstood and also numbers are as I would expect given my input
#3. calling the eventsPerStage field directly
dIN_ssr$eventsPerStage
#From Output:
# [,1]
# [1,] 22
# [2,] 66
#... now this worries me the most. The $eventsPerStage field in the getSampleSizeSurvival object returned cumulative events,
#but the field with the very same name in the getSimulationSurvival object returns non-cumulative events
library(rpact)
design1 <- getDesignGroupSequential(
kMax = 1,
sided = 1,
typeOfDesign = "asOF",
typeBetaSpending = "bsOF")
design2 <- getDesignGroupSequential(
kMax = 2,
sided = 1,
typeOfDesign = "asOF",
typeBetaSpending = "bsOF")
design4 <- getDesignGroupSequential(
kMax = 4,
sided = 1,
typeOfDesign = "asOF",
typeBetaSpending = "bsOF")
# This works:
plot(getDesignSet(designs = c(design2, design4),
variedParameters = "kMax"))
# This fails:
plot(getDesignSet(designs = c(design1, design4),
variedParameters = "kMax"))
# Error in rbind(deparse.level, ...) :
# numbers of columns of arguments do not match
Hello,
I noticed a one-off issue with the getSimulationSurvival(). While I fixed the singular events at 10 for stage 2 in a 3-stage design, there was one iteration out of 1000 where the analysis was triggered at 9 instead. See the code below. Impact on results is low if any, but still seems worthy investigating, especially because this variance where there should not be any shows up in the output of showStatistics = T.
Kind regards,
Tim
library(dplyr)
library(rpact)
#rpact developer version 3.5.1.9231 loaded
dIN <- getDesignInverseNormal(kMax = 3,
alpha = 0.025,
beta = 0.1,
sided = 1,
informationRates = c(0.4, 0.6, 1),
futilityBounds = c(-6, -1),
typeOfDesign = "asUser",
userAlphaSpending = c(0, 0.01, 0.025)
)
dIN_sim <- getSimulationSurvival(design = dIN,
thetaH0 = 0.9,
directionUpper = F,
pi2 = 0.003,
hazardRatio = c(0.9, 0.4, 0.35, 0.3, 0.25, 0.2),
eventTime = 12,
dropoutRate1 = 0.08,
dropoutRate2 = 0.08,
dropoutTime = 12,
accrualTime = c(0,0.01),
maxNumberOfSubjects = 14000,
plannedEvents = c(0.4*50, 0.6*50, 50),
maxNumberOfIterations = 1000,
longTimeSimulationAllowed = T,
seed = 1910,
showStatistics = T,
maxNumberOfRawDatasetsPerStage = 2,
minNumberOfEventsPerStage = c(20, 10, 20), #stage 2 triggered after 10 additional events after stage1
maxNumberOfEventsPerStage = c(20, 10, 40),
conditionalPower = 0.9,
thetaH1 = NA_real_
)
sims <- getData(dIN_sim) %>% filter(hazardRatio %>% near(0.4))
table((sims %>% filter(stageNumber == 2))$eventsPerStage) #-> 1 iteration has 9 singular events instead of 10 at stage 2
sims %>% filter(iterationNumber == 757)
Originally opened by xinzhn
I am wondering if it would be possible to add an additional option (e.g, userDefined
) for the argument effectMeasure
in getSimulationEnrichmentMeans/Rates/Survival
, to allow a user defined function based on effect measures other than effectEstimate
and testStatistic
for selectPopulationsFunction
. Examples for such alternatives are population-specific predictive probabilities and subgroup-specific posterior probabilities1. This might be implemented by providing additional arguments for the customized function selectPopulationsFunction
to facilitate the calculation for both type of probabilities.
Just a thought for consideration: it may be convenient to borrow some arguments from calcSubjectsFunction
(customized function for sample size re-calculation), but to provide for populations as well as subgroups. To be specific, we may need
Brannath, W., Zuber, E., Branson, M., Bretz, F., Gallo, P., Posch, M., and Racine-Poon, A. (2009), “Confirmatory adaptive designs with Bayesian decision tools for a targeted therapy in oncology,” Statistics in Medicine, 28, 1445–1463. ↩
What I get
I’m trying to find the absolute accrual intensity (number recruited per time unit) when
recruitment is twice as fast in the second period as the first.
sampleSize <- getSampleSizeSurvival(alpha = 0.025,
sided = 1,
beta = 0.1,
hazardRatio = 0.7,
lambda2 = log(2) / 9,
accrualTime = c(0, 3, 12),
accrualIntensity = c(0.1, 0.2),
accrualIntensityType = "relative",
followUpTime = 6)
sampleSize$accrualIntensity
[1] 215.90555 53.97639
What I expect
...so doing this manually for now
m <- sampleSize$maxNumberOfSubjects
Accrual per month during first 3 months
m / (3 * 0.1 + 9 * 0.2) * 0.1
[1] 30.84365
Accrual per month during months 4 to 9
m / (3 * 0.1 + 9 * 0.2) * 0.2
[1] 61.6873
You probably want to update your examples in the manual/ vignette about the custom sample size re-calculation functions: making it clear that allocationRatioPlanned is treated as a vector in the back end, not a scalar. Please see below example.
simpowerCPZ shows error message:
Error in fun(alternative = alternative, kMax = design$kMax, maxNumberOfIterations = maxNumberOfIterations, :
Evaluation error: the condition has length > 1.
simpowerCPZ2 uses allocationRatioPlanned[stage] instead and it runs fine.
library(rpact)
n1<-8 #interim sample size/arm
Nmin<-15 #min sample size/arm
Nmax<-23 #max sample size/arm
designc<-getDesignInverseNormal(sided = 1, alpha = 0.05, beta = 0.2,kMax=2,informationRates=c(n1/Nmin,1),
typeOfDesign = "noEarlyEfficacy", typeBetaSpending = "bsHSD",gammaB = -3)
myCPZSampleSizeCalculationFunction <- function(..., stage,
minNumberOfSubjectsPerStage,
maxNumberOfSubjectsPerStage,
conditionalPower,
conditionalCriticalValue,
allocationRatioPlanned,
thetaH1,
stDevH1) {
calculateStageSubjects <- function(cp) {
(1 + allocationRatioPlanned)^2/allocationRatioPlanned *
(max(0, conditionalCriticalValue + stats::qnorm(cp)))^2 /
(max(1e-12, thetaH1/stDevH1))^2
}
stageSubjectsCPmax <- calculateStageSubjects(cp = conditionalPower)
stageSubjectsCPmin <- calculateStageSubjects(cp = 0.7)
stageSubjects <- ceiling(min(max(minNumberOfSubjectsPerStage[stage], stageSubjectsCPmax), maxNumberOfSubjectsPerStage[stage]))
if (stageSubjectsCPmin > maxNumberOfSubjectsPerStage[stage]) {
stageSubjects <- minNumberOfSubjectsPerStage[stage]
}
return(stageSubjects)
}
myCPZSampleSizeCalculationFunction2 <- function(..., stage,
minNumberOfSubjectsPerStage,
maxNumberOfSubjectsPerStage,
conditionalPower,
conditionalCriticalValue,
allocationRatioPlanned,
thetaH1,
stDevH1) {
calculateStageSubjects <- function(cp) {
(1 + allocationRatioPlanned[stage])^2/allocationRatioPlanned[stage] *
(max(0, conditionalCriticalValue + stats::qnorm(cp)))^2 /
(max(1e-12, thetaH1/stDevH1))^2
}
stageSubjectsCPmax <- calculateStageSubjects(cp = conditionalPower)
stageSubjectsCPmin <- calculateStageSubjects(cp = 0.7)
stageSubjects <- ceiling(min(max(minNumberOfSubjectsPerStage[stage], stageSubjectsCPmax), maxNumberOfSubjectsPerStage[stage]))
if (stageSubjectsCPmin > maxNumberOfSubjectsPerStage[stage]) {
stageSubjects <- minNumberOfSubjectsPerStage[stage]
}
return(stageSubjects)
}
simpowerCPZ <- getSimulationMeans(designc,
alternative =0.65, stDev =0.7,
# cumulative overall sample size
plannedSubjects = 2 * c(n1, Nmin),
# stage-wise minimal overall sample size
minNumberOfSubjectsPerStage = 2 * c(n1, (Nmin - n1)),
# stage-wise maximal overall sample size
maxNumberOfSubjectsPerStage = 2 * c(n1, (Nmax - n1)),
conditionalPower = 0.8,
thetaH1 = 0.5, stDevH1=0.7,
maxNumberOfIterations = 1000,
seed = 12345,
calcSubjectsFunction = myCPZSampleSizeCalculationFunction)
simpowerCPZ2 <- getSimulationMeans(designc,
alternative =0.65, stDev =0.7,
# cumulative overall sample size
plannedSubjects = 2 * c(n1, Nmin),
# stage-wise minimal overall sample size
minNumberOfSubjectsPerStage = 2 * c(n1, (Nmin - n1)),
# stage-wise maximal overall sample size
maxNumberOfSubjectsPerStage = 2 * c(n1, (Nmax - n1)),
conditionalPower = 0.8,
thetaH1 = 0.5, stDevH1=0.7,
maxNumberOfIterations = 1000,
seed = 12345,
calcSubjectsFunction = myCPZSampleSizeCalculationFunction2)
Hi,
First, thank you for this incredibly useful package.
I am trying to set up a GS design using rpact
. The code looks like this:
design <- getDesignGroupSequential(kMax=3, typeOfDesign="asOF",
alpha=0.025, beta=0.2, sided=1)
designPlan <- getSampleSizeSurvival(design, pi1=1-0.66, pi2=1-0.54, eventTime=3,
accrualTime=3, followUpTime=2)
I am trying to validate this design against the software we are using, East.
Please find attached the output of the same design: east.zip
I can find the same results, which is very satisfying 😁
However, the output is sometimes a bit confusing and I couldn't find any extensive documentation
Expected study duration = 4.40
, although Analysis time = 2.43, 3.61, 5.00
. No interim analysis is planned at t=4.4, how could the study end at that moment?Expected number of events under [H0, H0/H1, H1]
: what are these referring to? They are all different from Number of events per stage
.Expected number of subjects under H1 = 461.4
: How is that different from Number of subjects [1,2,3]
? Aren't the latter under H1 too? Maybe the considered hypothesis would be worth adding to the output here.Of note, it would be very nice if designPlan
could hold a reference to design
internally, as it contains useful information to interpret it later (like the type of design for instance).
I'd be glad to make a PR if needed.
I am wondering if, rather than specifying the direction of alternative downstream in these other functions, it would be preferable to specify the alternative in getDesignGroupSequential
- the other functions could simply extract the direction of the alternative from the resulting trialDesignGroupSequential object.
This way, getDesignGroupSequential
returns appropriate one-sided boundaries for 'less than' hypotheses without having to make manual corrections. The reported boundaries, plot()
output, and so on would be ready to use both within and outside of RPACT, avoiding potential mistakes in manually correcting output (or failing to make that correction).
Just a thought for consideration: perhaps allow directionUpper to be passed as an argument in both getDesignGroupSequential
and those downstream functions. For downstream functions with one-sided designs, if directionUpper is NULL, the direction is extracted from the trialDesign object. If directions were specified in both the directionUpper and the trial design object, check to see if they are consistent, and throw an error if they are not.
As a programmer, I know that backwards compatibility and validation are nontrivial things, and things that seem superficially simple almost never are, and would understand if this is not worth the potential headaches.
library(rpact)
library(help = rpact)
# Version: 3.5.0.9228
?plot.Dataset
# The first example is:
dataExample <- getDataset(
n1 = c(22, 11, 22, 11),
n2 = c(22, 13, 22, 13),
means1 = c(1, 1.1, 1, 1),
means2 = c(1.4, 1.5, 3, 2.5),
stDevs1 = c(1, 2, 2, 1.3),
stDevs2 = c(1, 2, 2, 1.3)
)
if (require(ggplot2)) plot(dataExample, main = "Comparison of Means")
# That gives warning messages:
# Warning messages:
# 1: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> = "none")` instead.
# 2: Removed 73 rows containing non-finite values (stat_boxplot).
# 3: Removed 73 rows containing non-finite values (stat_summary).
# 4: Removed 73 rows containing missing values (geom_point).
# The plot is missing the treatment group at stages 2 and 4
# The second example, plotting rates, also gives the first warning message.
Running the following
designIN <- getDesignInverseNormal(kMax = 3, alpha = 0.025, typeOfDesign = "asOF", typeBetaSpending = "none")
simBonfSSREMAMS <- getSimulationMultiArmMeans(
design = designIN,
activeArms = 4,
muMaxVector = seq(0,1,0.5),
stDev = 1.5,
plannedSubjects = c(30, 60, 90),
intersectionTest = "Bonferroni",
typeOfShape = "linear",
typeOfSelection = "all",
successCriterion = "all",
minNumberOfSubjectsPerStage = c(22, 22, 22),
maxNumberOfSubjectsPerStage = c(190, 190, 190),
conditionalPower = 0.8,
thetaH1 = 1,
stDecH1 = 1.5,
allocationRatioPlanned = 1.5,
maxNumberOfIterations = maxNumberOfIterations,
seed = 1234
)
I see two odd things in the output, firstly the exit probabilities for futility are non zero but there is explicitly no futility stopping bound:
Exit probability for futility, mu_max = 0 0.5090 0.2666
Exit probability for futility, mu_max = 0.5 0.1914 0.0817
Exit probability for futility, mu_max = 1 0.0246 0.0258
secondly the reported stage sizes are the same for every experimental arm (after sample size re-assessment) even though they are not simulated with the same treatment effect (using "linear"):
Stagewise number of subjects, mu_max = 0.5
Treatment arm 1 30.0 75.3 38.7
Treatment arm 2 30.0 75.3 38.7
Treatment arm 3 30.0 75.3 38.7
Treatment arm 4 30.0 75.3 38.7
Control arm 20.0 50.2 25.8
Obviously I may have an input incorrect or be misinterpreting the results ... but I've double checked and don't see anything wrong or that explains either of these things in the output.
彭猛业 | |
---|---|
[email protected] |
</a></div></div></div></div></div><!--EndFragment-->
# Compare nonbinding and binding futility bounds
design1 <- getDesignGroupSequential(
kMax = 4, alpha = 0.05,
sided = 1,
informationRates = 1:4/4,
typeOfDesign = "asOF",
typeBetaSpending = "bsOF")
design2 <- getDesignGroupSequential(
kMax = 4, alpha = 0.05,
sided = 1,
informationRates = 1:4/4,
typeOfDesign = "asOF",
typeBetaSpending = "bsOF",
bindingFutility = TRUE)
plot(getDesignSet(designs = c(design1, design2),
variedParameters = "bindingFutility"))
# The final legend line is:
# TRUE, Futility bound (non-binding)
# The TRUE indicates that futility is binding.
# The "(non-binding)" is incorrect.
# In contrast
plot(design2)
# has the second line in the legend "Futility bound (binding)"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.