Coder Social home page Coder Social logo

rpact-com / rpact Goto Github PK

View Code? Open in Web Editor NEW
20.0 3.0 4.0 15.58 MB

rpact: Confirmatory Adaptive Clinical Trial Design and Analysis

Home Page: https://rpact-com.github.io/rpact/

R 96.91% C++ 2.75% Fortran 0.34%
adaptive-design analysis clinical-trials group-sequential-designs r simulation count-data power-calculation sample-size-calculation validated

rpact's People

Contributors

fpahlke avatar tilljensen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rpact's Issues

Example from help(plot.Dataset) seems incorrect

library(rpact)
library(help = rpact)
# Version:                       3.5.0.9228
?plot.Dataset

# The first example is:
dataExample <- getDataset(
    n1 = c(22, 11, 22, 11),
    n2 = c(22, 13, 22, 13),
    means1 = c(1, 1.1, 1, 1),
    means2 = c(1.4, 1.5, 3, 2.5),
    stDevs1 = c(1, 2, 2, 1.3),
    stDevs2 = c(1, 2, 2, 1.3)
)
if (require(ggplot2)) plot(dataExample, main = "Comparison of Means")

# That gives warning messages:
# Warning messages:
# 1: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> = "none")` instead. 
# 2: Removed 73 rows containing non-finite values (stat_boxplot). 
# 3: Removed 73 rows containing non-finite values (stat_summary). 
# 4: Removed 73 rows containing missing values (geom_point). 

# The plot is missing the treatment group at stages 2 and 4

# The second example, plotting rates, also gives the first warning message.

Direction of One-Sided Alternatives

I am wondering if, rather than specifying the direction of alternative downstream in these other functions, it would be preferable to specify the alternative in getDesignGroupSequential - the other functions could simply extract the direction of the alternative from the resulting trialDesignGroupSequential​ object.

This way, getDesignGroupSequential​ returns appropriate one-sided boundaries for 'less than' hypotheses without having to make manual corrections. The reported boundaries, plot()​ output, and so on would be ready to use both within and outside of RPACT, avoiding potential mistakes in manually correcting output (or failing to make that correction).

Just a thought for consideration: perhaps allow directionUpper​ to be passed as an argument in both getDesignGroupSequential​ and those downstream functions. For downstream functions with one-sided designs, if directionUpper​ is NULL, the direction is extracted from the trialDesign​ object. If directions were specified in both the directionUpper​ and the trial design object, check to see if they are consistent, and throw an error if they are not.

As a programmer, I know that backwards compatibility and validation are nontrivial things, and things that seem superficially simple almost never are, and would understand if this is not worth the potential headaches.

Misleading outputs regarding (cumulative) number of events in survival settings and possible bug

Hello,
I have looked into the differences in event reporting between objects generated by getSampleSizeSurvival() and getSimulationSurvival(). My attention was specifically on determining whether the number of events is intended to be understood as cumulative or not. My findings are below, I suspect at least one bug in a summary call. Since those two functions are frequently used together, I would argue that subtle semantic differences between them can easily result in mistakes.
Kind regards,
Tim

library(rpact)
library(dplyr)


#define combination function
#inverse normal combination function
dIN <- getDesignInverseNormal(kMax = 2,
                              alpha = 0.025,
                              beta = 0.1,
                              sided = 1,
                              informationRates = c(0.25, 1),
                              typeOfDesign = "asOF")



#Design (theory - without SSR)
#analytical ss calculation to get an idea for required N / events
dIN_no_ssr <- getSampleSizeSurvival(design = dIN,
                                    thetaH0 = 0.9,
                                    pi2 = 0.003,
                                    hazardRatio = 0.45,
                                    eventTime = 12,
                                    accrualTime = c(0,10),
                                    followUpTime = 24)


#1. with summary
summary(dIN_no_ssr)
#From Output: Cumulative number of events 21.9   87.5
#...talks about CUMULATIVE 21.9 events at interim and 87.5 final analysis 


#2. without summary
dIN_no_ssr
#From Output:
#Number of events per stage [1]: 21.9 
#Number of events per stage [2]: 87.5 
#...does not say cumulative, instead refers to events per stage, but can be deduced that its meant as cumulative


#3. calling the eventsPerStage field directly
dIN_no_ssr$eventsPerStage
#From Output:
# [,1]
# [1,] 21.87025
# [2,] 87.48101
#...same as before, implicitly cumulative



#Now with the simulation function 
#Design (simulation - with SSR)
dIN_ssr <- getSimulationSurvival(design = dIN,
                                 thetaH0 = 0.9,
                                 directionUpper = FALSE,
                                 pi2 = 0.003,
                                 hazardRatio = 0.45,
                                 eventTime = 12,
                                 accrualTime = c(0, 10),
                                 maxNumberOfSubjects = ceiling(dIN_no_ssr$numberOfSubjects[1]),
                                 plannedEvents = c(22, 88),                   #here we enter cumulative numbers
                                 minNumberOfEventsPerStage = c(NA, 88-22),    #but not here
                                 maxNumberOfEventsPerStage = c(NA, 88-22),    #intentionally leaving min=max, so we always have the same number of events in stage 2
                                 conditionalPower = 0.9,
                                 showStatistics = T
)

#1. with summary
summary(dIN_ssr)
#From Output:
#Cumulative number of events: 22.0   66.0 
#... strongly suspect this is a bug, it should not stay cumulative in the output here or the numbers are wrong

#2. without summary
dIN_ssr
#From Output:
# Number of events per stage [1] : 22 
# Number of events per stage [2] : 66 
# Cumulative number of events [1]: 22 
# Cumulative number of events [2]: 88 
#...much better, cannot be misunderstood and also numbers are as I would expect given my input


#3. calling the eventsPerStage field directly
dIN_ssr$eventsPerStage
#From Output:
# [,1]
# [1,]   22
# [2,]   66
#... now this worries me the most. The $eventsPerStage field in the getSampleSizeSurvival object returned cumulative events,
#but the field with the very same name in the getSimulationSurvival object returns non-cumulative events

Inconsistent result on probabilities of stopping for futility when all populations are not selected in getSimulationEnrichmentRates() for rpact version 3.4.0 and 3.5.0

We are working on a three-stage design for binary outcomes with adaptation of populations at the first interim using rpact. The OBF alpha spending function is used for efficacy. We consider two population selection, and the futility only happens when none of the population could meet the selection criteria. We implement this futility criteria in a customized population selection function for getSimulationEnrichmentRates().

We were working on rpact 3.4.0 (R version 4.1.0) and the simulation results look reasonable. However, when we are using the latest version (3.5.0), the futility probabilities at the first interim becomes zero for all simulation scenarios. We attached the code for a simplified version of our design for illustration. We developed the selection criteria using predictive probabilities for our project, but for this simplified one we only use the effect estimates, which requires at least 25% reduction from the assumed placebo rate (10%). The results are

Futility stop per stage [1]: 0.0820, 0.1050, 0.4650 (version 3.4.0)
Futility stop per stage [1]: 0.0000, 0.0000, 0.0000 (version 3.5.0)

Could you please help us to figure out what is going wrong?

Example:

library(rpact)

# Define design
ind <- getDesignInverseNormal(  
  kMax = 3,
  alpha = 0.025, 
  beta = 0.1, 
  sided = 1, 
  informationRates = c(530, 706, 1176)/1176,
  #futilityBounds = c(0, 1),
  typeOfDesign = "asOF", 
  typeBetaSpending = "none")

getPowerRates(
  design = ind,
  groups = 2,
  riskRatio = FALSE,
  thetaH0 = 0,
  pi1 = 10/100*0.5,
  pi2 = 10/100,
  directionUpper = FALSE,
  maxNumberOfSubjects = 1176,
  allocationRatioPlanned = 1) |> summary()

# Define selection function
mySelection <- function(effectVector, stage) {
  selectedPopulations <- rep(TRUE, 2)
  if (stage == 1) {
    selectedPopulations <- rep(FALSE, 2) 
    if (effectVector[2] / 0.1 >= 0.25)  
      selectedPopulations[2] = TRUE 
    else if (effectVector[1] / 0.1 >= 0.25) 
      selectedPopulations[1] = TRUE 
  }
  return(selectedPopulations)
}

# Perform simulation
simResults <- getSimulationEnrichmentRates(
  design = ind,
  effectList = list(
    subGroups = c("S", "R"),  
    prevalences = c(0.5, 0.5),
    piControl = rep(10/100, 2), 
    piTreatments = matrix(
      c( 5/100, 5/100,
         5/100, 5/100,
        10/100, 5/100), ncol = 2, byrow = TRUE)),
  intersectionTest = "Simes",
  stratifiedAnalysis = FALSE, 
  directionUpper = FALSE,
  typeOfSelection = "userDefined", 
  effectMeasure = "effectEstimate",
  successCriterion = "all",
  plannedSubjects = c(530, 706, 1176),
  maxNumberOfIterations = 1000,
  seed = 1234,
  selectPopulationsFunction = mySelection)

print(simResults)

peculiarity in getSimulationSurvival() in one iteration

Hello,
I noticed a one-off issue with the getSimulationSurvival(). While I fixed the singular events at 10 for stage 2 in a 3-stage design, there was one iteration out of 1000 where the analysis was triggered at 9 instead. See the code below. Impact on results is low if any, but still seems worthy investigating, especially because this variance where there should not be any shows up in the output of showStatistics = T.

Kind regards,
Tim

library(dplyr)
library(rpact)
#rpact developer version 3.5.1.9231 loaded

dIN <- getDesignInverseNormal(kMax = 3,
                              alpha = 0.025,
                              beta = 0.1,
                              sided = 1,
                              informationRates = c(0.4, 0.6, 1),
                              futilityBounds = c(-6, -1),
                              typeOfDesign = "asUser",
                              userAlphaSpending = c(0, 0.01, 0.025)
)


dIN_sim <- getSimulationSurvival(design = dIN,
                                 thetaH0 = 0.9,
                                 directionUpper = F,
                                 pi2 = 0.003,
                                 hazardRatio = c(0.9, 0.4, 0.35, 0.3, 0.25, 0.2), 
                                 eventTime = 12,
                                 dropoutRate1 = 0.08,
                                 dropoutRate2 = 0.08,
                                 dropoutTime = 12,
                                 accrualTime = c(0,0.01),
                                 maxNumberOfSubjects = 14000,    
                                 plannedEvents = c(0.4*50, 0.6*50, 50),
                                 maxNumberOfIterations = 1000,
                                 longTimeSimulationAllowed = T,
                                 seed = 1910,
                                 showStatistics = T,
                                 maxNumberOfRawDatasetsPerStage = 2,
                                 minNumberOfEventsPerStage = c(20, 10, 20),  #stage 2 triggered after 10 additional events after stage1
                                 maxNumberOfEventsPerStage = c(20, 10, 40), 
                                 conditionalPower = 0.9,
                                 thetaH1 = NA_real_
)


sims <- getData(dIN_sim)  %>% filter(hazardRatio %>% near(0.4))

table((sims %>% filter(stageNumber == 2))$eventsPerStage) #-> 1 iteration has 9 singular events instead of 10 at stage 2

sims %>% filter(iterationNumber == 757)

A couple of issues with reported results "getSimulationMultiArmMeans" when using sample size re-assessment

Running the following
designIN <- getDesignInverseNormal(kMax = 3, alpha = 0.025, typeOfDesign = "asOF", typeBetaSpending = "none")
simBonfSSREMAMS <- getSimulationMultiArmMeans(
design = designIN,
activeArms = 4,
muMaxVector = seq(0,1,0.5),
stDev = 1.5,
plannedSubjects = c(30, 60, 90),
intersectionTest = "Bonferroni",
typeOfShape = "linear",
typeOfSelection = "all",
successCriterion = "all",
minNumberOfSubjectsPerStage = c(22, 22, 22),
maxNumberOfSubjectsPerStage = c(190, 190, 190),
conditionalPower = 0.8,
thetaH1 = 1,
stDecH1 = 1.5,
allocationRatioPlanned = 1.5,
maxNumberOfIterations = maxNumberOfIterations,
seed = 1234
)

I see two odd things in the output, firstly the exit probabilities for futility are non zero but there is explicitly no futility stopping bound:
Exit probability for futility, mu_max = 0 0.5090 0.2666
Exit probability for futility, mu_max = 0.5 0.1914 0.0817
Exit probability for futility, mu_max = 1 0.0246 0.0258

secondly the reported stage sizes are the same for every experimental arm (after sample size re-assessment) even though they are not simulated with the same treatment effect (using "linear"):

Stagewise number of subjects, mu_max = 0.5
Treatment arm 1 30.0 75.3 38.7
Treatment arm 2 30.0 75.3 38.7
Treatment arm 3 30.0 75.3 38.7
Treatment arm 4 30.0 75.3 38.7
Control arm 20.0 50.2 25.8

Obviously I may have an input incorrect or be misinterpreting the results ... but I've double checked and don't see anything wrong or that explains either of these things in the output.

Improve documentation on `getSampleSizeSurvival()`

Hi,

First, thank you for this incredibly useful package.

I am trying to set up a GS design using rpact. The code looks like this:

design <- getDesignGroupSequential(kMax=3, typeOfDesign="asOF", 
                                   alpha=0.025, beta=0.2, sided=1)
designPlan <- getSampleSizeSurvival(design, pi1=1-0.66, pi2=1-0.54, eventTime=3,
                                    accrualTime=3, followUpTime=2)

I am trying to validate this design against the software we are using, East.
Please find attached the output of the same design: east.zip

I can find the same results, which is very satisfying 😁

However, the output is sometimes a bit confusing and I couldn't find any extensive documentation

  • Expected study duration = 4.40, although Analysis time = 2.43, 3.61, 5.00. No interim analysis is planned at t=4.4, how could the study end at that moment?
  • Expected number of events under [H0, H0/H1, H1]: what are these referring to? They are all different from Number of events per stage.
  • Moreover, the expected number of events (and therefore the analysis time), should be different under H0 and H1, as the survival hypothesis would be different. This is the case in the East example.
  • Expected number of subjects under H1 = 461.4: How is that different from Number of subjects [1,2,3]? Aren't the latter under H1 too? Maybe the considered hypothesis would be worth adding to the output here.

Of note, it would be very nice if designPlan could hold a reference to design internally, as it contains useful information to interpret it later (like the type of design for instance).
I'd be glad to make a PR if needed.

Minor fixes

  1. getDesignGroupSequential(futilityBounds = c(0,0), bindingFutility = FALSE) |> plot()
    Plot legend displays bindingFutility = TRUE

  2. getSampleSizeRates(groups = 1, thetaH0 = 0.2, riskRatio = FALSE)
    getPowerRates(groups = 1, thetaH0 = 0.2, riskRatio = FALSE, maxNumberOfSubjects = 100)
    There should be a warning that riskRatio = FALSE will be ignored (correct for getSamplesizeMeans and getPowerMeans)

  3. getDesignGroupSequential() |> getSampleSizeSurvival() |> summary()
    Expected study duration duration should have the same format as Analysis time (change to 2 decimal places)

  4. design <- getDesignGroupSequential(typeOfDesign = "asOF")
    dataExample <- getDataset(cumEvents = c(67, 129), cumLogRanks = c(-1.1, -2.0))
    getAnalysisResults(design = design, dataInput = dataExample) |> summary()
    For group sequential design second line in summary should not "Fixed weight" but "Planned information rate",
    "Fixed weight" only for combination test"!

  5. Bug for pipe operator in analysis results:
    S <- getDataSet(
    events1 = c(11, 12),
    events2 = c(6, 7),
    n1 = c(36, 39),
    n2 = c(38, 40))
    R <- getDataSet(
    events1 = c(12, 10),
    events2 = c(8, 8),
    n1 = c(32, 33),
    n2 = c(31, 29))
    getDesignInverseNormal(kMax = 2) |> getDataSet(S1 = S, R = R) |>
    getAnalysisResults(stage = 1, nPlanned = 150, intersectionTest = "Simes" ) |> pull(conditionalPower)

Installation Error rpact

Hello,
I am getting help from the example you put up at
www.rpact.org/vignettes/planning/rpact_binary_planning_example
I am trying to install rpact packge on my mac and I am getting an error message which lead to an installation failure. Specifically, I see this error:

make: gfortran: No such file or directory".   I also tried to download
a developer version but I still see the same error.
Could you help me guide how to install this packge?
installing *source* package ‘rpact’ ...
** package ‘rpact’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
clang++ -mmacosx-version-min=10.13 -std=gnu++14
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
-I'/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include'
-I/usr/local/include   -fPIC  -Wall -g -O2  -c RcppExports.cpp -o
RcppExports.o
clang++ -mmacosx-version-min=10.13 -std=gnu++14
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
-I'/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include'
-I/usr/local/include   -fPIC  -Wall -g -O2  -c f_as251.cpp -o f_as251.o
gfortran -mmacosx-version-min=10.13 -fno-optimize-sibling-calls  -fPIC  -Wall
-g -O2  -c f_as251_fortran.f -o f_as251_fortran.o
make: gfortran: No such file or directory
make: *** [f_as251_fortran.o] Error 1
ERROR: compilation failed for package ‘rpact’
* removing
‘/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rpact’
The downloaded source packages are in
‘/private/var/folders/tf/m_6xpy4j237b43dg4yfd53mw0000gn/T/Rtmpr9ciqn/downloaded_packages’
Warning messages:
1: In readRDS(dest) : lzma decoder corrupt data
2: In install.packages("rpact") :
installation of package ‘rpact’ had non-zero exit status 

plot(getDesignSet(...)) fails when one one design has kMax=1

library(rpact)

design1 <- getDesignGroupSequential(
  kMax = 1,
  sided = 1,
  typeOfDesign = "asOF",
  typeBetaSpending = "bsOF")

design2 <- getDesignGroupSequential(
  kMax = 2,
  sided = 1,
  typeOfDesign = "asOF",
  typeBetaSpending = "bsOF")

design4 <- getDesignGroupSequential(
  kMax = 4,
  sided = 1,
  typeOfDesign = "asOF",
  typeBetaSpending = "bsOF")

# This works:
plot(getDesignSet(designs = c(design2, design4),
                  variedParameters = "kMax"))

# This fails:
plot(getDesignSet(designs = c(design1, design4),
                  variedParameters = "kMax"))
# Error in rbind(deparse.level, ...) : 
#   numbers of columns of arguments do not match

getSimulationSurvival, thetaH0 and conditional power

Hi!
First of all thank you for this very nice and needed package!

I am trying to simulate an adaptive design using the function getSimulationSurvival(). I would also like to use thetaH0=0.9 instead of the default thetaH0=1. I am wondering if this parameter is respected during the simulation and/or in the results.

library(rpact)
library(dplyr)
dIN <- getDesignInverseNormal(kMax=2,
                              alpha=0.025,
                              beta=0.1,
                              sided=1,
                              typeOfDesign = "noEarlyEfficacy",
                              informationRates = c(19/125, 1),
                              futilityBounds = c(-1))



dIN_sim <-  getSimulationSurvival(design = dIN, 
                                  pi2 = 0.003,
                                  eventTime = 12,
                                  hazardRatio = c(0.5),
                                  dropoutRate1 = 0.15, dropoutRate2 = 0.15, dropoutTime = 12,
                                  accrualTime = c(0, 20),
                                  accrualIntensity = c(1000),
                                  thetaH0 = 0.9,
                                  directionUpper = F,
                                  plannedEvents = c(19, 125),              #cumulated
                                  maxNumberOfEventsPerStage = c(NA, 200),  #max events
                                  minNumberOfEventsPerStage = c(NA, 106),  #min events (per stage, not cumulative!)
                                  conditionalPower = 0.9,
                                  seed = 1910)

Since I would like to adapt the simulation manually (specifically, restart subject accrual at the timing of the interim based on conditional power and/or pi2), I tried to replicate scenarios from the simulation data using the smaller functions getStageResults() and getConditionalPower() .

simulation_replicates <- getData(dIN_sim) %>% mutate(iterationNumber = factor(iterationNumber))
interesting_case <- simulation_replicates %>% filter(iterationNumber == 12)
interesting_case  #increased sample size slightly from 106 to 131, i.e. overall 15o cumulated events in stage 2

which gives me

iterationNumber stageNumber         pi1   pi2 hazardRatio analysisTime numberOfSubjects overallEvents1 overallEvents2 eventsPerStage rejectPerStage
1              12           1 0.001501127 0.003         0.5     15.34246            15342              7             12             19              0
2              12           2 0.001501127 0.003         0.5     84.62074            20000             49            101            131              1
  eventsNotAchieved futilityPerStage testStatistic logRankStatistic conditionalPowerAchieved pValuesSeparate trialStop hazardRatioEstimateLR
1                 0                0     0.9191049        0.9191049                       NA    0.1790203579     FALSE             0.6559214
2                 0                0     3.6488241        3.6648582                0.7509866    0.0001774007      TRUE             0.5496526

Plugging in the results of the getSimulationSurvival() interim analysis result into getDataset() followed by getStageResults() and getConditionalPower() to the best of my ability:

#with thetah0=0.9
cp_with_thetah0_09 <- getConditionalPower(stageResults = getStageResults(dIN, dataInput = getDataset(cumulativeEvents=c(19), 
                                                                               cumulativeLogRanks = c(-0.9191049),  # minus! 
                                                                               cumulativeAllocationRatios = c(1)), 
                                                   directionUpper = F, 
                                                   stage = 1,
                                                   thetaH0 = 0.9),
                    thetaH1 = 0.6559214,   #use HR of stage 1
                    nPlanned = 131) #adaptively determined ssr for stage 2


#with thetah0=1
cp_with_thetah0_1 <- getConditionalPower(stageResults = getStageResults(dIN, dataInput = getDataset(cumulativeEvents=c(19), 
                                                                              cumulativeLogRanks = c(-0.9191049),  # minus! 
                                                                              cumulativeAllocationRatios = c(1)), 
                                                                         directionUpper = F, 
                                                                         stage = 1,
                                                                         thetaH0 = 1),
                                          thetaH1 = 0.6559214,   #use HR of stage 1
                                          nPlanned = 131)      #adaptively determined ssr for stage 2

Comparing the results for reported conditional power:

> cp_with_thetah0_09$conditionalPower
[1]        NA 0.4896989
> cp_with_thetah0_1$conditionalPower
[1]        NA 0.7497147
> interesting_case$conditionalPowerAchieved
[1]        NA 0.7509866

which made me suspect that the reported conditional power of getSimulationSurvival() does not respect the thetaH0 parameter. Or is $conditionalPower not to be confused with $conditionalPowerAchieved?
Additionally, why do I need to plug in the negative LR statistic into the getStageResults(getDataset()) function? I suspect the reason is the directionUpper argument, but I fail to see why the getSimulationSurvival() does not report the negative LR's that I would expect for HR's < 1.

Thank you!

Final CI for non-inferiority sequential design

Aim

We believe that there is something odd in the calculation of Final CIs at the first stage of a group sequential
design in RPACT for non-inferiority designs. We believe that this potential bug is only present for noninferiority
designs and at the first stage.

Our reasoning is based on the following:

  1. The results we get (Median unbiased estimate, Final CIs (lower + upper)) from a minimal reproducible
    example are different from what we would anticipate
  2. If we pretend that the design is not sequential, we should get CIs identical to the first stage Final CIs
    in a non-inferiority GSD context. But we do not.
  3. We get something else than the RPACT results when we calculate the confidence intervals by hand.
  4. Your source-code seems to adjust for the non-inferiority margin at all other interim stages and at the
    final analysis, but not the very first.

We wish with this document to illustrate the potential bug in RPACT for the first stage adjusted confidence
intervals and median unbiased estimate for a non-inferiority design.

Minimal reproducable example

library(rpact)

# our design
design <- getDesignGroupSequential(
    sided = 1, alpha = 0.025, beta = 0.1,
    informationRates = c(2/3, 1), typeOfDesign = "asOF"
)

# our fake data - create a scenario where we stop right at the boundary
fakeData2 <- getDataSet(
    overallEvents = c(1067), # number of events corresponding to 2/3s
    overallLogRanks = c(-1.713),
    overallAllocationRatio = c(1)
)
adjust_noninferiority <- getAnalysisResults(design = design,
    dataInput = fakeData2,
    directionUpper = F,
    thetaH0 = 1.05)

# We look at the final stage (first stage in this example) confidence intervals
adjust_noninferiority

Here we anticipated that the Final CIs upper boundary would be closer to 1.05 (thetaH0) and that the
Median unbiased estimate would be equal to the observed cumulative effect size.
To check that the CI is uncorrect we can use that the stage-wise confidence interval at the first stage should
be equal to the naively calculated confidence interval which we can get from the fixed design model.

Fixed design - no interim

If we assume that it was not a sequential design, we get the following confidence intervals:

# our design
design <- getDesignGroupSequential(
    sided = 1, alpha = 0.025, beta = 0.1,
    informationRates = c(1)
)

# creating data that stops right at the boundary
fakeData <- getDataSet(
    overallEvents = c(1067),
    overallLogRanks = c(-1.713),
    overallAllocationRatio = c(1)
)

# get the naive and adjusted inference
adjust_noninferirority <- getAnalysisResults(design = design,
    dataInput = fakeData,
    directionUpper = F,
    thetaH0 = 1.05)
adjust_noninferirority

Here the upper confidence interval crosses one and is different from the GSD final CI.

Calculating Final CIs (upper limit) by hand

We are going to find the value of μ = log(HR) for which the probability if 2.5% of being greater than
-1.713 (non-inf) or -2.5095 (superiority).

Starting with non-inf:

upper_ci_noninf <- function(x) pnorm(-1.713, mean = (log(x))*sqrt(1067)/2) - 0.025
uniroot(upper_ci_noninf, lower = 0.7, upper = 1.2)$root

We get an upper limit of 1.015 which is different than the RPACT result. But the same as the fixed design
CI.

Comparing non-inferiority to superiority

If the trial was a superiority trial instead, we could re-use the non-inferiority group sequential design but
without specifying the thetaH0 argument. Here we observe that if we reject the null hypothesis at the first
stage, we get identical Final CIs and Median unbiased estimate as in the non-inferiority analysis. However,
all other measures of inference are different in this scenario:

# superiority fake data - creating data that stops right at the boundary
fakeData <- getDataSet(
    overallEvents = c(1067),
    overallLogRanks = c(-2.5095),
    overallAllocationRatio = c(1)
)

# get the naive and adjusted inference
adjust_superiority <- getAnalysisResults(design = design,
    dataInput = fakeData,
    directionUpper = F)
adjust_superiority

Notice to do the analysis on the non-inferiority scale we needed to change the log-rank test to equal the
non-inferiority log-rank. Using the formula from Chen and Chen (Chen YH, Chen C. Testing superiority
at interim analyses in a non-inferiority trial. Stat Med. 2012 Jul 10;31(15):1531-42. doi: 10.1002/sim.5312.
Epub 2012 Mar 22. PMID: 22438208.)

We believe that the bug in RPACT is due to a missing adjustment of non-inferiority in the final CIs at the
first stage. Furthermore, we beleive that the non-inferiority analysis’ median unbiased estimate and final
CIs are incorrectly the same three measures of inference but based on a superiority trial.
This ends the arguments for the potential bug.

rpact_code

Possible inconsistency in getSimulationSurvival() with specified thetaH0 and thetaH1

Hello,
I am trying to understand thetaH1 in getSimulationSurvival. I am interested in an adaptive design involving a thetaH0 bound of 0.9. Note that I am interested in testing H0: HR >= 0.9 vs. H1: HR < 0.9, that means the 0.9 bound is not a non-inferiority bound in the classical sense. In the first scenario I ran a simulation without specifying thetaH1, which should be using the test-statistics of stage 1 and 2 and the observed hazard ratio of stage 2 to reassess the required events for stage 3. I picked out an iteration which had a hazard ratio of 0.4503625 at stage 2. Then I re-ran the simulation, using the same seed, with thetaH1 = 0.4503625 and looked at the same iteration again. Given I used the same seed, it was expected that the results including stage 2 test-statistic and observed hazard ratio are the same. However, this time, the reassessement, which in my understanding only requires the test-statistics of the previous stages (which here are the same between the two scenarios) and an assumed hazard ratio, which I also made sure to be the same given my specification of thetaH1, reported an amount of events for stage 3 which differs from the first scenario.

My code is below. Unless my understanding of the subject is incorrect, I do suspect some kind of inconsistency in regards the thetah0 bound between the two scenarios.

Kind regards
Tim

library(dplyr)
library(rpact)
futilityBounds <- c(0, 0.688)
N <- 14000
informationRates <- c(0.4, 0.6, 1)

dIN <- getDesignInverseNormal(kMax = 3,
                              alpha = 0.025,
                              beta = 0.1,
                              sided = 1,
                              informationRates = informationRates,
                              futilityBounds = futilityBounds,
                              typeOfDesign = "asUser",
                              userAlphaSpending = c(0, 0.01, 0.025)
)



myCalcEventsFunction_with_division_by_thetaH0 <- function(...,
                                 stage, thetaH0, conditionalPower, estimatedTheta, 
                                 plannedEvents, eventsOverStages,
                                 minNumberOfEventsPerStage, maxNumberOfEventsPerStage, allocationRatioPlanned,
                                 conditionalCriticalValue) {

  theta <- max(1 + 1e-12, estimatedTheta)
  cat(paste0("conditional critical value: ", conditionalCriticalValue, "\n"))
  cat(paste0("estimated theta: ", estimatedTheta, "\n"))
  requiredStageEvents <-
    max(0, conditionalCriticalValue + qnorm(conditionalPower))^2 * (((1+allocationRatioPlanned)^2)/allocationRatioPlanned) / log(theta/thetaH0)^2
  requiredStageEvents <- min(
    max(minNumberOfEventsPerStage[stage], requiredStageEvents),
    maxNumberOfEventsPerStage[stage]
  ) + eventsOverStages[stage - 1]
  return(requiredStageEvents)
}


#wrapper function for convenience, allows us to create the same simulation but it can only differ in thetaH1 and the calceventsfunction
sim_wrapper <- function(thetaH1 = NA_real_, calcEventsFunction = NULL) {
dIN_sim <- getSimulationSurvival(design = dIN,
                                 thetaH0 = 0.9,
                                 directionUpper = F,
                                 pi2 = 0.003,
                                 hazardRatio = c(0.4), 
                                 eventTime = 12,
                                 dropoutRate1 = 0.08,
                                 dropoutRate2 = 0.08,
                                 dropoutTime = 12,
                                 accrualTime = c(0,0.01),
                                 maxNumberOfSubjects = N,    
                                 plannedEvents =  c(20, 30, 50),
                                 maxNumberOfIterations = 75,
                                 longTimeSimulationAllowed = T,
                                 seed = 1910,
                                 showStatistics = T,
                                 maxNumberOfRawDatasetsPerStage = 2,
                                 minNumberOfEventsPerStage = c(20, 10, 20),
                                 maxNumberOfEventsPerStage = c(20, 10, 45), 
                                 conditionalPower = 0.9,
                                 calcEventsFunction = calcEventsFunction,
                                 thetaH1 = thetaH1
)}

#standard, use observed HR as thetaH1, interesting iteration 75 chosen
sim_wrapper(thetaH1 = NA_real_, calcEventsFunction = NULL) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#at stage 2, some outputs:
#estimated HR: 0.4503625
#test statistic: 1.896057
#-> for stage 3: increase single events from 20 to 41

#using the same seed, the data looks the same up to IA2. Now using the HR that we have just seen from iteration 75 in the 
#"use interim HR as theta case" (0.4503625) as a fixed thetaH1.
#-> if I understand the theory correctly, it should lead to the same SSR, as the test statistics of stage 1 and 2 are identical, 
#and the theta for the SSR is the same
sim_wrapper(thetaH1 = 0.4503625, calcEventsFunction = NULL) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#first of all consistency checks, estimated HR and test statistic of stage 2 are equal to the case above...
#estimated HR: 0.4503625
#test statistic: 1.896057
#BUT, the ssr now says we need 30 singular events for stage 3 instead of the 41 from above.

#now with custom function, translated from cpp code to the best of my ability, so it should be the same as rpact native if done correctly
#since we are using only 75 iterations, and we are interested in iteration 75, the last printed line in your console corresponds to the 
#conditional critical value and estimated theta of that iteration 75 (didn't know how to do that more neatly)
sim_wrapper(thetaH1 = NA_real_, calcEventsFunction = myCalcEventsFunction_with_division_by_thetaH0) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#conditional critical value: 0.924377935201586 (from console print...)
#estimated theta: 1.79855127135391 (from console print...)
#results look the same as the native rpact ssr function with 41 single events at stage 3
#but focus on estimatedTheta which gets passed internally
#estimated theta: 1.79855127135391
#while hazardRatioEstimateLR 0.4503625
#obviously there is a directionUpper mirroring
#but 1/1.79855127135391 != 0.4503625
#maybe estimatedTheta already the theta which would lead to the same test-statistic given thetaH0 = 0.9
#but 1/(0.4503625/0.9) !=  1.79855127135391
#however 1/(0.4503625/0.9) == 1.79855127135391/0.9

#now with a fixed theta, again, we should not have differences here for iteration 75 as the 
#chosen fixed theta is exactly the theta observed in stage 2 of iteration 75
sim_wrapper(thetaH1 = 0.4503625, calcEventsFunction = myCalcEventsFunction_with_division_by_thetaH0) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#conditional critical value: 0.924377935201586 -> same as before, as expected
#estimated theta: 1.99839018568375 -> not the same
#this time this equation holds
#1/(0.4503625/0.9) == 1.99839018568375



#now I try this by hand, and on the negative side of the normal distribution first so i dont get confused by the directionUpper mirroring...
#max(0, conditionalCriticalValue + qnorm(conditionalPower))^2 * (((1+allocationRatioPlanned)^2)/allocationRatioPlanned) / log(theta/thetaH0)^2
min(0, -0.924377935201586 - qnorm(0.9))^2 * (((1+1)^2)/1) / log(0.4503625/0.9)^2
#40.6071 -> I guess this would be consistent with the 41 from scenario 1 with thetaH1=NA_real_ and rpact native ssr function
#however, I plugged in the true HR of 0.4503625
#if I want to do this the directionUpper way (applying thetaH0 first, then mirroring), trivially the same as log(1) is 0.
max(0, 0.924377935201586 + qnorm(0.9))^2 * (((1+1)^2)/1) / log(1/(0.4503625/0.9))^2
#40.6071


#what happens in the fixed thetah1 case where it said we need 30 events for stage 3?
#estimatedTheta internally is 1.99839018568375, i.e. 1/(0.4503625/0.9)
#but then gets plugged in 
##max(0, conditionalCriticalValue + qnorm(conditionalPower))^2 * (((1+allocationRatioPlanned)^2)/allocationRatioPlanned) / log(theta/thetaH0)^2
#which divides it again by 0.9
#i.e.
max(0, 0.924377935201586 + qnorm(0.9))^2 * (((1+1)^2)/1) / log(((1/(0.4503625/0.9))/0.9))^2
#30.58873
#((1/(0.4503625/0.9))/0.9) = 2.220434 = 1/0.4503625

calcSubjectsFunction: should clarify allocationRatioPlanned should be treated as vector instead of scalar

You probably want to update your examples in the manual/ vignette about the custom sample size re-calculation functions: making it clear that allocationRatioPlanned is treated as a vector in the back end, not a scalar. Please see below example.

simpowerCPZ shows error message:

Error in fun(alternative = alternative, kMax = design$kMax, maxNumberOfIterations = maxNumberOfIterations, :
Evaluation error: the condition has length > 1.

simpowerCPZ2 uses allocationRatioPlanned[stage] instead and it runs fine.

library(rpact)
n1<-8 #interim sample size/arm
Nmin<-15 #min sample size/arm
Nmax<-23 #max sample size/arm

designc<-getDesignInverseNormal(sided = 1, alpha = 0.05, beta = 0.2,kMax=2,informationRates=c(n1/Nmin,1),
typeOfDesign = "noEarlyEfficacy", typeBetaSpending = "bsHSD",gammaB = -3)

custom SSR function

myCPZSampleSizeCalculationFunction <- function(..., stage,
minNumberOfSubjectsPerStage,
maxNumberOfSubjectsPerStage,
conditionalPower,
conditionalCriticalValue,
allocationRatioPlanned,
thetaH1,
stDevH1) {

function adapted from example in ?getSimulationMeans

note conditionalCriticalValue gives the critical value on the mean/proportion level of current stage data alone

instead of the overall cumulative critical Values on the z-scale (uk) at current stage

it already took care of 1. the weights based inverse normal method 2. observed data from previous stages

and gives critical values with the design

calculateStageSubjects <- function(cp) {

(1 + allocationRatioPlanned)^2/allocationRatioPlanned * 
  (max(0, conditionalCriticalValue + stats::qnorm(cp)))^2 / 
  (max(1e-12, thetaH1/stDevH1))^2

}

Calculate sample size required to reach maximum desired conditional power

cp_max (provided as argument conditionalPower)

stageSubjectsCPmax <- calculateStageSubjects(cp = conditionalPower)

Calculate sample size required to reach minimum desired conditional power

cp_min (manually set for this example to 0.7)

stageSubjectsCPmin <- calculateStageSubjects(cp = 0.7)

Define stageSubjects

stageSubjects <- ceiling(min(max(minNumberOfSubjectsPerStage[stage], stageSubjectsCPmax), maxNumberOfSubjectsPerStage[stage]))

Set stageSubjects to minimal sample size in case minimum conditional power

cannot be reached with available sample size

if (stageSubjectsCPmin > maxNumberOfSubjectsPerStage[stage]) {
stageSubjects <- minNumberOfSubjectsPerStage[stage]
}

return sample size

return(stageSubjects)
}

custom SSR function

myCPZSampleSizeCalculationFunction2 <- function(..., stage,
minNumberOfSubjectsPerStage,
maxNumberOfSubjectsPerStage,
conditionalPower,
conditionalCriticalValue,
allocationRatioPlanned,
thetaH1,
stDevH1) {

function adapted from example in ?getSimulationMeans

note conditionalCriticalValue gives the critical value on the mean/proportion level of current stage data alone

instead of the overall cumulative critical Values on the z-scale (uk) at current stage

it already took care of 1. the weights based inverse normal method 2. observed data from previous stages

and gives critical values with the design

calculateStageSubjects <- function(cp) {

  (1 + allocationRatioPlanned[stage])^2/allocationRatioPlanned[stage] *
  (max(0, conditionalCriticalValue + stats::qnorm(cp)))^2 / 
  (max(1e-12, thetaH1/stDevH1))^2

}

Calculate sample size required to reach maximum desired conditional power

cp_max (provided as argument conditionalPower)

stageSubjectsCPmax <- calculateStageSubjects(cp = conditionalPower)

Calculate sample size required to reach minimum desired conditional power

cp_min (manually set for this example to 0.7)

stageSubjectsCPmin <- calculateStageSubjects(cp = 0.7)

Define stageSubjects

stageSubjects <- ceiling(min(max(minNumberOfSubjectsPerStage[stage], stageSubjectsCPmax), maxNumberOfSubjectsPerStage[stage]))

Set stageSubjects to minimal sample size in case minimum conditional power

cannot be reached with available sample size

if (stageSubjectsCPmin > maxNumberOfSubjectsPerStage[stage]) {
stageSubjects <- minNumberOfSubjectsPerStage[stage]
}

return sample size

return(stageSubjects)
}
simpowerCPZ <- getSimulationMeans(designc,
alternative =0.65, stDev =0.7,
# cumulative overall sample size
plannedSubjects = 2 * c(n1, Nmin),
# stage-wise minimal overall sample size
minNumberOfSubjectsPerStage = 2 * c(n1, (Nmin - n1)),
# stage-wise maximal overall sample size
maxNumberOfSubjectsPerStage = 2 * c(n1, (Nmax - n1)),
conditionalPower = 0.8,
thetaH1 = 0.5, stDevH1=0.7,
maxNumberOfIterations = 1000,
seed = 12345,
calcSubjectsFunction = myCPZSampleSizeCalculationFunction)
simpowerCPZ2 <- getSimulationMeans(designc,
alternative =0.65, stDev =0.7,
# cumulative overall sample size
plannedSubjects = 2 * c(n1, Nmin),
# stage-wise minimal overall sample size
minNumberOfSubjectsPerStage = 2 * c(n1, (Nmin - n1)),
# stage-wise maximal overall sample size
maxNumberOfSubjectsPerStage = 2 * c(n1, (Nmax - n1)),
conditionalPower = 0.8,
thetaH1 = 0.5, stDevH1=0.7,
maxNumberOfIterations = 1000,
seed = 12345,
calcSubjectsFunction = myCPZSampleSizeCalculationFunction2)

Wrong calculation of accrual intensity for fixed sample survival design

Example where $accrualIntensity is wrong

What I get

I’m trying to find the absolute accrual intensity (number recruited per time unit) when
recruitment is twice as fast in the second period as the first.

sampleSize <- getSampleSizeSurvival(alpha = 0.025, 
    sided = 1, 
    beta = 0.1,
    hazardRatio = 0.7, 
    lambda2 = log(2) / 9,
    accrualTime = c(0, 3, 12),
    accrualIntensity = c(0.1, 0.2),
    accrualIntensityType = "relative",
    followUpTime = 6)
sampleSize$accrualIntensity

[1] 215.90555 53.97639

What I expect

...so doing this manually for now

m <- sampleSize$maxNumberOfSubjects

Accrual per month during first 3 months

m / (3 * 0.1 + 9 * 0.2) * 0.1

[1] 30.84365

Accrual per month during months 4 to 9

m / (3 * 0.1 + 9 * 0.2) * 0.2

[1] 61.6873

Inconsistent naming of variables: Single and cumulative events per stage

Current Situation:

TrialDesignPlanSurvival (getSampleSizeSurvival / getPowerSurvival)

  • eventsPerStage: Cumulative events per stage

SimulationResultsSurvival (getSimulationSurvival)

  • eventsPerStage: Number of events per stage
  • overallEventsPerStage: Cumulative events per stage

SimulationResults[MultiArm/Enrichment]Survival (getSimulationMultiArmSurvival / getSimulationEnrichmentSurvival)

  • eventsPerStage: Cumulative events per stage {"per treatment arm compared to control"}
  • singleNumberOfEventsPerStage: Single number of events {"per treatment arm"}

The ideal change would be as follows:

TrialDesignPlanSurvival (getSampleSizeSurvival / getPowerSurvival)

  • eventsPerStage: Number of events per stage
  • cumulativeEventsPerStage: Cumulative events per stage

SimulationResultsSurvival (getSimulationSurvival)

  • eventsPerStage: Number of events per stage
  • cumulativeEventsPerStage: Cumulative events per stage

SimulationResults[MultiArm/Enrichment]Survival (getSimulationMultiArmSurvival / getSimulationEnrichmentSurvival)

  • eventsPerStage: Number of events per stage {"per treatment arm compared to control"}
  • cumulativeEventsPerStage: Cumulative events per stage {"per treatment arm compared to control"}
  • singleNumberOfEventsPerStage: Single number of events {"per treatment arm"}

Issue

  • Users already pointed out this problem with the inconsistent variable names in 2019/2020
  • The R scripts of the "rpact Power Users" are now based on these inconsistent names

Conflicts/Solution Proposals:

TrialDesignPlanSurvival (getSampleSizeSurvival / getPowerSurvival)

  • eventsPerStage: Cumulative events per stage (unfortunately, this name is taken and cannot stand for "Number of events per stage")
  • cumulativeEventsPerStage: Cumulative events per stage

SimulationResultsSurvival (getSimulationSurvival)

  • eventsPerStage: Number of events per stage
  • cumulativeEventsPerStage: Cumulative events per stage
  • overallEventsPerStage: Cumulative events per stage (must remain as a duplicate and be marked as "deprecated")

SimulationResults[MultiArm/Enrichment]Survival (getSimulationMultiArmSurvival / getSimulationEnrichmentSurvival)

  • eventsPerStage: Cumulative events per stage (unfortunately, this name is taken and cannot stand for "Number of events per stage")
  • cumulativeEventsPerStage: Cumulative events per stage
  • singleNumberOfEventsPerStage: Single number of events {"per treatment arm"}

In the aggregated simulation results, there are also

  • eventsPerStage1 = "Observed events by stage (1)"
  • eventsPerStage2 = "Observed events by stage (2)"

The names should be changed to "Observed events per stage"

To implement a consistent solution that is still runnable with existing scripts, it would be best to find a new name for "eventsPerStage", something like "singleEventsPerStage". To ensure a clear distinction from "singleNumberOfEventsPerStage", this variable should be renamed to "singleEventsPerStagePerArm".

Add support for population selection based on predictive/posterior probabilities in getSimulationEnrichmentMeans/Rates/Survival

Originally opened by xinzhn

I am wondering if it would be possible to add an additional option (e.g, userDefined) for the argument effectMeasure in getSimulationEnrichmentMeans/Rates/Survival, to allow a user defined function based on effect measures other than effectEstimate and testStatistic for selectPopulationsFunction. Examples for such alternatives are population-specific predictive probabilities and subgroup-specific posterior probabilities1. This might be implemented by providing additional arguments for the customized function selectPopulationsFunction to facilitate the calculation for both type of probabilities.

Just a thought for consideration: it may be convenient to borrow some arguments from calcSubjectsFunction (customized function for sample size re-calculation), but to provide for populations as well as subgroups. To be specific, we may need

  • thetaH1, overallEffects, and stDevH1 for continuous endpoints.
  • piTreatmentH1, piControlH1, overallRatesTreatment, and overallRatesControl for binary endpoints.
  • thetaH1 and overallEffects for survival endpoints.

Footnotes

  1. Brannath, W., Zuber, E., Branson, M., Bretz, F., Gallo, P., Posch, M., and Racine-Poon, A. (2009), “Confirmatory adaptive designs with Bayesian decision tools for a targeted therapy in oncology,” Statistics in Medicine, 28, 1445–1463.

Incorrect legend when plotting designs with/without binding futility

# Compare nonbinding and binding futility bounds
design1 <- getDesignGroupSequential(
  kMax = 4, alpha = 0.05,
  sided = 1,
  informationRates = 1:4/4,
  typeOfDesign = "asOF",
  typeBetaSpending = "bsOF")

design2 <- getDesignGroupSequential(
  kMax = 4, alpha = 0.05,
  sided = 1,
  informationRates = 1:4/4,
  typeOfDesign = "asOF",
  typeBetaSpending = "bsOF",
  bindingFutility = TRUE)

plot(getDesignSet(designs = c(design1, design2),
                  variedParameters = "bindingFutility"))
# The final legend line is:
# TRUE, Futility bound (non-binding)
# The TRUE indicates that futility is binding.
# The "(non-binding)" is incorrect.

# In contrast
plot(design2)
# has the second line in the legend "Futility bound (binding)"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.