bozenne / buysetest Goto Github PK

View Code? Open in Web Editor NEW

4.0 6.0 6.0 23.7 MB

Generalized Pairwise Comparisons

License: GNU General Public License v3.0

R 69.21% C++ 14.44% TeX 16.35%

statistics r non-parametric generalized-pairwise-comparisons

buysetest's People

Contributors

Stargazers

Watchers

Forkers

ecantagallo guhjy graemeleehickey minghao2016 yings9758 drchinmay25

buysetest's Issues

Issue with restriction in some cases

Dear Brice,

I'm sorry to bother you again. I have found a bug in the package, still present in version "3.0.0".

It seems to be linked to using two non time to event variables followed by a restricted TTE variable in the hierarchy. Note that if you add any kind of variable before 2 non tte, it crashes as well. It seems to be linked with ending the formula with this pattern of 3 subsequent variables : any combinaison of 2 non tte and 1 restricted tte.

Here is a reproducible example of the bug :

set.seed(1)
dt <- simBuyseTest(n.T = 50, n.C = 50,
                   names.strata = "strat_column", n.strata = 2,
                   argsTTE = list(name = c("tte1"), 
                                  name.censoring = c("cnsr1"),
                                  scale.T = c(200), scale.censoring.T = c(10^5),
                                  scale.C = c(200), scale.censoring.C = c(10^5)),
                   argsBin = list(p.T = list(c(0.4,0.6))),
                   argsCont = list(mu.T = c(0.3), sigma.T = 1)
                  )

formula1 <- treatment ~ strat_column +
  bin(toxicity, operator = "<0") +
  cont(score, threshold = 0.5) +
  tte(tte1, status = cnsr1, threshold = 10, restriction = 365)

GPC <- BuyseTest(formula1,
                    data = dt,
                    trace = FALSE)

Thanks in advance for your help.
Kind regards,
Samuel

Asymptotic inference error

The method.inference = "asymptotic" option is returning erroneous confidence intervals.

An example is:

data(veteran,package="survival")

BT.keep <- BuyseTest(trt ~ tte(time, threshold = 20, censoring = "status") + cont(karno),
                     data = veteran, keep.pairScore = TRUE, method.tte = "Gehan", 
                     trace = 0, method.inference = "asymptotic")

summary(BT.keep, statistic = "winRatio")

which returns a 95% CI of [-1.1414, 2.7763]. However, the lower confidence interval should be non-negative.

I am getting similar results on my own datasets with purely TTE outcomes.

Equality with threshold

It occurs to me in the following lines of code that the pairwise comparison does not match what was done in Buyse (2010).

https://github.com/bozenne/BuyseTest/blob/087f54d575ab5d10d0f8e06b199173f139b5057a/src/FCT_calcOnePair.h#L48:L55

Namely, if $X - Y > \tau$, Table II in Buyse declares this as "favourable". However, in the code here, we have $X - Y \ge \tau$.

The difference is subtle, but I thought I would highlight in case you wanted to make this clear in the documentation(?).

If you do change any code to "correct", would it be possible to have an option to keep the old threshold, i.e. >=?

powerBuyseTest with Gehan statistic

The following code throws an error:

library(BuyseTest)

args <- list(rates.T = c((3:5) / 10), rates.Censoring.T = rep(1, 3))

simFCT <- function(n.C, n.T) {
  simBuyseTest(100, argsBin = NULL, argsCont = NULL, argsTTE = args)
}

powerBuyseTest(sim = simFCT, sample.size = c(100), n.rep = 2,
               formula = treatment ~ tte(eventtime1, status = status1),
               method.inference = "u-statistic",
               scoring.rule = "Gehan")

The error is r Error in FUN(X[[i]], ...) : argument "censoring" is missing, with no default.

For some reason the code works with if I change r status = status1 to r censoring = status1. In my view, this case should throw an error message.

simBuyseTest time to event data: rates vs. scales

When running

#### only TTE endpoints ####
args <- list(rates.T = c(3:5/10), rates.Censoring.T = rep(1,3))
simBuyseTest(100, argsBin = NULL, argsCont = NULL, argsTTE = args)

I think the rates are actually scales. This can be see by graphically displaying the survival curve:

plot(survfit(Surv(eventtime3, status3) ~ Treatment, data = dat))
x <- seq(0, 2.5, len = 1001)
y1 <- exp(-0.5 * x)
lines(x, y1, col = 2)

y2 <- exp(-1/0.5 * x)
lines(x, y2, col = 3)

The red curve (rate = 0.5) does not match the simulated the data. However, the green curve (rate = 2, scale = 0.5) does.

Why numbers of wins, losses or ties are NOT integers?

I was comparing the results from the BuyseTest and WINS R packages, using the veteran data (also used in the tutorial https://cran.r-project.org/web/packages/BuyseTest/vignettes/overview.pdf) and the results are drastically different. While either package may have its own flaws, but what stroke me with the BuyseTest package is the fact that, assuming no censoring, the number of wins, losses, and ties are NOT integers as expected. I suspect that how these are calculated may help uncover the mystery, but will be helpful if I can get clarifications on:

The discrepancy between the two packages;
Why are there different numbers (or proportions) of wins, losses, and ties?
Why, in the context of BuyseTest, the above numbers are decimals, instead of integers?
How that is reflected in the final results if I account for censoring in the analyses?

Hopefully, you can help me uncover the mystery.

keep.pairScore option is time consuming

Hi,
Thank you for this very complete package!
After playing with it, I noticed that the keep.pairScore option seems surprisingly time consuming.
Is it an issue that may be corrected?

Best regards,
David

> df <- simBuyseTest(n.T = 500, n.C = 500)
> time1 <- system.time(BuyseTest(data = df,
                      endpoint = "eventtime",
                      type = "timeToEvent",
                      censoring = "status",
                      treatment = "Treatment",
                      method.inference = "none", 
                      method.tte = "Gehan", 
                      keep.pairScore = FALSE))[3]
> time2 <- system.time(BuyseTest(data = df,
                      endpoint = "eventtime",
                      type = "timeToEvent",
                      censoring = "status",
                      treatment = "Treatment",
                      method.inference = "none", 
                      method.tte = "Gehan", 
                      keep.pairScore = TRUE))[3]
> time2/time1
elapsed 
     12

p-value in permutations : display 0 instead of 2.2e-16

Hi,

I was wondering whether it would be possible to display 0 instead of 2.2e-16 for very small p.values in the output of summary(GPC), as in my understanding, it happens when no permutations was above the real data.

Thanks in advance !
Best regards,
Samuel

Handling of ties in time-to-event outcomes

If a patient in the treatment arm has a failure at time $T$, but a patient in the control arm is lost-to-follow-up also at time $T$ (so is right-censored), Table III in Buyse (2010) declares the pair as uninformative. The BuyseTest package also uses this logic, as can be seen in the dummy example below.

dat <- data.frame(
  time = c(10, 10),
  event = c(0, 1),
  treat = c(0, 1)
)
 
test <- BuyseTest(
  treat ~ TTE(time, status = event),
  data = dat,
  method.inference = "u-statistic",
  keep.pairScore = TRUE,
  scoring.rule = "Gehan")
 
summary(test, statistic = "winRatio", percentage = FALSE)

I have contacted Prof. Buyse and he confirms this is an error. I have also checked Gehen (1965), and indeed it looks like the inequalities are not aligned. In the absence of a threshold (i.e., $\tau = 0$), the scoring should be:

References

Buyse M. ‘Generalized Pairwise Comparisons of Prioritized Outcomes in the Two-Sample Problem’. Statistics in Medicine 29, no. 30 (2010): 3245–57. https://doi.org/10.1002/sim.3923.

Gehan EA. A generalized Wilcoxon test for comparing arbitrarily singly censored samples. Biometrika 1965; 52:203--223.

Empty vignettes on CRAN?

It appears that the vignettes with the CRAN version of the package are empty:

https://cran.r-project.org/web/packages/BuyseTest/vignettes/overview.pdf
https://cran.r-project.org/web/packages/BuyseTest/vignettes/wilcoxonTest.pdf

Installation & check of the source shows that this is not just a build error, as the content looks like this:

%\VignetteIndexEntry{BuyseTest: overview}
%\VignetteEngine{R.rsp::asis}
%\VignetteKeyword{PDF}
%\VignetteKeyword{HTML}
%\VignetteKeyword{vignette}
%\VignetteKeyword{package}

I'm not sure whether this is just outdated (as the current version has two fine vignettes), but it's more than a bit unusual for a CRAN package, so... for your attention.

Error in power function

Running the following code throws an error:

## using user defined simulation function
simFCT <- function(n.C, n.T){
    out <- data.table(Y=rnorm(n.C+n.T),
                      T=c(rep(1,n.C),rep(0,n.T))
                     )
return(out)
}

powerBuyseTest(sim = simFCT, sample.sizeC = c(100), sample.sizeT = c(100), n.rep = 2,
              formula = T ~ cont(Y), method.inference = "u-statistic", trace = 4)

The error is:

Error in paste(sample.size, collapse = " ") : 
  argument "sample.size" is missing, with no default

No error occurs if sample.size is used instead.

error with large dataset

Hi,
Thank you very much for developing this helpful package!
When I try to apply BuyseTest to a large dataset, for example, 10000 subjects, I get the error

error: Mat::init(): requested size is too large
Error in GPC_cpp(endpoint = envir$outArgs$M.endpoint, status = envir$outArgs$M.status, :
Mat::init(): requested size is too large
Calls: BuyseTest -> .BuyseTest -> GPC_cpp -> .Call
Execution halted

I run the code on HPC memory should not be an issue. I did a quick search online and feel it may work for defining ARMA_64BIT_WORD for use of large matrix in rcpp. Could you please investigate this? Thank you very much!

issue when using restriction with 10 outcomes

Hi Brice,

For BuyseTest version 2.4.0.

I have an issue when I want to use a "restriction time" to more than 1 outcome.

The following code works :

BuyseTest(data = df,
                          arm ~ strat_column +
                            tte(tte1, status = cnsr1, threshold = 10, restriction = 365) +
                            tte(tte2, status = cnsr2, threshold = 10) +
                            tte(tte3, status = cnsr3, threshold = 10) +
                            tte(tte4, status = cnsr4, threshold = 10) +
                            bin(bin_var, operator = "<0") +
                            tte(tte5, status = cnsr5, threshold = 0) +
                            tte(tte6, status = cnsr6, threshold = 0) +
                            tte(tte7, status = cnsr7, threshold = 0) +
                            tte(tte8, status = cnsr8, threshold = 0) +
                            tte(tte9, status = cnsr9, threshold = 0),
                          trace = F, method.inference = "u-statistic", keep.pairScore = TRUE,
                          scoring.rule = "Peron", pool.strata = "CMH"
                          )

However, when adding the restriction to one or more outcomes such as this :

BuyseTest(data = df,
                          arm ~ strat_column +
                            tte(tte1, status = cnsr1, threshold = 180, restriction = 365) +
                            tte(tte2, status = cnsr2, threshold = 180,  restriction = 365) +
                            tte(tte3, status = cnsr3, threshold = 180,  restriction = 365) +
                            tte(tte4, status = cnsr4, threshold = 180,  restriction = 365) +
                            bin(bin_var, operator = "<0") +
                            tte(tte5, status = cnsr5, threshold = 0,  restriction = 365) +
                            tte(tte6, status = cnsr6, threshold = 0,  restriction = 365) +
                            tte(tte7, status = cnsr7, threshold = 0,  restriction = 365) +
                            tte(tte8, status = cnsr8, threshold = 0,  restriction = 365) +
                            tte(tte9, status = cnsr9, threshold = 0,  restriction = 365),
                          trace = F, method.inference = "u-statistic", keep.pairScore = TRUE,
                          scoring.rule = "Peron", pool.strata = "CMH"
                          )

I obtain the following error :

Error in (function (name.call, status, correction.uninf, cpus, data, endpoint, :
Strata variable(s) "strat_column" "tte(tte9,status=cnsr9,
threshold=0, restriction = 365)" not found in argument 'data'

which I don't understand as these columns are in "df". Sorry to bother you with this...

Kind regards,
Samuel