too long expressions in querry,about ohdsi/cohortmethod

pbr6cornell commented on June 12, 2024

Yikes, 1000 codes! That seems like it shouldn't be.....but regardless, I
think the technical solution is that instead of sending the list directly
into the IN clause of the SQL, you'd need to insert those values into a
temp table, and then join the temp table to the CDM table on the CONCEPT_ID
field.

Cheers,

Patrick

On Thu, Apr 23, 2015 at 12:57 PM, zuoyizhang [email protected]
wrote:

Hi Martijn,

In Oracle, the maximum number of expressions in a list is 1000. if
length(nsaids) in the example of "Single studies using the CohortMethod
package" is greater than 1000, one error will occur. How do we deal with
this issue?

Thanks,

Zuoyi

—
Reply to this email directly or view it on GitHub
#30.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Thanks, Patrick!

That is a good way to insert those values using a template sql.

Thanks,

Zuoyi

On Apr 23, 2015, at 5:47 PM, Patrick Ryan <[email protected]mailto:[email protected]> wrote:

Yikes, 1000 codes! That seems like it shouldn't be.....but regardless, I
think the technical solution is that instead of sending the list directly
into the IN clause of the SQL, you'd need to insert those values into a
temp table, and then join the temp table to the CDM table on the CONCEPT_ID
field.

Cheers,

Patrick

On Thu, Apr 23, 2015 at 12:57 PM, zuoyizhang <[email protected]mailto:[email protected]>
wrote:

Hi Martijn,

In Oracle, the maximum number of expressions in a list is 1000. if
length(nsaids) in the example of "Single studies using the CohortMethod
package" is greater than 1000, one error will occur. How do we deal with
this issue?

Thanks,

Zuoyi

—
Reply to this email directly or view it on GitHub
#30.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-95728413.

from cohortmethod.

schuemie commented on June 12, 2024

This should be fixed in the latest release. We're now using temp tables for all lists of concept IDs.

@zuoyizhang , could you check if the problem is now solved for you?

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

I tried to rerun the program using the whole exclusion conept IDs(>1000). But the below error occurs.

Error:
execute JDBC update query failed in dbSendUpdate (ORA-00942: table or view does not exist
)

SQL:
CREATE GLOBAL TEMPORARY TABLE zyizhang.liiwtcytnon_overlap_cohort
ON COMMIT PRESERVE ROWS
AS
SELECT
treatment,
new_user_cohort.person_id,
cohort_start_date,
cohort_end_date

FROM
zyizhang.liiwtcytnew_user_cohort new_user_cohort
LEFT JOIN (
SELECT person_id
FROM (
SELECT person_id,
COUNT(treatment) AS num_cohorts
FROM zyizhang.liiwtcytindicated_cohort
GROUP BY person_id
) t1
WHERE num_cohorts = 2
) both_cohorts
ON new_user_cohort.person_id = both_cohorts.person_id
WHERE both_cohorts.person_id IS NULL

But zyizhang.liiwtcytindicated_cohort doesn’t exist. This table is created by the translated sql (the original sql for creating #indicated_cohort is from GetCohorts.sql):

CREATE GLOBAL TEMPORARY TABLE liiwtcytindicated_cohort
ON COMMIT PRESERVE ROWS
AS
SELECT
DISTINCT treatment,
new_user_cohort.person_id,
cohort_start_date,
cohort_end_date,
observation_period_end_date

FROM
liiwtcytnew_user_cohort new_user_cohort
INNER JOIN (
SELECT person_id,
condition_start_date AS indication_date
FROM condition_occurrence
WHERE condition_concept_id IN (
SELECT descendant_concept_id
FROM concept_ancestor
INNER JOIN liiwtcytindications
ON ancestor_concept_id = concept_id
)
) indication
ON new_user_cohort.person_id = indication.person_id
AND new_user_cohort.cohort_start_date <= ( indication_date + @indication_lookback_window)
AND new_user_cohort.cohort_start_date >= indication_date
;

But I haven’t found where to create the table #indications (liiwtcytindications) in GetCohorts.sql. This might be the reason why the error occurs.

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Wednesday, May 06, 2015 11:24 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

This should be fixed in the latest release. We're now using temp tables for all lists of concept IDs.

@zuoyizhanghttps://github.com/zuoyizhang , could you check if the problem is now solved for you?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-99698597.

from cohortmethod.

schuemie commented on June 12, 2024

Sorry about that! There are many possible combinations of parameters, I really should write a test script that tries them all. I found the bug when not using indication concept IDs, and fixed it.

Can you try again?

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

Another error occurs.

Error:
execute JDBC update query failed in dbSendUpdate (ORA-00903: invalid table name
)

SQL:
CREATE GLOBAL TEMPORARY TABLE zyizhang.liiwtcytcondition_group
ON COMMIT PRESERVE ROWS
AS
SELECT
descendant_concept_id,
ancestor_concept_id

FROM

(

) t1

It seems that the table was not created in GetCovariates.sql (the corresponding sql is included below).

select descendant_concept_id,

ancestor_concept_id

INTO #condition_group

from

(

{@use_covariate_condition_group_meddra} ? {

SELECT DISTINCT ca1.descendant_concept_id,

ca1.ancestor_concept_id

FROM (

SELECT covariate_id,

covariate_name,

analysis_id,

concept_id

FROM #cov_ref

WHERE analysis_id > 100

AND analysis_id < 300

) ccr1

INNER JOIN concept_ancestor ca1

ON ccr1.concept_id = ca1.descendant_concept_id

INNER JOIN concept c1

ON ca1.ancestor_concept_id = c1.concept_id

WHERE c1.vocabulary_id = 15

AND c1.concept_class <> 'System Organ Class'

AND c1.concept_id NOT IN (36302170, 36303153, 36313966)

{@has_excluded_covariate_concept_ids} ? { AND c1.concept_id NOT IN (SELECT concept_id FROM #excluded_cov)}

{@has_included_covariate_concept_ids} ? { AND c1.concept_id IN (SELECT concept_id FROM #included_cov)}

{@use_covariate_condition_group_snomed} ? { UNION }

}

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Thursday, May 07, 2015 7:33 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Sorry about that! There are many possible combinations of parameters, I really should write a test script that tries them all. I found the bug when not using indication concept IDs, and fixed it.

Can you try again?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-100046593.

from cohortmethod.

schuemie commented on June 12, 2024

Sorry about that! Yet another combination of parameters I hadn't tested. It was this combination that was problematic:

useCovariateConditionGroup   = TRUE
useCovariateConditionGroupMeddra  = FALSE
useCovariateConditionGroupSnomed = FALSE

I've changed the code so it will automatically set the first parameter to FALSE if the last two are FALSE.

I've now written a script that tests a huge number of possible combinations of parameters. It identified one other issue (when using interaction terms on Oracle) that is now fixed. Hopefully you should not run into any problems from now on (fingers crossed).

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

That is all right. Happy to improve the package.

To test the package, now I just run 10 exclusion concept IDs. There is another error that the covariate_name is too short since some concept_names are longer 255.

Error:
execute JDBC update query failed in dbSendUpdate (ORA-12899: value too large for column "ZYIZHANG"."WIMYJJQVCOV_REF"."COVARIATE_NAME" (actual: 257, maximum: 255)
)

SQL:
INSERT INTO zyizhang.wimyjjqvcov_ref (
covariate_id,
covariate_name,
analysis_id,
concept_id
)
SELECT p1.covariate_id,
'Condition occurrence record observed during 365d on or prior to cohort index: ' || TO_CHAR((p1.covariate_id-101)/1000 ) || '-' || CASE
WHEN c1.concept_name IS NOT NULL
THEN c1.concept_name
ELSE 'Unknown invalid concept'
END AS covariate_name,
101 AS analysis_id,
(p1.covariate_id-101)/1000 AS concept_id
FROM (SELECT DISTINCT covariate_id FROM zyizhang.wimyjjqvcov_co_365d) p1
LEFT JOIN concept c1
ON (p1.covariate_id-101)/1000 = c1.concept_id

The original queries in GetCovariates.sql.

CREATE TABLE #cov_ref (

covariate_id BIGINT,

covariate_name VARCHAR(255),

analysis_id INT,

concept_id INT

);

INSERT INTO #cov_ref (

covariate_id,

covariate_name,

analysis_id,

concept_id

)

SELECT p1.covariate_id,

'Condition occurrence record observed during 365d on or prior to cohort index: ' + CAST((p1.covariate_id-101)/1000 AS VARCHAR) + '-' + CASE

WHEN c1.concept_name IS NOT NULL

THEN c1.concept_name

ELSE 'Unknown invalid concept'

END AS covariate_name,

101 AS analysis_id,

(p1.covariate_id-101)/1000 AS concept_id

FROM (SELECT DISTINCT covariate_id FROM #cov_co_365d) p1

LEFT JOIN concept c1

ON (p1.covariate_id-101)/1000 = c1.concept_id

;

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Sunday, May 10, 2015 10:30 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Sorry about that! Yet another combination of parameters I hadn't tested. It was this combination that was problematic:

useCovariateConditionGroup = TRUE

useCovariateConditionGroupMeddra = FALSE

useCovariateConditionGroupSnomed = FALSE

I've changed the code so it will automatically set the first parameter to FALSE if the last two are FALSE.

I've now written a script that tests a huge number of possible combinations of parameters. It identified one other issue (when using interaction terms on Oracle) that is now fixed. Hopefully you should not run into any problems from now on (fingers crossed).

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-100742125.

from cohortmethod.

schuemie commented on June 12, 2024

Crivens! Not sure why my test script didn't pick this up. I just increased the length to 512 chars, hope that fixes the problem.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

I got the cohort based on 10 exclusion concept IDs. After I run through the analyses based on the cohort data. Then will collect the cohort data again based on >1000 exclusion concept IDs. Hopefully there will be no any problems.

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Monday, May 11, 2015 6:26 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Crivens! Not sure why my test script didn't pick this up. I just increased the length to 512 chars, hope that fixes the problem.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-101065961.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

I ran through the examples with >1000 exclusion concept IDs and there are no any problems now.

But for function: saveCohortData, it is great if there is a reminder to replace the existing file. Otherwise one error will show up.

If “coxibVsNonselVsGiBleed” is already the working folder, the below error will show up in R when running “saveCohortData(cohortData, "coxibVsNonselVsGiBleed");”

----------------------------------------------;
Error in ffbase::save.ffdf(out1, out2, out3, out4, out5, dir = file) :
Directory 'coxibVsNonselVsGiBleed' contains existing '.Rdata' file.
To force saving use 'overwrite=TRUE'
----------------------------------------------;

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Monday, May 11, 2015 6:26 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Crivens! Not sure why my test script didn't pick this up. I just increased the length to 512 chars, hope that fixes the problem.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-101065961.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

Xiaochun found that there is negative number in attrition diagram. It seems that abs function needs to be added in function addStep of drawAttritionDiagram.

E.g. we have counts <- outcomeModel$counts; counts

cohortId treatment exposedCount newUserCount nonOverlapCount notExcludedCount notPriorCount matchedTrimmedCount
1 0 0 135755 116597 108984 49553 49033 14378
2 0 1 54145 40512 32899 19215 49033 14378
3 1 0 135755 116597 108984 49553 18991 14378
4 1 1 54145 40512 32899 19215 18991 14378

In red where currentCounts is less than newCounts. Then we have
[cid:[email protected]]

So it might be better to add abs function to addStep:

addStep <- function(label, newCounts, data) {
    data$leftBoxText[length(data$leftBoxText) + 1] <- label
    data$rightBoxText[length(data$rightBoxText) + 1] <- paste(treatmentLabel,
        ": n = ", abs(data$currentCounts[2] - newCounts[2]), "\n",
        comparatorLabel, ": n = ", abs(data$currentCounts[1] -
            newCounts[1]), sep = "")
    data$currentCounts <- newCounts
    return(data)
}

Thanks,

Zuoyi

Zuoyi Zhang

Research Biostatistician
Regenstrief Institute, Inc.
410 West 10th Street, Suite 2000
Indianapolis, IN 46202
Phone: (317)274-9246
Email: [email protected]
Web: www.regenstrief.orghttp://www.regenstrief.org/

From: Martijn Schuemie [mailto:[email protected]]
Sent: Monday, May 11, 2015 6:26 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Crivens! Not sure why my test script didn't pick this up. I just increased the length to 512 chars, hope that fixes the problem.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-101065961.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

Please disregard the suggestion for the Attrition Diagram.

The values for “Treated” in box “Does not have the outcome prior to index date” in Attrition Diagram is negative. Most likely that notPriorCount value when cohortId=0 and treatement=1 is not correctly assigned.

Below are the counts for attrition diagram what I had.

[cid:[email protected]]

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Monday, May 11, 2015 6:26 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Crivens! Not sure why my test script didn't pick this up. I just increased the length to 512 chars, hope that fixes the problem.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-101065961.

from cohortmethod.

schuemie commented on June 12, 2024

I think I have found the problem. Could you try again with the latest version? You'll have to rerun the fitOutcomeModel() and drawAttritionDiagram() functions.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

Another erro occurs whn running fitOutcomeModel:

Fitting unstratified model
Using prior: None
Error in data.frame(cohortId = c(0, 1), notPriorCount = c(counts$notExcludedCount[counts$cohortId == :

arguments imply differing number of rows: 2, 0

It seems that the rows are not matched in your updated code.

if (!is.null(outcomeConceptId) & !is.null(cohortData$exclude)) {
    t <- cohortData$exclude$outcomeId == outcomeConceptId
    t <- in.ff(cohortData$cohorts$rowId, cohortData$exclude$rowId[ffbase::ffwhich(t,
        t == TRUE)])
    cohortSubset <- cohortData$cohort[ffbase::ffwhich(t,
        t == TRUE), ]
    treatedWithPriorOutcome <- ffbase::sum.ff(cohortSubset$treatment)
    comparatorWithPriorOutcome <- nrow(cohortSubset) - treatedWithPriorOutcome
    notPriorCount <- data.frame(cohortId = c(0, 1), notPriorCount = c(counts$notExcludedCount[counts$cohortId ==
        0] - comparatorWithPriorOutcome, counts$notExcludedCount[counts$cohortId ==
        1] - treatedWithPriorOutcome))
    counts <- merge(counts, notPriorCount)
}-----------------------------

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Thursday, May 14, 2015 8:50 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

I think I have found the problem. Could you try again with the latest version? You'll have to rerun the fitOutcomeModel() and drawAttritionDiagram() functions.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-102211498.

from cohortmethod.

schuemie commented on June 12, 2024

Can you tell me what is in your cohortData$metaData$counts data.frame?

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

In the first running of “fitOutcomeModel” – without using the propensity scores, the data.frame is below.

[cid:[email protected]]

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Friday, May 15, 2015 7:41 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Can you tell me what is in your cohortData$metaData$counts data.frame?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-102547868.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

Checked the code, it seems that cohortId is not assigned to the data.frame: cohortData$metaData$counts.

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Friday, May 15, 2015 7:41 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Can you tell me what is in your cohortData$metaData$counts data.frame?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-102547868.

from cohortmethod.

schuemie commented on June 12, 2024

Ahh, I see now that that was changed a while ago. I was still testing with an old CohortData object.

Could you try again?

from cohortmethod.

zuoyizhang commented on June 12, 2024

Now it seems that works.

There is another error which didn’t show up before when I run the saved data.

It works to run the below function:

outcomeModel = fitOutcomeModel(outcomeConceptId = 3,
cohortData = cohortData,
subPopulation = strata,
riskWindowStart = 0,
riskWindowEnd = 30,
addExposureDaysToEnd = TRUE,
useCovariates = FALSE,
modelType = "cox",

stratifiedCox = TRUE);

But when running the below fuction:

outcomeModel = fitOutcomeModel(outcomeConceptId = 3,
cohortData = cohortData,
subPopulation = strata,
riskWindowStart = 0,
riskWindowEnd = 30,
addExposureDaysToEnd = TRUE,
useCovariates = TRUE,
modelType = "cox",

stratifiedCox = TRUE);

One error is like:

Fitting stratified model
Error in Cyclops::fitCyclopsModel(dataObject$cyclopsData, prior = prior, :
Insufficient data count for cross validation

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Wednesday, May 20, 2015 2:00 AM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Ahh, I see now that that was changed a while ago. I was still testing with an old CohortData object.

Could you try again?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-103770889.

from cohortmethod.

schuemie commented on June 12, 2024

So this error message means the following: when using covariates the default behavior is to use regularization on the betas of the covariates. Cross-validation is used to estimate the hyperparameter for the regularization, and there is a minimum amount of data that is required for that.

Could you check how many outcome events you have? You can use

summary(cohortData)

You can force the minimum down by using

control = createControl(cvType = "auto",startingVariance = 0.1, selectorType = "byPid", noiseLevel = "quiet", minCVData = 10)

as an extra parameter in the fitOutcomeModel() function.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

Yes, it is because that we have few rows and too many covariates in the data matrix.

Below is the summary of our cohortData.

CohortData object summary

Treatment concept ID: 1
Comparator concept ID: 2
Outcome concept ID(s): 3

Treated persons: 19215
Comparator persons: 49553

Outcome counts:
Event count Person count
3 2818 1236

Covariates:
Number of covariates: 11010

Number of non-zero covariate values: 15052710

If using the default control to run

outcomeModel = fitOutcomeModel(outcomeConceptId = 3,
cohortData = cohortData,
subPopulation = strata,
riskWindowStart = 0,
riskWindowEnd = 30,
addExposureDaysToEnd = TRUE,
useCovariates = TRUE,
modelType = "cox",

stratifiedCox = TRUE);

, below are the results of dataObject$cyclopsData.

Cyclops Data Object

Call: createSqlCyclopsData(modelType = modelType)

 Model: cox
  Rows: 74

Covariates: 5437
Strata: 37

Uninitialized interface.

So if we set the minimum CV data with 10 rows, we can fit the model even though there is p>>n. Below are the results when setting minCVData = 10 in createControl.

Model type: cox
Status: OK

Prior variance: 0.000231933782963255
Estimate lower .95 upper .95 logRr seLogRr
treatment 0.85546 0.37137 1.94335 -0.15611 0.4186

I can’t find more problems in CohortMethod package. It should work well now.

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Wednesday, May 20, 2015 11:01 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

So this error message means the following: when using covariates the default behavior is to use regularization on the betas of the covariates. Cross-validation is used to estimate the hyperparameter for the regularization, and there is a minimum amount of data that is required for that.

Could you check how many outcome events you have? You can use

summary(cohortData)

You can force the minimum down by using

control = createControl(cvType = "auto",startingVariance = 0.1, selectorType = "byPid", noiseLevel = "quiet", minCVData = 10)

as an extra parameter in the fitOutcomeModel() function.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-104109521.

from cohortmethod.

schuemie commented on June 12, 2024

Awesome! Thanks so much for your help and patience.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

Below is the attrition diagram based on RI CDM. In each category, the treated and comparator are exactly the same. I don’t think we have the perfect data for the example in our CDM.

[cid:[email protected]]

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Thursday, May 21, 2015 5:54 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Awesome! Thanks so much for your help and patience.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-104432883.

from cohortmethod.

schuemie commented on June 12, 2024

Luckily, this was easy to fix. Please download the latest version. You will have to rerun fitOutcomeModel to fix the count information.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Thanks, Martijn!

I reinstalled the package and try to run through the example based on my saved data. There was no error when fitting the model using the matching based on propensity score before the package was updated. But below is the error

Model type: cox
Status: NO OUTCOMES REMAINING AFTER RESTRICTING TO SUBPOPULATION, CANNOT FIT

Error in exp(d$logRr) : non-numeric argument to mathematical function

When fitting the model like:

outcomeModel = fitOutcomeModel(outcomeConceptId = 3,
cohortData = cohortData,
subPopulation = strata,
riskWindowStart = 0,
riskWindowEnd = 30,
addExposureDaysToEnd = TRUE,
useCovariates = FALSE,
modelType = "cox",

stratifiedCox = TRUE);

AUC of the propensity score diagnostics also changes from 0.8219979 to 0.9999999 after the package is updated. It seems the AUC changes quite much.

Are there any other changes for the packages?

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Friday, June 12, 2015 2:21 AM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Luckily, this was easy to fix. Please download the latest version. You will have to rerun fitOutcomeModel to fix the count information.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-111376360.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

Below please find the updated Attrition diagram, where the treated and comparator are still the same in “Person appears in only one of the two cohorts” and “Study population”.

[cid:[email protected]]

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Friday, June 12, 2015 2:21 AM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Luckily, this was easy to fix. Please download the latest version. You will have to rerun fitOutcomeModel to fix the count information.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-111376360.

from cohortmethod.

schuemie commented on June 12, 2024

Just to let you know I'm working on it. Main problem seems to be somethings wrong with the propensity score computation, but rerunning the vignette takes about 24 hours. Will keep you updated.

PS. I can't see the image you posted.

from cohortmethod.

schuemie commented on June 12, 2024

Hi Zuoyi,

I'm trying to figure out what has changed with the propensity model. You mentioned the AUC changed to .99. Could you answer these 2 questions?:

When fitting the propensity model, did you use checkSorting = FALSE in the createPs() function?
Could you check whether there are strongly predictive covariates in the model? Run psModel <- getPsModel(ps. cohortData) to get the model, then inspect it in the Environment tab of RStudio. The top variables will be the ones with the highest coefficient. Anything above 0.1 is suspicious.

from cohortmethod.

zuoyizhang commented on June 12, 2024

Martijn,

For my current running for the propensity model, the AUC is 0.8224298.

  No. I just used the default value-TRUE for checkSorting parameter in creates() function.

  For psModel <- getPsModel(ps, cohortData), the matrix psModel is not sorted by coefficient. After I sorted the matrix by coefficient, below are the top 10 coefficients for the variables.

coefficient          id                                                                                                               covariateName

33 1.2166048 2004 Index year: 2004
819 1.1183099 19079172402 Drug exposure record observed during 30d on or prior to cohort index: 19079172-Ondansetron 8 MG Disintegrating Tablet
15 0.9692442 27 Age group: 85-89
13 0.9155905 25 Age group: 75-79
14 0.8926998 26 Age group: 80-84
660 0.8644396 4004516102 Condition occurrence record observed during 30d on or prior to cohort index: 4004516-Pre-surgery evaluation
16 0.7972008 28 Age group: 90-94
12 0.7704026 24 Age group: 70-74
831 0.7384272 19123173402 Drug exposure record observed during 30d on or prior to cohort index: 19123173-3-isobutyl GABA 75 MG Oral Capsule [Lyrica]

11 0.6031661 23 Age group: 65-69

In the example, there are 1147 covariates and 126 among them with coefficient>1.

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Monday, June 15, 2015 10:10 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Hi Zuoyi,

I'm trying to figure out what has changed with the propensity model. You mentioned the AUC changed to .99. Could you answer these 2 questions?:

 When fitting the propensity model, did you use checkSorting = FALSE in the createPs() function?

 Could you check whether there are strongly predictive covariates in the model? Run psModel <- getPsModel(ps. cohortData) to get the model, then inspect it in the Environment tab of RStudio. The top variables will be the ones with the highest coefficient. Anything above 0.1 is suspicious.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-112260155.

from cohortmethod.

schuemie commented on June 12, 2024

Hi Zuoyi,

With respect to the attrition diagram:

"Person appears in only one of the two cohorts" should be the same for both groups. Remember: these are people that are excluded that didn't meet this rule, so people that are in both groups, and the count should be the same in both groups by definition.
"Study population" will be the same if you're using one-on-one matching

With respect to the propensity score and the error message when fitting the outcome model:

(The model should be sorted by the absolute value of the coefficient after running getPsModel)

It seems your treatment group and comparator group are very different, with the treatment group having these almost unique characteristics:

Index date in the year 2004
Exposure to Ondansetron and/or Lyrica on or in the 30 days before the index date
Elderly (> 65 years old)

Because they are so different, the AUC becomes really high. If you look at the PS distribution plot, you'll see there's little overlap between the two distributions. When doing matching, very few matches are found because there's just no similar people in the two groups. This leads to the error message when fitting the outcome model, because there just isn't enough data left.

When populations are truly incomparable, it is not possible to do a new-user cohort study (see Alec's paper).

from cohortmethod.

zuoyizhang commented on June 12, 2024

Thanks for your explanation, Martijn! Then I don’t have more questions about the package at this moment.

Thanks,

Zuoyi

From: Martijn Schuemie [mailto:[email protected]]
Sent: Tuesday, June 16, 2015 9:24 PM
To: OHDSI/CohortMethod
Cc: Zhang, Zuoyi
Subject: Re: [CohortMethod] too long expressions in querry (#30)

Hi Zuoyi,

With respect to the attrition diagram:

· "Person appears in only one of the two cohorts" should be the same for both groups. Remember: these are people that are excluded that didn't meet this rule, so people that are in both groups, and the count should be the same in both groups by definition.

· "Study population" will be the same if you're using one-on-one matching

With respect to the propensity score and the error message when fitting the outcome model:

(The model should be sorted by the absolute value of the coefficient after running getPsModel)

It seems your treatment group and comparator group are very different, with the treatment group having these almost unique characteristics:

Index date in the year 2004
Exposure to Ondansetron and/or Lyrica on or in the 30 days before the index date
Elderly (> 65 years old)

Because they are so different, the AUC becomes really high. If you look at the PS distribution plot, you'll see there's little overlap between the two distributions. When doing matching, very few matches are found because there's just no similar people in the two groups. This leads to the error message when fitting the outcome model, because there just isn't enough data left.

When populations are truly incomparable, it is not possible to do a new-user cohort study (see Alec's paperhttp://www.dovepress.com/a-tool-for-assessing-the-feasibility-of-comparative-effectiveness-rese-peer-reviewed-article-CER).

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-112618588.

from cohortmethod.

too long expressions in querry about cohortmethod HOT 32 CLOSED

Comments (32)

arguments imply differing number of rows: 2, 0

It seems that the rows are not matched in your updated code.

stratifiedCox = TRUE);

stratifiedCox = TRUE);

Number of non-zero covariate values: 15052710

If using the default control to run

stratifiedCox = TRUE);

, below are the results of dataObject$cyclopsData.

Uninitialized interface.

So if we set the minimum CV data with 10 rows, we can fit the model even though there is p>>n. Below are the results when setting minCVData = 10 in createControl.

I reinstalled the package and try to run through the example based on my saved data. There was no error when fitting the model using the matching based on propensity score before the package was updated. But below is the error

Error in exp(d$logRr) : non-numeric argument to mathematical function

When fitting the model like:

stratifiedCox = TRUE);

11 0.6031661 23 Age group: 65-69

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent