ohdsi / capr Goto Github PK

View Code? Open in Web Editor NEW

15.0 8.0 8.0 7.69 MB

Cohort definition Application Programming in R

Home Page: https://ohdsi.github.io/Capr

License: Apache License 2.0

R 99.20% Perl 0.34% Shell 0.46%

hades

capr's Introduction

Capr

Capr is part of HADES

Introduction

The goal of Capr, pronounced 'kay-pr' like the edible flower, is to provide a language for expressing OHDSI Cohort definitions in R code. OHDSI defines a cohort as "a set of persons who satisfy one or more inclusion criteria for a duration of time" and provides a standardized approach for defining them (Circe-be). Capr exposes the standardized approach to cohort building through a programmatic interface in R which is particularly helpful when creating a large number of similar cohorts. Capr version 2 introduces a new user interface designed for readability with the goal that Capr code being a human readable description of a cohort while also being executable on an OMOP Common Data Model.

Learn more about the OHDSI approach to cohort building in the cohorts chapter of the Book of OHDSI.

Installation

Users can install the current development version of Capr from GitHub with:

# install.packages("devtools")
devtools::install_github("ohdsi/Capr")

User Documentation

Documentation can be found on the package website.

PDF versions of the documentation are also available:

Vignette: Using Capr
Vignette: Capr Examples
Vignette: Working with Concept Sets in Capr
Vignette: Capr for Templating Cohort Definitions
Vignette: Capr components
Design Document
Package manual

Support

Developer questions/comments/feedback: OHDSI Forum
We use the GitHub issue tracker for all bugs/issues/enhancements

Contributing

Read here how you can contribute to this package.

License

Capr is licensed under Apache License 2.0

Development

Capr is being developed in R Studio.

Acknowledgements

This package is maintained by Martin Lavallee and Adam Black
Guidance and support for the original development of Capr came from Lee Evans and LTS Computing LLC

capr's People

Contributors

Stargazers

Watchers

Forkers

gowthamrao ablack3 rfherrerac gkovaig albertpratsu hms1 mvankessel-emc shanshan4q33

capr's Issues

procedure() not working

procedure() function is not able to take procedure conceptset to generate the query. It looks like procedure is not a valid domain.

procedure(c4)
Error in validityMethod(object) :
object@domain %in% validDomains is not TRUE

Repo title to camelCase?

Although there is no official guideline for repo names, in general we tend to use camelCase. Would you consider renaming to 'Capr'?

If Capr is to be an R package then there definitely are guidelines for R code.

Replace deprecated `oracleTempSchema` with `tempEmulationSchema`

Several functions like 'getConceptCodeDetails()' have an oracleTempSchema argument. First of all, please make sure this is really necessary. Are these functions using temp tables? Second, oracleTempSchema has been deprecated across HADES. Please use `tempEmulationSchema' instead, and set the default value like this.

Example (or tips) for using age() in defining a cohort within atttrition()?

I'm interested in applying inclusion criteria "is at least 18 at index"

I bet I want to leverage the age() function (

Capr/R/attributes-op.R

Lines 328 to 343 in fa03ba0

    
           #' Function to create age attribute 
        
           #' @param op   an opAttribute object that is either numeric or integer that defines the logical 
        
           #'             operation used to determine eligible patient age 
        
           #' @export 
        
           age <- function(op) { 
        
             check <- all(grepl("opAttribute", methods::is(op))) 
        
             if (!check) { 
        
               stop("Input must be an opAttributeNumeric or opAttributeInteger.") 
        
             } 
        
             methods::new("opAttributeInteger", 
        
                          name = "Age", 
        
                          op = op@op, 
        
                          value = as.integer(op@value), 
        
                          extent = as.integer(op@extent)) 
        
           }

), but I'm struggling a bit to figure out how to use it in an example

I looked at the examples in https://ohdsi.github.io/Capr/articles/Examples.html but didn't find one. Also couldn't seem to find one anywhere else in the repo.

Any tips? Thanks!!

createOccurrenceStartDateAttribute does not write Extent into JSON

Hi,
It seems that when using the Op="bt" in createOcurrenceStartDateAttribute, the Extent parameter is not actually used in JSON, which causes errors.

dx_id_Query <- createConditionOccurrence(conceptSetExpression = dx_code_omop,
attributeList = list(createOccurrenceStartDateAttribute(Op = "bt",
Value = "2020-01-01",
Extent = "2020-12-31")))

Ocurrence start date in JSON:
\n"OccurrenceStartDate": {
\n "Value": "2020-01-01",
\n"Op": "bt"
\n}

align with circe-be changes

Align with circe-be offset-date PR

Change that impacts Capr, addition of Date Adjustment field

add DateAdjustment field

Aim to release parallel to circe-be release

Submit to CRAN

do cran submission using devtools::submit_cran(). before that run devtools::check_win_release()

Released version failing R Check

I see:

CirceR: Can't find package called CirceR.

Which is a bit odd because CirceR is in the Enhances section. Normally this problem would be solved by adding a Remotes section to point to ohdsi/CirceR, but it seems you're going to CRAN (which is a good idea), but CRAN won't allow a Remotes section. Not sure how other people solve this.

Get code coverage over 80%

create hades release

S4 Component Validity Checks

Create validObject checks for S4 objects

Documentation out of date

I'm starting to use Capr and it's great, but it seems the best documentation is the unit tests ;-)

For example, this code in the README is incorrect, as the exit criteria must be created with the exit() function. Similarly, this example in the vignette appears incorrect because attrition entries need to be created with withAll() (for example).

Get github actions R check passing on Capr v2

Preparing for inclusion in HADES

I think Capr is a valuable resource that I would like to see added to the OHDSI HADES library. If you agree to add Capr to HADES, I would request some modifications to the repo:

Add some unit tests (maybe not fully covering all code right now). See CirceR for examples on how to implement unit tests.
Add GitHub actions (see description here)
Manually create a first release + tag
Add a NEWS.md file (see for example SqlRender)
Add Remotes section to DESCRIPTION for CirceR, so it will automatically be installed when calling remotes::install_github("ohdsi/Capr"). (see example Remotes section here). No need to add DatabaseConnector, as it is available in CRAN.
Make sure the master branch contains latest released version (as explained here), so remotes::install_github("ohdsi/Capr") installs the latest released version.
Use cosmo template for documentation website, and add a HADES link (see here and here)
Make README.md more consistent with HADES packages (see for example CohortMethod)
- Add badges for Github action and codecov
- Add link to HADES
- Modify instructions to install (no need to mention dependencies that need to be installed, but should link to HADES site for general instructions on setting up R and Java)
- Add links to PDF versions of manual, vignettes, and a link to the Capr GitHub pages (and enable GitHub pages)

Minor documentation issues

For a next release, it might be good to fix these:

The documentation website still lists version 0.0.1.994 as the current version. Next time make sure to generate the website as one of the last steps (after updating the version number).
The documentation website is missing the GitHub link in the top right corner (see the CohortMethod site as example. I must admit I'm not sure why that is. The _pkgdown.yml looks fine.
The vignette titles are bit ugly, like 'Capr_Attributes_Extended', instead of perhaps 'Using extended cohort attributes in Capr', which would be both human-readable and more informative.
The Documentation section in the README doesn't list all vignettes.
The Installation section in the README mentions a checkCmInstallation() function that is not part of Capr.

Create cohort/concept set from Circe JSON?

Is it possible to create the Capr R objects for cohorts/concept sets directly from the JSON representation?

My desire is to:

Create a set of cohort in ATLAS and export the design JSON representation
Create a set of concept sets in ATLAS and export the design JSON representation
Import the cohorts and concept sets into Capr to use via the JSON
Permute the cohorts using the concept sets to effectively create new cohorts
Save the cohort JSON for import back to ATLAS

From my preliminary review of the documentation and vignettes, I didn't find an easy way to use the JSON design directly in Capr? I was hoping for something like Capr::loadCohortDefinitionJson(json) and it would return the Capr cohort definition object.

Create CAPR object Print

Command to console print and preview the CAPR object

Add ability to generate `CohortGenerator` input tables

Right now there are some ugly transformations needed to plug the output of Capr into CohortGenerator, which looks something like this:

library(CirceR)
exposureCohorts <- tibble(cohortId = c(1,2),
                          cohortName = c("Celecoxib", "Diclofenac"),
                          json = c(as.json(celecoxibCohort), as.json(diclofenacCohort)))
exposureCohorts$sql <- sapply(exposureCohorts$json,
                              buildCohortQuery,
                              options = createGenerateOptions())

Where celecoxibCohort and diclofenacCohort are Capr cohorts, and exposureCohorts can be used directly in CohortGenerator::generateCohortSet().

Looping in @anthonysena : Any thoughts? Perhaps Capr can add a function that creates CohortGenerator input tables from a list of cohorts, or CohortGenerator can accept a list of Capr cohorts? Why does CohortGenerator need cohort names anyway?

Are examples for unit() out of date due to a breaking change?

I am on Capr 2.0.7 and I tried following the example in

Capr/vignettes/Examples.Rmd

Lines 136 to 179 in e44ec1c

    
           **Persons with new type 2 diabetes mellitus at first dx rx or lab** 
        
           https://atlas-phenotype.ohdsi.org/#/cohortdefinition/90 
        
           ```{r, eval=FALSE} 
        
           library(Capr) 
        
           cs0 <- cs(descendants(443238, 201820, 442793),  
        
                     descendants(exclude(195771, 201254, 435216, 761051, 4058243, 40484648)), 
        
                     name = "Type 2 diabetes mellitus (diabetes mellitus excluding T1DM and secondary)") 
        
           cs1 <- cs(descendants(201254, 435216, 40484648), 
        
                     name = "Type 1 diabetes mellitus") 
        
           cs2 <- cs(descendants(195771), 
        
                     name = "Secondary diabetes mellitus") 
        
           cs3 <- cs(descendants(4184637, 37059902), 
        
                     name = "Hemoglobin A1c (HbA1c) measurements") 
        
           cs4 <- cs(descendants(21600744), 
        
                     name = "Drugs for diabetes except insulin") 
        
           ch <- cohort( 
        
             entry = entry( 
        
               conditionOccurrence(cs0), 
        
               drugExposure(cs4), 
        
               measurement(cs3, valueAsNumber(bt(6.5, 30)), unit("%")), 
        
               measurement(cs3, valueAsNumber(bt(48, 99)), unit("mmol/mol")), 
        
               observationWindow = continuousObservation(priorDays = 365) 
        
             ), 
        
             attrition = attrition( 
        
               'no T1D' = withAll( 
        
                 exactly(0, conditionOccurrence(cs1), duringInterval(eventStarts(-Inf, 0))) 
        
                 ), 
        
               'no secondary diabettes' = withAll( 
        
                 exactly(0, conditionOccurrence(cs2), duringInterval(eventStarts(-Inf, 0))) 
        
               ) 
        
             ), 
        
             exit = exit( 
        
               endStrategy = observationExit() 
        
             ) 
        
           )

However it returned an error "Error in is.unit(x) : argument "x" is missing, with no default"

I was able to reduce it to the following calls:

library(Capr)

cs3 <- cs(descendants(4184637, 37059902),
          name = "Hemoglobin A1c (HbA1c) measurements")

measurement(cs3, valueAsNumber(bt(6.5, 30)), unit(units = "%"))

This gave me the error

Error in is.unit(x) : argument "x" is missing, with no default

I also tried running the code in

Capr/tests/testthat/test-attributes.R

Line 236 in e44ec1c

tt <- unit(8713L) #gram per deciliter

within the tests and also got an error

> tt <- unit(8713L)
Error in unit(8713L) : argument "units" is missing, with no default

My hypothesis is that there has been a breaking to unit() and the examples need to be updated. Is that right?

Bug: opAttribute convert to integer. Allow for numeric

When using an opAttribute within a query, Capr coerces any value to an integer. Needs to allow for a numeric value with a decimal.

Example

att <- Capr::createValueAsNumberAttribute(Op = "gt", Value = 3.39)  

att@CriteriaExpression[[1]]@Contents$Value # gives a value of 3

Tips for using calendar date range requirement in defining entry (or attrition)?

We're looking to use a calendar date range when defining cohort entry as is done in ATLAS, and are curious if that functionality is supported in Capr.

For example, how could we update the following Type 2 Diabetes definition (from https://ohdsi.github.io/Capr/articles/Examples.html#type-2-diabetes-mellitus) to also require that the observation of cs0 must occur between January 1, 2022 and December 31, 2023?

library(Capr)

cs0 <- cs(descendants(443238, 201820, 442793), 
          descendants(exclude(195771, 201254, 435216, 761051, 4058243, 40484648)),
          name = "Type 2 diabetes mellitus (diabetes mellitus excluding T1DM and secondary)")

ch <- cohort(
  entry = entry(
    conditionOccurrence(cs0),
    observationWindow = continuousObservation(priorDays = 365)
  ),
  exit = exit(
    endStrategy = observationExit()
  )
)

We looked through the package documentation in https://ohdsi.github.io/Capr/reference/index.html but couldn't find an example that seemed to match. Any tips?

Here's the ATLAS UI version (with a different concept set but you get the idea) we're trying to replicate in Capr

Add labelling to cohort elements

add a rule name and description to a list of attrition
attach concept id with name for concept attribute

Is there Capr support for Health Economics Data Tables

I am interested in creating cohorts with inclusion definitions that rely on health economics data tables such as PAYER_PLAN_PERIOD (https://ohdsi.github.io/CommonDataModel/cdm54.html#Health_Economics_Data_Tables)

When I look at the domains supported by Capr listed in https://ohdsi.github.io/Capr/articles/capr_objects.html#definition, I only see support for clinical data tables and none for Health Economics.

Is there Capr support for building cohorts leveraging Health Economics Data tables?

If yes, what functionality should be used?
If no, can you help me understand why and what would be needed to support them?

Add function for compile generic and cleanup process toCirce -> compile -> write to disk

Age attribute does not write Extent

When using

AgeAtt <- createAgeAttribute(Op = "bt", Value = 40, Extent = 65)
Age40And65Group <- createGroup(Name = "Between 40 and 65 years old",
type="ALL",
criteriaList = NULL,
demographicCriteriaList = list(AgeAtt),
Groups = NULL)
the Extent number is not written in the JSON file.

How to select first exposure?

I'd like to pick the first exposure to a drug, and then require a washout period. I realized that this:

sitagliptinNewUsers <- cohort(
  entry = entry(
    drug(sitagliptin),
    primaryCriteriaLimit = "First",
    observationWindow = continuousObservation(priorDays = 365)
  ),
  attrition = attrition(
    "prior T2DM" = withAll(
      atLeast(1, condition(t2dm), duringInterval(eventStarts(-Inf, 0)))
    )
  ),
  exit = exit(endStrategy = drugExit(sitagliptin, persistenceWindow = 30, surveillanceWindow = 0))
)

actually first applies the 365 washout, then picks the first occurrence. So if someone was exposed on days 1 and 366 relative to observation_period_start_date, the resulting cohort will have an entry for day 366. Instead, this person should not be included in the cohort (first exposure is on day 1, there is less than 365 days of prior observation on day 1).

I tried using firstOccurrence():

sitagliptinNewUsers <- cohort(
  entry = entry(
    drug(sitagliptin, firstOccurrence()),
    observationWindow = continuousObservation(priorDays = 365)
  ),
  attrition = attrition(
    "prior T2DM" = withAll(
      atLeast(1, condition(t2dm), duringInterval(eventStarts(-Inf, 0)))
    )
  ),
  exit = exit(endStrategy = drugExit(sitagliptin, persistenceWindow = 30, surveillanceWindow = 0))
)

but then as.json() throws an error:

as.json(sitagliptinNewUsers)
# Error in `purrr::map()`:
# i In index: 1.
# Caused by error in `purrr::map()`:
# i In index: 1.
# Caused by error in `as.list.default()`:
# ! no method for coercing this S4 class to a vector

Add Device exposure as an entry option

Is there a way to query the device domain in the same way we query observations or procedures?

add design doc as pkg site article

add design doc md as article in pkg site. design doc includes intent of package, user, design specs, roadmap

Finalize createAttribute Commands

Finish suite of create Attribute Commands.
Attribute Types include:
-Op (Dates and Numeric) --> check fully compatible
-Logical (First and Exclude) --> compatible
-Correlated Criteria --> compatible
-Source Concept --> compatible
-Concept (using vocabularies) --> check fully compatible
-Text Filler --> not compatible

Inconsistent use of capitalization

For example, createCohortDefinition() has capitalized argument names, such as Name and PrimaryCriteria, while createDrugExposure() has non-capitalized argument names such as conceptSetExpression and attributeList. createGroup() even mixes the two.

For consistency with the HADES style guide I recommend using lowercase first letters for all arguments.

Error when multiple CSE are used in `createPrimaryCriteria`

For example when trying to include a createConditionSourceConceptAttribute in a PrimaryCriteria:

library(Eunomia)
library(Capr)

connectionDetails <- getEunomiaConnectionDetails()
connection <- connect(connectionDetails)

index_codeset <- getConceptIdDetails(conceptIds = 0, connection = connection, vocabularyDatabaseSchema = vocabularyDatabaseSchema, mapToStandard = F)  %>%
  createConceptSetExpression(Name = "Index Event", includeDescendants = F)

source_codeset <- getConceptIdDetails(conceptIds = 0, connection = connection, vocabularyDatabaseSchema = vocabularyDatabaseSchema, mapToStandard = F)  %>%
  createConceptSetExpression(Name = "Source Event", includeDescendants = F)

attribute_list <- list()

attribute_list[[1]] <- createSourceConceptAttribute("ConditionOccurrence", source_codeset)

index_event <-  createConditionOccurrence(conceptSetExpression = index_codeset, attributeList = attribute_list)

primary <- createPrimaryCriteria(Name = "Primary Criteria", ComponentList = list(index_event), ObservationWindow = createObservationWindow(0L,0L), Limit = "fist")

cd <- createCohortDefinition(Name = "My cohort", Description = "My cohoft", PrimaryCriteria = primary)

compileCohortDefinition(cd)

Error in names(att) <- x@Name : 'names' attribute [1] must be the same length as the vector [0]

I did some digging and this is because only the first concept set is recognised during createPrimaryCriteria, see here.

A quick fix can be implemented here but it might be worth considering the same issue will manifest elsewhere.

Happy to do this and make a PR if you believe it is suitable solution.

PS. Many thanks for the work on this package, it is really cool. I am starting to utilise it a fair but and would be happy to contribute other things that come up.

R6 v S3/S4 in HADES

From #77 @chrisknoll has posed an interesting way of viewing Capr potentially through R6 via a pure OOP system. The purpose of this post is to

a) understand the benefits of switching to R6 for Capr and
b) consider the impact of R6 within HADES...when/where to use it, why is it beneficial and ultimately does it even matter

I am hoping to get some feedback or thoughts from others @chrisknoll, @azimov, @ablack3, @anthonysena, @schuemie. I know a post was made a while back referencing this same topic.

Thoughts on OOP in context of Capr

Currently Capr is written in S4 this was done for two (at this point, flimsy) reasons:

R6 was not available at the time. R still used ReferenceClass as its pure OOP system. I explored this in early Capr development but opted to go the S4 route
S4 maintains the "feel" of R. Where as R6 is more amenable to programmers coming from the java and python worlds. S4 is a stricter version of S3 which does a better job of working in a functional programming pipeline. When Capr was originally created it was intended to heavily leverage the pipe operator %>% however this proved to be rather awkward.

Resources to give context to S4 and R6 can be found in chapters 14, 15, and 16 of Advanced R. While the strengths of S4 can be found here.

Starting at Capr v2, there was an intentional effort to transition the feel of the code away from piping and towards nested functions. The construction of cohorts would hence feel like building the ui of a shinyDashboard. A dashboard requires a header, sidebar and body. Within each section the user provides context on the look by adding text, output, boxes etc. Similar to a cohort definition where the user is constructing sections of the definition...the entry, attrition, exit and era. An example of what Capr code should look like now is show below:

library(Capr)


cd <- cohort(
  
  #entry event (i.e. primary criteria)
  entry = entry(
    drugExposure(metformin, male()), # index query of metformin users who are male
    observationWindow = continuousObservation(priorDays = 365L), #365 min prior obs
    primaryCriteriaLimit = "All" # use all index events
  ),
  
  # attrition to index event (i.e. inclusion rules)
  attrition = attrition(
    # no t1d any time prior
    'no t1d' = withAll( # start group
      exactly( # start criteria (i.e. count)
        x = 0,
        query = conditionOccurrence(t1d),
        aperture = duringInterval(
          startWindow = eventStarts(a = -Inf, b = 0, index = "startDate")
        )
      )
    ),
    expressionLimit = "All" # include all events for attrition
  ),
  
  # exit when the person leaves the cohort
  exit = exit(
    endStrategy = drugExit(
      conceptSet = metformin,
      surveillanceWindow = 30L
    ) # create metformin era to determine exit
  ),
  
  # era logic on how to collapse multiple events
  era = era(eraDays = 30L) # 30 days of metformin use builds an era (bit redundant)
  
)

If Capr were to switch to R6 the syntax would look more like this:

cd <- cohort$new(
  
  #entry event (i.e. primary criteria)
  entry = entry$new(
    #list of queries or single query
    list(
      drugExposure$new(
        conceptSet = metformin,
        attributes = 
          list(
            male$new()
          )
      ),
      observationWindow = continuousObservation$new(prior = 365L),
      primaryCriteriaLimit = limit$new(type = "All"),
      additionalCriteria = NULL # placeholder
      qualifyingLimit = limit$new(type = "All")
    )
  ),
  
  # attrition to index event (i.e. inclusion rules)
  attrition = attrition$new(
    list(
      group$new(
        name = 'no t1d occurrence',
        type = "all",
        int = NULL, #placeholder
        criteria = criteria$new(
          type = "exactly",
          int = 1,
          query = conditionOccurrence$new(
            conceptSet = t1d
          ),
          aperture = aperture$new(
            startWindow = eventStarts$new(a = -Inf, b = 0, index = "startDate"),
            ignoreObservationPeriod = FALSE # placeholder
          )
        )
      )
    )
  ),
  
  # exit when the person leaves the cohort
  exit = exit$new(
    endStrategy = drugExit$new(
      conceptSet = metformin,
      surveillanceWindow = 30L
    ),
    censoringCriteria = NULL #placeholder
  ),
  
  # era logic on how to collapse multiple events
  era = era$new(
    eraDays = 30L
  )
  
)

Each class has a new object method where we describe its details. Classes can have further methods such as json coercion, sql builder, print statement, and plot functions. This would be quite nice. I am conscious of not overlapping too much with CirceR.

Thoughts on R6

My main hesitation with R6 is that it removes the "feel" of R. R works best in S3 when you take advantage of its "pipe-ability" and functional programming attributes. Forcing R code into a pure OOP system may tune out "tidy-verse" programmers trying to enter the OHDSI software space. Think there is legitimate fear here given the design of the DARWIN software which are quite "tidy-verse" heavy. Not that its any sort of competition.

Having said this, I am beginning to realize the benefits of using R6 particularly if we begin to think about complex objects (circe definition) and pipelines (strategus modules). Having a strictly encapsulated objects makes it easier to force a complex routine across a network.

This post has gone way too long (of which I have accidentally deleted it twice) but maybe it starts a conversation to think about the HADES codebase as it becomes more and more complex :)

Show concept details in concept set when available

To make sure I have the right concepts in my concept set expression, I'd like to see the concept names. Of course, at the start this information is not available:

sitagliptin <- cs(
  descendants(1580747),
  name = "Sitagliptin"
)
sitagliptin 
# conceptId includeDescendants isExcluded includeMapped
# <int> <lgl>              <lgl>      <lgl>        
#   1   1580747 TRUE               FALSE      FALSE

However, I would have expected to see the information after calling getConceptSetDetails():

sitagliptin <- getConceptSetDetails(sitagliptin, connection, cdmDatabaseSchema)
sitagliptin 
# conceptId includeDescendants isExcluded includeMapped
# <int> <lgl>              <lgl>      <lgl>        
#   1   1580747 TRUE               FALSE      FALSE

The information is there, in the internal structure of the object. Could the concept name (and other concept information) be shown when available?

Executing writeCohort() leads to Error: 'list_flatten' is not an exported object from 'namespace:purrr'

I'm trying out the README example of your package.
When executing the following command
writeCohort(ch, path)
an error occurs:
Error: 'list_flatten' is not an exported object from 'namespace:purrr'

My environment:

R version 4.1.1 (2021-08-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.2

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] Capr_2.0.2 rlang_1.0.6 openxlsx_4.2.4 forcats_0.5.1
[5] stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4 readr_2.1.4
[9] tidyr_1.1.4 tibble_3.1.5 ggplot2_3.4.0 tidyverse_1.3.1
[13] SqlRender_1.13.1 DatabaseConnector_6.0.0

Passing Database Connections through functions

Functions accessing the OMOP Vocabulary need a database connection parameter (i.e. lookup functions). Test and update to OHDSI standard

Improve Concept Lookup and Concept Set Expression Mapping

Mature concept lookup functions to handle ad hoc concept searches (when the concept id is not known in advanced) and improve functions for setting concept set expression mapping (i.e. include descendants, include mapped and exclude)

Setting observational start date for cohort

Hi all, I am trying to limit my cohort to conditions with observation start date in between 2013-01-01 to 2022-12-31, using the startdate attribute, but it didn't work. Below is my code using eunomia as an example. As you can see I still gets conditions with cohort_start_date below 2013. I wonder is my code correct?


CDMConnector::downloadEunomiaData(
  pathToData = here::here(), 
  overwrite = TRUE
)
usethis::edit_r_environ()

db <- dbConnect(duckdb(), dbdir = eunomia_dir())

cdm <- cdm_from_con(db, cdm_schema = "main", write_schema = "main")

concept_sets <- list(gibleed = cs(descendants(192671), name = "gibleed"))

# generate Capr cohort

condition_cohort_template <- function(concept_set) {
  cohort(
    entry = entry(
      conditionOccurrence(concept_set,startDate(bt(as.Date("2013-01-01"),as.Date("2022-12-31"))), firstOccurrence()),
      observationWindow = continuousObservation(priorDays = 365)
      ,
      primaryCriteriaLimit = "All"
    ),
    exit = exit(endStrategy = observationExit())
  )
}



# Write a loop

cohortsConditions <- list()
for (i in 1:length(concept_sets)) {
  cohortsConditions[[i]] <- condition_cohort_template(concept_sets[[i]])
}

# Lets add names to the list of cohorts
names(cohortsConditions) <- names(concept_sets)

cdm <- generateCohortSet(cdm, cohortsConditions, "cohort_conditions", overwrite = TRUE)


cdm$cohort_conditions

> cdm$cohort_conditions
# Source:   table<main.cohort_conditions> [?? x 4]
# Database: DuckDB 0.8.1 [miked@Windows 10 x64:R 4.3.1/C:\Users\miked\AppData\Local\Temp\RtmpGcDKbM/nfqdmjxy]
   cohort_definition_id subject_id cohort_start_date cohort_end_date
                  <int>      <dbl> <date>            <date>         
 1                    1         80 1974-10-27        2019-04-15     
 2                    1        733 1980-07-22        2018-08-30     
 3                    1       1259 1981-10-16        2019-01-03     
 4                    1       1265 2018-02-20        2018-02-21     
 5                    1       1756 2013-07-14        2018-08-26     
 6                    1       1967 1994-05-17        2019-04-26     
 7                    1       2786 1959-12-27        2015-12-19     
 8                    1       2834 1972-03-16        1999-12-11     
 9                    1       3216 1995-12-22        2018-10-05     
10                    1       3224 2002-01-26        2019-06-07     
# i more rows
# i Use `print(n = ...)` to see more rows

Bug: adding death query without CSE

Need to add an open-ended death query to Primary Criteria and Censoring Criteria. Query is used to search all deaths in death domain. Common censoring criteria component. Example below:

dd <- Capr::createDeath()
cen <- Capr::createCensoringCriteria(Name = "Death Censoring Criteria",
                                 ComponentList = list(dd))
#yields indexing error for concept set expressions

Add weekly github R check action for Hades (refer to Martijn Hades post from 2/15)

address following:

Cohort expression JSON generated by CAPR is not importable into Atlas

Expected behavior: When i copy the json expression of a cohort definition generated by Capr and paste it in Atlas UI for cohort definition (under export tab, reload) - Atlas cohort definition editor should be able to parse the JSON and render the definition.

Actual behavior: Atlas is unable to parse the JSON from Capr.

Reproducible example:

Primary Criteria :

conceptsWithKeyWord <- Capr::lookupKeyword(
  keyword = "Diabetes",
  search_type = "any",
  cdmDatabaseSchema = cdmDatabaseSchema
)

conceptSetExpression <- Capr::createConceptSetExpression(conceptSet = conceptsWithKeyWord,
                                                         Name = "Diabetes Key Word", 
                                                         includeDescendants = TRUE)

attributeFirst <- Capr::createFirstAttribute(logic = TRUE)
attributeAge <- Capr::createAgeAttribute(Op = "gt", Value = "10")

conceptSetConditionOccurrence <-
  Capr::createConditionOccurrence(conceptSetExpression = conceptSetExpression,
                                  attributeList = list(attributeFirst, attributeAge))

primaryCriteria <-
  Capr::createPrimaryCriteria(
    Name = "Primary Criteria Using Condition Occurrence concept set",
    ComponentList = list(conceptSetConditionOccurrence),
    ObservationWindow = Capr::createObservationWindow(0L, 0L),
    Limit = "First"
  )

Inclusion Rule :

irs <- Capr::createInclusionRules(Name = "Inclusion Rules",
                                  Contents = list(),
                                  Limit = "First")

End Strategy :
es <- Capr::createDateOffsetEndStrategy(offset = 0, eventDateOffset = "EndDate")

Cohort Era :
cohortEra <- Capr::createCohortEra(EraPadDays = as.integer(1))

Creating Definition :
desc <- "Description for Testing Cohort"

Final Cohort Definition:

cd <- Capr::createCohortDefinition(
  Name = "Testing Cohort",
  Description = desc,
  PrimaryCriteria = pc,
  InclusionRules = irs,
  EndStrategy = es,
  CohortEra = cohortEra
)

What's the right way to get json from Capr code object?

library(DatabaseConnector)
library(Capr)
connectionDetails <- Eunomia::getEunomiaConnectionDetails()
connection <- connect(connectionDetails)
#> Connecting using SQLite driver
gibleed <- cs(descendants(192671))
gibleedCohort <- cohort(entry = entry(condition(gibleed)))
as.json(gibleedCohort)
#> Error: 'list_flatten' is not an exported object from 'namespace:purrr'
disconnect(con)
#> Error in disconnect(con): object 'con' not found

^{Created on 2023-07-01 with reprex v2.0.2}

The function createConceptSetExpression puts the same CodesetId.

Example:

DMDx_hist <- getConceptIdDetails(conceptIds = c(40769338, 43021173, 42539022, 46270562),
                            connection = cdm_bbdd,
                            vocabularyDatabaseSchema = cdm_schema) %>%
  createConceptSetExpression(Name = "History of Diabetes Diagnosis",
                             includeDescendants = TRUE)
T1Dx <- getConceptIdDetails(conceptIds = c(201254, 435216, 4058243, 40484648),
                            connection = cdm_bbdd,
                            vocabularyDatabaseSchema = cdm_schema) %>%
  createConceptSetExpression(Name = "Type 1 Diabetes Diagnosis",
                             includeDescendants = TRUE)

DMDx_hist@ConceptSetExpression[[1]]@id #"16101412-f225-4cdd-830a-344e094563e8"
T1Dx@ConceptSetExpression[[1]]@id #"16101412-f225-4cdd-830a-344e094563e8"

capr v2 release

Tasks for capr v2 release (aiming for March 17)

Possible bug in Example vignette code for applying inclusion criteria

We (https://github.com/nimirea) and I suspect that the examples in https://github.com/OHDSI/Capr/blob/main/vignettes/Examples.Rmd may be out of date with the latest version of the package (2.0.5).

Namely, when we executed the example code on a CDM, we found that without wrapping the arguments to attrition() with withAll() calls, the attrition information was not properly applied in the generated cohort.

For example, in the following lines, we suspect the lines need to be wrapped with withAll() for the inclusion criteria to be correctly applied.

Capr/vignettes/Examples.Rmd

Lines 56 to 59 in fa03ba0

    
               atLeast(1,  
        
                       observationPeriod(),  
        
                       duringInterval(eventStarts(-Inf, -365, "startDate")) 
        
               )

I'm curious if the developers agree these examples are out of sync with the latest version or not. Maybe we're missing something else.

I can include the actual JSON comparison if it's helpful.

No Vignette

After installing the command vignette("CAPR_tutorial", package = "Capr") says that there is not a vignette.

Fix R check notes and warnings locally

do R check get rid of all notes and warnings.
most warnings are scoping issues

Object inheritance

First of all I'd like to state that I'm not too familiar with S4 classes, but I am comfortable with R6.

Looking at the code, the show methods of the following objects are identical:

Could they inherit from an opAttributeSuper class, which implements this method, and any other identical methods?

If similar implementations exist elsewhere in the package I think this would significantly change code duplicity.

Fix drug exit bug

DrugExit class fails coercion. Check coercion of all end strategies and add unit tests

Typos using "-" instead of "-" in some documentation

Spotted a couple of places in code examples where the = operator was mistakenly typed as -

See

Capr/vignettes/capr_objects.Rmd

Line 127 in fa03ba0

atenololConceptSet <- cs(descendants(1314002), name - "atenolol")

and

Capr/vignettes/capr_objects.Rmd

Line 141 in fa03ba0

atenololConceptSet <- cs(descendants(1314002), name - "atenolol")

looks like eval = FALSE on both of these chunks which is likely why it wasn't caught.

Support updates to Circe v1.11.0

Circe 1.11.0 adds a new feature to date-offset start/end dates of 'query' criteria. This allows you to override an end date of an index event, or re-assign a start date to some other time. Example: A pregnancy confirmation event may indicate 12-weeks, so you may want to offset the start date of this event to indicate a pregnancy start by 7*12=84d.

The main change of the of the API is that all Criteria now have a DateAdjustment field, used here.

The DateAdjustment class is described here.

	#' Function to create age attribute
	#' @param op an opAttribute object that is either numeric or integer that defines the logical
	#' operation used to determine eligible patient age
	#' @export
	age <- function(op) {

	check <- all(grepl("opAttribute", methods::is(op)))
	if (!check) {
	stop("Input must be an opAttributeNumeric or opAttributeInteger.")
	}
	methods::new("opAttributeInteger",
	name = "Age",
	op = op@op,
	value = as.integer(op@value),
	extent = as.integer(op@extent))
	}

	Persons with new type 2 diabetes mellitus at first dx rx or lab

	https://atlas-phenotype.ohdsi.org/#/cohortdefinition/90

	```{r, eval=FALSE}
	library(Capr)

	cs0 <- cs(descendants(443238, 201820, 442793),
	descendants(exclude(195771, 201254, 435216, 761051, 4058243, 40484648)),
	name = "Type 2 diabetes mellitus (diabetes mellitus excluding T1DM and secondary)")

	cs1 <- cs(descendants(201254, 435216, 40484648),
	name = "Type 1 diabetes mellitus")

	cs2 <- cs(descendants(195771),
	name = "Secondary diabetes mellitus")

	cs3 <- cs(descendants(4184637, 37059902),
	name = "Hemoglobin A1c (HbA1c) measurements")

	cs4 <- cs(descendants(21600744),
	name = "Drugs for diabetes except insulin")


	ch <- cohort(
	entry = entry(
	conditionOccurrence(cs0),
	drugExposure(cs4),
	measurement(cs3, valueAsNumber(bt(6.5, 30)), unit("%")),
	measurement(cs3, valueAsNumber(bt(48, 99)), unit("mmol/mol")),
	observationWindow = continuousObservation(priorDays = 365)
	),
	attrition = attrition(
	'no T1D' = withAll(
	exactly(0, conditionOccurrence(cs1), duringInterval(eventStarts(-Inf, 0)))
	),
	'no secondary diabettes' = withAll(
	exactly(0, conditionOccurrence(cs2), duringInterval(eventStarts(-Inf, 0)))
	)
	),
	exit = exit(
	endStrategy = observationExit()
	)
	)

	atLeast(1,
	observationPeriod(),
	duringInterval(eventStarts(-Inf, -365, "startDate"))
	)