Coder Social home page Coder Social logo

omerronen / clinical-rule-vetting Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yu-group/clinical-rule-vetting

1.0 0.0 0.0 87.87 MB

Learning clinical-decision rules with interpretable models.

Home Page: https://rules.csinva.io

License: MIT License

Python 2.14% R 0.01% TeX 0.57% Jupyter Notebook 97.28% Shell 0.01%

clinical-rule-vetting's Introduction

⚕️ Interpretable Clinical Decision Rules ⚕️️

Validating and deriving clinical-decision rules. Work-in-progress.

This is a collaborative repository intended to validate and derive clinical-decision rules. We use a unified pipeline across a variety of contributed datasets to vet previous modeling practices for clinical decision rules. Additionally, we hope to externally validate the rules under study here with data from UCSF.

Rule derivation datasets

Dataset Task Size References Processed
iai_pecarn Predict intra-abdominal injury requiring acute intervention before CT 12,044 patients, 203 with IAI-I 📄, 🔗
tbi_pecarn Predict traumatic brain injuries before CT 42,412 patients, 376 with ciTBI 📄, 🔗
csi_pecarn Predict cervical spine injury in children 3,314 patients, 540 with CSI 📄, 🔗
tig_pecarn Predict bacterial/non-bacterial infections in febrile infants from RNA transcriptional biosignatures 279 patients, ? with infection 🔗
exxagerate Predict 30-day mortality for acute exacerbations of chronic obstructive pulmonary disease (AECOPD) 1,696 patients, 17 mortalities 📄, 🔗
heart_disease_uci Predict heart disease presence from basic attributes / screening 920 patients, 509 with heart disease 📄, 🔗

Research paper 📄, Data download link 🔗

Datasets are all tabular (or at least have interpretable input features), reasonably large (e.g. have at least 100 positive and negative cases), and have a binary outcome. For PECARN datasets, please read and agree to the research data use agreement on the PECARN website.

Possible data sources: PECARN datasets | Kaggle datasets | MDCalc | UCI | OpenML | MIMIC | UCSF De-ID Potential specific datasets: Maybe later will expand to other high-stakes datasets (e.g. COMPAS, loan risk).

Contributing checklist

To contribute a new project (e.g. a new dataset + modeling), create a pull request following the steps below. The easiest way to do this is to copy-paste an existing project (e.g. iai_pecarn) into a new folder and then edit that one.

Helpful docs: Collaboration details | Lab writeup | Slides

  • Repo set up
    • Create a fork of this repo (see tutorial on forking/merging here)
    • Install the repo as shown below
    • Select a dataset - once you've selected, open an issue in this repo with the name of the dataset + a brief description so others don't work on the same dataset
    • Assign a project_name to the new project (e.g. iai_pecarn)
  • Data preprocessing
    • Download the raw data into data/{project_name}/raw
      • Don't commit any very large files
    • Copy the template files from rulevetting/projects/iai_pecarn to a new folder rulevetting/projects/{project_name}
      • Rewrite the functions in dataset.py for processing the new dataset (e.g. see the dataset for iai_pecarn)
      • Document any judgement calls you aren't sure about using the dataset.get_judgement_calls_dictionary function
      • Notebooks / helper functions are optional, all files should be within rulevetting/projects/{project_name}
  • Data description
    • Describe each feature in the processed data in a file named data_dictionary.md
    • Summarize the data and the prediction task in a file named readme.md. This should include basic details of data collection (who, how, when, where), why the task is important, and how a clinical decision rule may be used in this context. Should also include your names/affiliations.
  • Modeling
    • Baseline model - implement baseline.py for predicting given a baseline rule (e.g. from the existing paper)
    • New model - implement model_best.py for making predictions using your newly derived best model
  • Lab writeup (see instructions)
    • Save writeup into writeup.pdf + include source files
    • Should contain details on exploratory analysis, modeling, validation, comparisons with baseline, etc.
  • Submitting
    • Ensure that all tests pass by running pytest --project {project_name} from the repo directory
    • Open a pull request and it will be reviewed / merged
  • Reviewing submissions
    • Each pull request will be reviewed by others before being merged

Installation

Note: requires python 3.7 and pytest (for running the automated tests). It is best practice to create a venv or pipenv for this project.

python -m venv rule-env
source rule-env/bin/activate

Then, clone the repo and install the package and its dependencies.

git clone https://github.com/Yu-Group/rule-vetting
cd rule-vetting
pip install -e .

Now run the automatic tests to ensure everything works (warnings are fine as long as all test pass).

pytest --project iai_pecarn

To use with jupyter, might have to add this venv as a jupyter kernel.

python -m ipykernel install --user --name=rule-env

Clinical Trial Datasets

Dataset Task Size References Processed
bronch_pecarn Effectiveness of oral dexamethasone for acute bronchiolitisintra-abdominal injury requiring acute intervention before CT 600 patients, 50% control 📄, 🔗
gastro_pecarn Impact of Emergency Department Probiotic Treatment of Pediatric Gastroenteritis 886 patients, 50% control 📄, 🔗

Research paper 📄, Data download link 🔗

Reference

Background reading
Related packages
Updates
Related open-source collaborations

clinical-rule-vetting's People

Contributors

albertqu avatar csinva avatar floricaconstantine avatar hysk79 avatar keyan3 avatar mindful-math avatar omerronen avatar ssaxena00 avatar

Stargazers

 avatar

clinical-rule-vetting's Issues

Document all cleaning steps in a google doc

The data cleaning and imputations is not documented in right now outside the rulevetting/projects/tbi_pecarn/dataset.py file, see Dataset.get_data function.

We need to write all the steps in a short google doc. I suggest you use a debugger and just go over all the steps manually, let me know if you need help using it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.