Coder Social home page Coder Social logo

stephanieshong / valueset-tools Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jhu-bids/termhub

0.0 0.0 0.0 5.51 MB

Various converters to convert value sets from CSV to JSON, etc.

License: MIT License

Python 94.60% Makefile 3.25% Shell 2.15%

valueset-tools's Introduction

ValueSet Tools

Tools for converting value sets in different formats, such as converting extensional value sets in CSV format to JSON format able to be uploaded to a FHIR server. Tools to automate CRUD operations such as reads and updates from various different data sources and web services.

Set up / installation

  1. You must have Python3 installed.
  2. Run to clone repo: git https://github.com/HOT-Ecosystem/ValueSet-Tools.git
  3. Change directory: cd ValueSet-Converters
  4. Make & use virtual environment: virtualenv env; source env/bin/activate
  5. Run to install dependencies: pip install -r requirements.txt
  6. To use the "VSAC to OMOP/FHIR JSON" tool, which fetches from Google Sheets, you'll need the following:
    3.a. Access to this google sheet.
    3.b. Place credentials.json and token.json inside the env/ directory. For BIDS members, these can be downloaded from the BIDS OneDrive here.
  7. Create an env/.env file based on env/.env.example, replacing VSAC_API_KEY with your own VSAC API key as shown in your profile. More instructions on getting an API key can be found in "Step 1" on this page. Or, if you are a BIDS member, you can simply download and use the .env file from the BIDS OneDrive. It already has an API key from the shared UMLS BIDS account pre-populated.

Tools

First, cd into the directory where this repository was cloned.

1. VSAC Wrangler

This will fetch OIDs from the "OID" column of this google sheet, make VSAC API calls, and produce output.

Syntax

python3 -m vsac_wrangler <options>

Options:

Short flag Long flag Choices Default Description
-i --input-source-type ['google-sheet', 'oids-txt'] 'oids-txt' If "google-sheet", this will fetch from a specific, hard-coded Google Sheet, and pull OIDs from a specific column in that sheet. If "oids-txt" it will pull a list of OIDs from input/oids.txt.
-g --google-sheet-name ['CDC reference table list', 'VSAC Lisa1'] 'CDC reference table list' The name of the tab within a the Google Sheet containing the target data within OID column. Make sure to encapsulate the text in quotes, e.g. -g "VSAC Lisa1". This option can only be used if --input-source-type is google-sheet.
-o --output-structure ['fhir', 'vsac', 'palantir-concept-set-tables', 'atlas'] 'vsac' Destination structure. This determines the specific fields, in some cases, internal structure of the data in those fields.
-f --output-format ['tabular/csv', 'json'] 'json' The output format. If csv/tabular, it will produce a tabular file; CSV by default. This can be changed to TSV by passing "\t" as the field-delimiter.
-d --tabular-field-delimiter [',', '\t'] ',' Field delimiter for tabular output. This applies when selecting "tabular/csv" for "output-format". By default, uses ",", which menas that the output will be CSV (Comma-Separated Values). If "\t" is chosen, output will be TSV (Tab-Separated Values).
-d2 --tabular-intra-field-delimiter [',', '\t', ';', '|'] | Intra-field delimiter for tabular output. This applies when selecting "tabular/csv" for "output-format". This delimiter will be used when a specific field contains multiple values. For example, in "tabular/csv" format, there will be 1 row per combination of OID (Object ID) + code system. A single OID represents a single value set, which can have codes from multiple code systems. For a given OID+CodeSystem combo, there will likely be multiple codes in the "code" field. These codes will be delimited using the "intra-field delimiter".
-j --json-indent 0 - 4 4 The number of spacees to indent when outputting JSON. If 0, there will not only be no indent, but there will also be no whitespace. 0 is useful for minimal file size. 2 and 4 tend to be standard indent values for readability.
-c --use-cache When running this tool, a cache of the results from the VSAC API will always be saved. If this flag is passed, the cached results will be used instead of calling the API. This is useful for (i) working offline, or (ii) speeding up processing. In order to not use the cache and get the most up-to-date results (both from (i) the OIDs present in the Google Sheet, and (ii) results from VSAC), simply run the tool without this flag.
-h --help Shows help information for using the tool.

Examples

1. Create a TSV with comma-delimited VSAC codes, and use the last cached results from the VSAC API.

python -m vsac_wrangler -o vsac -f tabular/csv -d \t -d2 , -c

2. CSV to FHIR JSON

First, convert your CSV to have column names like the example below. Then can run these commands.

Syntax

python3 -m csv_to_fhir path/to/FILE.csv

Example

python3 -m csv_to_fhir examples/1/input/n3cLikeExtensionalValueSetExample.csv

Before:

valueSet.id,valueSet.name,valueSet.description,valueSet.status,valueSet.codeSystem,valueSet.codeSystemVersion,concept.code,concept.display
1,bear family,A family of bears.,draft,http://loinc.org,2.36,1234,mama bear
1,bear family,A family of bears.,draft,http://loinc.org,2.36,1235,papa bear
1,bear family,A family of bears.,draft,http://loinc.org,2.36,1236,baby bear

After:

{
    "resourceType": "ValueSet",
    "id": 1,
    "meta": {
        "profile": [
            "http://hl7.org/fhir/StructureDefinition/shareablevalueset"
        ]
    },
    "text": {
        "status": "generated",
        "div": "<div xmlns=\"http://www.w3.org/1999/xhtml\">\n\t\t\t<p>A family of bears.</p>\n\t\t</div>"
    },
    "name": "bear family",
    "title": "bear family",
    "status": "draft",
    "description": "A family of bears.",
    "compose": {
        "include": [
            {
                "system": "http://loinc.org",
                "version": 2.36,
                "concept": [
                    {
                        "code": 1234,
                        "display": "mama bear"
                    },
                    {
                        "code": 1235,
                        "display": "papa bear"
                    },
                    {
                        "code": 1236,
                        "display": "baby bear"
                    }
                ]
            }
        ]
    }
}

valueset-tools's People

Contributors

joeflack4 avatar sigfried avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.