Coder Social home page Coder Social logo

resource-utils's Introduction

Utilities

Some ad hoc utilities for analyzing FHIR resources. Requires msgpack and tqdm packages.

pack.py

The following scripts use msgpack for faster loading of resources (only a few seconds on 1.5M resources). Resources are assumed to be grouped into directories, one resource per JSON file. pack.py is used to pack these resources into a single msgpack-formatted file.

To pack all resources in a directory named CARRIER:

python pack.py CARRIER

search.py

Searches all packed resources for all instances of a given key, and returns all found paths along with unique values for the key (and companion key).

To find all instances of system and see found system/code pairs in CARRIER resources:

python search.py system code CARRIER

analyze.py

Performs some simple analysis of the structure and values within a pack of resources.

To see all requried and optional properties of extension elements within item elements on CARRIER resources:

$ python analyze.py -d INPATIENT -p item.extension
Required:
 - url (71)
 - valueQuantity (71)
Optional:

To see all encountered url values in extension elements of the base resource of INPATIENT resources:

$ python analyze.py -d INPATIENT -p extension -a url
Required:
 - https://bluebutton.cms.gov/resources/variables/clm_pass_thru_per_diem_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/clm_pps_cptl_dsprprtnt_shr_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/clm_pps_cptl_excptn_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/clm_pps_cptl_fsp_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/clm_pps_cptl_ime_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/clm_pps_cptl_outlier_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/clm_pps_old_cptl_hld_hrmls_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/clm_tot_pps_cptl_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/fi_num (5071)
 - https://bluebutton.cms.gov/resources/variables/nch_bene_blood_ddctbl_lblty_am (5071)
 - https://bluebutton.cms.gov/resources/variables/nch_bene_ip_ddctbl_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/nch_bene_pta_coinsrnc_lblty_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/nch_drg_outlier_aprvd_pmt_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/nch_ip_ncvrd_chrg_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/nch_ip_tot_ddctn_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/nch_profnl_cmpnt_chrg_amt (5071)
 - https://bluebutton.cms.gov/resources/variables/prpayamt (5071)
Optional:
 - https://bluebutton.cms.gov/resources/variables/clm_mdcr_non_pmt_rsn_cd (78)
 - https://bluebutton.cms.gov/resources/variables/dsh_op_clm_val_amt (2869)
 - https://bluebutton.cms.gov/resources/variables/ime_op_clm_val_amt (2104)

tree.py

Performs larger-scale analysis of the structure within multiple packs of resources. The result is a JSON tree structure with statistics for each property stratified by profile, with along unique (scalar) values of system, code, and url. See summary.json for the output. The script utilizes multiple cores with multiprocessing.

$ python tree.py CARRIER INPATIENT PDE > summary.json

The --count option is useful when debugging to limit how many resources are analyzed per profile type.

extract_profile.py

Processes the output of tree.py to produce output for a given profile containing the tree of properties along with observed cardinalities. Observed values of system, code, and url from tree.py are output if there are between 1 and 25 of such values to capture common coding systems and extensions. See CARRIER_profile.txt, PDE_profile.txt and INPATIENT_profile.txt for the output.

$ cat summary.json | python extract_profile.py CARRIER > CARRIER_profile.txt

The resulting files were manually edited to remove uninteresting collections of values, and then run through an online unicode tree generator for beautification. The results can be found at CARRIER_tree.txt, PDE_tree.txt and INPATIENT_tree.txt.

resource-utils's People

Contributors

abjonnes avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.