Coder Social home page Coder Social logo

seerapi-client-java's Introduction

seerapi-client-java

Quality Gate Status integration Maven Central

A SEER*API client for Java applications. This library supports most of the APIs and makes them easy to incorporate into Java applications.

SEER

The Surveillance, Epidemiology and End Results (SEER) Program is a premier source for cancer statistics in the United States. The SEER Program collects information on incidence, prevalence and survival from specific geographic areas representing 28 percent of the US population and reports on all these data plus cancer mortality data for the entire country.

From the SEER*API website:

The SEER*API is a RESTful Web service that supports various SEER Program data sets and algorithms. This service is available to developers who wish to incorporate SEER resources to their own systems. These resources include databases and tools developed to enhance registry operations and quality improvement. The SEER*API is designed for integration into registry computer stems and is not designed to be accessed by end-users.

Download

The library requires Java 8 or greater.

Download the latest JAR or grab via Maven:

<dependency>
    <groupId>com.imsweb</groupId>
    <artifactId>seerapi-client-java</artifactId>
    <version>5.4</version>
</dependency>

or via Gradle:

compile 'com.imsweb:seerapi-client-java:5.4'

Usage

SEER*API usage is free, but requires an API key. To get started,

  1. Create a free account from the Account Creation page
  2. Locate your API key on your Account page

Your API key will need to be supplied to make calls. It can be supplied on each call, or it can be stored in a configuration file in your home directory called .seerapi. The file should look like this

apikey=your_api_key

To make calls to the API, first get an instance of SeerApi. If your local configuration is set up, then this is how you get a connection.

SeerApi api = new SeerApi.Builder().connect();

or the key can be supplied when constructing the instance.

SeerApi api = new SeerApi.Builder().apiKey("your_api_key").connect();

Each set of APIs are broken into their own service. Here are the services:

Glossary (rest/glossary)

A glossary of cancer-related terms.

api.glossary().getById("latest", "4ffd7623a3dd635c99d38e2d").execute().body()

Disease (rest/disease)

A searchable database of hematopoietic and lymphoid neoplasms and solid tumor diseases.

api.disease().samePrimaries("9870/3", "9872/3", "2010").execute().body()

Multiple Primaries (rest/mph)

The SEER implementation of the Multiple Primary and Histology Coding Rules. The implementation combines Hematopoietic rules, SEER Multiple Primary and Histology Coding Rules. The rules used in the calculation are based on the diagnosis year and histology.

MphInput input1 = new MphInput();
input1.setPrimarySite("C509");
input1.setHistologyIcdO3("8000");
input1.setBehaviorIcdO3("3");
input1.setDateOfDiagnosisYear("2016");
input1.setLaterality("1");

MphInput input2 = new MphInput();
input2.setPrimarySite("C501");
input2.setHistologyIcdO3("8000");
input2.setBehaviorIcdO3("3");
input2.setDateOfDiagnosisYear("2015");
input2.setLaterality("1");

MphResult result = api.mph(new MphInputPair(input1, input2)).execute().body();

NAACCR (rest/naaccr)

The NAACCR API provides programmatic access to documentation for the NAACCR Standards for Cancer Registries Volume II. It includes field level documentation as well as information about file layout.

api.naaccr().field("14", 12).execute().body()

NDC (rest/ndc)

A searchable mirror of the National Drug Code Directory from the FDA website. This database is updated weekly.

api.ndc().getByCode("0002-3227").execute().body()

RX (rest/rx)

A searchable database for coding oncology drug and regimen treatment categories in cancer registries.

api.rx().getById("latest", "53c44b01102c1290262dc8b2").execute().body()

SEER Incidence Site Recode (rest/recode)

The values of SEER site recode variables are based on the primary site and histology data fields submitted to SEER by the registries. The site recode variables define the major cancer site/histology groups that are commonly used in the reporting of cancer incidence data. For example, there is a section of the SEER Cancer Statistics Review for each major site corresponding to groupings in a site recode variable. The site recode variables are added to SEER databases as a convenience for researchers.

api.siteRecode().siteGroup("C619", "8000").execute().body()

Site-specific Surgery Codes (rest/surgery)

The site-specific surgery tables available on the SEER website.

api.surgery().tables().execute().body()

HCPCS

Healthcare Common Procedure Coding Systems (HCPCS) nomenclatures. The information is maintained through CanMED.

api.hcpcs().getProcedure("S0087").execute().body();

Staging

Cancer staging algorithms. Currently supports Collaborative Stage which is a unified data collection system designed to provide a common data set to meet the needs of all three staging systems (TNM, SEER EOD, and SEER SS). It provides a comprehensive system to improve data quality by standardizing rules for timing, clinical and pathologic assessments, and compatibility across all of the systems for all cancer sites.

api.staging().schemaById("cs", "02.05.50", "brain").execute().body()

For a complete description of all available API inputs and outputs, see the SEER*API Documentation page.

seerapi-client-java's People

Contributors

ctmay4 avatar depryf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

ccenjj

seerapi-client-java's Issues

Investigate using lighter dependencies

The dependency list is very small. Only the need for a JAX-RS 2.0 client and the Jackson libs:

dependencies {
    compile 'org.glassfish.jersey.core:jersey-client:2.7'
    compile 'org.glassfish.jersey.media:jersey-media-json-jackson:2.7'

    testCompile 'junit:junit:4.11'
}

However Jersey brings a large amount of associative dependencies. I'm not sure if there is a more lightweight JAX-RS option.

dependencies

I would prefer to keep this library lighter. I could just use HttpClient or URLConnection and roll my own. There are also a few other libraries I found that might work:

https://github.com/beders/Resty
http://unirest.io/java.html

Using HttpClient might be the safest long-term.

Why does the NAACCR field resource not return the "Section" of a NAACCR field?

The SEER API appears to not return the "Section" that a NAACCR field belongs to? See this call:

https://api.seer.cancer.gov/rest/naaccr/latest/item/1290

This returns the following:

{
  "item": 1290,
  "name": "RX Summ--Surg Prim Site",
  "start_col": 2225,
  "end_col": 2226,
  "alignment": "RIGHT",
  "padding_char": "0",
  "documentation": "<table class=\"naaccr-summary-table naaccr-borders\">\n    <tr>\n        <th class=\"naaccr-summary-header naaccr-borders\">Item #</th>\n        <th class=\"naaccr-summary-header naaccr-borders\">Length</th>\n        <th class=\"naaccr-summary-header naaccr-borders\">Source of Standard</th>\n        <th class=\"naaccr-summary-header naaccr-borders\">Year Implemented</th>\n        <th class=\"naaccr-summary-header naaccr-borders\">Version Implemented</th>\n        <th class=\"naaccr-summary-header naaccr-borders\">Year Retired</th>\n        <th class=\"naaccr-summary-header naaccr-borders\">Version Retired</th>\n        <th class=\"naaccr-summary-header naaccr-borders\">Column #</th>\n    </tr>\n    <tr>\n        <td class=\"naaccr-summary-cell naaccr-borders naaccr-summary-centered\">1290</td>\n        <td class=\"naaccr-summary-cell naaccr-borders naaccr-summary-centered\">2</td>\n        <td class=\"naaccr-summary-cell naaccr-borders naaccr-summary-centered\">SEER/CoC</td>\n        <td class=\"naaccr-summary-cell naaccr-borders naaccr-summary-centered\">1997</td>\n        <td class=\"naaccr-summary-cell naaccr-borders naaccr-summary-centered\">5.1</td>\n        <td class=\"naaccr-summary-cell naaccr-borders naaccr-summary-centered\"></td>\n        <td class=\"naaccr-summary-cell naaccr-borders naaccr-summary-centered\"></td>\n        <td class=\"naaccr-summary-cell naaccr-borders naaccr-summary-centered\">2225 - 2226</td>\n    </tr>\n</table>\n\n<br/><strong>NAACCR XML</strong>: Tumor.rxSummSurgPrimSite<br/><br/>\n<strong>Alternate Names</strong>\n<br/>&nbsp;&nbsp;&nbsp;Surgery of Primary Site (SEER/CoC)\n<br/>&nbsp;&nbsp;&nbsp;Cancer-Directed Surgery (pre-96 CoC)\n\n\n<div class='content chap10-para-head'><strong><strong>Description</strong></strong></div>\n<div class='content chap10-para mark-changed'>Site-specific codes for the type of surgery to the primary site performed as part of the first course of treatment. This includes treatment given at all facilities as part of the first course of treatment.</div>\n<div class='content chap10-para mark-changed'><strong>Codes (in addition to the site-specific codes; Refer to the most recent version of\n    <em>STORE</em> and SEER Program Code manual for additional instructions.) </strong></div>\n<div class='content chap10-para'>\n    <table>\n        <tr class='code-row'>\n            <td class='code-nbr'>00</td>\n            <td class='code-desc'>None</td>\n        </tr>\n        <tr class='code-row'>\n            <td class='code-nbr'>10-19</td>\n            <td class='code-desc'>Site-specific code; tumor destruction</td>\n        </tr>\n        <tr class='code-row'>\n            <td class='code-nbr'>20-80</td>\n            <td class='code-desc'>Site-specific codes; resection</td>\n        </tr>\n        <tr class='code-row'>\n            <td class='code-nbr'>90</td>\n            <td class='code-desc'>Surgery, NOS</td>\n        </tr>\n        <tr class='code-row'>\n            <td class='code-nbr'>98</td>\n            <td class='code-desc'>Site specific codes; special</td>\n        </tr>\n        <tr class='code-row'>\n            <td class='code-nbr'>99</td>\n            <td class='code-desc'>Unknown</td>\n        </tr>\n    </table>\n</div>"
}

It seems that NAACCR item resource omits "Section" information. Alternatively, the "Section" of NAACCR itmes is available via download from the NAACCR 'Data Standards and Data Dictionary โ€“ Query Builder' (NAACCR Query Builder) application at the following location:

http://applications.naaccr.org/querybuilder/default.aspx?Version=18

For example NAACCR #1290 is presented within the 'Treatment-1st Course' section from the NAACCR Query Builder application.

Why does the SEER API not return the "Section" information? Sections are very useful for the organization of NAACCR variables.

Cancer Identification
Demographic
Edit Overrides/Conversion History/System Admin
Follow-up/Recurrence/Death
Hospital-Confidential
Hospital-Specific
Other-Confidential
Pathology
Patient-Confidential
Record ID
Special Use
Stage/Prognostic Factors
Text-Diagnosis
Text-Miscellaneous
Text-Treatment
Treatment-1st Course
Treatment-Subsequent & Other

I see that "Sections" are also not present within the NAACCR XML format. See here: https://github.com/imsweb/naaccr-xml.

Are NAACCR "Sections" obsolete and decprecated?

Support NAACCR version addition to staging version

In the near term, a new naaccr_version field will be added to the staging version. It will represent the NAACCR version associated with the specific staging algorithm version.

This is on hold until SEER*API is updated.

How to get subjects data?

I am interested in seer-api and have studied its document carefully. I have tried to use seer-api extracting data, but I can only get the data of the data structure, I cannot get patients' data like seer-stat. I am wondering if you could kindly send me some tips about seer-spi to get data like seer-stat. I promise data will be used only for research.Thank you for your kind consideration and I am looking forward to your reply.

Update to retrofit2

The new version was a large overhaul. New package structure and classes have all been fixed. It is being used now, but the artifact is still a beta. Once the final release occurs, a new release will happen.

Why does the schema resource only return sites and not histologies?

The SEER API appears to only return sites for a schema resource. See this call:

https://api.seer.cancer.gov/rest/staging/tnm/latest/schema/prostate

This returns the following:

{
"id": "prostate",
"algorithm": "tnm",
"version": "1.9",
"name": "Prostate",
"title": "Prostate",
"notes": "C619 Prostate gland\n\nNote 1: This schema is based on the UICC chapter Prostate pages 243-248.\n\nNote 2: Transitional cell (urothelial) carcinoma of the prostatic urethra is to be coded to primary site C680, Urethra, and TNM assigned according to the Urethra schema.\n\nNote 3: The 7th Edition TNM stage groups are derived from T, N, M, PSA and Gleason Score. The following SSF's are used for PSA and Gleason Score. \n* SSF 1: Prostatic Specific Antigen (PSA) Lab Value\n* SSF 8: Gleason's Score on Needle Core Biopsy/Transurethral Resection of Prostate (TURP) (Clinical Stage Group)\n* SSF 10: Gleason's Score on Prostatectomy/Autopsy (Pathologic Stage Group)\n\nNote 4: In addition to directly assigning TNM, Summary Stage 2000 (SS2000) must be captured. If directly assigning SS2000, use the Prostate chapter on page 230 of the SS2000 on-line manual.",...

It seems that this schema resource only defines the schema in terms of sites. Not histologies. However, the staging.seer.cancer.gov website seems to define schemas in terms of sites and histologies. See here:
https://staging.seer.cancer.gov/tnm/schema/1.9/prostate/?breadcrumbs=(~schema_list~)

Why does the SEER APi not return a list of histologies for a schema?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.