Coder Social home page Coder Social logo

qiicr / dcmqi Goto Github PK

View Code? Open in Web Editor NEW
225.0 14.0 61.0 17.71 MB

dcmqi (DICOM for Quantitative Imaging) is a free, open source C++ library for conversion between imaging research formats and the standard DICOM representation for image analysis results

Home Page: https://qiicr.gitbook.io/dcmqi-guide/

License: BSD 3-Clause "New" or "Revised" License

CMake 16.67% C++ 76.10% Python 3.54% XSLT 0.84% C 0.09% Shell 0.08% Makefile 0.64% Dockerfile 0.07% Java 1.97%
quantitative-imaging dicom imaging-informatics converters cancer-imaging-research medical-image-computing nci-qin tcia-dac nci-itcr 3d-slicer-extension

dcmqi's People

Contributors

che85 avatar dclunie avatar fedorov avatar gitter-badger avatar ilyafinkelshteyn avatar jcfr avatar kislinsk avatar lassoan avatar michaelonken avatar michaelschwier avatar msmolens avatar nolden avatar pieper avatar pwighton avatar thewtex avatar vkt1414 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dcmqi's Issues

Default parameters for itkimage2segimage

In order to be able to use as few parameters as possible, we need to define some defaults for the following parameters:

  • skipEmptySlices
  • compress
  • cropSegmentsBBox (which is WIP)

Communication of files + label IDs should be fixed

When ITK labels are converted to DICOM, there may be multiple labels per single ITK label file. Therefore, we need a combination of file (or itkImageData) and label ID to do the conversion properly.

Currently, this is not reflected in the SEG schema, and probably not in the API.

Open question is how file list should be communicated in the JSON. Options:

  1. Pass file list outside of JSON, have groups of segment groups, where each group maps to the position of the input label file, and within each group labelID maps to the individual label value within that file. This maps more cleanly into the API use case, since segment groups can the same way map to the position of the itkImageData in the vector passed to the API.
  2. Include file name directly in the segment group. To be used with the API, without file IO, JSON will need to be modified before it is passed downstream.

Assignment of the source series UIDs to the segmentation frames

Currently, communication of the source series DICOM files serves multiple purposes:

  • initialize patient/study/FoR for the segmentation to be created
  • list all of the instances passed as input in the ReferencedInstanceSequence
  • UIDs of the instances that match the spatial positions of the slices corresponding to the segmentation slices are listed in the PerFrameFunctionalGroupsSequence > DerivationImageSequence

It will be more flexible (but also more complicated) if these different purposes are separated.

A related use case is segmentation of the time series data, where multiple source series slices will map to the same segmentation series frame.

Reorganize dcmqi/gh-pages

We should have a top-level page for dcmqi with seg application being a sub-page.

@che85 - no action for now, let's prioritize when we meet

Use JSON for communicating all of the parameters

It should be possible to use the converters when the source series is not available (allow to pass patient/study information using JSON). Schema/converters should allow to do this.

Communication of source series files should also be supported by the schema. There should be no need to pass any parameters, other than the JSON file name, to the converter, outside the schema.

SuperBuild implementation laundry list

Things to keep in mind:

  • make sure external ITK/DCMTK/zlib etc are tested, so that it is possible to compile against Slicer
  • should use the same versions of dependencies as in Slicer while doing superbuild to test the same configuration

dcmtk error checking macros

at some point, we need to review the code and make sure all dcmtk calls that return OFCondition are wrapped in condition checking macros.

Add proper testing

Specific sub-tasks:

Segmentations

  • add test DICOM data and baselines (@fedorov)
  • seg2itk conversion test: confirm the images are equivalent
  • itk2seg conversion test:
    • run dciodvfy on the output
    • confirm round-trip test is working

Parametric Maps

  • pmap2itk conversion test: confirm the images are equivalent
  • itk2pmap conversion test:
    • run dciodvfy on the output
    • confirm round-trip test is working

Structured Report

  • tid1500reader test
  • tid1500writer test
    • run dciodvfy on the output
    • confirm round-trip test is working

Build issue when git is wrongly configured due to line endings

I made an error when building dcmqi that might be added to the documentation.

If you experience the following issues when compiling dcmqi:
dcmqi/dcmqi-build/apps/seg/itkimage2segimageCLP.h:214:1: error: stray ‘\’ in program
{
^
dcmqi/dcmqi-build/apps/seg/itkimage2segimageCLP.h:214:3: warning: missing terminating " character
{
^
dcmqi/dcmqi-build/apps/seg/itkimage2segimageCLP.h:214:1: error: missing terminating " character
{
^

This is due to the slicer execution model generating a header file based on a xml with windows line endings on linux.
(In my case due to autocrlf = true instead of = input on linux)

Library API revision

API calls should use in-memory data structures for in/out communication: itkImage and DcmDataset in place of file names

  • Change parameters from taking filenames to itkImage and DcmDataset

experiment with alternative bulk file resources for testing

As discussed at 2016 summer project week [1] we are looking for ways to be able to test on large collections of DICOM test data. Previously we have used midas and could also consider using girder or S3. But as an experiment we are looking at ipfs as mentioned in this comment [2].

Ideal properties for testing data resource:

  • reliable and efficient access to the exact bitstream of the test data (confirmed by checksum hash)
  • no large external dependencies (ideally just a http download or at most a readily available helper executable)
  • scalable to multi-gigabyte or larger test sets covering the range of DICOM data seen in the real world (to allow exhaustive tests, random tests, etc)
  • robust in the case of server outages
  • peer-to-peer for faster testing when data is already downloaded to other sharing hosts

[1] http://www.na-mic.org/Wiki/index.php/2016_Summer_Project_Week

[2] #11 (comment)

Compare two json files for round-trip testing

To improve testing, it makes sense to "round trip" tests for the metadata, in a similar way as we have round-trip tests for image data.

To help with this task, we could add a tool that takes two jsons, and tests whether they are identical.

Strategies for populating ProcedureCode in TID1500

from a discussion with @dclunie

On Fri, Jul 1, 2016 at 7:19 AM, David Clunie [email protected] wrote:

Hi Andrey

I agree that there is not a lot of value in putting a lot
of effort into populating a pre-coordinated code when one
is not available in the source images, and certainly no
point in bothering the user if the application does not
already the modality and body part.

In the degenerate case, if the source was not DICOM and
not even the modality is known, one can make the argument
that it is sufficient to send a very generic code like
(P0-0099A, SRT, "Imaging procedure").

If I were building this myself, I would probably use
logic something like this:

  • if Procedure Code Sequence present and consistent and
    valid in source images, use it,
  • else if Requested Procedure Code Sequence present inside
    single item of Request Attributes Sequence and valid, use it,
  • else if DICOM images and modality present and consistent
    in source images, and Body Part Examined or Anatomic Region
    Sequence present and consistent in source images, use them
    to look up code in pre-configured list of procedure codes
    by modality and body part (which can be built from various
    sources automatically), else
  • if DICOM modality but not body part known try using the
    Finding Site being used for TID 1500 as the body part, else
  • if only DICOM modality known look up a modality-specific
    procedure code, else
  • use (P0-0099A, SRT, "Imaging procedure")

I think that something like this would be a useful part of
dcmtk, for example, but I don't known if QIICR really needs
it right now.

Obviously a robust body part lookup would probably require
an "ontology" (e.g., to know that the hippocampus was part
of the brain or head, etc.), if the procedure codes were
coarse relative to the supplied body parts, but that too
is probably beyond the scope.

So, to get back to reality, do you want to use a short
list of modality-specific procedure codes ignoring the
body part, or do you just want to always use (P0-0099A, SRT,
"Imaging procedure") or equivalent if the caller does not
supply something more specific, or a code is not available
from the images?

Note that there is a 1:1: correspondence between DICOM
Modality (0008,0060) code string values, and a corresponding
DCM code, e.g., Modality = "MR" maps to (MR, DCM, "Magnetic
Resonance"):

http://dicom.nema.org/medical/dicom/current/output/chtml/part16/chapter_D.html#DCM_MR

so apart from having to look to the code meaning, you could
use DCM codes for the modality-specific procedures (rather
than say (P5-09000, SRT, "Magnetic resonance imaging") or
(LP6406-5, LN, "MRI") or (C0024485, UMLS, "Magnetic resonance
imaging") or whatever.

This use of meaningful codes for modality in the DCM scheme
is very bad coding practice according to Cimino's desiderata,
but since it has been done there is no reason not to take
advantage of it (although one day we should probably retire
this from DICOM and use SNOMED codes instead).

David

PS. For my work on radiation does report construction from
dose screens, as well as earlier work for other projects,
I also put some effort into hunting down body part in other
attributes like series description, but that too is probably
beyond the scope of our needs, but see com/pixelmed/anatproc
if you are interested.

PPS. (P0-0099A, SRT, "Imaging procedure") is in the DICOM0
SNOMED subset, so it is free to use in the context of a
DICOM implementation, but if you want a non-SNOMED equivalent
you can look up C0011923 in UMLS:

https://uts.nlm.nih.gov/metathesaurus.html?cui=C0011923

On 6/30/16 5:47 PM, Andrey Fedorov wrote:
...

Ok. Let me rephrase the practical question I have - let's say
Procedure Code Sequence is not populated in the corresponding images
(which I am sure will be the most common situation). How do you
envision the workflow for a generic TID1500 reporting process? Would
you like the application to automatically come up with a code, given
the knowledge of the image modality, and prompting the user to choose
from the constrained list of anatomical structures? Or let the user
choose from the list of procedure options?

My personal opinion is that it is somewhat wasteful - we probably
cannot come up with an exhaustive list of codes for various
permutations of modalities/body parts/etc, so it almost seems like a
unique code would need to be generated on the fly by the application,
or the application would need to maintain a growing list of codes
based on the situations encountered in the field.

Again, perhaps this is another question to discuss as we talk about
QIICR progress at the next meeting.

Add an option to pass patient/study/equipment etc via JSON

The idea is the following:

  • give user an option to either specify source DICOM files to carry forward composite context, or specify individual modules
  • when possible, assign default values to allow for conversion even if source data is not available
  • to manage complexity, include only attributes listed as Type 1 or 2, skip Type 3 or conditionals

Known deficiency, out of many I am sure, this "parameterization" is lossy with respect to the standard, as it does not communicate modules completely, and does not preserve their definition (Type) completeness. The motivation for this is to reduce complexity, and handle only the attributes that are of interest for the QI applications.

Related to #44

Add a linear measurement converter

Example from David:

% dcsrdump FromEmel_20150506/jjvector4276015677.84977662.6.74.6217627.81381.667482524.159327.4255.136_extract.dcm 
: CONTAINER: (126000,DCM,"Imaging Measurement Report")  [SEPARATE] (DCMR,1500)
        >HAS CONCEPT MOD: CODE: (121049,DCM,"Language of Content Item and Descendants")  = (eng,RFC3066,"English")
                >>HAS CONCEPT MOD: CODE: (121046,DCM,"Country of Language")  = (US,ISO3166_1,"United States")
        >HAS OBS CONTEXT: PNAME: (121008,DCM,"Person Observer Name")  = "admin"
        >HAS OBS CONTEXT: TEXT: (RP-100006,99RPH,"Person Observer's Login Name")  = "admin"
        >HAS CONCEPT MOD: CODE: (121058,DCM,"Procedure reported")  = (24587-8,LN,"brain mri wo+w contr iv")
        >CONTAINS: CONTAINER: (111028,DCM,"Image Library")  [SEPARATE]
                >>CONTAINS: CONTAINER: (126200,DCM,"Image Library Group")  [SEPARATE]
                        >>>CONTAINS: IMAGE:  = (1.2.840.10008.5.1.4.1.1.4,1.3.12.2.1107.5.2.30.25226.3.2007080608524411252220157)
        >CONTAINS: CONTAINER: (126010,DCM,"Imaging Measurements")  [SEPARATE]
                >>CONTAINS: CONTAINER: (125007,DCM,"Measurement Group")  [SEPARATE]
                        >>>HAS OBS CONTEXT: TEXT: (112039,DCM,"Tracking Identifier")  = "Lesion7~sp1~-~sp1~-1~sp1~#FFFFFF"
                        >>>HAS OBS CONTEXT: UIDREF: (112040,DCM,"Tracking Unique Identifier")  = "1266843.1.76.1940.865.49429.781249.3097277773.935694841.27403146"
                        >>>CONTAINS: CODE: (121071,DCM,"Finding")  = (jjv-5,epad-plugin,"epad-plugin")
                        >>>CONTAINS: NUM: (G-D7FE,SRT,"Length")  = 0.0 (mm,UCUM,"millimeter")
                                >>>>INFERRED FROM: SCOORD:  = POLYLINE {72.1983489990234,186.446273803711,88.0661163330078,176.92561340332,76.1652908325195,206.280990600586,62.6776847839355,193.586776733398}
                                        >>>>>SELECTED FROM: IMAGE:  = (1.2.840.10008.5.1.4.1.1.4,1.3.12.2.1107.5.2.30.25226.3.2007080608524411252220157) 

Support encoding of segment IDs

Currently, we don't support this recent modification of the standard, I believe: ftp://medical.nema.org/medical/dicom/final/cp1496_ft_segmenttrackingidanduid.pdf

SEG converter ignores JSON attributes

  • SegmentedPropertyType appears to be ignored - the output item in DICOM is saved as "Tissue" for the liver test
  • AnatomicRegionCode appears to be ignored by writer as well, and also ignored by the reader

JSON editor and validator for metadata generator

It might be helpful to have the component showing the output of the metadata JSON generator editable, and also add a button to do very basic JSON validation (not schema-based, for now, but at least check correctness of JSON).

@che85 what do you think? do you think it would be useful?

Selection of platform for hosting test data

I keep returning to this issue over and over, and can't make my mind! I don't think there is a perfect solution, but for now I thought I at least document my thinking before choosing something.

Requirements

  • ideally, data and code are versioned together
  • should be able to download individual files directly
  • bandwidth should be reasonable
  • ideally, versioning should be supported
  • ideally, user interface (not command line) should be supported for interacting with the data (e.g., data contributors should be able to manage data access)

Platforms considered

git-lfs

Pros:

  • code and data stay together on github
  • versioning

Cons:

  • bandwidth is free only up to some limit
  • the only way to download git-lfs managed files is by installing git-lfs, which may be too burdensome for some users

ipfs, dat

Do not solve the data hosting problem

Dropbox

Pros:

  • bandwidth is great
  • user interface for non-developers is great
  • direct download link is supported

Cons:

  • Dropbox is restricted by some corporate firewalls
  • data permanence cannot be guaranteed (subscription, user account inactivity closures)

Google Drive

Pros:

  • bandwidth is great
  • user interface for non-developers is ok (not great, since not possible to download folder as zip file)

Cons:

  • direct download link is not supported
  • may be restricted by some corporate firewalls
  • data permanence cannot be guaranteed (subscription, user account inactivity closures)

Midas

http://slicer.kitware.com/midas3 instance

Pros:

  • Free (although, no idea about longevity of the platform)
  • direct download links supported
  • used by other projects in the 3D Slicer ecosystem

Cons:

Girder

https://data.kitware.com/ instance

Pros:

  • API exists and should be supported (I have not tried myself)

Cons:

  • the instance referenced above does not allow me to create collections (stopper)

Package dependencies for all platforms

This would help with reducing time to test pull requests; currently, only appveyor dependencies are packaged (because appveyor's time constraints are most limiting).

ctest fails on appveyor

There is some ctest issue that prevents tests from running on appveyor. Same command seems to work fine on circleci and travis ...

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.