Coder Social home page Coder Social logo

m2g's Issues

Integrate labeling algorithms

This means, ability to generate our own atlases...our desikan atlases currently come from JIST via long-deprecated code. Not sure this is high priority, but a small capability gap. Basically involves registering labeled brain(s) with T1 to MNI space. -Will

fact (susumu) integration

Assess - would be good to get an engineering estimate. If it's command-line runnable, much more attractive.

build graphs in a memory efficient way

decoupling from the other task because not strictly required for non-big graphs.

Currently we build from adjacency matrices, which is inefficient, but logic is more straightfoward to debug.

quality control

this issue documents the QC checks that cep is doing that we could implement after running ndmg on a dataset.:

  • nnz per graph
  • average edge weight (given non-zero weight) per graph

@gkiar @WillGray

confirm space of data and atlases

We now read data with nibabel - confirm that headers and data are in a consistent (or at least not misleading to our algorithms space).

We discovered an issue when looking at them in mipav.

baseline ndmg docs

m2g.io is pretty much obsolete - let's to at least v0.01 move stuff over to md on the main docs site so that we have a nice landing point for people.

I need this for a meeting this week.

Keep a running list of TRT for KKI and SWU

Compare to historical runs. Nothing fancy. Just maybe a table with inter/intra/total TRT for KKI, SWU, (and NKI-TRT). Ideally we'd run every pip push and see improvement!

Correctly align image in both image and real space

We encountered this with atlases, and now that everything is happening in image space we need to ensure that headers cooperate and make this so. I have a script to do this in m2g repo, and will transfer/build in to the pipeline.

fmri vs. dti

please point @shangsiwang to an experiment where we have run both DTI & fMRI on the same subjects, so he can compare which is more discriminable

@WillGray

graph format

finalize spec for attributed edge + json graph format. Once completed, this will split into subsequent tasks which are write reader/writer for this format in R and Python.

np.where best practices

Confirm we are using np.where the pep8 way.

Upstream we set a mask equal to True/False, so (np.where == True) should be ok.

PEP8 complains, but may be the parser or still the right way to do it.

make quick integration test

proposal is small data to allow us to check interfaces quickly when locally changing things and prior to pushing to pip.

dependencies

set lower bounds for required packages in setup script

Documentation

In the process of rewriting documentation, consider the previous list of edits:


General

  • Add a link to project/tool site for each tool mentioned on http://m2g.io/tutorials/available_data.html
  • For running reliability, http://m2g.io/tutorials/validation.html#computing-reliability-on-your-data is great - we want to be a little more clear about what each of the things is in the {}. Can we clarify, maybe using one of the nightly build data sets?
  • On http://m2g.io/tutorials/sample_data.html, put the two line code snippets with how to do frob norm and edge counts
  • Clarify runtime for graphs on validation page as through small graphs. Perhaps give at least limited (approximate) timing for big graphs (run time, not wall time)
  • We ran a third dataset, right? The MRN data? We should say this on the validation page, even though there was no TRT (just explain)
  • Add data derivatives to nightly run for each data set, not just KKI
  • Rename data on data public to m2g_v1_1_2 consistent with pull request
  • Tag release of 1_1_2.
  • Add an adjacency matrix (PNG) of at least one of the sample results, and give single line code used to generate (For running reliability, http://m2g.io/tutorials/validation.html#computing-reliability-on-your-data)
  • Explain why we ran multiple atlases on validation page and what that means (2-3 sentences).
  • Add lessons learned from running pipeline under FAQ page - places where people might get stuck (e.g., running big graphs, setting memory allocations)
  • How much RAM/cores are needed for a single subject? Either put in validation or setup. Maybe one number for small directions and one for large.
  • Capture all covariate information in one place for now until we have a lims option
  • Add additional reliability info somewhere (links or in page) for datasets in addition to MNR score. We produce other really interesting/helpful debug information
  • Document what information is produced by reliability function - sphinx-ing it might be the way to go (couldn't find it if it exists)
  • Put LONI 6.0.1 versions for all platforms somewhere private
  • Make it slightly easier with instructions to mount a drive on EC2. FAQs?
  • Explain how to open screenshare on a mac. FAQs?
  • Add recommended EC2 instance size (name) - let's provide a simple recommendation and also a note on cost ($0.XX/hour).

High Level

  • Sample data should be used consistently for all examples
  • Parallel explanations and formats across headings and pages
  • Code block and ipython notebook to reproduce reliability numbers
  • Line up all m2g architecture and figures with new python classes: data wrangling, registration, diffusion (processing), graph generation, (utilities)

Cleanup

  • ITK-SNAP should be styled that way (vs. itk-snap or similar)
  • On overview page, add explicitly what the m2g pipeline produces in terms of output (high level graphs (adjacency matrices), showing ...
  • Move data formats and access to Data formats page, and link to that in public data
  • Provide link to each of the software tools we recommend. Does MIPAV do the same things as mricron?
  • Link to spec for fiber dat file
  • Link to something explaining graphml
  • Explain the details (each file/folder) of the sample data
  • Preprocessing -> wrangling throughout
  • webservice -> webservices
  • Need parallel structure in data formats page (explain outputs in more detail, for example)
  • Commit demo workflow to repo rather than making a separate download, and tell people where this is

NeuroData

  • Rename to developer guide
  • Add information indicating: where master setup script is (i.e., embedded in rst file), quarterly release schedule, how people can contribute

Validation

  • Add picture for reliability
  • Add and explain outputs from reliability for each validation run
  • Link validation to v1.1.2 release
  • SWU4 - analysis of removed subjects
  • validation -> point explicitly to small graphs that are produced

Sample Data

  • Provide ipython notebook to reproduce table.
  • This page should be called system test - break out sample data
  • Add expected same scan/same subject difference between example outputs and test run(for deterministic runs = 0)
  • Add binary edge counts
  • Add picture of sample adjacency matrix - point people to graph explorer

Install Instructions

  • rename page to install instructions for m2g
  • add ami link explicitly and prominently (how do i use should be more prominent than setup)
  • revise and rerelease AMI
  • grab camino release - we should be pulling a particular release, not master. Make sure we grab the one that was used for the validation
  • igraph - get specific release and get rid of gist stuff (pipe single command to file or similar
  • m2g - get tagged version
  • separate loni stuff into separate block
  • we should grab setup data vs. all of kki
  • echo "export m2g=/mrimages" <-- need trailing quote
  • take out words like install, configure, update, setup, except the h1 heading at top

Basic Usage

  • Rename to Pipeline Overview
  • m2g Diffusion processing pipeline figure (crop so that spacing is good)
  • fix parallelism in each paragraph - choose either pseudocode or preferably description of what's going on
  • Add a single line explaining docs.scripts.m2g() namespace
  • Explain exported outputs so that pin count lines up (one line explanation is all we need)
  • Note that this is a LONI implementation of our pipeline

Advanced Usage

  • m2g_home is already set in data path block and doesn't need to be reset
  • rename as bash implementation/walkthrough
  • remove timing and ram estimates
  • explain that setup is required, except for LONI block
  • When you reference CLI, briefly explain what that is (same as GUI, but will run workflow from command line
  • Running test workflow on the sample data (link to system test)

Deliverables (VERY HIGH LEVEL)

This issue captures where we are going, not specific actions for any person or any week

  • 2000+ Brains with at least one graph
  • 7000 x 24 atlas graphs
  • Graphs available online
  • Graphs in grutedb
  • ndmg as a webservice
  • input data (aligned dti) in neurodata
  • derivatives in neurodata
  • mega-analysis

clarify differences between migraine & m2g

for sephira data, we found signal with migraine, but not m2g.
insofar as we want to figure out why,
i'd like to understand the differences.
can you document it somewhere public, and send me link?
of note, this does not take priority over the weekly priority :)

@WillGray

Fiber bounds checking

We round the fibers to index into labels. This has the potential to result in errors when on boundaries. Patched for now, but really this should never occur.

Consider masking fibers or otherwise attenuating tracks that are clearly spurrious

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.