Coder Social home page Coder Social logo

becryan / thunder-extraction Goto Github PK

View Code? Open in Web Editor NEW

This project forked from thunder-project/thunder-extraction

0.0 0.0 0.0 311 KB

algorithms for feature extraction from spatio-temporal data

License: MIT License

Python 7.53% Jupyter Notebook 92.47%

thunder-extraction's Introduction

thunder-extraction

Latest Version Build Status

algorithms for feature extraction from spatio-temporal data

Source or feature extraction is the process of identifying spatial features of interest from data that varies over space and time. It can be either unsupervised or supervised, and is common in biological data analysis problems, like identifying neurons in calcium imaging data.

This package contains a collection of approaches for solving this problem. It defines a set of algorithms in the scikit-learn style, each of which can be fit to data, and return a model that can be used to transform new data. Compatible with Python 2.7+ and 3.4+. Works well alongside thunder and supprts parallelization via spark, but can be used as a standalone package on local numpy arrays.

installation

pip install thunder-extraction

example

# generate data
from extraction.utils import make_gaussian
data = make_gaussian()

# fit a model
from extraction import NMF
model = NMF().fit(data)

# extract sources by transforming data
sources = model.transform(data)

usage

Analysis starts by import and constructing an algorithm

from extraction import NMF
algorithm = NMF(k=10)

Algorithms can be fit to data in the form of a thunder images object or an t,x,y(,z) numpy array

model = algorithm.fit(data)

The model is a collection of identified features that can be used to extract temporal signals from new data

signals = model.transform(data)

api

algorithms

All algorithms have the following methods

algorithm.fit(data, opts)

Fits the algorithm to the data, which should be a collection of time-varying images. It can either be a thunder images object, or a numpy array with shape t,x,y(,z).

For many algorithms, fit will take the optional arguments chunk_size and padding, which allows the algorithm to be performed on smaller chunks of the data, either in serial (if running locally) or in parallel (if running on a cluster).

A chunk is defined a subset of the image in space, including all time points. The chunk_size is the size of each chunk in pixels, and padding is the amount by which to pad the chunks in each dimension. For example, given a (100,100,500) data set, we could set chunk_size=(50,50) resulting in four chunks each of which are (50,50,500).

model

The result of fitting an algorithm is a model. Every model has the following properties and methods.

model.regions

The spatial regions identified during fitting.

model.transform(data)

Transform a new data set using the model, by averaging pixels within each of the regions. As with fitting, data can either be a thunder images object, or a numpy array with shape t,x,y(,z). It will return a thunder series object, which can be converted to a numpy array by calling toarray().

model.merge(overlap=0.5, max_iter=2, k_nearest=10)

Merge overlapping regions in the model, by greedily comparing nearby regions and merging those that are similar to one another more than the specified overlap. Repeats greedy merging process max_iter times. Only considers k_nearest neighbors to speed up computation.

list of algorithms

Here are all the algorithms currently available.

NMF(k=5, max_iter=20, max_size='full', min_size=20, percentile=95, overlap=0.1)

Local non-negative matrix factorization followed by thresholding to yield binary spatial regions. Applies factorization either to image blocks or to the entire image.

The algorithm takes the following parameters.

  • k number of components to estimate per block
  • max_size maximum size of each region
  • min_size minimum size for each region
  • max_iter maximum number of algorithm iterations
  • percentile value for thresholding (higher means more thresholding)
  • overlap value for determining whether to merge (higher means fewer merges)

The fit method takes the following options.

  • block_size a size in megabytes like 150 or a size in pixels like (10,10), if None will use full image

thunder-extraction's People

Contributors

freeman-lab avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.