Coder Social home page Coder Social logo

swc-advanced-microscopy / stitchit Goto Github PK

View Code? Open in Web Editor NEW
16.0 3.0 7.0 1.5 MB

Stitching of large tiled datasets

License: GNU Lesser General Public License v3.0

MATLAB 98.49% JavaScript 0.47% CSS 0.12% HTML 0.42% Shell 0.50%
anatomy microscopy matlab tile stitching stitched-images neuroscience imaging

stitchit's Introduction

StitchIt

StitchIt is a MATLAB package for stitching data acquired via our ScanImage)-based acquisition system (BakingTray). To get started, please read the Documentation. There is a changelog.

Features

With a single command (syncAndCrunch) the user can:

  • Pre-process image tiles as they are being acquired.
  • Display the last acquired section on the web.
  • Automatically stitch data when acquisition completes.
  • Send Slack notifications when acquisition completes or pre-processing fails.
  • Automatically conduct arbitrary analyses after acquisition completes.

StitchIt has commands for basic tasks such as:

  • Stitching subsets of a data set.
  • Calculating the average tile for illumination correction.
  • Calculating coefficients for correcting for scanning artifacts (experimental).
  • Randomly accessing any tile in the dataset.
  • Techniques for exploring stitching accuracy.
  • Cropping stitched datasets or partition a single stitched dataset into multiple ROIs.

Post-stitching functionality:

  • Correction of intensity differences across different optical sections.
  • Removal of tile seams in stitched images.
  • Down-sampling the dataset to a single multi-page TIFF stack or MHD file.

Installation

Clone the repository. Add the code directory and its sub-directories to your MATLAB path. In addition, you will need to acquire the following and add to your MATLAB path:

If you need more information on the installation procedure, please see the Installation page on the Wiki. Stitchit will automatically check if it's up to date whenever the user runs syncAndCrunch.

Questions and bug reports

Please use the project's issue tracker for questions, bug reports, feature requests, etc. Please do get in touch if use the software: especially if you are publishing with it! You may also join the StitchIt Gitter room for discussions.

Licensing

This software is distributed under the GPL v3 licence. This repository may be freely forked and shared so long as this licence is attached.

More tools

See btpytools for Python-based helper tools. e.g. to compress raw data or send data to a remote server. You may install those via:

$ sudo pip install btpytools

stitchit's People

Contributors

ablot avatar lguerard avatar raacampbell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

stitchit's Issues

Handle the case where the syncAndCrunch machine is the also the buffer server

Currently syncAndCrunch assumes one of the following scenarios:

  • For TissueVision experiments, data are streamed to a Linux server with no compute capacity and copied to a Linux analysis machine via rsync.
  • For other Windows-based systems, syncAndCrunch can access the data directly on a local drive on the acquisition machine so long as all channels are saved on a single computer.

syncAndCrunch does not cope with the scenario where the acquisition is streamed directly to the analysis machine on which the pre-processing is run.

NOTE: syncAndCrunch does not handle pulling data directly from the TissueVision acquisition machines. This will not be implemented because in-house we don't have remote access to these machines, as they run Windows XP and are on an isolated sub-net. It's also annoying to handle the fact that data from the three channels are streamed to two separate computers.

Rare random failure to read Mosaic file

Over the course of a ~12 hour acquisition I got one instance of:

17-01-19 02:06:24 -- ERROR: MATLAB:nonExistentField -- Error using tissuecyte/readMetaData2Stitchit>mosaic2StitchIt (line 41)
Reference to non-existent field 'SampleID'.
  mosaic2StitchIt - /home/rob/work/Anatomy/StitchIt/code/SystemClasses/@tissuecyte/readMetaData2Stitchit.m - line 41
   readMetaData2Stitchit - /home/rob/work/Anatomy/StitchIt/code/SystemClasses/@tissuecyte/readMetaData2Stitchit.m - line 24
    readMetaData2Stitchit - /home/rob/work/Anatomy/StitchIt/code/SystemFunctionStubs/readMetaData2Stitchit.m - line 35
     generateIndexFileInDirectory - /home/rob/work/Anatomy/StitchIt/code/SystemClasses/@tissuecyte/generateTileIndex.m - line 130
      generateTileIndex - /home/rob/work/Anatomy/StitchIt/code/SystemClasses/@tissuecyte/generateTileIndex.m - line 88
       generateTileIndex - /home/rob/work/Anatomy/StitchIt/code/SystemFunctionStubs/generateTileIndex.m - line 58
        syncAndCrunch - /home/rob/work/Anatomy/StitchIt/code/processDuringAcquistion/syncAndCrunch.m - line 318

I am guessing by chance the Mosaic file is partially written because this error does not repeat.

Remove redundant code in BT and TV tileLoad functions

There is too much redundnat code in these functions. The tileLoaders should just tile load then perform the other tasks with specific functions: stitchit.tileLoad.cropper, stitchit.tileLoad.illuminationCorrector, stitchit.tileLoad.combCorrector

How easy is it to make the average images using sub-sets of the data?

If the alignment drifts slightly with time, it will be helpful to have a rolling value for the average tile. Have tried this with the BakingTray sub-system and it works fairly well. The problems are that there aren't enough tiles near the bulb and that the average tile is liable to get very biased by transient bright areas, such as the injection site.

Does warnLowDiskSpace even work?

I think I have seen syncAndCrunch carry on until the volume is 100% full. Then MATLAB generates those incomprehensible error messages that always confuse people. Check that warnLowDiskSpace works. I think we should probably have a cap that will cause stitching or syncAndCrunch to stop when there is less than, say, 500 GB left on the volume and to show a sensible error message.

The flipud and fliplr options in stitcher.m no longer need isfield

The INI file reader has for some time automatically added settings that the user's file does not have by the default file does have. This makes it possible to easily add new options without having to have isfield commands each time we read a setting. In stitcher.m there are a couple of sfield lines near the bottom.

if isfield(st,'flipud') & st.flipud

We can replace these with:

if st.flipud

And all should be fine.

The behavior of readStitchItINI that makes this possible should be documented in that function's help text.

Write permissions check is sometimes wrong

>> syncAndCrunch(localDir, serverDir, 0, 1:3, 0, 2)  
WARNING: you appear not to have permissions to write to /mnt/data/IMCF/LisaT/wt1
Getting first batch of data from server

I have seen this only rarely (it doesn't stop anything from working) and I've not been able to reproduce it.

Stitch BakingTray data based on stage positions

I think this isn't working right now. BakingTray goes more accurately to the desired stage locations that Orchestrator does (which produces incorrect values for the stage locations for some reason), so this isn't a big deal: the BT data look better anyway.

syncAndCrunch initially displays a warning that it can't find a log file

Likely there is a command that requires a log file and this displays a warning until rsync has copied one over.

>> syncAndCrunch(L,S,0,1:3,0,1)
Can not find acquisition system log file in /mnt/data/TissueCyte/kanamori/Retro1
Getting first batch of data from server and copying to /mnt/data/TissueCyte/kanamori/Retro1/rawData
Running:
rsync -a /mnt/tvbuffer/Data/Mrsic-Flogel/kanamori/Retro1/ /mnt/data/TissueCyte/kanamori/Retro1/rawData

Document new system-specific INI files

We can now name the INI files so that they are associated with a particular acquisition system (or even different users of the same system). e.g. "stitchitConf_cajal.ini" and "stitchitConf_brainsaw.ini" and readStitchItINI automatically looks for these based on the system ID as returned by:

M=readMetaData2Stitchit
M.System.ID

This is not documented right now: document it!

Rare failure to stitch following syncAndCrunch from a TissueVision experiment

The Slack message is: "Stitching failed. Undefined function 'ne' for input arguments of type 'cell'.
Firstly, this isn't informative enough. User reports that missing tiles were found and fixed but maybe some tiles are still missing somewhere (don't know where). Manually running the stitching command fails again (no error message provided).

Re-running syncAndCrunch fixes things so one possibility is that a final rsync is needed. However, that is just a guess.

The first thing to do is to create a log file so we have a better idea of which steps worked and which did not.

tileIndex format is very restrictive

The tileIndex file links the TIFF tile names to a position in the sample. However, the format of this file only makes sense for data generated by orchestrator. Other acquisition systems won't produce data this way. The format only handles three channels.

We likely need a new tile index system that is a .mat file and is more flexible. This was already done for the tileStats files, but that was easier because we weren't using these files for anything until recently.

Maybe we could retain the old tileIndex system somehow as a legacy format and keep functions to hand that will handle this. Mark these as legacy and slowly phase them out.

INI settings should all be have param/value pair overrides

It would make sense to be able to override all (or at least the vast majority) of INI settings with parameter/value pairs. That way we can choose sensible defaults but always over-ride without needing to edit the INI file each time.

Remove objectives info from the INI file?

The INI file contains objective information to set the number of microns per pixel. This is very specific to the system used for the acquisition and also very specific to the TissueVision.

Other systems might well report accurate microns per pixel values and not need the objective information in the INI file. So it's best to try removing objectives from the INI file.

Average image generation will fail with large images

Some time ago I took the decision to move from a rolling average (where we only need to keep in RAM the rolling average image and the last loaded section) to a system of loading up all images in a section before calculating the average. This choice was made because there were problems non-brain average images and I hoped that having all images present in RAM at once would allow for more flexibility.

However, as things currently stand the system can fail to work if the tiles are very large since it makes running out of RAM likely. There are a few solutions:

  1. Switch back to rolling average.
  2. Down-sample average images. They don't need to be even 1 micron per pixel.
  3. Do less stuff in parallel.

I rather like option 2. Should look into it.

Add an update checker

For people who have cloned with Git, we should check on runs of, say, syncAndCrunch for updates and report if the system is up to date. People fail to update and are stuck with bugs.

Revisit the way we subtract the background tiles at some point

The BakingTray tile loader currently uses the background tiles found by ./preProcessTiles/private/writeTileStats.m to subtract this from the average tile. The process is somewhat ad-hoc and was motivated by the offset added to the amp. We stopped doing this now and so empty tiles should be near zero. I think, therefore, that we should modift this code in the light of this. See also: #46

tidy stitchAllSubDirectories

  • Check whether the unix find command is really needed
  • Improve comments
  • Remove the horrible we check how many channels are present

Set number of workers in INI file

We should add to the INI file keys that determine the maximum number of workers for each process. e.g. how many to use for tile loading, how many for section stitching, etc.

We should remember that the parallel pool might already have been started with a smaller number of workers, so something will need to be done about that.

It's important to set this because I've noticed that hardware RAID 10 works best if the number of workers is equal to the number of drives. Conversely, I think btrfs RAID will keep getting faster with more threads (up to some limit, of course).

Modular pre-processing steps

Should we make a modular stitching system that allows, say, pre-processing steps any stuff to happen in any order? So it should be plugin-based. e.g. That way the phase delay shifts will be a plugin. If I do this, it means a change to the way the input arguments are defined. Maybe it should be param/value pairs of some sort?

stitchAllSubdirectories changes dir so won't honour INI file in calling dir

stitchAllSubDirectories will descend into directories to stitch what's in them. If there was an ini file in the calling directory, this will be ignored. Either the system one will be used, or whatever is in the child directories will be used. Perhaps this isn't a bug, but it might cause confusion.

Send Slack message when no average tiles are made

When no average tiles are made that is an indirect indicator that the laser has lost modelock or that other bad stuff has happened. Add option to send Slack message when average tiles failed to be made.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.