dchaley / qupath-project-initializer Goto Github PK

View Code? Open in Web Editor NEW

0.0 2.0 0.0 98 KB

Tools to initialize a QuPath project from input files

Kotlin 94.36% Dockerfile 5.64%

qupath-project-initializer's Introduction

QuPath Project Initializer

Tools to initialize a QuPath project from input files

qupath-project-initializer's People

Contributors

Watchers

qupath-project-initializer's Issues

Import new version of script

We got a new (simpler) version of the qupath script from Brenna here:
dchaley/deepcell-imaging#293 (comment)

Let's incorporate it into our repo.

It does just whole cell, not also nuclear. It also seems to remove things like intensity measurements.

Future Question: how do we configure when we do/don't handle these various modes?

Faster builds with tool: jib ?

I looked into this tool but couldn't get it to work quickly:
https://github.com/GoogleContainerTools/jib

It promises to build java containers intelligently e.g. separating dependencies from app code in docker layers.

This greatly speeds up builds & transfers as fewer layers need to change during regular development.

For now though … just Ship The Jar

Create container for qupath-project-initializer

Now that we have a working script to generate the qupath project files, we need to bundle it up into a container so we can run it live.

How to generate intensity measurements

For the ~140M pixel image we detect ~160k cells. Adding QuPath intensity measurement takes about 2 minutes to process ~1600 cells (1%). That means it would take ~3hr for all cells. This is on my M3 laptop, a decently powerful machine.

Getting this data is a key part of this workflow:

From Brenna:

The intensity measurements are the main output from QuPath for downstream analysis, so unfortunately we have to run that super long step 😫

So the question is: how do we run this thing more efficiently? And how is it run today? Do users start the process then walk away overnight...? And where is this running in the first place today?

What to do with QuPath image paths?

The QuPath data files refer to file paths – local to wherever the QuPath script was run.

This means you can't just download the project & open it, at least I think not, not without fixing the paths.

can we do relative paths?
should we ask the user to specify a path rewrite, for when they download it?
QuPath says it has an "image server", can that be … cloud storage ...??? (It can do NAS but that's a big moving part)

One big question is: where does the end QuPath app even run? On somebody's computer? On a cloud VM?

Create github action to build qupath container on commit

Depends on #9 .

Once that's done, make a GitHub action that starts a Cloud Build on commit to main branch.

Create ramdisk to store working files

For large enough projects, the boot disk may be insufficient to store all the downloaded images & generated QUPATH data.

In that case we can create a ramdisk (in-memory file-system) to leverage generally large amounts of RAM.

We currently limit to 1 task for node (b/c the GPU has to be dedicated). If we ever had more tasks per node, each would need this ramdisk space available…

QUESTION: do we actually need this? If we increase the boot disk size, we get this "for free". The main cost being of course the increased boot disk size that we currently use across tasks. And is RAM actually enough? I think the qupath code reloads the pieces from disk each time, that probably reduces RAM a lot.

So the RAM needs would be the (un?)compressed size of the images, plus, segmasks, plus, the uncompressed pixel overlay (1 float per pixel I think?). Then detection data…

Maybe for now, let's see how long we can get away with just using the default boot disk size.

Upload qupath project directory to remote storage

If the project directory argument is remote storage, then, process the project in a temporary local directory.

When processing is complete,

If remote URI is not a gz, upload files in the temporary project directory to the remote storage
If remote URI is a gz, fail, defer to follow-up: #16

Support project archive upload

Follow-up to #15

Support remote URIs that end in .gz. Archive the temporary directory then upload the archive to remote storage.

Parameterize input directory from command line

Right now we have to recompile to change the input directory. 😿

Let's switch those to command-line arguments so that we can test more easily. Anyhow we need that to actually run this live!

Add measurements report output to QuPath

We have 2 upstream scripts:

https://github.com/VillasboasLab/MyCodexPipeline/blob/main/Modules/createNewProject_Codex.groovy
https://github.com/VillasboasLab/MyCodexPipeline/blob/main/Modules/export_individual_qupath_rois.groovy

We've been implementing off the 1st script.
The 2nd script is our next step: actually export the measurements.

Download remote cloud files to local storage

If the paths aren't already local, we need to download all remote files to local storage.

Since the current system works with directories we can just store them in their own temporary directories.

Once we've downloaded the files we'd update the path to the downloaded path.