Tools to initialize a QuPath project from input files
dchaley / qupath-project-initializer Goto Github PK
View Code? Open in Web Editor NEWTools to initialize a QuPath project from input files
Tools to initialize a QuPath project from input files
Tools to initialize a QuPath project from input files
We got a new (simpler) version of the qupath script from Brenna here:
dchaley/deepcell-imaging#293 (comment)
Let's incorporate it into our repo.
It does just whole cell, not also nuclear. It also seems to remove things like intensity measurements.
Future Question: how do we configure when we do/don't handle these various modes?
I looked into this tool but couldn't get it to work quickly:
https://github.com/GoogleContainerTools/jib
It promises to build java containers intelligently e.g. separating dependencies from app code in docker layers.
This greatly speeds up builds & transfers as fewer layers need to change during regular development.
For now though … just Ship The Jar
Now that we have a working script to generate the qupath project files, we need to bundle it up into a container so we can run it live.
For the ~140M pixel image we detect ~160k cells. Adding QuPath intensity measurement takes about 2 minutes to process ~1600 cells (1%). That means it would take ~3hr for all cells. This is on my M3 laptop, a decently powerful machine.
Getting this data is a key part of this workflow:
The intensity measurements are the main output from QuPath for downstream analysis, so unfortunately we have to run that super long step 😫
So the question is: how do we run this thing more efficiently? And how is it run today? Do users start the process then walk away overnight...? And where is this running in the first place today?
The QuPath data files refer to file paths – local to wherever the QuPath script was run.
This means you can't just download the project & open it, at least I think not, not without fixing the paths.
One big question is: where does the end QuPath app even run? On somebody's computer? On a cloud VM?
Depends on #9 .
Once that's done, make a GitHub action that starts a Cloud Build on commit to main
branch.
See also: dchaley/deepcell-imaging#254
For large enough projects, the boot disk may be insufficient to store all the downloaded images & generated QUPATH data.
In that case we can create a ramdisk (in-memory file-system) to leverage generally large amounts of RAM.
We currently limit to 1 task for node (b/c the GPU has to be dedicated). If we ever had more tasks per node, each would need this ramdisk space available…
QUESTION: do we actually need this? If we increase the boot disk size, we get this "for free". The main cost being of course the increased boot disk size that we currently use across tasks. And is RAM actually enough? I think the qupath code reloads the pieces from disk each time, that probably reduces RAM a lot.
So the RAM needs would be the (un?)compressed size of the images, plus, segmasks, plus, the uncompressed pixel overlay (1 float per pixel I think?). Then detection data…
Maybe for now, let's see how long we can get away with just using the default boot disk size.
If the project directory argument is remote storage, then, process the project in a temporary local directory.
When processing is complete,
If remote URI is not a gz, upload files in the temporary project directory to the remote storage
If remote URI is a gz, fail, defer to follow-up: #16
Follow-up to #15
Support remote URIs that end in .gz. Archive the temporary directory then upload the archive to remote storage.
Right now we have to recompile to change the input directory. 😿
Let's switch those to command-line arguments so that we can test more easily. Anyhow we need that to actually run this live!
We have 2 upstream scripts:
https://github.com/VillasboasLab/MyCodexPipeline/blob/main/Modules/createNewProject_Codex.groovy
https://github.com/VillasboasLab/MyCodexPipeline/blob/main/Modules/export_individual_qupath_rois.groovy
We've been implementing off the 1st script.
The 2nd script is our next step: actually export the measurements.
If the paths aren't already local, we need to download all remote files to local storage.
Since the current system works with directories we can just store them in their own temporary directories.
Once we've downloaded the files we'd update the path to the downloaded path.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.