allenneuraldynamics / aind-codeocean-utils Goto Github PK
View Code? Open in Web Editor NEWRepository to facilitate interfacing with Code Ocean's index
License: MIT License
Repository to facilitate interfacing with Code Ocean's index
License: MIT License
As a user, I want to run a method to easily update custom metadata.
modality
, subject id
, platform
, collection date
, institution
, and data level
filled out correctly.Add any helpful notes here.
Is your feature request related to a problem? Please describe.
When trying to use CodeOceanJob
to handle metadata tags correctly, I spent couple hours looking at the control flow, and I'm still not sure I get it. Some other inconsistencies were also confusing - wrappers around configuration objects in aind-codeocean-api that really were only renaming things. We should also support a reprocessing workflow - registration should be optional.
Describe the solution you'd like
I propose CodeOceanJob be organized as follows:
aind-codeocean-api
's request/configuration objects directly for the Register, Run, and Capture steps.pass_metadata_to_result
flag, default Trueadd_data_level_tags
flag, default TrueIs your feature request related to a problem? Please describe.
Currently, the aind-trigger-codeocean
has some classes to run specific capsules, register assets, and capture results (see: https://github.com/AllenNeuralDynamics/aind-trigger-codeocean/blob/main/code/aind_trigger_codeocean/pipelines.py#L104)
However, such class should live here and the aind-trigger-codeocean
repo should be a CO capsule to trigger jobs using this class.
Describe the solution you'd like
Ideally, a CodeOceanJob
class should:
Describe alternatives you've considered
One could use the aind-codeocean-api
directly, but registration and waiting for results to capture requires some additional and non-trivial coding
As a user, I want the code published to PyPI, so I can easily install it in other packages.
Add any helpful notes here.
Move bot alert from aind-trigger-codeocean
here
Is your feature request related to a problem? Please describe.
We have a large number of assets that users have archived. These are using unnecessary space and cost.
Describe the solution you'd like
A method that let's me see a list of all archived data assets older than a particular age and then separately decide to delete them. I should optionally be able to exclude assets that have attachements.
As a user, I'd like to use a map to replace tags, to make it easier to replace tags instead of running remove and add separately.
update_tags
, then they can supply an arg tags_to_replace: Optional[Dict[str,str]] = None
that will change tags in the data_assets
list.Add any helpful notes here.
As an admin, I would like to know which Runs we could potentially delete so as to save space/cost.
Code Ocean Capsules and Pipelines have many Runs that store output files. In many cases these runs are part of the normal testing cycle and can be removed.
This script should not actually delete data - that will be a separate task.
*Original issue:
Code Ocean Capsules and Pipelines have many Runs that store output files. In many cases these runs are part of the normal testing cycle and can be removed.
Write a script uses the Code Ocean API to identify Runs that we could potentially delete. The script should output a CSV with the following columns:
1. Capsule name
2. Run date / time
3. Whether the Run was captured as a Data Asset
4. Total size of files
We should be able to generate this report whenever we like. Actually deleting data will be a separate task.
As a user, I want a method to update tags on assets, so I can easily update tags.
old_tag
and new_tag
and iterator of data asset ids, then all assets with those data asset ids in code ocean and have old_tag
will have "old_tag" changed to new_tag
.old_tag
is None, then all assets satisfying the filter
will be tagged with new_tag
.Add any helpful notes here.
Is your feature request related to a problem? Please describe.
Assets are not being tagged automatically, which is making them difficult to find.
Describe the solution you'd like
raw data should be tagged with the DataLevel.RAW tag and derived data should be tagged with the DataLevel.DERIVED tag.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.