Coder Social home page Coder Social logo

wipp-backend's Introduction

WIPP REST API

A Java Spring Boot application for managing WIPP data and workflows. The API follows the HATEOAS architecture using the HAL format.

Requirements

Requirements for development environment setup.

Java environment

  • Java JDK 21
  • Maven version compatible with Java 21

Database

  • MongoDB (Supported versions: 3.6 to 7.0)

Identity and Access Management

  • Keycloak 11.0.2
  • Default dev configuration expects Keycloak at http://localhost:8081/auth (see wipp-backend-application/src/main/resources/application.properties. Sample Docker run command: docker run -p 8081:8080 -e KEYCLOAK_USER=admin -e KEYCLOAK_PASSWORD=admin quay.io/keycloak/keycloak:11.0.2 (see https://www.keycloak.org/getting-started/getting-started-docker)
  • Import WIPP realm available in folder docs/auth-acl
  • RBAC-ACLs descriptions available in file docs/auth-acl/acl.md

Kubernetes cluster

  • For development purposes, a single-node cluster can be easily installed using Minikube or Docker for Mac on macOS
  • We are using Argo Workflows to manage workflows on a Kubernetes cluster, a minimal installation of the Argo Server and Controller can be set up using the following commands:
kubectl apply -f https://github.com/argoproj/argo-workflows/releases/download/v3.5.5/quick-start-minimal.yaml

Please follow the instructions for version 3.5.5 here to:

  • install the Argo binary

Data storage

  • Create a WIPP-plugins folder in your home directory for data storage (dev Maven profile is expecting the data folder location to be $HOME/WIPP-plugins)
  • Create the WIPP data storage Persistent Volume (PV) and Persistent Volume Claim (PVC) in your Kubernetes cluster following the templates for hostPath PV and PVC available in the WIPP repository:
    • path of hostPath in hostPath-wippdata-volume.yaml should be modified to match path of WIPP-plugins folder created above
    • storage of capacity is set to 100Gi by default, this value can be modified in hostPath-wippdata-volume.yaml and hostPath-wippdata-pvc.yaml
    • run hostPath-deploy.sh to setup the WIPP data PV and PVC

Local backend import option
Default root folder configuration for the local backend import of images collections is:

  • ${user.home}/WIPP-plugins/local-import in dev profile and
  • /data/WIPP-plugins/local-import in prod profile

This default configuration can be changed in wipp-backend-application/pom.xml (property storage.local.import) when running locally or in the deployment manifest when running in a Kubernetes cluster (an additional volume needs to be mounted if the new location is not in the WIPP root data folder).

Compiling

mvn clean install

Running

cd wipp-backend-application
mvn spring-boot:run

Docker packaging

docker build --no-cache . -t wipp_backend

For a Docker deployment of WIPP on a Kubernetes cluster, scripts and configuration files are available in the WIPP repository.

Deployment

  1. Create a .env file in the root of the repository, using sample-env as an example.
  2. Configure kubectl with a kubeconfig pointing to the correct Kubernetes cluster. Optionally, pass the location of the kubeconfig file in the .env. This value defaults to the standard kubeconfig location.
  3. Run the script using: ./deploy.sh.

Application Performance Monitoring (APM)

The Elastic APM Java agent is integrated into the Docker image as an optional setting. The Elastic APM agent will push metrics to an APM server, which feeds into Elasticsearch and Kibana. Configuration of the Elastic APM Java Agent to connect to the APM server is controlled via environment variables. These variables are optional if Elastic APM is not needed.

Environment Variable Description
ELASTIC_APM_SERVICE_NAME Service name tag attached to all metrics sent.
ELASTIC_APM_SERVER_URLS Url of Elastic APM server.
ELASTIC_APM_APPLICATION_PACKAGES (Optional) Determines stack trace frame. Multiple packages can be set.
ELASTIC_APM_SECRET_TOKEN (Optional) Secret token for Elastic APM server.

WIPP Development flow

We are following the Gitflow branching model for the WIPP development.

Contributing

Please follow the Contributing guidelines

Disclaimer/License

NIST Disclaimer/License

wipp-backend's People

Contributors

dependabot[bot] avatar ktaletsk avatar mapleknight avatar mohamedouladi avatar mylenesimon avatar nicholas-schaub avatar pdessauw avatar samiasa avatar sunnielyu avatar tejavegesna avatar wangk8 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wipp-backend's Issues

Leverage Argo's new workflow persistence storage

Description
As of version 2.4.0-rc1, Argo supports workflow persistence storage in centralized external database.
We could try to leverage that and have the workflows stored in the WIPP database, so that the storage is persistent and we have workflow status and more directly available in the backend, without having to do extra requests to the Argo API.

Proposal
This new config uses upper.io/db.v3 database adapter, which supports MongoDB, but for now only MySQL and PostgreSQL are provided by Argo as examples of implementation, so we will have to see if using sql database is a requirement.

HTTP GET requests triggering the sorting return the same results no matter the sorting direction

Summary

HTTP GET requests triggering the sorting return the same results no matter the sorting direction.

What is the current bug behavior?

In the images collections list page (WIPP-frontend), when the sort is triggered on the columns "Number of images" or "Images total size", the HTTP GET request returns the same result for both sort directions (asc and desc).

Similar behavior in the images detail page, when sorting images or metadata files by name or by size.

What is the expected correct behavior?

The sorting must return different results depending on the sorting direction (asc or desc).

Steps to reproduce

From postman, send the following requests (for images):

http://localhost:8080/api/imagesCollections/{imagesCollectionId}/images?page=null&size=10&sort=name,asc

http://localhost:8080/api/imagesCollections/{imagesCollectionId}/images?page=null&size=10&sort=name,desc

Possible fixes

For the columns "Number of images" and "Images total size", in the ImagesCollection.java class, removing the annotation "@JsonIgnore" fixes the issue from a client-side perspective (might not be the optimal solution). Perhaps, The upgrade to Spring boot 2.0 changed the behavior of the sorting.

Files larger than 2GB fail to upload

Summary

Uploading files larger than 2GB to a collection always fails.

What is the current bug behavior?

When uploading files larger than 2GB, WIPP always reports an error with the file upload:
Unknown upload error.

What is the expected correct behavior?

The file uploads.

Steps to reproduce

Upload a file larger than 2,147,484,672 bytes.
I took a large file and saved it in subsets so that the file size would change. Below is a list of file sizes I tried and whether they succeeded or failed:

File size: Upload status

2,013,266,944 bytes: Succeeded
2,080,375,808 bytes: Succeeded
2,147,484,672 bytes: Succeeded
2,214,593,536 bytes: Failed
2,281,702,400 bytes: Failed
2,348,811,264 bytes: Failed

Relevant screenshots and/or logs

2019-07-24 21:00:55.435 ERROR 1 --- [nio-8080-exec-6] o.a.c.c.C.[.[.[/].[dispatcherServlet]    : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception

java.io.IOException: Negative seek offset
	at java.io.RandomAccessFile.seek(RandomAccessFile.java:555) ~[na:1.8.0_212]
	at gov.nist.itl.ssd.wipp.backend.data.utils.flowjs.FlowjsController.uploadChunck(FlowjsController.java:92) ~[wipp-backend-data-3.0.0-SNAPSHOT.jar!/:na]
	at gov.nist.itl.ssd.wipp.backend.data.imagescollection.files.FileUploadController.uploadChunck(FileUploadController.java:76) ~[wipp-backend-data-3.0.0-SNAPSHOT.jar!/:na]
	at sun.reflect.GeneratedMethodAccessor145.invoke(Unknown Source) ~[na:na]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_212]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_212]
	at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:209) ~[spring-web-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136) ~[spring-web-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:102) ~[spring-webmvc-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:877) ~[spring-webmvc-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:783) ~[spring-webmvc-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) ~[spring-webmvc-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:991) ~[spring-webmvc-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:925) ~[spring-webmvc-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:974) ~[spring-webmvc-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:877) ~[spring-webmvc-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:661) ~[tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:851) ~[spring-webmvc-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:742) ~[tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) ~[tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) ~[tomcat-embed-websocket-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at gov.nist.itl.ssd.wipp.backend.core.rest.CorsFilter.doFilter(CorsFilter.java:50) ~[wipp-backend-core-3.0.0-SNAPSHOT.jar!/:3.0.0-SNAPSHOT]
	at org.apache.catalina.core.ApplicationFilterChain.inter'nalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:200) ~[spring-web-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) ~[spring-web-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198) ~[tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:800) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1471) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_212]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_212]
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-embed-core-8.5.32.jar!/:8.5.32]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_212]

Environment info

wipp-backend-data-3.0.0

Possible fixes

It appears as though the error is caused by an int indexing variable used to store file chunk position in FlowjsController.java.
https://github.com/usnistgov/WIPP-backend/blob/master/wipp-backend-data/src/main/java/gov/nist/itl/ssd/wipp/backend/data/utils/flowjs/FlowjsController.java

On line 86: int flowChunkNumber = getFlowChunckNumber(request);
The actual error occurs on line 92 according to the logs, with a negative index issue.

I suspect that using a signed int is likely the cause of the problem, given the errors start occurring around file sizes 2^31 bytes.

Add a queue for workflow using GPU

Description
Currently there is no queue for workflows using GPU. If the GPU is already used, the new workflow that will try to access it will fail.

Proposal
Add a queue for the workflow using GPU so that they wait for the GPU to be available to run.

Handle inputs/outputs for chained jobs

Description
When creating a workflow, user can select output of existing task (A) as input of another task (B). This results in input configuration for task B sent to backend as {{ jobID.outputName }} (jobID being the id of the task A, outputName the name of task A's output to be used as task B's input).
This as to be translated to the actual path to the data during workflow conversion to Argo YAML file.

Proposal
{{ jobID.outputName }} can be parsed and then used to get the path of the temporary folder created for this outputName of the job.

Support export of collections from Catalog to WIPP

Description
The NCATS team is developing the Catalog app, where users can upload collections of images, visualize them , reorganize them and more. User should be able to export set of images from Catalog to WIPP, so that he can then run computations on them.

Proposal
The support of this feature is divided in several steps:

  1. Decouple submission to the OME TIFF conversion from the image upload
    Currently, the methods for submitting uploaded images to the OME TIFF conversion are inside of the ImageUploadController class, but in order to support multiple ways of creating image collections, we should put them in their own class.
    To do:

    • Create new class ImageConversionService (annotated with @Service) in the same package, and move methods submitImageToExtractor, doSubmit and convertToTiledOmeTiff in this new class.
  2. Add new importMethod and sourceCatalog attribute to the ImagesCollection class
    Since we are extending the ways images collections can be created, we need to record the method that was used and keep track of the source.
    To do:

    • Create new ImagesCollectionImportMethod enum with the following values: UPLOADED, JOB and CATALOG, add importMethod attribute to ImagesCollection
    • Add sourceCatalog (string) attribute to ImagesCollection
    • Modify existing ImagesCollection constructors to accommodate for the new importMethod field
      (public ImagesCollection(String name, boolean locked) should now become public ImagesCollection(String name, boolean locked, ImagesCollectionImportMethod importMethod))
    • Create new constructor for collections created from Catalog
      public ImagesCollection(String name, String sourceCatalog) that will set name and creation date, as well as locked to true (users should not upload images to a collection coming from catalog), sourceCatalog to sourceCatalog and importMethod to ImagesCollectionImportMethod.CATALOG
  3. Create ImagesCollectionCatalogImportController class
    To do:

    • Create new ImagesCollectionCatalogImportController that will import files from the images collection temp folder, with OME TIFF conversion of the images (a bit similar to the importData method of ImagesCollectionDataHandler, but with an added step for the conversion). Before running the import, the first step should be to check that: the collection is empty, the importMethod is CATALOG, and sourceCatalog is not empty.

Additional context
Export images from Catalog to WIPP flow:

  1. Create new WIPP Images Collection from Catalog (using the REST API, specifying desired name for the collection, as well as ID of the Catalog resource), this will return the newly created collection, including ID
  2. Copy of files from Catalog to the temp folder of the collection
  3. Once copy is done, request to the ImagesCollectionCatalogImportController, which will trigger the import of the files in the collection (physical import to the proper folders, as well as conversion to OME TIFF and update of the metadata in the database)

Add developer doc

Add/update doc for devs:

  • requirements (mongo, maven, java version)
  • how to set up dev environments (Kubernetes cluster, argo, and such)
  • how to compile and run the backend API

Add attribution fields in plugin manifest

Description
Currently, we only have support for one field in the plugin manifest to define attribution metadata (creator). We want to replace this field by a set of new ones defined below.

Proposal
Add the following attributes to the Plugin.java class:

  • author (will replace creator, to be updated in existing plugin manifests, database patch script should be provided as well)
  • institution
  • repository
  • website
  • citation

Additional context
Existing plugin manifests will have to be updated to account for the new fields.

Refactor getJobOutputTempFolder method

The method getJobOutputTempFolder will actually be identical and needed in most DataHandlers, so it should not be implemented in the DataHandler implementations to avoid code duplication.

Since there are several places in the code where creating the temp folders for job outputs and/or retrieving the path of these folders is needed, maybe we could create a service handling that and autowired where needed.

Add UNet Model Data Type

UNet CNN Training job outputs the following folder hierarchy:

saved_model/
    saved_model.pb
    assets/
    variables/
        variables.data-00000-of-00002
        variables.data-00001-of-00002
        variables.index
tensorboard/
    train/
        events.out.tfevents*
    test/
        events.out.tfevents*
model.txt
test_loss.csv

The users should be able to view the test_loss.csv, and the model.txt which contains a layer by layer description of the model including the number of parameters per layer.

Ideally, the user should be able to launch the tensorboard instance and view the training logs, either during training, or afterwards. Tensorboard is where the train/test loss and accuracy curves exist. Tensorboard launches a web server to display the log data. So, in theory we should be able to launch that in the background and provide the user the link to it.

Permanent redirect error in Argo Workflow with HTTPS

Summary

When submitting a Workflow, the entire Workflow succeeds, but no output collections of any sort are added to the WIPP collections.

What is the current bug behavior?

When submitting a Workflow, no output collections are created. In the final container of the Argo Workflow, the logs indicate that the curl request for the /api/workflows/<id>/exit API returns a 308 Permanent Redirect status. The Workflow show a "SUCCEEDED" Status and the Argo Dashboard shows no sign of errors.

What is the expected correct behavior?

The /exit API should return a 200 status and output collections from the Workflow should be created in WIPP.

Steps to reproduce

We have a deployment of WIPP and Argo running in AWS EKS and using Nginx Ingress. HTTPS is configured with SSL termination at the AWS Load Balancer level and SSL redirect is configured in Ingress so that any HTTP request is automatically redirected to use HTTPS.

Environment info

WIPP version: 3.0.0
Kubernetes version: 1.14
Deployment: AWS EKS

Possible fixes

  1. The full url for the WIPP deployment is not passed to the final Argo curl container. If the full url, with https is passed, then no redirect issue would occur ( ).
  2. Add the -L option to the curl command here:

Option for keeping original image data

Description
When uploading images to a collection, there should be an option to keep the original data. Currently, all images are converted to tiled Tiff. Some existing tools for processing certain types of image data either 1) only work on a particular image format or 2) do not read tiled Tiff's. Further, some of these file formats include metadata at specific locations in the file, and may not be recognized by an existing tool after converted to tiled Tiff.

Proposal
Include option on frontend to keep original data prior to tiled Tiff conversion. This should be a property that is set prior to uploading files, maybe set when naming the collection.

Add pyramid annotations data type

Description

The user can create annotations in the WDZT view of pyramids. With the need of having a complete AI pipeline, the annotations to mask plugin requires the annotations json file as an input file. Thus, adding the pyramid annotation data type will facilitate passing the json file as an argument to the annotations to mask plugin.

Proposal

A PyramidAnnotation entity needs to be implemented which will link the annotation json file to a certain folder referenced by its id. This entity must be linked to the time slice for which the annotations were created. This will also require having a save button in the WDZT view which will save the json file in the file system via a POST HTTP request.

WIPP data volume should be mounted to /data/inputs in plugin containers

Description
Currently, the WIPP data root folder is mounted as a volume to the plugin container using the same file path (ie, if the root data folder is in /home/myusername/dataWIPP, it will be mounted in the same location inside of the container).
This is not good and could create some issues. Mount point should be standard.

Proposal
During the workflow conversion to YAML file process, file paths should be curated so that {wipp.data.root} becomes /data/inputs in the YAML file:

  • the data volume mount path inside of the container should be data/inputs, in the "generateTemplatePluginContainer" method
  • the host path of the data volume will stay the same, {wipp.data.root}
  • input and output folder paths in the workflow task configurations should be curated in the DataInputHandlers to replace {wipp.data.root} by /data/inputs
    • for example, /data/WIPP-plugins/imageCollections/{id} will become /data/inputs/imageCollections/{id}

Add guidelines for plugins

Add guidelines for developers and scientists about how to create and package a new plugin, and how to write the plugin descriptor, including currently supported data types.

Automatically lock input images collections during workflow submission

Description
Right now, a user can use an images collection that is unlocked as an input to a task in a workflow. Since it is unlocked, the collection could be modified or even deleted before, during or after the workflow execution, creating an incoherent data state.

Proposal
During workflow submission, if an input images collection is not locked, it should be automatically locked. This could be handled in the DataHandler (only for regular collection, not for virtual ones that do not exist yet).

Additional context
Same thing applies to CSV collections.

Pyramid visualizations (composition of layers and groups)

Description
Pyramid visualization is a data type in WIPP that allows the composition of groups and layers to overlay several pyramids in the same visualization.
For example: overlaying raw images with segmented masks to check segmentation accuracy.
Behind the scene it creates a WDZT manifest containing the groups and layers configured by the user.

Proposal
Add pyramid visualizations management code (model, repository, controllers, handlers,...) from WIPP 2.3

Mount input data as read-only volume

Description
Currently the whole WIPP data folder is mounted as a volume in the plugin containers.
This is not good, as the plugin code could be modifying or deleting existing data, which is not a plugin's responsibility.
Only the temp job output folder should be mounted with write permissions to avoid accidental loss of data.

Proposal

  • #10 should be implemented first
  • Volume mounts in container templates configured here will now have a readOnly property set to true or false (see Kubernetes doc)
  • Modify WorkflowConverter class to manage input ready-only volumes and output volumes for each task, instead of mounting a single volume

Jupyter Notebook datatype

Description

Use case: We would like our users to be able to quickly prototype new WIPP plugins, test new libraries and algorithms.

Problem: Our new Notebook WIPP plugin (https://github.com/LabShare/polus-plugins/tree/master/polus-notebook-plugin) uses a workaround to enable it to read notebooks. We manually create notebooks folder in the root of WIPP folder on the backend, copy-paste notebook and then provide the filename in UI. There needs to be more robust solution instead.

Proposal
Create new datatype notebookCollection which will contain .ipynb files. There should be a way to upload notebook from UI and, more importantly, a way to create a new notebook collection through API, so we can integrate that in JupyterLab.

Additional context

Image assembling plugin

Description
Create WIPP Image assembling plugin

Proposal
Image assembling plugin taking a stitching vector folder and image collection folder as input, outputting a collection of stitched images.

Additional context
Based on Image Assembling algorithm from WIPP 2.xx

Delete task from workflow

Description
User may want to delete a task while configuring a workflow (before submission of the workflow).

Proposal
Authorize the delete method for jobs.

Add filtering by name for workflows on the server side

Description
Add filtering by name for workflows on the server side in the WorkflowRepository interface.

Proposal
Implement the method findByNameContainingIgnoreCase in the WorkflowRepository interface which returns the resulting pageable workflows filtered by name whenever the user triggers the filtering from the UI.

Add output field to a job

Description
The output of a job is currently not stored in the database.
It is necessary to get the available outputs when defining a workflow in frontend, for example.

Proposal
Add an outputs field which is a list of the job outputs

Additional context

Add conventions for importing images and metadata files in ImagesCollectionDataHandler

Description
Right now, if a plugin outputs an image collection, all files from the output folder will be imported as images in the image collection. However, an image collection in WIPP consists of a set of images and a set of metadata files, so it makes sense to have a plugin generating these two types of files, and to have WIPP properly handling the metadata files in this case.

Proposal
We could add a convention for plugins generating image collection, specifying that the output folder will contain an images sub-folder and a metadata_files sub-folder as follow:

outputCollectionFolder
โ”œโ”€โ”€โ”€images
โ”‚    image1.ome.tif
โ”‚    image2.ome.tif
โ”‚    ...
โ”œโ”€โ”€โ”€metadata_files
โ”‚    metadata-file1.xml

The importData method in ImagesCollectionDataHandler will have to be modified to look for these sub-folders and import files in the proper sub-collection.

Additional context
Unless we decide to add backward compatibility, existing plugins generating image collections will have to be updated to support this convention.

cc @Nicholas-Schaub

Workflow copy

Description
Often times, user will run an analysis and want to duplicate the analysis with another set of parameters or inputs. Right now, the only option is to recreate a similar workflow from scratch using the WIPP frontend, which can be time consuming and error-prone. Ideally, user would like to be able to click on a "copy" button from a workflow detail page (or call a "copy" endpoint from the REST API if scripting), and then get a new workflow in edit mode.

Proposal
Create WorkflowCopyController that, from an existing workflow and new provided name, will:

  • create a new Workflow with the new name, new creation date and status CREATED
  • loop through the jobs of the source workflow and recreate the jobs with new ids, empty outputs, updated names (workflow name prefix update), and updated dependencies/chained inputs.

Additional context
The list of jobs is actually a DAG.

Add tags to image collections

Description
Users would like to be able to add tags to their image collections, so that they can then filter and search by tags in the image collection list view.

Proposal
User should be able to add tags to collections, either by selecting existing ones or adding new ones. Image collection model will have a list of tags.
We will need new search methods in the ImageCollectionRepository as well to search by tags.

Support import from AWS S3 bucket for image collections

Description
Sometimes, scientists will store their data on Amazon S3 buckets. Currently, the only option to create an image collection (besides it being the result of a job) is to select files or folders from the user's machine and upload them through the frontend. So in the case of data being in S3 buckets, that would mean downloading the data from S3, and then re-uploading it to WIPP, which is not ideal.
Instead, we could provide an option to import images from S3.

Proposal
From the image collection view, user should have a new option "Import from Amazon S3 bucket", a popup/modal being then displayed to configure the bucket to use, folder to import, etc.
In the backend, WIPP will use the Java AWS SDK to connect to S3, download the files in the temp folder for that collection and then the usual processing of ome.tif conversion and metadata extraction can be applied, just like for uploaded files.
TBD: usage of AWS credentials (with required encryption) vs pre-signed URLs

Additional context
This feature has been requested by @Nicholas-Schaub and others from the Polus team. A workaround for S3 support was made in the form of a WIPP plugin.

Fix jackson-databind vulnerability

Description
A vulnerability has been found in the wipp-backend-argo-workflows repository.
Proposal
Upgrade jackson-databind library version from 2.9.5 to 2.9.10

Integration of AI based model training and inference as plugins

Description
(Include problem, use cases, benefits, and/or goals)
Problem: Users cannot run AI-based model training and inference unless the modeling computations become easy to access, run, monitor, and interpret
Use cases: Users want to segment images of RPE cells (binary labels), concrete (multiple labels), and organelles (binary labels) with AI-based segmentation models (i.e., U-net, SegNet).
Benefits: By making the AI-based model training and inference easily accessible, users can become independent of computational experts
Goals: (1) take existing AI-based segmentation models in WIPP 2.3 (designed for RPE cell segmentation) and figure out a general mechanism for integrating this model and any other segmentation model (i.e., the code from Mike's collection of AI models)
(2) take NVIDIA DIGITS (Docker container deployment) and figure out the input and output data types that could facilitate the data exchange between WIPP and DIGITS

Proposal
(A clear and concise description of what you want to happen)
-Create a plugin for running training and inference of AI-based models developed at NIST (parameters: model type = U-net, SegNet, tile size, gradient method, gradient method parameters, ...)
-Create a plugin for integrating NVIDIA DIGITS to WIPP

Additional context
(Add any other context or screenshots about the feature request here)
The initial focus could be on semantic segmentation of 2D grascale images

GPU support for plugins

Description

To get access to GPU in Kubernetes, pods need to have a special tag added to their spec (https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#deploying-nvidia-gpu-device-plugin). That means it is something plugin authors can't enable from within container, WIPP needs to use a slightly different Argo Workflow .yaml (argoproj/argo-workflows#577)

Proposal

  • Add a new field to plugin manifest with an option to request of GPU resources, i.e. gpu
  • If option is chosen in plugin manifest, WIPP-backend will add the following lines to Argo Workflow .yaml container spec:
resources:
        limits:
          nvidia.com/gpu: 1

(number of GPUs should be a configurable workflow parameter)

Convert images to tiled tiff at upload

Description
Images will be converted to tiled tiff to ensure compatibility with high performance libraries used to develop plugins such as NIST FastImage. Agreed on tile size is 1024.

Proposal
Conversion will be done once upload of the image is completed, during the conversion to OME-TIFF using the following options available from BioFormats:

// set tiled tiff flag
writer.setInterleaved(reader.isInterleaved());
// set the tile size height and width for writing
this.tileSizeX = writer.setTileSizeX(tileSizeX);
this.tileSizeY = writer.setTileSizeY(tileSizeY);

This means that we also have to modify the condition that is sending images to OME-TIFF conversion, as currently images ending in ome.tif are not converted (see ImageUploadController), however uploaded images could be in OME-TIFF format but not tiled.

Additional context
(Add any other context or screenshots about the feature request here)

Support upload of CSV collection

Description
Add the capability to create a new csvCollection by uploading a set of CSV files, similar to the way users can create new Stitching vectors.

Proposal
Create a CsvCollectionUploadController with an upload method that will accept a string (desired name for the collection) and array of files (MultipartFile[] files), and return the newly created CsvCollection or an error if the name is empty or already taken.

Additional context
Current restriction regarding file size is 5Mb/per file, 30Mb/request, so we should up the request limit to 50 or 100Mb. For this first implementation, upload of large collections of large files will not be supported.

Add API documentation

Description
Although the WIPP REST API can be explored at /api, we are missing a proper API documentation that could ideally be fed to REST client generators.

Proposal
Create REST API documentation.

Additional context
Two solutions seem to be the most used for Spring REST API documentation:

  1. Springfox implementation of Swagger
  • Pros: OpenAPI specification, automatic generation of documentation, minimal configuration
  • Cons: Version for Spring 5 and Spring Boot 2 support is still in snapshot version, springfox-swagger-ui-rfc6570 for RFC6570 support (for url templating support) not updated yet
  1. Spring REST Docs
  • Pros: official Spring project, integrated with JUnit tests
  • Cons: forces to write tests/documentation for all endpoints, default format is AsciiDoc instead of OpenAPI

Conversion to ome tiff fails when converting an image with single plane larger than 2GB

Summary

Conversion to ome tiff at upload fails when converting an image with single plane larger than 2GB

What is the current bug behavior?

An attempt to convert an image to ome format with single plane dimensions of 46912 x 48320 (width x height) resulted in the following error:

java.io.IOException: Cannot convert image to OME TIFF.
	at gov.nist.itl.ssd.wipp.backend.data.imagescollection.images.ImageUploadController.convertToTiledOmeTiff(ImageUploadController.java:188) [wipp-backend-data-3.0.0-SNAPSHOT.jar!/:na]
	at gov.nist.itl.ssd.wipp.backend.data.imagescollection.images.ImageUploadController.doSubmit(ImageUploadController.java:148) [wipp-backend-data-3.0.0-SNAPSHOT.jar!/:na]
	at gov.nist.itl.ssd.wipp.backend.data.imagescollection.images.ImageUploadController.lambda$submitImageToExtractor$0(ImageUploadController.java:138) [wipp-backend-data-3.0.0-SNAPSHOT.jar!/:na]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_212]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_212]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_212]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_212]
	at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_212]
Caused by: loci.formats.FormatException: Image plane too large. Only 2GB of data can be extracted at one time. You can work around the problem by opening the plane in tiles; for further details, see: https://docs.openmicroscopy.org/bio-formats/6.0.1/about/bug-reporting.html#common-issues-to-check
	at loci.formats.FormatReader.openBytes(FormatReader.java:878) ~[formats-api-6.0.1.jar!/:6.0.1]
	at loci.formats.FormatReader.openBytes(FormatReader.java:855) ~[formats-api-6.0.1.jar!/:6.0.1]
	at loci.formats.ImageReader.openBytes(ImageReader.java:436) ~[formats-api-6.0.1.jar!/:6.0.1]
	at gov.nist.itl.ssd.wipp.backend.data.utils.tiledtiffs.TiledOmeTiffConverter.readWriteTiles(TiledOmeTiffConverter.java:92) ~[wipp-backend-data-3.0.0-SNAPSHOT.jar!/:na]
	at gov.nist.itl.ssd.wipp.backend.data.imagescollection.images.ImageUploadController.convertToTiledOmeTiff(ImageUploadController.java:185) [wipp-backend-data-3.0.0-SNAPSHOT.jar!/:na]
	... 7 common frames omitted
Caused by: java.lang.IllegalArgumentException: Array size too large: 46912 x 48320 x 1 x 2
	at loci.common.DataTools.safeMultiply32(DataTools.java:1286) ~[ome-common-6.0.0.jar!/:6.0.0]
	at loci.common.DataTools.allocate(DataTools.java:1259) ~[ome-common-6.0.0.jar!/:6.0.0]
	at loci.formats.FormatReader.openBytes(FormatReader.java:875) ~[formats-api-6.0.1.jar!/:6.0.1]
	... 11 common frames omitted

Reported by @Nicholas-Schaub

Single Image download from output Image collections (Image assembling plugin)

Summary

Single Image download from output Image collections (created with the image assembling plugin) is not working properly.

What is the current bug behavior?

When the user downloads a single image from an image collection (created with the image assembling plugin), the downloaded image is not the output expected when opened with ImageJ's Bioformats plugin. However, when the user downloads the whole image collection as a zip archive, unzip it and then opens the images within with ImageJ's Bioformats plugin, the output images are correct (the expected output).

What is the expected correct behavior?

Downloading a single image should give the same image coming from the downloaded image collection archive.

Steps to reproduce

Download a single image from an image collection created with the image assembling plugin and open with ImageJ's Bioformats plugin.

Get Argo workflow name after submission

Description
When we submit a workflow with the argo submit command line, Argo will generate a unique name by adding 5 alphanumeric characters at the end of the generateName specified in the workflow YAML metadata section.
Example: currently, if we create a worklow myworkflow, Argo will add 5 alphanumeric characters at the end of the name to make it unique, ie myworkflow123ab.
We would like to get this unique name and store it so we can link the argo ui monitoring page for the workflow from the WIPP frontend.

Proposal

  • in WorkflowConverter add "-" at the end of the generateName (this is standard for this field in Argo, so that there is a separation between name and generated ID)
  • add generatedName attribute to Workflow class
  • in WorkflowConverter, get the name generated by Argo (using the --output name option when submitting workflow, see help of argo submit bellow)

Additional context

$ argo submit --help
submit a workflow

Usage:
  argo submit FILE1 FILE2... [flags]

Flags:
      --entrypoint string       override entrypoint
      --generate-name string    override metadata.generateName
  -h, --help                    help for submit
      --instanceid string       submit with a specific controller's instance id label
      --name string             override metadata.name
  -o, --output string           Output format. One of: name|json|yaml|wide
  -p, --parameter stringArray   pass an input parameter
  -f, --parameter-file string   pass a file containing all input parameters
      --serviceaccount string   run all pods in the workflow using specified serviceaccount
      --strict                  perform strict workflow validation (default true)
  -w, --wait                    wait for the workflow to complete
      --watch                   watch the workflow until it completes

Global Flags:
      --as string                      Username to impersonate for the operation
      --as-group stringArray           Group to impersonate for the operation, this flag can be repeated to specify multiple groups.
      --certificate-authority string   Path to a cert file for the certificate authority
      --client-certificate string      Path to a client certificate file for TLS
      --client-key string              Path to a client key file for TLS
      --cluster string                 The name of the kubeconfig cluster to use
      --context string                 The name of the kubeconfig context to use
      --insecure-skip-tls-verify       If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
      --kubeconfig string              Path to a kube config. Only required if out-of-cluster
  -n, --namespace string               If present, the namespace scope for this CLI request
      --password string                Password for basic authentication to the API server
      --request-timeout string         The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
      --server string                  The address and port of the Kubernetes API server
      --token string                   Bearer token for authentication to the API server
      --user string                    The name of the kubeconfig user to use
      --username string                Username for basic authentication to the API server

Support for tabular inputs

Description
(Include problem, use cases, benefits, and/or goals)
Problem: we cannot add WIPP 3.0 plugins that would be operating on tabular data created, for example, by image-based feature extractors.
Use case: we have a Docker container for NIST DataPlot software that could be turned into a WIPP 3.0 plugin if we had the support for tabular data
Benefits: We will be able to support processing outputs from all feature extraction libraries in WIPP 2.3 and add all functionalities in DataPlot
Goals: create a data type for tabular data

Proposal
(A clear and concise description of what you want to happen)
Add the data type support for tabular data

Additional context
(Add any other context or screenshots about the feature request here)
-Decide whether to support the tabular data formats on .CSV format or/and DataPlot .DAT format.
-Consider upload of tabular data in addition to image-based generation of tabular data

Add Pyramid data type

Description
Pyramid building is a basic feature of WIPP. Therefore, there should be a pyramid data type that could be handled by the core of WIPP and then sent as a parameter to the Pyramid Building plugin.

Proposal
A pyramid data type must be created so it can be passed as an input or output param to the appropriate plugin (e.g. pyramid building, image assembly, feature extraction). it should appear in the workflow yml file when needed just like the Image Collection or Stitching Vectors types.

Convert WIPP 3.0 plugin for converting WDZT annotations to masks

Description
(Include problem, use cases, benefits, and/or goals)
Problem: Users can create image annotations using the WDZT visualization and download a JSON file per image. However, users cannot use the annotated images for AI model training unless a plugin for converting annotations to image masks is available.
Use cases: AI-based segmentation for RPE cells, concrete samples, and organelle videos cannot be executed unless the manual annotations provided by experts are converted to training image masks.
Benefits: users will be able to annotate and train AI-based models in WIPP 3.0
Goals: create a WIPP 3.0 plugin that takes a collection of JSON files (annotations from a collection of images) and generates a collection of image masks. The plugin will have user-specified options whether the masks should follow color, text, or shape attributes of annotations when assigning semantic labels.

Proposal
(A clear and concise description of what you want to happen)
Create a WIPP 3.0 plugin with the conversion code.

Additional context
(Add any other context or screenshots about the feature request here)
Starting point is in https://gitlab.nist.gov/gitlab/WIPP/WIPP-annotation2mask-plugin

Add notes/metadata to image collection

Description
Users would like to have a field where they can save some notes about the image collection, and then be able to search collections based on that text.

Proposal
A text field that the user can fill in and modify if needed with his notes.
Search capabilities for collections based on that text:

Additional context
TBD: search capabilities on the list of metadata files

Unsupported/Default Data type

Description
Right now, WIPP supports a fixed set of data types (images collections, stitching vectors, pyramids, csv files, etc.). As the system evolves, we add more data types, but cannot always keep up with requests from users, or may not want to add some "niche" data types that would not be used by a variety of users.

Proposal
As a first step towards a more flexible way of supporting more data types in WIPP, we want to introduce a new generic data type that would consist of:

  • a folder of files
  • an optional metadata file present in this folder

Proposition for the metadata file:

  • JSON format
  • standard name: data-info.json
  • structure:
{
 "type": "string",
 "description": "string",
 "metadata": "object"
}

If the data-info.json file is present in the folder, the WIPP backend would parse it, extract type and description as strings, and store the metadata part as raw json in the database during the data import. From the WIPP frontend, user will be able to download the folder of data, and see the type, description, and metadata (in highlighted JSON format) if present.

The idea behind having a JSON metadata file parsed by the WIPP backend is:

  • third-party apps (such as Polus ones) that can handle these new types can use the metadata as needed,
  • if some of these types become "standard" WIPP data types later one, we can retroactively recognize them in the WIPP database
  • user gets some extra information, we can add a "search by type name" method to filter the list table in the frontend for example

For now we are not thinking about adding upload capabilities for these unsupported data, they will be generated as outputs of plugins, and could be reused as inputs as well. The folders will be downloadable.

Tentative name in WIPP: Generic Data

Additional context
@Nicholas-Schaub and @sunnielyu, feedback is welcome

Improve job creation handling

Description
The following things should happen when a new job is created:

  • check that name is not null and is unique
  • check that id is null
  • check that wippExecutable is not null
  • set startTime, endTime and error to null
  • set wippVersion to coreConfig.getWippVersion()

Proposal
All these checks should happen is the JobEventHandler class (wipp-backend-core), in the method handleBeforeCreate(Job job)
Checks should happen first (before setting the other properties), and throw ClientException with appropriate message if check fails.
See example in WIPP 2.3:

@HandleBeforeCreate
    public void handleBeforeCreate(ImageProcessingJob job) {
        if (job.getId() != null) {
            throw new ClientException(
                    "Do not specify an id when creating a new job.");
        }
        if (job.getName() == null) {
            throw new ClientException("You must specify a job name.");
        }
        job.setCreationDate(new Date());
        job.setStatus(JobStatus.SUBMITTED);
        job.setStartTime(null);
        job.setEndTime(null);
        job.setError(null);
        job.setWippVersion(coreConfig.getWippVersion());
    }

Additional context
Check in frontend handling of error thrown by backend

Support for TensorFlow models

Description
In order to properly support training and inference plugins using TensorFlow, we need to have a "trained model" data type (similar to the one we had in WIPP 2.3)

Proposal
Create a new data type Trained model with handlers for importing trained models from training jobs and display relevant information to user.

Additional context
@mmajurski waiting for your input about what the structure of the training job output looks like exactly and what the user needs to see before we define the model for this new data type.
Tensorboard linking TBD.
Exact name for this data type can be discussed as well.

cc @mohamedOuladi

MaxUploadSizeExceededException while uploading 1.9MB stitching vector

Summary

MaxUploadSizeExceededException while uploading 1.9MB stitching vector

What is the current bug behavior?

Uploading a new stitching vector with a file size > 1MB throws an exception:
org.springframework.web.multipart.MaxUploadSizeExceededException: Maximum upload size exceeded

What is the expected correct behavior?

No exception is thrown and file is properly handled if its size is less than 3MB.

Steps to reproduce

Data -> Stitching vectors -> New stitching vector (choose file > 1MB)

Relevant screenshots and/or logs

2019-07-15 09:57:26.911 ERROR 77759 --- [nio-8080-exec-1] o.a.c.c.C.[.[.[/].[dispatcherServlet]    : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.web.multipart.MaxUploadSizeExceededException: Maximum upload size exceeded; nested exception is java.lang.IllegalStateException: org.apache.tomcat.util.http.fileupload.FileUploadBase$FileSizeLimitExceededException: The field file exceeds its maximum permitted size of 1048576 bytes.] with root cause

org.apache.tomcat.util.http.fileupload.FileUploadBase$FileSizeLimitExceededException: The field file exceeds its maximum permitted size of 1048576 bytes.

Possible fixes

This is coming from these two properties in the spring app.properties:

# Image upload - Flow.js configurtion
multipart.maxFileSize=3MB
multipart.maxRequestSize=30MB

In the Spring Boot 2 convention, they should be changed to:

spring.servlet.multipart.maxFileSize=3MB
spring.servlet.multipart.maxRequestSize=30MB

See this thread on Stackoverflow: https://stackoverflow.com/questions/28572700/i-am-trying-to-set-maxfilesize-but-it-is-not-honored/28572901#28572901

These properties are used by the FlowJS upload controller, so testing should be carefully done to ensure no issues are introduced in the image upload.

Support for flexVolumes (Rook/CephFS setup)

Description
For dev and testing environments we are using hostPath volumes in our single-node Kubernetes clusters. When a workflow is submitted, the workflow yaml file generated by WIPP-backend contains definitions of new hostPath volumes for input data (read-only) and output data (read/write). Here for example, workflow with two jobs:

volumes:
  - name: "data-volume-input"
    hostPath:
      path: "/data/WIPP-plugins"
  - name: "data-volume-output-5d08443a8870c7246a98b8fe"
    hostPath:
      path: "/data/WIPP-plugins/temp/jobs/5d08443a8870c7246a98b8fe"
  - name: "data-volume-output-5d0844278870c7246a98b8fd"
    hostPath:
      path: "/data/WIPP-plugins/temp/jobs/5d0844278870c7246a98b8fd"

Then volume mounts are defined for each plugin template like so:

volumeMounts:
  - mountPath: "/data/inputs"
    name: "data-volume-input"
    readOnly: true
  - mountPath: "/data/outputs/5d08443a8870c7246a98b8fe"
    name: "data-volume-output-5d08443a8870c7246a98b8fe"
    readOnly: false

However this is not a viable setup for production deployments, which will often be in a multi-node cluster, potentially hosted on the cloud (first target is AWS), with more sophisticated volumes specifications.

Proposal
Ideally, we want to be volume-type agnostic, ie the the type of volume used for hosting WIPP data does not need to be known by the WIPP-backend in order to generate the workflow yaml file.
For that we could use Kubernetes Persistent Volumes and Persistent Volume Claims (PVC), with the following proposal:

  • the WIPP data volume is defined as a Persistent Volume, type and specifications of volume are defined here
  • a matching WIPP data PVC is created,
  • the WIPP-backend pod will use this PVC during deployment to access the WIPP data volume,
  • argo workflows yaml files will reference this PVC as well:
volumes:
  - name: "data-volume-input"
    persistentVolumeClaim:
      claimName: "wippdata-pvc"
  • plugin templates:
volumeMounts:
      - mountPath: "/data/inputs"
        name: "data-volume-input"
        readOnly: true
      - mountPath: "/data/outputs/5d084e188870c72e14a24499"
        name: "data-volume-input"
        readOnly: false
        subPath: "temp/jobs/5d084e188870c72e14a24499"

This also simplifies the yaml generation.

This may not support all volume types, as some types may not supporte ReadWriteMany access mode, which will then become an issue when the workflow tasks are not being executed on the same node where the volume is.

Additional context
cc @ktaletsk please feel free to add any additional information, since you are testing with flexVolumes

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.