funcx-faas / funcx-container-service Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
#Problem
The current main
branch is no longer building
It looks like a library is failing safety check. Fix this and insure that docker image is published
We need to research exactly how tasks are scheduled and run in the FastAPI service.
Does it depend on uvicorn settings?
As a DLHub user I want to submit a specification so my container can be built for me
We will add a new endpoint where a user can provide a Repo2Docker style specification for a new container to be built.
Payload to the new POST endpoint: `/containers/build/:
"type": "docker",
"specification": {
"apt": ["apt package 1", "apt-package 2"],
"pip":["pypi library", "pypi library==specific version"],
"conda": ["conda dependency", "conda dependency==specific version"]
}
Given I have a valid docker container specification JSON When I submit to the WebService Then I receive a container UUID and a container record is created and the specification is persisted with a status of Submitted.
type
must be "docker"specification
property to Container
model: JSON container build specificationstatus
property to the Container
model. Valid states are "Submitted", "Building", "Build Failed", "Ready"As a container Service User I want to re-use identical containers so I can conserve resources
ready
then just return the container UUID of the previous container requestGiven I have successfully build a container for a spec When I submit a container build request for a functionally identical spec Then the container service is not invoked And I receive the UUID of the previously built container
Given I have successfully build a container for a spec When I submit a container build request for a different spec Then the container service is invoked And I receive a new UUID for the new container
It's hard to specify the python version of the built container.
You add this to the container spec:
conda=[
"python=3.10"
]
It would be better if the SDK took care of this and you can externalize the specification
As a DLHub user I want the container service to build a docker image based on my specification so I can find an image to run my model
Given I have a valid docker container specification JSON When I submit to the WebService Then the container service starts the build and the status of the container record will be Building
/build
operation. The payload will include the UUID of the container along with the container type (always docker) and the specification./containers/<uuid>/status
PUT operation to allow the Container service to update the statusContainer Service builds should be run in Kubernetes jobs so we can scale
As a funcX function author I want the container service to build a Dockerfile based on simple specifications
docker build
this Dockerfile I get a valid imageFor security purposes we don't want to allow the container service to be able to access any other components in the funcX stack. There is a potential vulnerability due to docker build running as root.
We already have a node tag for the specially prepared nodes that can run container service. Update the helm chart so the other services have a nodeSelector that excludes that tag.
There is a dangling dependency on SQLAlchemy in requirements.txt - this library now has a vulnerability and is failing the safety check. It's not a good idea to have libraries floating around that we don't need.
Remove this and see if there are other unused libraries.
Also look at the Dockerfile to see if we need the apt packages that are installed:
RUN apt-get update && \
apt-get install -y gcc musl-dev && \
apt-get install -y postgresql libffi-dev g++ make git
As a funcX function author I want the container service to operate asynchronously so I can get on with my work while the container builds
Add a /version endpoint to the service so the WebService can interrogate the version. It will also be used as a healthz check.
/version
endpointversion.py
{
version: 1.0
}
The container build status REST endpoint only makes sense for containers that were built by the container service. Right now if the endpoint is invoked on a manually created container it throws a ContainerNotFound
exception which is not exactly correct or easy to make sense of.
Create a new exception class ContainerStatusNotValid
to report this correctly
As a container service user I want to be able to use conda packages that are not available in the default channel so I can get the libraries I need
conda_channels
property to the container_spec. It will be an optional list of conda channelschannels
property in the generated environment.yaml
fileWe need to deploy an instance of funcX that supports the container service into AWS for widespread testing.
Add code coverage reporting to the build.
Increase it at least to 80%
Kevin ran across container_size (in funcx-services/web-service/funcx_web_service/schemas/container.py, specified in ContainerBuildStatusUpdate):
class ContainerBuildStatusUpdate(BaseModel):
...
container_size: float = Field(default=0.0, description="Container size in bytes")
...
That is, the description lists "bytes" as the unit, but the data type is listed as a float? Does that seem correct to you? I would rather expect that to be a non-negative integer, like:
container_size: NonNegativeInt = Field(default=0, description="...")
But that matches the spec in the funcX Container Service code, which is, currently:
class CompletionSpec(BaseModel):
...
container_size: float = 0
...
As a funcX developer I want to deploy the container service via helm chart so I can easily deploy the full stack
values.yaml
that enables the container servicekubectl port-forward
Notes:
Start by perfecting the deployment of a dev environment following the instructions in the helm chart repo
This should just be new deployment and service templates. Hopefully the service can be further configured with a config file mounted in the pod in the same way
As an author of funcX functions I want there to be a service to build custom containers so I can offer functions that run across compute environments
Microservices in funcX may need to communicate with the funcX app, however they do not have any credentials to access the endpoints.
Make these REST endpoints free from any auth checks and only allow access from inside the Kubernetes cluster.
Update the Ingres to add a /v2/internal
route.
- path: /v2/internal
pathType: Prefix
backend:
service:
name: {{ .Values.app.ingress.defaultBackend }}
port:
number: 80
This causes any attempts to reach these endpoints to route users to an error page.
Update the app and the container service to use this new route
When the container service build fails, no status update is sent to the WebService. The status of the build remains building
As a funcX administrator I want the container service to persist container build requests so I can understand the types of containers being used by deployed functions
Set it initially to 30 minutes. Should be a config option that can be set in the helm chart
We want to explicitly allow certain globus users to build containers and not allow anyone else.
Protect the buildContainer endpoint with a globus group
Docker images are built by the container service and are kept in the local docker image registry forever.
Add a cron job to the FastAPI server to occasionally issue a docker system prune --all
command to clear old layers out. This will make the next build slow since we lose the expensive repo2docker base image.
Make the timing of the purge job configurable through the FastAPI config file.
There are a few error conditions that end with a sys.exit(-1).
These will cause the entire server to stop. Remove
Tested
As a container Service user I want a provided zip file bundled into the image so I can use assets that are otherwise not installable
Want to see how it performs when several requests are simultaneously submitted to the service
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.