replicate / cog Goto Github PK

View Code? Open in Web Editor NEW

7.6K 67.0 530.0 7.18 MB

Containers for machine learning

Home Page: https://cog.run

License: Apache License 2.0

Go 45.79% Python 52.78% Shell 1.42%

containers cuda deep-learning docker machine-learning pytorch tensorflow ai

cog's Introduction

Cog: Containers for machine learning

Cog is an open-source tool that lets you package machine learning models in a standard, production-ready container.

You can deploy your packaged model to your own infrastructure, or to Replicate.

Highlights

📦 Docker containers without the pain. Writing your own Dockerfile can be a bewildering process. With Cog, you define your environment with a simple configuration file and it generates a Docker image with all the best practices: Nvidia base images, efficient caching of dependencies, installing specific Python versions, sensible environment variable defaults, and so on.
🤬️ No more CUDA hell. Cog knows which CUDA/cuDNN/PyTorch/Tensorflow/Python combos are compatible and will set it all up correctly for you.
✅ Define the inputs and outputs for your model with standard Python. Then, Cog generates an OpenAPI schema and validates the inputs and outputs with Pydantic.
🎁 Automatic HTTP prediction server: Your model's types are used to dynamically generate a RESTful HTTP API using FastAPI.
🥞 Automatic queue worker. Long-running deep learning models or batch processing is best architected with a queue. Cog models do this out of the box. Redis is currently supported, with more in the pipeline.
☁️ Cloud storage. Files can be read and written directly to Amazon S3 and Google Cloud Storage. (Coming soon.)
🚀 Ready for production. Deploy your model anywhere that Docker images run. Your own infrastructure, or Replicate.

How it works

Define the Docker environment your model runs in with cog.yaml:

build:
  gpu: true
  system_packages:
    - "libgl1-mesa-glx"
    - "libglib2.0-0"
  python_version: "3.12"
  python_packages:
    - "torch==2.3"
predict: "predict.py:Predictor"

Define how predictions are run on your model with predict.py:

from cog import BasePredictor, Input, Path
import torch

class Predictor(BasePredictor):
    def setup(self):
        """Load the model into memory to make running multiple predictions efficient"""
        self.model = torch.load("./weights.pth")

    # The arguments and types the model takes as input
    def predict(self,
          image: Path = Input(description="Grayscale input image")
    ) -> Path:
        """Run a single prediction on the model"""
        processed_image = preprocess(image)
        output = self.model(processed_image)
        return postprocess(output)

Now, you can run predictions on this model:

$ cog predict -i [email protected]
--> Building Docker image...
--> Running Prediction...
--> Output written to output.jpg

Or, build a Docker image for deployment:

$ cog build -t my-colorization-model
--> Building Docker image...
--> Built my-colorization-model:latest

$ docker run -d -p 5000:5000 --gpus all my-colorization-model

$ curl http://localhost:5000/predictions -X POST \
    -H 'Content-Type: application/json' \
    -d '{"input": {"image": "https://.../input.jpg"}}'

Why are we building this?

It's really hard for researchers to ship machine learning models to production.

Part of the solution is Docker, but it is so complex to get it to work: Dockerfiles, pre-/post-processing, Flask servers, CUDA versions. More often than not the researcher has to sit down with an engineer to get the damn thing deployed.

Andreas and Ben created Cog. Andreas used to work at Spotify, where he built tools for building and deploying ML models with Docker. Ben worked at Docker, where he created Docker Compose.

We realized that, in addition to Spotify, other companies were also using Docker to build and deploy machine learning models. Uber and others have built similar systems. So, we're making an open source version so other people can do this too.

Hit us up if you're interested in using it or want to collaborate with us. We're on Discord or email us at [email protected].

Prerequisites

macOS, Linux or Windows 11. Cog works on macOS, Linux and Windows 11 with WSL 2
Docker. Cog uses Docker to create a container for your model. You'll need to install Docker before you can run Cog. If you install Docker Engine instead of Docker Desktop, you will need to install Buildx as well.

Install

If you're using macOS, you can install Cog using Homebrew:

brew install cog

You can also download and install the latest release using our install script:

# fish shell
sh (curl -fsSL https://cog.run/install.sh | psub)

# bash, zsh, and other shells
sh <(curl -fsSL https://cog.run/install.sh)

# download with wget and run in a separate command
wget -qO- https://cog.run/install.sh
sh ./install.sh

You can manually install the latest release of Cog directly from GitHub by running the following commands in a terminal:

sudo curl -o /usr/local/bin/cog -L "https://github.com/replicate/cog/releases/latest/download/cog_$(uname -s)_$(uname -m)"
sudo chmod +x /usr/local/bin/cog

Alternatively, you can build Cog from source and install it with these commands:

make
sudo make install

Or if you are on docker:

RUN sh -c "INSTALL_DIR=\"/usr/local/bin\" SUDO=\"\" $(curl -fsSL https://cog.run/install.sh)"

Upgrade

If you're using macOS and you previously installed Cog with Homebrew, run the following:

brew upgrade cog

Otherwise, you can upgrade to the latest version by running the same commands you used to install it.

Next steps

Get started with an example model
Get started with your own model
Using Cog with notebooks
Using Cog with Windows 11
Take a look at some examples of using Cog
Deploy models with Cog
cog.yaml reference to learn how to define your model's environment
Prediction interface reference to learn how the Predictor interface works
Training interface reference to learn how to add a fine-tuning API to your model
HTTP API reference to learn how to use the HTTP API that models serve

Need help?

Join us in #cog on Discord.

Contributors ✨

Thanks goes to these wonderful people (emoji key):

_{Ben Firshman} 💻 📖	_{Andreas Jansson} 💻 📖 🚧	_{Zeke Sikelianos} 💻 📖 🔧	_{Rory Byrne} 💻 📖 ⚠️	_{Michael Floering} 💻 📖 🤔	_{Ben Evans} 📖	_{shashank agarwal} 💻 📖
_VictorXLR 💻 📖 ⚠️	_{hung anna} 🐛	_{Brian Whitman} 🐛	_JimothyJohn 🐛	_ericguizzo 🐛	_{Dominic Baggott} 💻 ⚠️	_{Dashiell Stander} 🐛 💻 ⚠️
_{Shuwei Liang} 🐛 💬	_{Eric Allam} 🤔	_{Iván Perdomo} 🐛	_{Charles Frye} 📖	_{Luan Pham} 🐛 📖	_TommyDew 💻	_{Jesse Andrews} 💻 📖 ⚠️
_{Nick Stenning} 💻 📖 🎨 🚇 ⚠️	_{Justin Merrell} 📖	_{Rurik Ylä-Onnenvuori} 🐛	_Youka 🐛	_{Clay Mullis} 📖	_Mattt 💻 📖 🚇	_{Eng Zer Jun} ⚠️
_BB 💻	_williamluer 📖	_{Simon Eskildsen} 💻	_F 🐛 💻	_{Philip Potter} 🐛 💻	_{Joanne Chen} 📖	_technillogue 💻
_{Aron Carroll} 📖 💻 🤔	_{Bohdan Mykhailenko} 📖 🐛	_{Daniel Radu} 📖 🐛	_{Itay Etelis} 💻	_{Gennaro Schiano} 📖

This project follows the all-contributors specification. Contributions of any kind welcome!

cog's People

Stargazers

Watchers

Forkers

bfirsh rorybyrne andreasjansson filippobrizzi amarjandu ianherri imshashank bigpurpletoast bencevans victorxlr xtynger shalevy1 dhruvinsinh leopiney clbarnes mbrukman dashstander deneutoy undercontroller wx-b useada kennivelez maggotttt conexuz kanapazombie oadeniyi23 plint-peklund adbmd nicolasanjoran cogitovsmachina mysticaltech nanderoo ssahgal agporto yakshavingcatherder eternalerrors chuckhend mokpolar cprakashagr lcsouzamenezes st7ma784 iperdomo santialferez adhityaraar auaan enginbozkurt manikant92 abhinavm24 deanofthewebb josrod napo178 kp-forks marencc benjamin-ky dhee2211 charlesfrye dharaneepatel15 emmastarkk raghadalnouri qiuzhuang abhilb elazarg anas-zafar takshan weiquangreenphyto zeyefkey camiloyate09 uneidel arbruzaz xkey- wes-kay edenweb1 creative-research-project-v1-1 laplacekorea nicholascelestin marcus-arcadius hoangthanh283 daniyalt uakbr joshuaword2alt diamondgeisha nightmareai gqadonis tkoe2 ske159 strategist922 jags111 chrilledallas74 ninetailskim tuanbc yaroslavivanov2901 sakunaharinda navaneeth-sharma szajmon66 igyorfi sa1an tungvuthanh stevethebloody yoursimpcard spiderking1108

cog's Issues

Predict shouldn't have network access

It's common for models to download weights in the setup() function. This isn't reproducible (models might run without a network connection and weights files can disappear from the internet) so we should discourage it. In cases where you actually need network access we can provide a config option to allow the model to hit the network.

The tricky bit is allowing incoming access for the HTTP server, while disallowing outgoing connections. On a cursory search, there seems to be no simple way to do this without iptables rules on the host, or the container to a private network and using another container as a proxy. Some creativity might be needed.

As a start, perhaps we could bodge it inside the container. That way we can guarantee it isn't downloading any files for reproducibility reasons, but doesn't have any security guarantees.

There are some cases where network access might be needed. A few ideas:

We don't add any way of allowing network access now, and see how far we get. Maybe we don't need to add an option at all.
If you want network access, you need to run it via Docker directly.
We add an option in cog.yaml to allow network access. However -- Untrusted models not having network access is a neat security feature, so it would be a shame to allow model creators to break this.
We add a runtime option to allow network access. For users who are in control of their environment, this lets them do weird stuff like this. "Turn this off at your own risk."

[This issue has been authored by @andreasjansson and @bfirsh.]

Webhooks don't get sent until both builds have finished

Make Dockerfile generation more robust

There is a lot of string concatenation without escaping and things like that. This feels like it needs a DSL.

Perhaps the simplest way would be to put all generated data inside environment variables, which can easily be escaped, then carefully input that data in fixed commands through the rest of the Dockerfile.

We can also do stuff like use the RUN ["foo", ...] form.

Ideally we wouldn't use Dockerfiles at all, which is a larger piece of work in #21.

Repositories shouldn't require username

If I'm running my own server, I should just be able to point at http://10.1.1.1/hotdog-detector

Output of build should not display as console log messages

═══╡ Building gpu image
═══╡   * Installing Python prerequisites
═══╡   * Installing Python 3.8
═══╡   * Installing system packages
═══╡   * Installing Python packages
═══╡ #11 sha256:6ba92f3047b5dec04235ade8528c87bc142e66bb38015765dc4f9cbb7d185cd8
═══╡ #11 DONE 0.5s
═══╡
═══╡ #12 [ 9/15] RUN pip install -f
   │ https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/index.html -f
   │ https://download.pytorch.org/whl/cu101/torch_stable.html
   │ --extra-index-url=git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI
   │ cachetools==4.1.0 chardet==3.0.4 future==0.18.2 fvcore==0.1.dev200506
   │ idna==2.9 importlib-metadata==1.6.0 jsonpatch==1.25 jsonpointer==2.0
   │ markdown==3.2.2 mock==4.0.2 opencv-python==4.3.0.38 portalocker==1.7.0
   │ pyasn1==0.4.8 pyasn1-modules==0.2.8 pydot==1.4.1 requests==2.23.0
   │ requests-oauthlib==1.3.0 rsa==4.0 tabulate==0.8.7 termcolor==1.1.0
   │ urllib3==1.25.8 visdom==0.1.8.9 websocket-client==0.57.0 werkzeug==1.0.1
   │ yacs==0.1.7 zipp==3.1.0 cython==0.29.22 pyyaml==5.1 dominate==2.4.0
   │ detectron2==0.1.2 torch==1.5.0 torchvision==0.6.0 pycocotools==2.0.2
   │ ipython==7.21.0 scikit-image==0.18.1
═══╡ #12 sha256:84e81bafd53e4595c28450182c770765e490c57fe351dd48fbad3418b7ad1697
═══╡ #12 1.283 Looking in indexes: https://pypi.org/simple,
   │ git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI
═══╡ #12 1.283 Looking in links:
   │ https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/index.html,
   │ https://download.pytorch.org/whl/cu101/torch_stable.html
═══╡ #12 1.370 WARNING: Cannot look at git URL
   │ git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI/cachetools/
   │ because it does not support lookup as web pages.
═══╡ #12 2.202 Collecting cachetools==4.1.0
═══╡ #12 2.235   Downloading cachetools-4.1.0-py3-none-any.whl (10 kB)
═══╡ #12 2.265 WARNING: Cannot look at git URL
   │ git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI/chardet/
   │ because it does not support lookup as web pages.
═══╡ #12 2.942 Collecting chardet==3.0.4
═══╡ #12 2.950   Downloading chardet-3.0.4-py2.py3-none-any.whl (133 kB)
......

Console ═══╡ messages are for displaying messages to the user, not for large amounts of debugging output. The build output should be displayed as plain text, clearly delineated from the informational messages.

inst-colorization build runs out of memory with default Docker for Mac install

pip install in building GPU image throws OOM with standard 4GB of memory. (Within Docker, not outside.)

Concept of "workdir" doesn't map to Docker's concept of "workdir"

Cog's is relative to /code, Docker's is absolute. This is quite confusing, particularly when cog run gets involved.

As far as I can see, the intention of cog.yaml's workdir is two-fold: to set PYTHONPATH correctly, and as a shortcut to set the directory for post_install.

Require users to set `python_version` in cog.yaml

This will stop models from breaking when we update our default.

Build Docker images with buildkit directly

Instead of using Dockerfiles. Concatenating strings is fragile. https://www.docker.com/blog/compiling-containers-dockerfiles-llvm-and-buildkit/

Note that we use buildkit to build Dockerfiles for CPU. This is about calling the buildkit API directly instead of going via a Dockerfile.

We could also call the Docker API to create and commit containers, emulating the build process.

A nice side-effect of using Dockerfiles is we can generate a Dockerfile for users if they want to "eject" from Cog.

Related to #165

Local image management

As part of implementing local mode and on server (#18) we need a sensible way of managing local images.

Requirements

Tags should not grow uncontrollably, so that images can be cleaned up by docker system prune
The image should be given a name so you can identify it in docker images
The previous image should be removed

For clarity, this is additional work on top of #18. This also includes picking a sensible name when running locally and you aren't pointed at a registry.

`cog build log` doesn't display output when build fails

It should use the same TerminalLogger as cog run.

Enable auth by default?

Server eats up all disk space very quickly

Server should probably delete all local images besides the most recent one. That way caching still works, but we don't infinitely use up disk space.

Design prediction API versioning

I want to rename /infer to /predict but we can't change it because all the old models have it and there is no way for Cog to detect what version it is and what it should call.

Alternatively, perhaps it is the client which should detect what version of Cog the model has been made with, and adjust its API calls as appropriate.

End to end tests don't work on macOS

From #118, the end to end tests can't connect to the bridge IP on macOS:

This should either be run in a consistent dev environment, or if we actually want to run the end to end tests on macOS (which probably makes sense in CI?), then we might need some kind of OS-based switch in there.

Better explain what signing commits and DCO is about

It's not clear that it's just a string attached to your commit message. It looks scary. https://github.com/replicate/cog/blob/main/CONTRIBUTING.md

Versions with different file trees get the same ID

Steps to reproduce:

cog push
touch test
cog push

This will produce a version with the same ID. (And will presumably fail once #90 is in.)

Define path types without having to import pathlib

This is clumsy to explain in getting started, and means we have to have extra imports in all the model definition docs. Also a thing users will stumble on.

`cog predict` should show progress when pulling an image

Currently hangs.

Better error message when `Model.setup()` is not set

If you don't set a setup() function, you get an incomprehensible error:

═══╡ Traceback (most recent call last):
   │   File "/usr/bin/cog-http-server", line 8, in <module>
   │     cog.HTTPServer(Model()).start_server()
   │ TypeError: Can't instantiate abstract class Model with abstract methods setup
   │
═══╡ Container exited unexpectedly

My feeling is we should require setup() to encourage users to do the right thing, but there should be a clearer error message.

Pushing version with same ID to different model fails silently

We ran into this yesterday but haven't reproduced locally. Needs confirmation.

End to end tests should clean up after themselves

My machine is filled with this stuff.

Local `cog predict` doesn't print full error when setup fails

I think we're holding StepGroup wrong.

Throw error if invalid keys exist in cog.yaml

Currently invalid keys are silently ignored.

For example, a cog.yaml that contains this:

buildd:
  python_version: "3.8"
  python_packages:
    - "torch=1.8.0"

Will silently not install torch, instead of complaining that the key buildd doesn't exist.

Future

We may also want to validate the values at some point. For example, ensuring python_version is a string and matches a given pattern. (If you omit the quotes, 3.1 and 3.10 are indistinguishable! Thanks YAML!)

(Edited by @andreasjansson and @bfirsh.)

Repositories should start with http://

I think we decided on this, but doesn't seem to be the case.

Jupyter Notebook Integration

Would love to see some kind of notebook integration. Can we expose the environment built in Docker as a Jupyter Notebook kernel possibly?

Cog should work in subdirectories

Cog should search up the file tree for cog.yaml, like Keepsake does.

For example, if /home/ben/hotdog-detector/cog.yaml exists, then I should be able to run cog predict in /home/ben/hotdog-detector/subdir/ and it should do what I expect.

There is some nuance here with cog run. Should the working directory be the relative current directory inside the container?

Installing `cog.py` busts Docker cache because it chooses a random filename

It could be a fixed filename, but maybe it isn't that so as not to cause a race condition with multiple builds? Perhaps it could be a hash of the content.

Cog server stores data in current directory by default

It is unexpected behavior that the server behaves differently in different working directories. Simplest solution here might be that you need to explicitly specify where data is stored.

Build log output is missing pushing model

In cog build log, there is no line saying it is pushing the model, which is a time-consuming process.

Throw error if version already exists

Support DO_NOT_TRACK

Proposal here: https://consoledonottrack.com/

"run" and "infer" used interchangeably

It's run() in Python but cog infer on CLI. We need to decide on the verb and stick to it.

Run inference on CPU image before building GPU image

When I'm iterating, I have to wait for two images to build before I know it's completely broken.

Fix content type isn't multipart/form-data bug on push

Seeing this again in CI.

https://github.com/replicate/cog/runs/2713873111

Default values are sometimes the string "None"

Use separate `required` option for `@cog.input()`

The double behavior of default is not obvious.

This might want to be optional=True instead, so that inputs are required by default. If the inputs were not required by default, then users normally would make them required. If required inputs are not marked as required, this will cause breakage.

Design how to run arbitrary scripts

There is preinstall but it implies that it comes before installing other things, but it doesn't -- it becomes before copying code. We should:

Figure out the "default" place to run arbitrary commands
Figure out the right name for it

"Project dir" needs a better name

"Project" is used nowhere else.

Support streaming real-time prediction

Currently Cog requires you to upload an input file, then it's processed and results are returned. But there are cases when you might want to stream a continuous input to the model. For example, if you have a model that does audio event detection, you might want to display the current event as it happens.

gnutls_handshake() failed: The TLS connection was non-properly terminated.

cog build
═══╡ Uploading /Users/tekumara/code3/cog-examples/inst-colorization to localhost:8080/examples/inst-colorization
⠋ uploading (925 MB, 269.985 MB/s) ═══╡ Building model...
═══╡ Received model
═══╡ Building cpu image
═══╡   * Installing Python prerequisites
═══╡   * Installing Python 3.8
═══╡   * Installing system packages
═══╡   * Installing Python packages
═══╡   * Installing Cog
═══╡   * Copying code
═══╡ Successfully built 507cf5936fd9
═══╡ Pushing localhost:5000/inst-colorization:507cf5936fd9 to registry
═══╡ Building gpu image
═══╡   * Installing Python prerequisites
═══╡   * Installing Python 3.8
═══╡  ---> Using cache
═══╡  ---> 68aac6e4699f
═══╡ Step 8/20 : RUN curl https://pyenv.run | bash && 	git clone https://github.com/momo-lab/pyenv-install-latest.git "$(pyenv root)"/plugins/pyenv-install-latest && 	pyenv
   │ install-latest "3.8" && 	pyenv global $(pyenv install-latest --print "3.8")
═══╡  ---> Running in ae5b74d815ca
═══╡   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
═══╡                                  Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   285  100   285    0     0    198      0  0:00:01  0:00:01 --:--:--   198  0
═══╡ Cloning into '/root/.pyenv'...
═══╡ Cloning into '/root/.pyenv/plugins/pyenv-doctor'...
═══╡ Cloning into '/root/.pyenv/plugins/pyenv-installer'...
═══╡ Cloning into '/root/.pyenv/plugins/pyenv-update'...
═══╡ fatal: unable to access 'https://github.com/pyenv/pyenv-update.git/': gnutls_handshake() failed: The TLS connection was non-properly terminated.
═══╡ Failed to git clone https://github.com/pyenv/pyenv-update.git
═══╡ Error: Failed to build Docker image: exit status 255

High CPU usage during the build.

Batch version of `cog predict`

Use case: I have a folder of images, I want them all colorized. At the moment you have to wait a minute for the model to boot for each image.

It should also support reading from a file, as requested by @DeNeutoy.

(written by @andreasjansson @bfirsh)

When starting Docker image and running setup(), it should show log output

When running locally, it'd be neat if this displayed the log output from your program, like it displayed output when building:

Add future compatibility for server API

The server should be able to say "this client is not supported, you should upgrade!" elegantly.

Inputs in metadata should be a list, not a dict

Sorting is significant.

Run end-to-end tests inside Docker, Poetry, Virtualenv, or something like that

Currently depends on Python dependencies installed globally.

Cog stalls when pushing large, cached files

Steps to reproduce:

Push some large files
Push again

It now stalls saying ⠙ uploading (3.1 kB, 0.494 kB/s). Perhaps it's calculating some hashes, or something. Whatever it's doing it should show progress instead of looking broken.

`cog predict` should print JSON and plain text output to stdout by default

If you don't pass -o, it should just print the output to stdout. This is a regression. It was working at some point.

Rename "run arguments"

We don't use the word "arguments" anywhere else. This is "input types" or "inputs" or something along those lines?

Sensible defaults for `.cogignore`

As suggested by @zeke.

.npmignore defaults to .gitignore, but there is a dangerous silent failure in that: Suppose .gitignore ignores secrets.json. If you then you add .npmignore with something new you want to ignore, it stops inheriting from .gitignore therefore unignoring secrets.json.

There is also an additional consideration for machine learning models: .gitignore will normally ignore your model weights, but you want to include that for Cog, so maybe in this case the default would always be not what you want. In which case, it probably shouldn't be the default.

Maybe we need some sensible defaults that are clear to the user? Maybe there's something clever we can do based on .gitignore?