Comments (13)
@andreyvelich how do you publish release to PyPi? I took a look at the code and I didn't see any actions doing a release automatically. I reached out to @tenzen-y on this as well.
Currently, for Training Operator we don't have script to automate release process. So, @johnugeorge is publishing SDK manually after we cut the release.
However, for Katib SDK we have this script that we run to publish Images + SDK after the release: https://github.com/kubeflow/katib/blob/master/scripts/v1beta1/release.sh#L85-L97.
Happy to help out and replicate the same here if that would be desirable.
That would be awesome if you could help us to automate releases for Training Operator/Katib.
We have this issue that we created a while ago: kubeflow/katib#2049.
from training-operator.
Happy to help out and replicate the same here if that would be desirable.
from training-operator.
However, for Katib SDK we have this script that we run to publish Images + SDK after the release: https://github.com/kubeflow/katib/blob/master/scripts/v1beta1/release.sh#L85-L97.
So is publishing the image also manual?
We usually publish the operator image by
training-operator/.github/workflows/publish-core-images.yaml
Lines 24 to 26 in 86e0df1
from training-operator.
Thank you for creating this @JamesKunstle.
We publish SDK on each Training Operator release: https://pypi.org/project/kubeflow-training/.
E.g. the latest version is 1.7, so to see the changes for that SDK, you need to check the release-1.7
branch:
https://github.com/kubeflow/training-operator/blob/v1.7-branch/sdk/python/kubeflow/training/api/training_client.py
from training-operator.
What would be the supported path to get the most up-to-date SDK code? The main-branch code does what I want, but not the code that gets pulled when I install the kubeflow-training library
from training-operator.
@andreyvelich how do you publish release to PyPi? I took a look at the code and I didn't see any actions doing a release automatically. I reached out to @tenzen-y on this as well.
from training-operator.
FWIW @andreyvelich for Feast we have the release process fully automated and deployed to PyPi with this action: https://github.com/feast-dev/feast/actions/workflows/release.yml
from training-operator.
Could you try something like this?
pip install git+https://github.com/kubeflow/training-operator.git@master#subdirectory=sdk/python"
I've never installed from a subdirectory before but I think this should work
from training-operator.
@JamesKunstle If you want to get the latest changes for SDK, I added the scripts in this PR: kubeflow/website#3719.
Similar to @anishasthana's comment, you can do this:
pip install git+https://github.com/kubeflow/training-operator.git@7345e33b333ba5084127efe027774dd7bed8f6e6#subdirectory=sdk/python
from training-operator.
On a similar note: we have a ton of github actions we built to automate releases for codeflare. Some links...
- https://github.com/project-codeflare/codeflare-sdk/blob/main/.github/workflows/release.yaml
- https://github.com/project-codeflare/codeflare-operator/blob/main/.github/workflows/project-codeflare-release.yml
from training-operator.
However, for Katib SDK we have this script that we run to publish Images + SDK after the release: https://github.com/kubeflow/katib/blob/master/scripts/v1beta1/release.sh#L85-L97.
So is publishing the image also manual?
from training-operator.
@andreyvelich @anishasthana Okay yeah that works now, I can see the most recent changes. Would really appreciate a more "pypi"-y way of installing the latest release, I think I was getting a fairly old package when I was installing by name from pypi.
from training-operator.
@andreyvelich @anishasthana Okay yeah that works now, I can see the most recent changes. Would really appreciate a more "pypi"-y way of installing the latest release, I think I was getting a fairly old package when I was installing by name from pypi.
Basically, we release SDK when we make another release of Training Operator to keep all component versions consistent: Controller + SDK. That helps us to keep versions stable.
Any thoughts @JamesKunstle ?
from training-operator.
Related Issues (20)
- Migrate to controller-runtime logger HOT 5
- Support CertManager for the Webhook cert generation HOT 1
- Unable to start elastic PyTorchJob example HOT 5
- Commonize webhook validations at the some points
- Update developer documentation for arm HOT 1
- Aunpun1.00 HOT 1
- Update pytorch launcher component in Kubeflow Pipelines repository HOT 3
- Update developer guide to handle missing training-operator-webhook-cert HOT 2
- Job Status is failed, when scale-in ps. HOT 4
- Failed K8s nodes leave jobs hanging indefinitely HOT 3
- Update examples for `train` API HOT 1
- [Question] Training Operator v1.8 Release Date HOT 1
- Why manifests/base/service.yaml does not include webhook server port (443) in version 1.7.0~1.5.0? HOT 7
- Flaky Test: [It] should create desired Pods and Services: Distributed TFJob (4 workers, 2 PS) is succeeded
- MPIJob requires service names for the pods. HOT 3
- Add DeepSpeed Example with MPI Operator HOT 9
- chore(style): provide type for `STORAGE_INITIALIZER_VOLUME` constant
- fix(compatability): match-case syntax only compatible with Python3.10 HOT 5
- Export Fine-Tuned LLM after Trainer is Complete HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from training-operator.