Coder Social home page Coder Social logo

Comments (6)

ioga avatar ioga commented on August 16, 2024 1

I do not know what you mean by this, perhaps you could clarify. To be clear, I am not looking for monetary compensation to add this update -

ah, I apologize for misunderstanding. I've noticed that terraform-aws-eks is sponsored by a consulting/professional services business, and so I assumed you're here on behalf of that business.

it was more of raising it as a discussion first before going down the path of submitting a PR, and just overall align on the approach. If you are open to the idea of updating the EKS portion of the docs, I am offering my services to do so, free of charge 😬

As a background, today we have a det deploy tool which uses cloudformation on AWS and terraform on GCP to spin up determined clusters on raw EC2/GCP nodes. We also have a very raw solution for GKE which sets up a GKE cluster and depoys our helm chart to it.

What I'd like to have in a long term, is det deploy eks which creates an appropriate EKS cluster and deploys our helm chart to it. If I were to break it up into milestones:

  1. Terraform code to create/update/maintain an EKS cluster with autoscaling for two types of instances: GPU instances of configurable type and max count for ML loads, and cheap CPU instances (e.g. m5.xlarge) for lightweight jobs. On GKE it's literally a checkbox, but I've really struggled to set this up on EKS before opening that ticket.
  2. Support for a RDS Postgres instance our helm chart will use for database needs.
  3. Support for a S3 bucket our helm chart will use for (ml model training) checkpoint storage.
  4. Support for a shared AWS EFS filesystem for users home directories and so on.
  5. Put a helm chart on it.

from determined.

ioga avatar ioga commented on August 16, 2024 1

yep, I understand that's a typical approach for terraform ecosystem. However in our product historically we've been targeting ML engineers who do not have any experience with terraform, but want to push a button and get a cluster in a box deployed. In the end of the day, CLI is just a thin wrapper on top of terraform code. Some users elect to bypass the wrapper and take the raw terraform code if they want to consume it that way.

from determined.

ioga avatar ioga commented on August 16, 2024

hello @bryantbiggs ,

thanks a lot for addressing terraform-aws-modules/terraform-aws-eks#3027 . you are right to guess that I've been investigating how we can modernize our EKS support and move from a manual setup to terraform. as an open-source product we'd be happy to take a PR for that.

reading between the lines, I assume you're looking to offer us your professional services. unfortunately we're not able to do that at this time.

from determined.

bryantbiggs avatar bryantbiggs commented on August 16, 2024

I assume you're looking to offer us your professional services. unfortunately we're not able to do that at this time.

I do not know what you mean by this, perhaps you could clarify. To be clear, I am not looking for monetary compensation to add this update - it was more of raising it as a discussion first before going down the path of submitting a PR, and just overall align on the approach. If you are open to the idea of updating the EKS portion of the docs, I am offering my services to do so, free of charge 😬

from determined.

bryantbiggs avatar bryantbiggs commented on August 16, 2024

thank you for sharing that information! I'll put it on my list to try putting together a pattern of running the Determined AI helm chart on EKS and then we can discuss how that fits into the documentation that is currently provided

One thing to keep in mind - most of the Terraform users are used to interacting with Terraform directly, and not through a wrapper CLI. So this is more along the lines of what we provide for folks to help them understand how to achieve a certain outcome. This gives them options for consumption - they can copy+paste it into their environment and deploy it, they can compare the code against their setup if trying to figure out what they may be missing, or they can simply use it as a frame of reference to guide their implementation

from determined.

ioga avatar ioga commented on August 16, 2024

@bryantbiggs can you please share what are you plans and timelines? I'd also like working in that direction, but I don't want to repeat the same work you are doing.

from determined.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.