cdfoundation / sig-mlops Goto Github PK

View Code? Open in Web Editor NEW

590.0 590.0 66.0 286 KB

CDF SIG MLOps

Home Page: https://cd.foundation

License: Apache License 2.0

cdf cicd devop machine-learning ml mlops special-interest-group

sig-mlops's People

Contributors

Stargazers

Watchers

Forkers

markjacksonfishing tdcox edgbr jamesethatcher aronchick iamvarol dedmari almogbaku littlekeith enixdark skp80 itz-antaripa chrisken azizighani wacoder yournextdataguy chunmk stefanodallapalma tarrysingh elsenra fibremint saravanaqb kevinwkc pnrajan lemonez swagshaw rmasinidemelo zhuzeyu22 radhakrishnang tezheng benouinirachid jaafarhammoud ngts-aus nditah 18724799167 awadelrahman sharatthk pritimaheshwary75 sw6820 narasimmansaravana1994 jason-lee-lxx intellect-solution-cmr personal-fork-reps brunoscaglione sampson2016 enricforn diegomarcello ys-jung-1021 zong209 the-intelligence-of-information mohammadreza-asadi-g conradjuta epec254 aarthin22 hanumeshc kvu41 graceyuyushih kesavan-raman westamine acnaweb annefrancine xkuang shashipal95 abdullahmoosa olanix2002 mahiyum gino2013

sig-mlops's Issues

Editing rights on the wiki

https://lists.cd.foundation/g/sig-mlops

Dont thinks I have editing right there @danlopez00 ?

Add a link for a list of references for MLOps

I am curating a list of references for MLOps. Probably this will be useful to add to the roadmap:
https://github.com/visenger/mlops-references

Best regards,
Larysa

The Wiki is completely empty - can be used for community discussions

Using the link to the wiki, it seems empty without any discussions or posts or further exploration. If Slack fulfills the idea for discussions, should the wiki be removed?

Agenda Item: Proposal to focus SIG activities on producing MLOps Roadmap

I would like to formally propose that we focus the activities of the SIG upon collaboratively producing a Roadmap document for MLOps, on an annual cadence.

I suggest that we devote a proportion of each year to gather the key challenges facing the community in the MLOps space, spell out the technology requirements associated with each challenge and consider potential solutions that might be adopted.

Once collated and agreed, this document can then be published and shared as the basis for efficient pre-competitive collaboration on core issues common across the community that will offer maximal benefits to customers if resolved.

It would probably make sense to look ahead five years at this stage and begin to anticipate capabilities that will be required by customers during that period.

I propose the following as a potential template for the structure of the roadmap and I have included some example challenges to give a feel for how we might want to shape the document over the coming year:

https://docs.google.com/document/d/15fhsarEYbXk1yeqTo6KbLArxJuRh4JbgMU3GBpSd-8Y/edit?usp=sharing

We can convert to a more appropriate format for collaborative editing if the proposal is adopted.

Jupyter Notebooks - Roadmap Discussion

Thanks for the comprehensive roadmap, @tdcox! I have one question in the technical requirements section:

Educating data science teams regarding the risks of trying to use Jupyter Notebooks in production

Can you expand on this a bit more? I have found it difficult to make an argument that data scientists should not use their most beloved tool when crafting their deliverables. There are some tractable ways that data scientists can reliably use notebooks in a production workflow:

I am not sure if this is what you meant, but I wanted to pause and have a discussion on this point. I am still reviewing the rest of the document, but I figured I should bring this up since it caught my attention.

cc: @jlewi, @aronchick

[Suggestion] Enable GitHub Discussions

As a DevOps engineer/consultant, I'd love to learn more about ML-Ops. I think that having a "forum" (so 90s) would be great for tracking interesting discussions around the subject.

Enabling GitHub Discussions is a good start. You can start with the basic categories - General, Ideas, Q&A, Show and tell.

Can you treat a trained model like a traditional software dependancy?

A unit test tests the behaviour of a piece of code. When that code includes a dependancy, it is not the job of the unit test to ensure the implementation of the external lib is correct - only that it's API is behaving correctly within the current program's context.

Does this still hold true if the dependancy is a trained model? What if this model continues to train and evolve in production? Is it enough to know the model still passes the assumptions we've explicitly defined in our unit tests, even if the deeper assumptions the model's behaviour is based on have possibly dramatically shifted?

If we do not know the core assumptions of the model, but only it's behaviour it the explicit tests we've constructed, then is this a potential regulatory risk? Say, for example, you've deployed self-driving car software that has resulted in a fatality because your model hit a particular environmental case that was previously not explicitly tested.

How do we comply to regulations? Is the expectation here that you'd test for every possible circumstance? If so, does that not undermine the flexible problem solving ML promises - creating essentially a procedural program by proxy of the unit tests?

Versioning is yet another issue. Should self learning models be treated as 3rd party APIs rather than imported libs?

In short, there is a tension between leveraging the flexibility of models and embedding their desired behaviour in a program - and being ultimately responsible for the unknown: the inner workings of the model.

cdfoundation / sig-mlops Goto Github PK

sig-mlops's People

Contributors

Stargazers

Watchers

Forkers

sig-mlops's Issues

Editing rights on the wiki

Add a link for a list of references for MLOps

Where are the minutes?

The Wiki is completely empty - can be used for community discussions

Agenda Item: Proposal to focus SIG activities on producing MLOps Roadmap

Jupyter Notebooks - Roadmap Discussion

[Suggestion] Enable GitHub Discussions

Can you treat a trained model like a traditional software dependancy?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent