cdfoundation / sig-mlops Goto Github PK
View Code? Open in Web Editor NEWCDF SIG MLOps
Home Page: https://cd.foundation
License: Apache License 2.0
CDF SIG MLOps
Home Page: https://cd.foundation
License: Apache License 2.0
https://lists.cd.foundation/g/sig-mlops
Dont thinks I have editing right there @danlopez00 ?
I am curating a list of references for MLOps. Probably this will be useful to add to the roadmap:
https://github.com/visenger/mlops-references
Best regards,
Larysa
?
Using the link to the wiki, it seems empty without any discussions or posts or further exploration. If Slack fulfills the idea for discussions, should the wiki be removed?
I would like to formally propose that we focus the activities of the SIG upon collaboratively producing a Roadmap document for MLOps, on an annual cadence.
I suggest that we devote a proportion of each year to gather the key challenges facing the community in the MLOps space, spell out the technology requirements associated with each challenge and consider potential solutions that might be adopted.
Once collated and agreed, this document can then be published and shared as the basis for efficient pre-competitive collaboration on core issues common across the community that will offer maximal benefits to customers if resolved.
It would probably make sense to look ahead five years at this stage and begin to anticipate capabilities that will be required by customers during that period.
I propose the following as a potential template for the structure of the roadmap and I have included some example challenges to give a feel for how we might want to shape the document over the coming year:
https://docs.google.com/document/d/15fhsarEYbXk1yeqTo6KbLArxJuRh4JbgMU3GBpSd-8Y/edit?usp=sharing
We can convert to a more appropriate format for collaborative editing if the proposal is adopted.
Thanks for the comprehensive roadmap, @tdcox! I have one question in the technical requirements section:
Educating data science teams regarding the risks of trying to use Jupyter Notebooks in production
Can you expand on this a bit more? I have found it difficult to make an argument that data scientists should not use their most beloved tool when crafting their deliverables. There are some tractable ways that data scientists can reliably use notebooks in a production workflow:
I am not sure if this is what you meant, but I wanted to pause and have a discussion on this point. I am still reviewing the rest of the document, but I figured I should bring this up since it caught my attention.
cc: @jlewi, @aronchick
As a DevOps engineer/consultant, I'd love to learn more about ML-Ops. I think that having a "forum" (so 90s) would be great for tracking interesting discussions around the subject.
Enabling GitHub Discussions is a good start. You can start with the basic categories - General, Ideas, Q&A, Show and tell.
A unit test tests the behaviour of a piece of code. When that code includes a dependancy, it is not the job of the unit test to ensure the implementation of the external lib is correct - only that it's API is behaving correctly within the current program's context.
Does this still hold true if the dependancy is a trained model? What if this model continues to train and evolve in production? Is it enough to know the model still passes the assumptions we've explicitly defined in our unit tests, even if the deeper assumptions the model's behaviour is based on have possibly dramatically shifted?
If we do not know the core assumptions of the model, but only it's behaviour it the explicit tests we've constructed, then is this a potential regulatory risk? Say, for example, you've deployed self-driving car software that has resulted in a fatality because your model hit a particular environmental case that was previously not explicitly tested.
How do we comply to regulations? Is the expectation here that you'd test for every possible circumstance? If so, does that not undermine the flexible problem solving ML promises - creating essentially a procedural program by proxy of the unit tests?
Versioning is yet another issue. Should self learning models be treated as 3rd party APIs rather than imported libs?
In short, there is a tension between leveraging the flexibility of models and embedding their desired behaviour in a program - and being ultimately responsible for the unknown: the inner workings of the model.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.