Coder Social home page Coder Social logo

hydrospheredata / hydro-serving-sdk Goto Github PK

View Code? Open in Web Editor NEW
7.0 10.0 2.0 9.44 MB

Python SDK for the Hydrosphere.io project.

Home Page: https://hydrospheredata.github.io/hydro-serving-sdk/

License: Apache License 2.0

Python 15.36% Groovy 0.60% CSS 1.54% JavaScript 5.68% Makefile 0.06% HTML 76.69% Batchfile 0.06%
hydrosphere python python-sdk sdk

hydro-serving-sdk's People

Contributors

akastav avatar bulbawarrior avatar github-actions[bot] avatar hydrorobot avatar kineticcookie avatar mkf-simpson avatar techkuz avatar tidylobster avatar valenzione avatar vixtir avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hydro-serving-sdk's Issues

De-/Serialization of entities

We have YAML resource definitions for every serving entity. Need to migrate and refactor the code from CLI to the SDK.

Model.list_models(hs_cluster) raises error

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/elizavetabatanina/hydro/private_hydrosphere_examples/venv/lib/python3.6/site-packages/hydrosdk/model.py", line 438, in list_models
    model_contract = contract_from_dict(model_version_json["modelContract"])
  File "/Users/elizavetabatanina/hydro/private_hydrosphere_examples/venv/lib/python3.6/site-packages/hydrosdk/contract.py", line 287, in contract_from_dict
    output_item = field_from_dict_new(item.get("name"), item)
  File "/Users/elizavetabatanina/hydro/private_hydrosphere_examples/venv/lib/python3.6/site-packages/hydrosdk/contract.py", line 120, in field_from_dict_new
    raise ValueError("Invalid contract: {} field has invalid datatype {}".format(name, dtype))
ValueError: Invalid contract: metric_value field has invalid datatype DT_INVALID

Set up CI/CD

Need Jenkins support to run tests and release artefacts.

All contract fields names are equal to "model"

All contract fields names are equal to "model"

This code

cluster = Cluster("https://cluster/")
model = Model.find_by_id(c, 4)
[tensor.name for tensor in m.contract.predict.inputs]

will return

['model',
 'model',
 'model',
 'model',
 'model',
 'model',
 'model',
 'model',
 'model',
 'model',
 'model',
 'model']

instead of proper field names like ['age', 'sex', ... ]

Rename with_signature to with_contract

Concept of a signature isn't preserved in the ecosystem anymore, so it's better to stick with a common terminology. Contracts are used in the serving.yaml.

Remove with_name building block

Why do we need a separate with_name building block for Model?

model = (
    sdk.Model()
    .with_name('RandomForest')
)

Creating Monitoring is performed differently.

monitoring = sdk.Monitoring('RandomForestMetric')

Error during MetricSpec.create - Invalid ThresholdCmpOperator kind Gt

I want to attach a metric to a model, but some configurations of MetricSpecConfig raise an Exception


Steps to reproduce:

Following code

import hydrosdk
c = hydrosdk.cluster.Cluster("http://localhost")
m = hydrosdk.model.Model.find(c, "adult_classification", 1)

metric_config = hydrosdk.monitoring.MetricSpecConfig(m.id, 4.2, hydrosdk.monitoring.TresholdCmpOp.GREATER)
hydrosdk.monitoring.MetricSpec.create(c, "test_metric", m.id, metric_config)

Raises an execption

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-21-5e9ddd6c8eab> in <module>
----> 1 hydrosdk.monitoring.MetricSpec.create(c, "test_metric", m.id, metric_config)

~/anaconda3/lib/python3.7/site-packages/hydrosdk-2.2.0.dev0-py3.7.egg/hydrosdk/monitoring.py in create(cluster, name, model_version_id, config)
     66         else:
     67             raise Exception(
---> 68                 f"Failed to create a MetricSpec. Name={name}, model_version_id={model_version_id}. {resp.status_code} {resp.text}")
     69 
     70     @staticmethod

Exception: Failed to create a MetricSpec. Name=test_metric, model_version_id=2. 400 The request content was malformed:
Invalid ThresholdCmpOperator kind Gt

This error happens also with hydrosdk.monitoring.TresholdCmpOp.GREATER_EQ as an argument.

All other variations of hydrosdk.monitoring.TresholdCmpOp wor fine

Implement local serving

Implement the deployment mechanism for a LocalModel and Model classes.

In case of LocalModel - you have all the required metadata and payload. You can just build a docker image and run it. Dockerfile for the model image could be found here: https://github.com/Hydrospheredata/hydro-serving-manager/blob/master/src/main/scala/io/hydrosphere/serving/manager/domain/model_build/BuildScript.scala

With Model it's quite easy - the manager service already did heavy lifting for you and pushed the resulting image to registry. Pull and run.

Implement training-data upload

LocalModel.training-data contains path to local csv file or path to object in s3 bucket

Implement the data upload as in CLI.

High level interface to send requests to models

Currently we have GRPC service definitions and HTTP endpoint for gateway.
GRPC operates on proto messages which are not convenient to use in Python. And JSON requires de-/serializers to use with numpy types.

We need to design an interface to seamlessly intregrate numpy/pandas data types and send to model data ingestion.

Mock HTTP requests in tests

Tests use requests library to send messages to locally deployed serving cluster.
It's essential that they are mocked for instance with requests-mock.

Marked as bug because Jenkins is unable to run tests.

Mock data for a contract

Need to generate appropriate mocks using numpy data types for a ModelContract and ModelSignature.

There are some tests, but they might be broken.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.