Coder Social home page Coder Social logo

zenml-io / zenml Goto Github PK

View Code? Open in Web Editor NEW
3.7K 40.0 400.0 342.19 MB

ZenML ๐Ÿ™: Build portable, production-ready MLOps pipelines. https://zenml.io.

Home Page: https://zenml.io

License: Apache License 2.0

Python 98.81% Shell 0.38% Dockerfile 0.07% HTML 0.03% Smarty 0.20% HCL 0.50% Batchfile 0.01% Mako 0.01%
mlops machine-learning data-science production-ready devops-tools zenml pipelines metadata-tracking deep-learning pytorch

zenml's People

Contributors

alex-zenml avatar alexejpenner avatar avishniakov avatar barismaiot avatar bcdurak avatar benkoller avatar bhatt-priyadutt avatar christianversloot avatar dependabot[bot] avatar dnth avatar fa9r avatar francoisserra avatar gabrielmbmb avatar hamzamaiot avatar htahir1 avatar jlopezpena avatar jwwwb avatar kamalesh0406 avatar lopezco avatar michael-zenml avatar nicholasmaiot avatar ramitsurana avatar safoinme avatar schustmi avatar skrohit avatar stefannica avatar strickvl avatar val3nt-ml avatar wjayesh avatar znegrin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zenml's Issues

[FEATURE] Add integration to deploy models via BudgetML

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

[FEATURE] Microsoft Azure Orchestrator Backend

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

[FEATURE] Deploying ML on FPGA using zenml

Discussed in https://github.com/zenml-io/zenml/discussions/96

Originally posted by zaman75 August 17, 2021
Hi,
I am doing a project regarding implement neural network on FPGA and deploy the model using MLOPS.
Zenml has excellent pipeline and I want to use zenml in my project. Is it possible to deploy model on FPGA using zenml? What could be the procedure? I would be extremely grateful if anyone can give me direction.

[FEATURE] Add ability to write BatchInference to any data sink

Is your feature request related to a problem? Please describe.
Currently, The BatchInferenceStep/Pipeline can only write to one data sink: TFRecords. In many use-cases, another sort of sink is required, e.g., a SQL datasource.

Describe the solution you'd like
Abstract the writing into the BatchInferenceStep interface so user can decide.

Describe alternatives you've considered

Additional context

[BUG] Examples directory contains absolute imports which do not work and relative imports also dont work with current zenml git logic

Describe the bug
I tried to run the scikit demo from example folder. first it gave the "No module named 'examples'".
Then I change this line
from examples.scikit.step.trainer import MyScikitTrainer in run.py to from step.trainer import MyScikitTrainer to fix the problem and now its giving this error:

Screenshots
it attached

** Context (please complete the following information):**

OS: Mac
Python Version: 3.7.4
ZenML Version: cloned from the git
Screen Shot 2021-02-25 at 12 21 42 PM

[BUG]: initialisation breaks if api.segment.io is blocked on network

Contact Details [Optional]

No response

What happened?

When running zenml init, I saw the following output:

$ zenml init
Initializing at /home/matt/projects/zenml-test
INFO:backoff:Backing off send_request(...) for 0.5s (requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.segment.io', port=443): Max retries exceeded with url: /v1/batch (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0e14b28898>: Failed to establish a new connection: [Errno 111] Connection refused')))
ZenML repo initialized at /home/matt/projects/zenml-test
INFO:backoff:Backing off send_request(...) for 1.5s (requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.segment.io', port=443): Max retries exceeded with url: /v1/batch (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0e142ace10>: Failed to establish a new connection: [Errno 111] Connection refused')))
INFO:backoff:Backing off send_request(...) for 1.7s (requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.segment.io', port=443): Max retries exceeded with url: /v1/batch (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0e142aceb8>: Failed to establish a new connection: [Errno 111] Connection refused')))
INFO:backoff:Backing off send_request(...) for 4.3s (requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.segment.io', port=443): Max retries exceeded with url: /v1/batch (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0e142ac978>: Failed to establish a new connection: [Errno 111] Connection refused')))
INFO:backoff:Backing off send_request(...) for 5.5s (requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.segment.io', port=443): Max retries exceeded with url: /v1/batch (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0e142cf0b8>: Failed to establish a new connection: [Errno 111] Connection refused')))
INFO:backoff:Backing off send_request(...) for 24.4s (requests.exceptions.ConnectionError: 
... etc ...

Cause

My /etc/hosts file includes a blacklist of all sorts of advertising and tracking domains, which happens to include api.segment.io. After removing this line from /etc/hosts, ZenML works correctly.

Expected behaviour

While there's certainly a good argument to be made that this is a local problem due to my slightly obscure setup, it would still be good if zenml init could fail gracefully when this domain is unreachable.

Reproduction steps

  1. Add the line api.segment.io to /etc/hosts
  2. Run zenml init in a new project.
  3. Initialisation hangs while it tries to reach the Segment API
    ...

ZenML Version

0.5.0rc2

Python Version

3.9

OS Type

Ubuntu (or other Linux Flavor)

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

[BUG] zenml init throws ยซKeyError: 'HOME'ยป on Windows

Trying to create zenml repo on Windows throws error message as indicated in section Stack Trace

Expected behavior
enable zenml to run on Windows as well

Stack Trace
from zenml.backends.orchestrator.kubernetes.orchestrator_kubernetes_backend
File "c:\users\frank\anaconda3\envs\zenml\lib\site-packages\zenml\backends\orchestrator\kubernetes\orchestrator_kubernetes_backend.py", line 40, in
DEFAULT_K8S_CONFIG = os.path.join(os.environ["HOME"], '.kube/config')
File "c:\users\frank\anaconda3\envs\zenml\lib\os.py", line 681, in getitem
raise KeyError(key) from None
KeyError: 'HOME'

** Context (please complete the following information):**

  • OS: Windows 10
  • Python Version: 3.7

Additional information
Windows uses USERPROFILE, instead of HOME. Windows doesn't have HOME and other OSs don't have USERPROFILE, so using either of these drops platform independence.

To keep platform independence, you can use expanduser from os.path, like so:

import os.path
home_folder = os.path.expanduser('~')

see https://stackoverflow.com/questions/14742064/python-os-environhome-works-on-idle-but-not-in-a-script/33935163

[BUG] zenml for regression

This is my code for predicting housing prices for boston housing dataset

from zenml.datasources import CSVDatasource
from zenml.exceptions import AlreadyExistsException
from zenml.pipelines import TrainingPipeline
from zenml.repo import Repository
from zenml.steps.evaluator import AgnosticEvaluator
from zenml.steps.preprocesser import StandardPreprocesser
from zenml.steps.split import RandomSplit
from zenml.steps.trainer import TorchFeedForwardTrainer
from zenml.utils import naming_utils

training_pipeline = TrainingPipeline()

try:
    ds = CSVDatasource(name='boston_data',
                       path='/datasets/boston_data.csv')
except AlreadyExistsException:
    ds = Repository.get_instance().get_datasource_by_name(
        'boston_data')
training_pipeline.add_datasource(ds)

# Add a split
training_pipeline.add_split(RandomSplit(
    split_map={'train': 0.7, 'eval': 0.2, 'test': 0.1}))

# Add a preprocessing unit
training_pipeline.add_preprocesser(
    StandardPreprocesser(
        features=['crim',
                'zn',
                'indus',
                'chas',
                'nox',
                'rm',
                'age',
                'dis',
                'rad',
                'tax',
                'ptratio',
                'black',
                'lstat'],
        labels=['medv'],
        overwrite={'medv': {
            'transform': [{'method': 'no_transform', 'parameters': {}}]}}
    ))

# Add a trainer

training_pipeline.add_trainer(TorchFeedForwardTrainer(
    loss='mse',
    optimizer="adam",
    last_activation='relu',
    metrics=['accuracy'],
    epochs=100))

# Add an evaluator
label_name = naming_utils.transformed_label_name('medv')
training_pipeline.add_evaluator(
    AgnosticEvaluator(
        prediction_key='output',
        label_key=label_name,
        slices=[['medv']],
        metrics=['mean_squared_error']))

# Run the pipeline locally
training_pipeline.run()
training_pipeline.evaluate()

Is there something I am missing here? The error when I run this code snippet is
mat1 and mat2 shapes cannot be multiplied (32x13 and 8x64)

[FEATURE] Bootstrapping for AWS

Is your feature request related to a problem? Please describe.
Users need to quickly get set up with using AWS with ZenML.

Describe the solution you'd like
To take bootstrapping of a cloud provider to the next level, a few functionalities can/should be built-in:

  • integration of bootstrapping to the CLI (vs. independent terraform scripts)
  • Inputs to bootstrap existing resources (e.g. an existing K8S cluster, reusing an existing MySQL)
  • Automatic provisioning of the ZenML config
    • would require some additional information to be stored in the config, e.g. k8s details
  • an easy way to share the setup config with other team members (e.g. hot-loading of a zenml config)
  • "saved environments" to chose as backends.

Describe alternatives you've considered
Using Native APIs rather than Terraform - but these seem very hard to maintain

Additional context
Should follow the same logic as GCP logic here

Not able to install zenml on Mojave or Ubuntu

I am trying to install zenml on Mojave but I get the below error. I also did try to do the same on EC2 instance but still got an error

pip install zenml is the command I use to install

EC2 Python details

  • Python - 3.5.2
  • PIP - 20.3.4
  • Error ERROR: Could not find a version that satisfies the requirement tfx==0.25.0 (from zenml) ERROR: No matching distribution found for tfx==0.25.0

Mojave Python details

  • Python - 3.9.1
  • PIP - 21.0.1
  • Error ` ERROR: Cannot install zenml==0.1.0 and zenml==0.1.1 because these package versions have conflicting dependencies.

The conflict is caused by:
zenml 0.1.1 depends on tfx==0.25.0
zenml 0.1.0 depends on tfx==0.25.0

To fix this you could try to:

  1. loosen the range of package versions you've specified
  2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies `

[FEATURE] Implement non-tensorflow specific custom Preprocessing step

Is your feature request related to a problem? Please describe.
Not everyone would like to use Tensorflow Transform, due to any reason whether it be complex logic or a legacy codebase.

Describe the solution you'd like
A simple Pythonic interface that receives data in an understood format and then preprocesses in a certain format.

Describe alternatives you've considered

Additional context

[FEATURE] Extend BatchInference to Work with any TrainerStep output (model agnostic)

Is your feature request related to a problem? Please describe.
Currently, there is only one evaluator, the TFMAEvaluator. We need to make it work for any upstream Trainer.

Describe the solution you'd like
Exposing an appropriate Step interface should do the trick -> User will be able to define their own Predict logic and their own Write logic to evaluate on any model and write to any sort of database.

Describe alternatives you've considered

Additional context

[FEATURE] AWS Orchestrator Backend

Is your feature request related to a problem? Please describe.
No. It is about implementing a quick and easy way of getting ZenML pipelines on AWS

Describe the solution you'd like
An EC2 instance will boot up, executing the desired ZenML pipeline and use S3 as the artifact store and RDS as the metadata store

[BUG]

Describe the bug
When I run infer_pipeline , the code in example can`t run successfully, this is the code and the error:

To Reproduce
Steps to reproduce the behavior:

  1. Open the jupyterlab on zenml-0.3.8 docker image
  2. Run code in 'examples-->batch_inference-->run.py'
  3. See error

Expected behavior
It can run successfully.

Screenshots
image

Stack Trace
If applicable, add the error stack trace to help explain your problem.

** Context (please complete the following information):**

  • OS: [K8sS -Dcoker image Ubuntu 18.04]
  • Python Version: [e.g. 3.7.5]
  • ZenML Version: [e.g. 0.3.8]

Additional information

[FEATURE] Add integration to deploy models via Seldon Core

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

[FEATURE] Is it possible to a generic flexible solution for MLOps pipelines independent of the use case, type of model etc?

Is your feature request related to a problem? Please describe.
IMHO, Having a generic flexible framework for ML pipelines independent of the use case, type of model etc. would be the best case. So the its up to the creativity and use of the DS to implement the pipeline. Also input and output for each steps in the pipeline could be generic and definable. Is this possible in ZenML?

Describe the solution you'd like

  1. How flexible is ZenML? In the example in the blog CSVDataSource() is used but in many production scenario data for training is pulled from data stored in a database like feature store, mongodb, folder or what ever.
    How can data for training be pulled from generic sources ?
  2. Can we have custom components of the pipeline instead of fixed components depending on use case image processing, NLP, regression on structured data, arima time series forecasting?
  3. Can the pipeline components be visualized in ZenML?

[BUG] Error displaying widget: model not found in "quickstart" example

Describe the bug
Error displaying widget: model not found raise when run training_pipeline.evaluate(magic=True)
in most time, if I change pipeline many times this error will raise

To Reproduce

Steps to reproduce the behavior:

  1. I run pipeline use training_pipeline.add_split(RandomSplit(split_map={'train': 0.7,'eval': 0.3})) in split step
  2. then I change to training_pipeline.add_split(RandomSplit(split_map={'train': 0.7,'eval': 0.3, 'test': 0.1})) and run pipeline again
  3. then I change to training_pipeline.add_split(RandomSplit(split_map={'train': 0.7,'eval': 0.2, 'test': 0.1})) and run pipeline again this time it works
  4. then I run training_pipeline.evaluate(magic=True) and run the script it generated, then raise error Error displaying widget: model not found
    I try again and error raise again in the first time.

Expected behavior
show evaluation right

Screenshots
image

Stack Trace
If applicable, add the error stack trace to help explain your problem.

** Context (please complete the following information):**

  • OS: [e.g. Ubuntu 18.04]
  • Python Version: 3.7
  • ZenML Version: 0.3.6

Additional information
Add any other context about the problem here.

[FEATURE] Bootstrapping for Microsoft Azure

Is your feature request related to a problem? Please describe.
Users need to quickly get set up with using Azure with ZenML.

Describe the solution you'd like
To take bootstrapping of a cloud provider to the next level, a few functionalities can/should be built-in:

  • integration of bootstrapping to the CLI (vs. independent terraform scripts)
  • Inputs to bootstrap existing resources (e.g. an existing K8S cluster, reusing an existing MySQL)
  • Automatic provisioning of the ZenML config
    • would require some additional information to be stored in the config, e.g. k8s details
  • an easy way to share the setup config with other team members (e.g. hot-loading of a zenml config)
  • "saved environments" to chose as backends.

Describe alternatives you've considered
Using Native APIs rather than Terraform - but these seem very hard to maintain

Additional context
Should follow the same logic as GCP logic here

[FEATURE] Decouple configuration from first-class component executions

Is your feature request related to a problem? Please describe.
Currently Pipelines, Steps, Datasource, and Backends , i.e., first-class ZenML components have the configuration and the post-execution state built-in to them. For example to run a Pipeline:

# Configuration
training_pipeline.add_split()
training_pipeline.add_preprocesser()
training_pipeline.add_trainer()

# After this, the state of `training_pipeline` changes from a config type to an execution type implictly
training_pipeline.run()

# This gets configuration + execution -> state is preserved
pipeline_execution = repo.get_pipeline_by_name('name')

This causes unintended consequences after the pipeline is run -> The execution object becomes immutable (in a hidden way) at that point, it gets in the way of fast iteration if working in a Jupyter notebook setting.

Describe the solution you'd like
Due to a variety of reasons, including the ability to test, reduced complexity, and ease of understanding, the community has arrived at a conclusion that the configuration and execution need to be separate Python objects. That is,

pipeline_execution = training_pipeline.run(name='unique name')

The pipeline_execution and training_pipeline will be different objects, former being the execution object and the latter being the configuration object. The name variable will then bind the execution and the configurations for experiment tracking.

Describe alternatives you've considered
Trying to maintain immutable states after the run() and register() calls but that led to the problems stated above.

[FEATURE] Add Google's Model card toolkit support to collect the model metrics and build the model documentation automatically

Is your feature request related to a problem? Please describe.

Using this Toolkit will help users of zenml align with best practices in the industry for model reporting. This will also help with increased transparency in the workflow and being more accountable towards the stakeholders in the project and the users.

Describe the solution you'd like
A suggested pseudo code from @hamzamaiot could look like this:

# pseudo-code
model_card = mct.scaffold_assets()
# fill in ur model card details
pipeline.add_model_card(model_card)

Describe alternatives you've considered
Alternatively one could use Dalex as a model agnostic tool to build the model card self. Here is an example in a jupyter notebook. Otherwise, can you view it here

[BUG] CSV Datasource error

Describe the bug
I'm not able to getting started with quickstart example pipeline.
Trying to run:
ds = CSVDatasource(name='Pima Indians Diabetes Dataset', path='gs://zenml_quickstart/diabetes.csv')

To Reproduce
I have followed QuickStart steps:

  1. pip install zenml
  2. zenml init
  3. Run the QuickStart example

Screenshots
Schermata 2021-01-15 alle 14 56 56

Stack Trace

KeyError Traceback (most recent call last)
in
1 # Add a datasource. This will automatically track and version it.
----> 2 ds = CSVDatasource(name='Pima Indians Diabetes Dataset', path='gs://zenml_quickstart/diabetes.csv')
3 training_pipeline.add_datasource(ds)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/zenml/core/datasources/csv_datasource.py in init(self, name, path, schema, **unused_kwargs)
45 schema (str): optional schema for data to conform to.
46 """
---> 47 super().init(name, schema, **unused_kwargs)
48 self.path = path
49

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/zenml/core/datasources/base_datasource.py in init(self, name, schema, _id, _source, *args, **kwargs)
61 else:
62 # If none, then this is assumed to be 'new'. Check dupes.
---> 63 all_names = Repository.get_instance().get_datasource_names()
64 if any(d == name for d in all_names):
65 raise Exception(

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/zenml/core/repo/repo.py in get_datasource_names(self)
236 c = yaml_utils.read_yaml(file_path)
237 n.append(c[keys.GlobalKeys.DATASOURCE][keys.DatasourceKeys.NAME])
--> 238 return list(set(n))
239
240 @track(event=GET_DATASOURCES)

KeyError: 'datasource'

** Context (please complete the following information):**

  • OS: MacOS Big Sur 11.1
  • Python Version: 3.8.2
  • ZenML Version: 0.1.3

[BUG] ValueError: An error occured while trying to apply the transformation

Expected behavior
I try to run the basic example:

from zenml.datasources import CSVDatasource
from zenml.pipelines import TrainingPipeline
from zenml.steps.evaluator import TFMAEvaluator
from zenml.steps.split import RandomSplit
from zenml.steps.preprocesser import StandardPreprocesser
from zenml.steps.trainer import TFFeedForwardTrainer

training_pipeline = TrainingPipeline(name='Quickstart')

# Add a datasource. This will automatically track and version it.
ds = CSVDatasource(name='Pima Indians Diabetes Dataset',
                   path='gs://zenml_quickstart/diabetes.csv')
training_pipeline.add_datasource(ds)

# Add a random 70/30 train-eval split
training_pipeline.add_split(RandomSplit(split_map={'train': 0.7, 
                                                   'eval': 0.2,
                                                   'test': 0.1}))

# StandardPreprocesser() has sane defaults for normal preprocessing methods
training_pipeline.add_preprocesser(
    StandardPreprocesser(
        features=['times_pregnant', 'pgc', 'dbp', 'tst', 
                  'insulin', 'bmi', 'pedigree', 'age'],
        labels=['has_diabetes'],
        overwrite={'has_diabetes': {
            'transform': [{'method': 'no_transform', 'parameters': {}}]}}
    ))

# Add a trainer
training_pipeline.add_trainer(TFFeedForwardTrainer(
    loss='binary_crossentropy',
    last_activation='sigmoid',
    output_units=1,
    metrics=['accuracy'],
    epochs=20))

# Add an evaluator
training_pipeline.add_evaluator(
    TFMAEvaluator(slices=[['has_diabetes']],
                  metrics={'has_diabetes': ['binary_crossentropy',
                                            'binary_accuracy']}))

# Run the pipeline locally
training_pipeline.run()

But I got the following error.

Stack Trace

2021-07-12 12:19:13,646 โ€” zenml.pipelines.training_pipeline โ€” INFO โ€” Datasource Pima Indians Diabetes Dataset has no commits. Creating the first one..
2021-07-12 12:19:13,648 โ€” zenml.pipelines.base_pipeline โ€” INFO โ€” Pipeline 1626085153648 created.

2021-07-12 12:19:13,724 โ€” apache_beam.options.pipeline_options โ€” WARNING โ€” Discarding unparseable args: ['-f', '/home/gs/.local/share/jupyter/runtime/kernel-20bd2f0f-6d09-43d9-9eec-d3553c030468.json']

2021-07-12 12:19:15,886 โ€” zenml.datasources.csv_datasource โ€” INFO โ€” Matched 1: ['gs://zenml_quickstart/diabetes.csv']
2021-07-12 12:19:15,892 โ€” zenml.datasources.csv_datasource โ€” INFO โ€” Using header from file: gs://zenml_quickstart/diabetes.csv.
2021-07-12 12:19:16,070 โ€” zenml.datasources.csv_datasource โ€” INFO โ€” Header: ['times_pregnant', 'pgc', 'dbp', 'tst', 'insulin', 'bmi', 'pedigree', 'age', 'has_diabetes'].

2021-07-12 12:19:16,430 โ€” apache_beam.runners.interactive.interactive_environment โ€” WARNING โ€” Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features.

2021-07-12 12:19:16,600 โ€” apache_beam.options.pipeline_options โ€” WARNING โ€” Discarding unparseable args: ['-f', '/home/gs/.local/share/jupyter/runtime/kernel-20bd2f0f-6d09-43d9-9eec-d3553c030468.json']
2021-07-12 12:19:17,496 โ€” apache_beam.io.tfrecordio โ€” WARNING โ€” Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be.
2021-07-12 12:19:18,762 โ€” apache_beam.options.pipeline_options โ€” WARNING โ€” Discarding unparseable args: ['-f', '/home/gs/.local/share/jupyter/runtime/kernel-20bd2f0f-6d09-43d9-9eec-d3553c030468.json']
2021-07-12 12:19:19,850 โ€” apache_beam.options.pipeline_options โ€” WARNING โ€” Discarding unparseable args: ['-f', '/home/gs/.local/share/jupyter/runtime/kernel-20bd2f0f-6d09-43d9-9eec-d3553c030468.json']
2021-07-12 12:19:20,900 โ€” apache_beam.options.pipeline_options โ€” WARNING โ€” Discarding unparseable args: ['-f', '/home/gs/.local/share/jupyter/runtime/kernel-20bd2f0f-6d09-43d9-9eec-d3553c030468.json']
2021-07-12 12:19:24,609 โ€” apache_beam.options.pipeline_options โ€” WARNING โ€” Discarding unparseable args: ['-f', '/home/gs/.local/share/jupyter/runtime/kernel-20bd2f0f-6d09-43d9-9eec-d3553c030468.json']
2021-07-12 12:19:26,134 โ€” tensorflow โ€” WARNING โ€” From /home/gs/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow_transform/tf_utils.py:266: Tensor.experimental_ref (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use ref() instead.
2021-07-12 12:19:26,600 โ€” root โ€” WARNING โ€” This output type hint will be ignored and not used for type-checking purposes. Typically, output type hints for a PTransform are single (or nested) types wrapped by a PCollection, PDone, or None. Got: Tuple[Dict[str, Union[NoneType, _Dataset]], Union[Dict[str, Dict[str, PCollection]], NoneType]] instead.
2021-07-12 12:19:27,134 โ€” root โ€” WARNING โ€” This output type hint will be ignored and not used for type-checking purposes. Typically, output type hints for a PTransform are single (or nested) types wrapped by a PCollection, PDone, or None. Got: Tuple[Dict[str, Union[NoneType, _Dataset]], Union[Dict[str, Dict[str, PCollection]], NoneType]] instead.
2021-07-12 12:19:27,168 โ€” tensorflow โ€” WARNING โ€” Tensorflow version (2.4.1) found. Note that Tensorflow Transform support for TF 2.0 is currently in beta, and features such as tf.function may not work as intended. 
2021-07-12 12:19:29,328 โ€” tensorflow โ€” WARNING โ€” Tensorflow version (2.4.1) found. Note that Tensorflow Transform support for TF 2.0 is currently in beta, and features such as tf.function may not work as intended. 
2021-07-12 12:19:29,393 โ€” tensorflow โ€” WARNING โ€” Tensorflow version (2.4.1) found. Note that Tensorflow Transform support for TF 2.0 is currently in beta, and features such as tf.function may not work as intended. 
2021-07-12 12:19:29,458 โ€” tensorflow โ€” WARNING โ€” Tensorflow version (2.4.1) found. Note that Tensorflow Transform support for TF 2.0 is currently in beta, and features such as tf.function may not work as intended. 
2021-07-12 12:19:29,495 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring send_type hint: <class 'NoneType'>
2021-07-12 12:19:29,495 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring return_type hint: <class 'NoneType'>
2021-07-12 12:19:29,496 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring send_type hint: <class 'NoneType'>
2021-07-12 12:19:29,496 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring return_type hint: <class 'NoneType'>
2021-07-12 12:19:29,497 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring send_type hint: <class 'NoneType'>
2021-07-12 12:19:29,497 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring return_type hint: <class 'NoneType'>
2021-07-12 12:19:29,539 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring send_type hint: <class 'NoneType'>
2021-07-12 12:19:29,540 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring return_type hint: <class 'NoneType'>
2021-07-12 12:19:29,540 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring send_type hint: <class 'NoneType'>
2021-07-12 12:19:29,541 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring return_type hint: <class 'NoneType'>
2021-07-12 12:19:29,541 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring send_type hint: <class 'NoneType'>
2021-07-12 12:19:29,542 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring return_type hint: <class 'NoneType'>
2021-07-12 12:19:29,584 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring send_type hint: <class 'NoneType'>
2021-07-12 12:19:29,585 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring return_type hint: <class 'NoneType'>
2021-07-12 12:19:29,585 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring send_type hint: <class 'NoneType'>
2021-07-12 12:19:29,586 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring return_type hint: <class 'NoneType'>
2021-07-12 12:19:29,587 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring send_type hint: <class 'NoneType'>
2021-07-12 12:19:29,588 โ€” apache_beam.typehints.typehints โ€” WARNING โ€” Ignoring return_type hint: <class 'NoneType'>

---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow_transform/beam/impl.py in _handle_batch(self, batch)
    380       else:
--> 381         result = self._graph_state.callable_get_outputs(feed_dict)
    382         assert len(self._graph_state.outputs_tensor_keys) == len(result)

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow_transform/saved/saved_transform_io_v2.py in apply_transform_model(self, logical_input_map)
    363     elif self._is_finalized:
--> 364       return self._apply_v2_transform_model_finalized(logical_input_map)
    365     else:

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow_transform/saved/saved_transform_io_v2.py in _apply_v2_transform_model_finalized(self, logical_input_map)
    288     modified_inputs = self._format_input_map_as_tensors(logical_input_map)
--> 289     return self._wrapped_function_finalized(modified_inputs)
    290 

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow/python/eager/function.py in __call__(self, *args, **kwargs)
   1668     """
-> 1669     return self._call_impl(args, kwargs)
   1670 

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow/python/eager/function.py in _call_impl(self, args, kwargs, cancellation_manager)
   1678           return self._call_with_structured_signature(args, kwargs,
-> 1679                                                       cancellation_manager)
   1680         except TypeError as structured_err:

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow/python/eager/function.py in _call_with_structured_signature(self, args, kwargs, cancellation_manager)
   1761         captured_inputs=self.captured_inputs,
-> 1762         cancellation_manager=cancellation_manager)
   1763 

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow/python/eager/function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
   1918       return self._build_call_outputs(self._inference_function.call(
-> 1919           ctx, args, cancellation_manager=cancellation_manager))
   1920     forward_backward = self._select_forward_and_backward_functions(

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow/python/eager/function.py in call(self, ctx, args, cancellation_manager)
    559               attrs=attrs,
--> 560               ctx=ctx)
    561         else:

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:

NotFoundError:  No registered 'Min' OpKernel for 'GPU' devices compatible with node {{node StatefulPartitionedCall/max_6/min_and_max/Max_1}}
	 (OpKernel was found, but attributes didn't match) Requested Attributes: T=DT_INT64, Tidx=DT_INT32, _XlaHasReferenceVars=false, keep_dims=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"
	.  Registered:  device='XLA_CPU_JIT'; Tidx in [DT_INT32, DT_INT64]; T in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_INT64, DT_QINT8, DT_QUINT8, DT_QINT32, DT_BFLOAT16, DT_UINT16, DT_HALF, DT_UINT32, DT_UINT64]
  device='XLA_GPU_JIT'; Tidx in [DT_INT32, DT_INT64]; T in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_INT64, DT_QINT8, DT_QUINT8, DT_QINT32, DT_BFLOAT16, DT_UINT16, DT_HALF, DT_UINT32, DT_UINT64]
  device='GPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT32]

	 [[StatefulPartitionedCall/max_6/min_and_max/Max_1]] [Op:__inference_wrapped_finalized_5475]

Function call stack:
wrapped_finalized


During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker.invoke_process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow_transform/beam/impl.py in process(self, batch, saved_model_dir)
    440 
--> 441     yield self._handle_batch(batch)
    442 

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow_transform/beam/impl.py in _handle_batch(self, batch)
    387           Fetching the values for the following Tensor keys: {}.""".format(
--> 388               str(e), batch, self._graph_state.outputs_tensor_keys))
    389 

ValueError: An error occured while trying to apply the transformation: " No registered 'Min' OpKernel for 'GPU' devices compatible with node {{node StatefulPartitionedCall/max_6/min_and_max/Max_1}}
	 (OpKernel was found, but attributes didn't match) Requested Attributes: T=DT_INT64, Tidx=DT_INT32, _XlaHasReferenceVars=false, keep_dims=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"
	.  Registered:  device='XLA_CPU_JIT'; Tidx in [DT_INT32, DT_INT64]; T in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_INT64, DT_QINT8, DT_QUINT8, DT_QINT32, DT_BFLOAT16, DT_UINT16, DT_HALF, DT_UINT32, DT_UINT64]
  device='XLA_GPU_JIT'; Tidx in [DT_INT32, DT_INT64]; T in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_INT64, DT_QINT8, DT_QUINT8, DT_QINT32, DT_BFLOAT16, DT_UINT16, DT_HALF, DT_UINT32, DT_UINT64]
  device='GPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT32]

	 [[StatefulPartitionedCall/max_6/min_and_max/Max_1]] [Op:__inference_wrapped_finalized_5475]

Function call stack:
wrapped_finalized
".
          Batch instances: pyarrow.RecordBatch
age: large_list<item: int64>
  child 0, item: int64
bmi: large_list<item: float>
  child 0, item: float
dbp: large_list<item: int64>
  child 0, item: int64
has_diabetes: large_list<item: int64>
  child 0, item: int64
insulin: large_list<item: int64>
  child 0, item: int64
pedigree: large_list<item: float>
  child 0, item: float
pgc: large_list<item: int64>
  child 0, item: int64
times_pregnant: large_list<item: int64>
  child 0, item: int64
tst: large_list<item: int64>
  child 0, item: int64,
          Fetching the values for the following Tensor keys: {'max_2/min_and_max/Identity_1', 'max_6/min_and_max/Identity_1', 'max_4/min_and_max/Identity', 'max/min_and_max/Identity_1', 'max_6/min_and_max/Identity', 'max_4/min_and_max/Identity_1', 'max_1/min_and_max/Identity', 'max_7/min_and_max/Identity_1', 'max_8/min_and_max/Identity_1', 'max_3/min_and_max/Identity', 'max_1/min_and_max/Identity_1', 'max_3/min_and_max/Identity_1', 'max/min_and_max/Identity', 'max_5/min_and_max/Identity', 'max_2/min_and_max/Identity', 'max_7/min_and_max/Identity', 'max_5/min_and_max/Identity_1', 'max_8/min_and_max/Identity'}.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-16-43278ec7954a> in <module>
      1 # Run the pipeline locally
----> 2 training_pipeline.run()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/zenml/utils/analytics_utils.py in inner_func(*args, **kwargs)
    175     def inner_func(*args, **kwargs):
    176         track_event(event, metadata=metadata)
--> 177         result = func(*args, **kwargs)
    178         return result
    179 

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/zenml/pipelines/base_pipeline.py in run(self, backend, metadata_store, artifact_store)
    455             self.register_pipeline(config)
    456 
--> 457         self.run_config(config)
    458 
    459         # After running, pipeline is immutable

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/zenml/pipelines/base_pipeline.py in run_config(self, config)
    376         """
    377         assert issubclass(self.backend.__class__, OrchestratorBaseBackend)
--> 378         self.backend.run(config)
    379 
    380     @track(event=RUN_PIPELINE)

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/zenml/backends/orchestrator/base/orchestrator_base_backend.py in run(self, config)
    107         """
    108         tfx_pipeline = self.get_tfx_pipeline(config)
--> 109         ZenMLLocalDagRunner().run(tfx_pipeline)

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/zenml/backends/orchestrator/base/zenml_local_orchestrator.py in run(self, pipeline)
     95                     custom_driver_spec=custom_driver_spec)
     96                 logging.info('Component %s is running.', node_id)
---> 97                 component_launcher.launch()
     98                 logging.info('Component %s is finished.', node_id)

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tfx/orchestration/portable/launcher.py in launch(self)
    429     if is_execution_needed:
    430       try:
--> 431         executor_output = self._run_executor(execution_info)
    432       except Exception as e:  # pylint: disable=broad-except
    433         execution_output = (

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tfx/orchestration/portable/launcher.py in _run_executor(self, execution_info)
    323     outputs_utils.make_output_dirs(execution_info.output_dict)
    324     try:
--> 325       executor_output = self._executor_operator.run_executor(execution_info)
    326       code = executor_output.execution_result.code
    327       if code != 0:

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tfx/orchestration/portable/beam_executor_operator.py in run_executor(self, execution_info)
     84         stateful_working_dir=execution_info.stateful_working_dir)
     85     executor = self._executor_cls(context=context)
---> 86     return python_executor_operator.run_with_executor(execution_info, executor)

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tfx/orchestration/portable/python_executor_operator.py in run_with_executor(execution_info, executor)
     64   output_dict = copy.deepcopy(execution_info.output_dict)
     65   result = executor.Do(execution_info.input_dict, output_dict,
---> 66                        execution_info.exec_properties)
     67   if not result:
     68     # If result is not returned from the Do function, then try to

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tfx/components/transform/executor.py in Do(self, input_dict, output_dict, exec_properties)
    490       label_outputs[labels.CACHE_OUTPUT_PATH_LABEL] = cache_output
    491     status_file = 'status_file'  # Unused
--> 492     self.Transform(label_inputs, label_outputs, status_file)
    493     absl.logging.debug('Cleaning up temp path %s on executor success',
    494                        temp_path)

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tfx/components/transform/executor.py in Transform(***failed resolving arguments***)
   1025                       output_cache_dir, compute_statistics,
   1026                       per_set_stats_output_paths, materialization_format,
-> 1027                       len(analyze_data_paths))
   1028   # TODO(b/122478841): Writes status to status file.
   1029 

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tfx/components/transform/executor.py in _RunBeamImpl(self, analyze_data_list, transform_data_list, preprocessing_fn, stats_options_updater_fn, force_tf_compat_v1, input_dataset_metadata, transform_output_path, raw_examples_data_format, temp_path, input_cache_dir, output_cache_dir, compute_statistics, per_set_stats_output_paths, materialization_format, analyze_paths_count)
   1338                      Executor._RecordBatchToExamples)
   1339                  | 'Materialize[{}]'.format(infix) >> self._WriteExamples(
-> 1340                      materialization_format, dataset.materialize_output_path))
   1341 
   1342     return _Status.OK()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/pipeline.py in __exit__(self, exc_type, exc_val, exc_tb)
    578     try:
    579       if not exc_type:
--> 580         self.result = self.run()
    581         self.result.wait_until_finish()
    582     finally:

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/pipeline.py in run(self, test_runner_api)
    557         finally:
    558           shutil.rmtree(tmpdir)
--> 559       return self.runner.run_pipeline(self, self._options)
    560     finally:
    561       shutil.rmtree(self.local_tempdir, ignore_errors=True)

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/direct/direct_runner.py in run_pipeline(self, pipeline, options)
    131       runner = BundleBasedDirectRunner()
    132 
--> 133     return runner.run_pipeline(pipeline, options)
    134 
    135 

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_pipeline(self, pipeline, options)
    181 
    182     self._latest_run_result = self.run_via_runner_api(
--> 183         pipeline.to_runner_api(default_environment=self._default_environment))
    184     return self._latest_run_result
    185 

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_via_runner_api(self, pipeline_proto)
    191     # TODO(pabloem, BEAM-7514): Create a watermark manager (that has access to
    192     #   the teststream (if any), and all the stages).
--> 193     return self.run_stages(stage_context, stages)
    194 
    195   @contextlib.contextmanager

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_stages(self, stage_context, stages)
    357           stage_results = self._run_stage(
    358               runner_execution_context,
--> 359               bundle_context_manager,
    360           )
    361           monitoring_infos_by_stage[stage.name] = (

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in _run_stage(self, runner_execution_context, bundle_context_manager)
    553               input_timers,
    554               expected_timer_output,
--> 555               bundle_manager)
    556 
    557       final_result = merge_results(last_result)

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in _run_bundle(self, runner_execution_context, bundle_context_manager, data_input, data_output, input_timers, expected_timer_output, bundle_manager)
    593 
    594     result, splits = bundle_manager.process_bundle(
--> 595         data_input, data_output, input_timers, expected_timer_output)
    596     # Now we collect all the deferred inputs remaining from bundle execution.
    597     # Deferred inputs can be:

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in process_bundle(self, inputs, expected_outputs, fired_timers, expected_output_timers, dry_run)
    894             process_bundle_descriptor.id,
    895             cache_tokens=[next(self._cache_token_generator)]))
--> 896     result_future = self._worker_handler.control_conn.push(process_bundle_req)
    897 
    898     split_results = []  # type: List[beam_fn_api_pb2.ProcessBundleSplitResponse]

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/worker_handlers.py in push(self, request)
    378       self._uid_counter += 1
    379       request.instruction_id = 'control_%s' % self._uid_counter
--> 380     response = self.worker.do_instruction(request)
    381     return ControlFuture(request.instruction_id, response)
    382 

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/sdk_worker.py in do_instruction(self, request)
    605       # E.g. if register is set, this will call self.register(request.register))
    606       return getattr(self, request_type)(
--> 607           getattr(request, request_type), request.instruction_id)
    608     else:
    609       raise NotImplementedError

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/sdk_worker.py in process_bundle(self, request, instruction_id)
    642         with self.maybe_profile(instruction_id):
    643           delayed_applications, requests_finalization = (
--> 644               bundle_processor.process_bundle(instruction_id))
    645           monitoring_infos = bundle_processor.monitoring_infos()
    646           monitoring_infos.extend(self.state_cache_metrics_fn())

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/bundle_processor.py in process_bundle(self, instruction_id)
    998           elif isinstance(element, beam_fn_api_pb2.Elements.Data):
    999             input_op_by_transform_id[element.transform_id].process_encoded(
-> 1000                 element.data)
   1001 
   1002       # Finish all operations.

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/bundle_processor.py in process_encoded(self, encoded_windowed_values)
    226       decoded_value = self.windowed_coder_impl.decode_from_stream(
    227           input_stream, True)
--> 228       self.output(decoded_value)
    229 
    230   def monitoring_infos(self, transform_id, tag_to_pcollection_id):

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.Operation.output()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.Operation.output()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SdfProcessSizedElements.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SdfProcessSizedElements.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process_with_sized_restriction()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker.invoke_process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.FlattenOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.FlattenOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.Operation.output()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.ConsumerSet.receive()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/future/utils/__init__.py in raise_with_traceback(exc, traceback)
    444         if traceback == Ellipsis:
    445             _, _, traceback = sys.exc_info()
--> 446         raise exc.with_traceback(traceback)
    447 
    448 else:

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker.invoke_process()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow_transform/beam/impl.py in process(self, batch, saved_model_dir)
    439     assert self._graph_state.saved_model_dir == saved_model_dir
    440 
--> 441     yield self._handle_batch(batch)
    442 
    443 

~/miniconda3/envs/zenml-py36/lib/python3.6/site-packages/tensorflow_transform/beam/impl.py in _handle_batch(self, batch)
    386           Batch instances: {},
    387           Fetching the values for the following Tensor keys: {}.""".format(
--> 388               str(e), batch, self._graph_state.outputs_tensor_keys))
    389 
    390     result.update(self._get_passthrough_data_from_recordbatch(batch))

ValueError: An error occured while trying to apply the transformation: " No registered 'Min' OpKernel for 'GPU' devices compatible with node {{node StatefulPartitionedCall/max_6/min_and_max/Max_1}}
	 (OpKernel was found, but attributes didn't match) Requested Attributes: T=DT_INT64, Tidx=DT_INT32, _XlaHasReferenceVars=false, keep_dims=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"
	.  Registered:  device='XLA_CPU_JIT'; Tidx in [DT_INT32, DT_INT64]; T in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_INT64, DT_QINT8, DT_QUINT8, DT_QINT32, DT_BFLOAT16, DT_UINT16, DT_HALF, DT_UINT32, DT_UINT64]
  device='XLA_GPU_JIT'; Tidx in [DT_INT32, DT_INT64]; T in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_INT64, DT_QINT8, DT_QUINT8, DT_QINT32, DT_BFLOAT16, DT_UINT16, DT_HALF, DT_UINT32, DT_UINT64]
  device='GPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='GPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT32]

	 [[StatefulPartitionedCall/max_6/min_and_max/Max_1]] [Op:__inference_wrapped_finalized_5475]

Function call stack:
wrapped_finalized
".
          Batch instances: pyarrow.RecordBatch
age: large_list<item: int64>
  child 0, item: int64
bmi: large_list<item: float>
  child 0, item: float
dbp: large_list<item: int64>
  child 0, item: int64
has_diabetes: large_list<item: int64>
  child 0, item: int64
insulin: large_list<item: int64>
  child 0, item: int64
pedigree: large_list<item: float>
  child 0, item: float
pgc: large_list<item: int64>
  child 0, item: int64
times_pregnant: large_list<item: int64>
  child 0, item: int64
tst: large_list<item: int64>
  child 0, item: int64,
          Fetching the values for the following Tensor keys: {'max_2/min_and_max/Identity_1', 'max_6/min_and_max/Identity_1', 'max_4/min_and_max/Identity', 'max/min_and_max/Identity_1', 'max_6/min_and_max/Identity', 'max_4/min_and_max/Identity_1', 'max_1/min_and_max/Identity', 'max_7/min_and_max/Identity_1', 'max_8/min_and_max/Identity_1', 'max_3/min_and_max/Identity', 'max_1/min_and_max/Identity_1', 'max_3/min_and_max/Identity_1', 'max/min_and_max/Identity', 'max_5/min_and_max/Identity', 'max_2/min_and_max/Identity', 'max_7/min_and_max/Identity', 'max_5/min_and_max/Identity_1', 'max_8/min_and_max/Identity'}. [while running 'Analyze/ApplySavedModel[Phase0][AnalysisIndex0]/ApplySavedModel']
  • OS:

    • LSB Version: core-11.1.0ubuntu2-noarch:security-11.1.0ubuntu2-noarch
    • Distributor ID: Pop
    • Description: Pop!_OS 21.04
    • Release: 21.04
    • Codename: hirsute
  • Python Version: Python 3.6.13 :: Anaconda, Inc.

  • ZenML Version: 0.3.8

[FEATURE] Add text as a datasource

Is your feature request related to a problem? Please describe.
Currently, there is no easy way to deal with text corpus's in ZenML. We want to unlock the NLP angle for people using it.

Describe the solution you'd like
A simple TextDatasource that can read a bunch of text files would be a good start. A solid example of NLP end-to-end would also help.

[BUG] training_pipeline.view_anomalies() raises error in the "quickstart" example

Describe the bug
training_pipeline.view_anomalies() raises error in the "quickstart" example

To Reproduce
I tried the "quickstart" example and anomalies function as well.

I just run training_pipeline.view_anomalies() after training_pipeline.run()

Expected behavior
like screenshots

Screenshots
I try to change source code will get right return (maybe)
image

Stack Trace
`

TypeError Traceback (most recent call last)
in
----> 1 training_pipeline.view_anomalies()

/opt/conda/lib/python3.8/site-packages/zenml/pipelines/training_pipeline.py in view_anomalies(self, split_name)
338 schema_uri = self.get_artifacts_uri_by_component(
339 GDPComponent.SplitSchema.name)[0]
--> 340 detect_anomalies(stats_uri, schema_uri, split_name)
341
342 def steps_completed(self) -> bool:

/opt/conda/lib/python3.8/site-packages/zenml/utils/post_training/post_training_utils.py in detect_anomalies(stats_uri, schema_uri, split_name)
108 def detect_anomalies(stats_uri: Text, schema_uri: Text, split_name: Text):
109 schema = get_schema_proto(schema_uri)
--> 110 stats = get_statistics_dataset_dict(stats_uri)
111 if split_name not in stats:
112 raise Exception(f'{split_name} split not present!')

/opt/conda/lib/python3.8/site-packages/zenml/utils/post_training/post_training_utils.py in get_statistics_dataset_dict(stats_uri)
69 """Get DatasetFeatureStatisticsList from stats URI"""
70 result = {}
---> 71 for split in os.listdir(stats_uri):
72 stats_path = os.path.join(stats_uri, split, 'stats_tfrecord')
73 serialized_stats = next(

TypeError: listdir: path should be string, bytes, os.PathLike, integer or None, not list
`

** Context (please complete the following information):**

  • OS: Ubuntu 18.04
  • Python Version: 3.8.2
  • ZenML Version: 0.3.5

Additional information
Add any other context about the problem here.

[ENHANCEMENT] There should be a way to list all integrations with dependencies

Is your enhancement request related to a problem? Please describe.
Not easy to see the list of supported integrations -> For example: is it zenml[torch] or zenml[pytorch]

Describe the enhancement you'd like
A way to list all integrations with dependencies they would install.

How do you solve your current problem with the current status-quo of ZenML?
Have to look at source code

Additional context
Thank you @JoyZhou for pointing it out

RuntimeError: Tensor for argument #2 'mat1' is on CPU, but expected it to be on GPU (while checking arguments for addmm)

Describe the bug
I am new to zenml and planning to use in one of our project. I tried to run Pytorch examples mentioned here. Please let me know what is the issue. I am confused because it is able to train the model without CPU, GPU tensor mismatch. But after training I am getting this error and I cannot find an option (in the apis) to specify an option to use or not use GPU. Please let me know if you need any other details.

To Reproduce
Steps to reproduce the behavior:

  1. pip install zenml[pytorch]
  2. zenml example pull pytorch
  3. cd zenml_examples/pytorch
  4. git init
  5. zenml init

Stack Trace

RuntimeError Traceback (most recent call last)
in
1 # Run the pipeline locally
----> 2 training_pipeline.run()

~/miniconda3/envs/zenml/lib/python3.7/site-packages/zenml/utils/analytics_utils.py in inner_func(*args, **kwargs)
175 def inner_func(*args, **kwargs):
176 track_event(event, metadata=metadata)
--> 177 result = func(*args, **kwargs)
178 return result
179

~/miniconda3/envs/zenml/lib/python3.7/site-packages/zenml/pipelines/base_pipeline.py in run(self, backend, metadata_store, artifact_store)
455 self.register_pipeline(config)
456
--> 457 self.run_config(config)
458
459 # After running, pipeline is immutable

~/miniconda3/envs/zenml/lib/python3.7/site-packages/zenml/pipelines/base_pipeline.py in run_config(self, config)
376 """
377 assert issubclass(self.backend.class, OrchestratorBaseBackend)
--> 378 self.backend.run(config)
379
380 @track(event=RUN_PIPELINE)

~/miniconda3/envs/zenml/lib/python3.7/site-packages/zenml/backends/orchestrator/base/orchestrator_base_backend.py in run(self, config)
107 """
108 tfx_pipeline = self.get_tfx_pipeline(config)
--> 109 ZenMLLocalDagRunner().run(tfx_pipeline)

~/miniconda3/envs/zenml/lib/python3.7/site-packages/zenml/backends/orchestrator/base/zenml_local_orchestrator.py in run(self, pipeline)
95 custom_driver_spec=custom_driver_spec)
96 logging.info('Component %s is running.', node_id)
---> 97 component_launcher.launch()
98 logging.info('Component %s is finished.', node_id)

~/miniconda3/envs/zenml/lib/python3.7/site-packages/tfx/orchestration/portable/launcher.py in launch(self)
429 if is_execution_needed:
430 try:
--> 431 executor_output = self._run_executor(execution_info)
432 except Exception as e: # pylint: disable=broad-except
433 execution_output = (

~/miniconda3/envs/zenml/lib/python3.7/site-packages/tfx/orchestration/portable/launcher.py in _run_executor(self, execution_info)
323 outputs_utils.make_output_dirs(execution_info.output_dict)
324 try:
--> 325 executor_output = self._executor_operator.run_executor(execution_info)
326 code = executor_output.execution_result.code
327 if code != 0:

~/miniconda3/envs/zenml/lib/python3.7/site-packages/tfx/orchestration/portable/python_executor_operator.py in run_executor(self, execution_info)
139 stateful_working_dir=execution_info.stateful_working_dir)
140 executor = self._executor_cls(context=context)
--> 141 return run_with_executor(execution_info, executor)

~/miniconda3/envs/zenml/lib/python3.7/site-packages/tfx/orchestration/portable/python_executor_operator.py in run_with_executor(execution_info, executor)
64 output_dict = copy.deepcopy(execution_info.output_dict)
65 result = executor.Do(execution_info.input_dict, output_dict,
---> 66 execution_info.exec_properties)
67 if not result:
68 # If result is not returned from the Do function, then try to

~/miniconda3/envs/zenml/lib/python3.7/site-packages/tfx/components/trainer/executor.py in Do(self, input_dict, output_dict, exec_properties)
192 # Train the model
193 absl.logging.info('Training model.')
--> 194 run_fn(fn_args)
195
196 # Note: If trained with multi-node distribution workers, it is the user

~/miniconda3/envs/zenml/lib/python3.7/site-packages/zenml/components/trainer/trainer_module.py in run_fn(fn_args)
30 # Load the step, parameterize it and run it
31 c = load_source_path_class(custom_config.pop(StepKeys.SOURCE))
---> 32 return c(**args).run_fn()

~/miniconda3/envs/zenml/lib/python3.7/site-packages/zenml/steps/trainer/pytorch_trainers/torch_ff_trainer.py in run_fn(self)
211 pattern = self.input_patterns[split]
212 test_dataset = self.input_fn([pattern])
--> 213 test_results = self.test_fn(model, test_dataset)
214 utils.save_test_results(test_results, self.output_patterns[split])
215

~/miniconda3/envs/zenml/lib/python3.7/site-packages/zenml/steps/trainer/pytorch_trainers/torch_ff_trainer.py in test_fn(self, model, dataset)
130 # finally, add the output of the model
131 x_batch = torch.cat([v for v in x.values()], dim=-1)
--> 132 p = model(x_batch)
133
134 if isinstance(p, torch.Tensor):

~/miniconda3/envs/zenml/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
887 result = self._slow_forward(*input, **kwargs)
888 else:
--> 889 result = self.forward(*input, **kwargs)
890 for hook in itertools.chain(
891 _global_forward_hooks.values(),

~/miniconda3/envs/zenml/lib/python3.7/site-packages/zenml/steps/trainer/pytorch_trainers/torch_ff_trainer.py in forward(self, inputs)
42
43 def forward(self, inputs):
---> 44 x = self.relu(self.layer_1(inputs))
45 x = self.batchnorm1(x)
46 x = self.relu(self.layer_2(x))

~/miniconda3/envs/zenml/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
887 result = self._slow_forward(*input, **kwargs)
888 else:
--> 889 result = self.forward(*input, **kwargs)
890 for hook in itertools.chain(
891 _global_forward_hooks.values(),

~/miniconda3/envs/zenml/lib/python3.7/site-packages/torch/nn/modules/linear.py in forward(self, input)
92
93 def forward(self, input: Tensor) -> Tensor:
---> 94 return F.linear(input, self.weight, self.bias)
95
96 def extra_repr(self) -> str:

~/miniconda3/envs/zenml/lib/python3.7/site-packages/torch/nn/functional.py in linear(input, weight, bias)
1751 if has_torch_function_variadic(input, weight):
1752 return handle_torch_function(linear, (input, weight), input, weight, bias=bias)
-> 1753 return torch._C._nn.linear(input, weight, bias)
1754
1755

RuntimeError: Tensor for argument #2 'mat1' is on CPU, but expected it to be on GPU (while checking arguments for addmm)

** Context (please complete the following information):**

  • OS: Ubuntu 20.04
  • Python Version: 3.7.10
  • ZenML Version: 0.3.8

[BUG] If a class is passed as a step rather than then object, then there is a panic exception

Describe the bug
If a class is passed as a step rather than the object of the class, there is panic.

To Reproduce
Try to pass a step class in the Pipeline instead of an object:

training_pipeline.add_trainer(MyTrainer)

instead of:

training_pipeline.add_trainer(MyTrainer())

Expected behavior
A more elegant exception to be thrown.

Screenshots
Provided by @jondoering on the Slack channel.
image

Stack Trace
See above

** Context (please complete the following information):**

  • OS: 16.04
  • Python Version: [e.g. 3.6.6]
  • ZenML Version: [e.g. 0.1.2]

Additional information
Thank you to @jondoering that made us aware of this problem!

[BUG]: Weird _data_type for artifact view

Contact Details [Optional]

No response

What happened?

When trying to load my pandas DataFrame from an ArtifactReview, it tries to load: env.lib.python3 and the _data_type attribute is set to env.lib.python3.8.site-packages.pandas.core.frame.DataFrame. I can't reproduce this behavior.

While I try to recreate this in a new project, i am not able to see the same problem.

pd.DataFrame
# pandas.core.frame.DataFrame

/env/lib/python3.8/site-packages is the site directory in my conda env, where the root directory seems to be my project directory and env my conda environment location directory.

This didn't occur in the 0.5.1 version.

Reproduction steps

Not able to reproduce it yet

ZenML Version

0.5.2

Python Version

3.8

OS Type

Linux

Relevant log output

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
/tmp/ipykernel_22166/2978891941.py in <module>
----> 1 x.read()

~/work/hda-solostar/env/lib/python3.8/site-packages/zenml/post_execution/artifact.py in read(self, output_data_type, materializer_class)
     90 
     91         if not output_data_type:
---> 92             output_data_type = source_utils.load_source_path_class(
     93                 self._data_type
     94             )

~/work/hda-solostar/env/lib/python3.8/site-packages/zenml/utils/source_utils.py in load_source_path_class(source)
    248         "load class from current repository state."
    249     )
--> 250     class_ = import_class_by_path(source)
    251     return class_
    252 

~/work/hda-solostar/env/lib/python3.8/site-packages/zenml/utils/source_utils.py in import_class_by_path(class_path)
    232     classname = class_path.split(".")[-1]
    233     modulename = ".".join(class_path.split(".")[0:-1])
--> 234     mod = importlib.import_module(modulename)
    235     return getattr(mod, classname)  # type: ignore[no-any-return]
    236 

~/work/hda-solostar/env/lib/python3.8/importlib/__init__.py in import_module(name, package)
    125                 break
    126             level += 1
--> 127     return _bootstrap._gcd_import(name[level:], package, level)
    128 
    129 

[...]

ModuleNotFoundError: No module named 'env.lib.python3'

Code of Conduct

  • I agree to follow this project's Code of Conduct

[BUG] when running zenml init after installing

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

  1. pyenv global 3.8.2
  2. python -m venv venv
  3. source venv/bin/activate.fish
  4. pip install zenml
  5. zenml init

Expected behavior

Project to be initiated.

Stack Trace

Traceback (most recent call last):
  File "/Users/ewanvalentine/work/zenml/venv/lib/python3.8/site-packages/pkg_resources/__init__.py", line 584, in _build_master
    ws.require(__requires__)
  File "/Users/ewanvalentine/work/zenml/venv/lib/python3.8/site-packages/pkg_resources/__init__.py", line 901, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/Users/ewanvalentine/work/zenml/venv/lib/python3.8/site-packages/pkg_resources/__init__.py", line 792, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (numpy 1.21.2 (/Users/ewanvalentine/work/zenml/venv/lib/python3.8/site-packages), Requirement.parse('numpy~=1.19.2'), {'tensorflow'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ewanvalentine/work/zenml/venv/bin/zenml", line 6, in <module>
    from pkg_resources import load_entry_point
  File "/Users/ewanvalentine/work/zenml/venv/lib/python3.8/site-packages/pkg_resources/__init__.py", line 3253, in <module>
    def _initialize_master_working_set():
  File "/Users/ewanvalentine/work/zenml/venv/lib/python3.8/site-packages/pkg_resources/__init__.py", line 3236, in _call_aside
    f(*args, **kwargs)
  File "/Users/ewanvalentine/work/zenml/venv/lib/python3.8/site-packages/pkg_resources/__init__.py", line 3265, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/Users/ewanvalentine/work/zenml/venv/lib/python3.8/site-packages/pkg_resources/__init__.py", line 586, in _build_master
    return cls._build_from_requirements(__requires__)
  File "/Users/ewanvalentine/work/zenml/venv/lib/python3.8/site-packages/pkg_resources/__init__.py", line 599, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
  File "/Users/ewanvalentine/work/zenml/venv/lib/python3.8/site-packages/pkg_resources/__init__.py", line 792, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (numpy 1.21.2 (/Users/ewanvalentine/work/zenml/venv/lib/python3.8/site-packages), Requirement.parse('numpy~=1.19.2'), {'tensorflow'})

** Context (please complete the following information):**

  • OS: [e.g. Ubuntu 18.04] OSX 11.5.2
  • Python Version: 3.8.2
  • ZenML Version: latest

Additional information

I've tried Python 3.7.3, 3.9.3 and currently 3.8.2. I've tried to manually resolve some of the version conflicts by installing specific versions. But there seemed to be several, some I couldn't work out how to resolve.

Advice is much appreciated!

[BUG]: Windows path issues

Contact Details [Optional]

[email protected]

What happened?

Hi, I've run into a couple path-related issues running on Windows.
(1) Creating a local pipeline tries to create files with colon (:) in the path due to datetime.now().isoformat() having them (line 103 in local_dag_runner.py). Worked around by adding .replace(':', '_') but there may be more elegant solutions.
(2) Having a materializer in a module within a subpackage structure, causes an importerror when trying to load a module with '\' instead of '.' separating the package names. Worked around by inserting at line 140 in source_utils.py: module_source = module_source.replace("\", ".")

Reproduction steps

  1. Run a pipeline in Windows
  2. Make a custom materializer in a package within a package, run a step that uses it.

ZenML Version

0.5.0

Python Version

3.8

OS Type

Windows

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

[FEATURE] Set ports of servers when showing "statistics" and "Evaluation"

Is your feature request related to a problem? Please describe.
Can user set ports of servers when showing "statistics" and "Evaluation"? As I use zenml in docker, the ports I can use are limit and predefined.

Describe the solution you'd like
Set them by myself when init zenml workspace (global setting)

Describe alternatives you've considered

Additional context

[BUG] Doing `git init`, `zenml init` and running causes a `no HEAD` exception -> A commit must be made first.

Describe the bug
ZenML fails with custom step on no commit git repo.

To Reproduce
Steps to reproduce the behavior:

git init
zenml init
python run.py  # run any pipeline with a custom step

Expected behavior
If there is no commit, make the pipeline un-pinned.

Screenshots

Stack Trace

** Context (please complete the following information):**

  • OS: Ubuntu 16.04
  • Python Version: 3.6.6
  • ZenML Version: 0.3.1

Additional information

[ENHANCEMENT] Extend Trainer interface to compute eval results

Is your enhancement request related to a problem? Please describe.
In order to make the Model Agnostic Evaluator work (see #44 ), the Trainer API needs to write out the evaluation results, so that metrics can be calculated downstream.

Describe the enhancement you'd like
An interface that exposes an eval_fn that when implemented writes evaluated results in a Channel to be consumed by a downstream Model Agnostic Evaluator.

How do you solve your current problem with the current status-quo of ZenML?
Currently, evaluator is done only for Tensorflow models.

[ENHANCEMENT] Refine design for creation of custom datasource

Is your enhancement request related to a problem? Please describe.
Initial feedback shows that creating a custom datasource is hard due to the Step relationship and the need to learn Beam.

Describe the enhancement you'd like
A simple Pythonic interface that exposes a Beam ParDo.

How do you solve your current problem with the current status-quo of ZenML?
Its hard to do for now.

Additional context
Add any other context or screenshots about the request here.

[FEATURE] JAX integration

Salam / merhabalar friends,

Is your feature request related to a problem? Please describe.
Let's bring Google's JAX to ZenML!

Describe the solution you'd like
I would like to build an example (what exactly the example is about is TBD at this point) ZenML+JAX project - if it's cloud-ready, all the better (although JAX on GCP has some sharp edges as far as I understood).

Ideally, this could be accomplished by only a NumPy datasource plus a JAX trainer class, but that is a first hunch - let's hope karma does not strike me for this one.

Additional context
Admittedly I am still in March with my mental model of a lot of ZenML's designs, so I will need to spend some time to go through the newer concepts. When I'm ready and made progress, I'll submit a PR with the aforementioned JAX example.

Also, I might have to unpin some requirements to get stuff to build from source (Apple M1), due to the present lack of wheel support (think JAX itself, scipy, or pandas) - I'll check in and document what works here (or just go to Linux instead).

What do you think?

[ENHANCEMENT] How do i reconnect the pipelines?

I used zenml init to build the pipelines as suggested in QUICKSTART, however, the jupyter notebook got broken up. How can I reconnect back to the pipelines that i have built as zenml init wont be working at such circumstance. Thanks

[ENHANCEMENT] Distribute requirements.txt into folder wise structure

Is your enhancement request related to a problem? Please describe.
No

Describe the enhancement you'd like
I was going through this File and wanted to check your thoughts if we could create a folder wise tree-like structure for different requirements like extensions, providers etc. in text files, depending on our use case and refer to the path of the requirements files from contants.py. Let me know if you see any challenges. For Example -

requirements
-> base.txt
-> providers
    -> gcp.txt
    -> azure.txt
    -> aws.txt
-> datasources
    -> huggingface.txt

How do you solve your current problem with the current status-quo of ZenML?
It would make it easier to update and maintain the versions of dependencies including extensions, base dependencies etc rather than updating the python file for any version updates.

Additional context
enhance

[FEATURE] Add support for PostgreSQL data sources

Is your feature request related to a problem? Please describe.
Some ZenML customers consume directly from Postgres.

Describe the solution you'd like
An integration with beam-nuggets would open up a plethora of options.

Describe alternatives you've considered

Additional context

[BUG] zenml installation error on Python 3.9.5

Describe the bug
zenml installation error.
To Reproduce
Steps to reproduce the behavior:
pip install zenml

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Stack Trace

ERROR: Cannot install zenml==0.1.0 and zenml==0.1.1 because these package versions have conflicting dependencies.

The conflict is caused by:
    zenml 0.1.1 depends on tfx==0.25.0
    zenml 0.1.0 depends on tfx==0.25.0

** Context (please complete the following information):**

  • OS: [e.g. windows 10]
  • Python Version: [3.9.5]
  • ZenML Version: latest

Additional information
Add any other context about the problem here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.