numaproj / numalogic Goto Github PK

Collection of operational time series ML models and tools

Home Page: https://numalogic.numaproj.io/

License: Apache License 2.0

Makefile 0.22% Python 99.36% Shell 0.10% Dockerfile 0.32%

machine-learning time-series python deep-learning outlier-detection autoencoders unsupervised-learning variational-autoencoder hacktoberfest

numalogic's Introduction

numaproj - Operational data analytics for Kubernetes

What is numaproj?

numaproj is a collection of Kubernetes-native tools for doing real-time operation data analytics.

numaflow - Massively parallel, real-time data and stream processing engine
numalogic - ML models and tools for real-time operational data analytics

Project Resources

numaproj GitHub: https://github.com/numaproj
numaproj Slack: Join
numaproj Blog: https://blog.numaproj.io

numalogic's People

Contributors

Stargazers

Watchers

numalogic's Issues

Support for Loss Functions (Non-Symmetric Loss function)

Summary

Support for Non-Symmetric Loss Functions.

We are ultimately looking to support loss functions that are not only in PyTorch but also provide a flexibility to the user to plug in there own custom loss function. This issue does not only point to adding more loss functions but also asks for a better way of providing that interface to the user to bring in their own custom loss function that integrates with the models that we have today.

Add forecast based models

Summary

Add forecast based anomaly detection models, both uni-channel and muti-channel.

Multi stage docker builds for examples

Current docker build in the quick start guide take too long, as well as being very large in size. This requires more resources in local, and also decreases developer velocity. Introducing multi-stage docker builds can solve this problem.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Storing optimizer dict only when resume training parameter is True.

Summary

What change needs making?
The pipeline.py everytime saves optimizer dict to mode registry which is redundant. We can store the optimizer dict by introducing a new param "resume_training" and setting it to true if we want to save it or else we can ignore it.

Use Cases

Resume training or retraining

When would you use this?

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Code errors in Load function when there is no metadata and/or seconday_artifacts in the registry

Describe the bug
A clear and concise description of what the bug is.
While loading the model the mlflow code error out when the secondary_artifacts and/or metadata is missing in the mlflow registry.

To Reproduce
Steps to reproduce the behavior:

1. Save a model to mlflow registry without metadata and/or secondary_artifacts.
2. Load the same model with the key. The code should error out.

Expected behavior
A clear and concise description of what you expected to happen.
The model loading should happen seamlessly.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

- Numalogic: v0.2.4

Additional context
Add any other context about the problem here.

Message from the maintainers:

Impacted by this bug? Give it a 👍. We often sort issues this way to know what to prioritize.

unbatch_sequences throws error for certain cases

unbatch_sequences fails and gives an error when:

batchsize = 1
the datasize is smaller than batchsize

Message from the maintainers:

Impacted by this bug? Give it a 👍. We often sort issues this way to know what to prioritize.

Boto3 dependency for mlflow

Summary

Add boto3 dependency along with mlflow in extras. Mlflow depends upon boto3 and is not installed along with mlflow package

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Allow mxnet models to be used as a custom model

Summary

Allow users to use their custom tensorflow models and use them seamlessly as they would use numalogic's in-built models.

Use Cases

ML engineers and data scientists might want to use their own models written in tensorflow for inference.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Docs for SequenceDataset

Docstrings is needed to explain the SequenceDataset.

Add benchmark datasets to compare algorithms

Summary

Adding benchmark datasets to compare algorithms, and their performance under different use cases.

What change needs making?
Synthetic or real benchmark datasets

When would you use this?
To compare algorithms

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Add MLFlow as an optional dependency

Summary

Add mlflow (not mlflow-skinny) as an optional dependency through poetry.

Use Cases

The user might not use mlflow as their choice of model registry. In that case, it makes sense to make mlflow install as an optional pkg dependency

Introduce dynamic thresholding technique

Summary

What change needs making?

Use Cases

When would you use this?

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Support exponential moving average

Summary

Support exponential moving average as a post process step

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Support forecasting based models

Summary

Add a blend of forecasting based models for anomaly detection, that can potentially take in additional regressors also.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Improve SparseAEPipeline docs

Add docstrings to the SparseAEPipeline. It follows the paper: https://web.stanford.edu/class/cs294a/sparseAutoencoder.pdf

github page for numalogic

We need to have GitHub pages as we have in numaflow.

Ability to train only during given time range

Summary

Give users the ability to train only during a specified time range.

Use Cases

Sometimes, users would like to perform training during non peak hours. They could have the ability to specify the time range when the retraining can happen.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Support automated early stopping

Summary

Early stopping can help reduce overfitting of the models, and an automated way of doing that will be really helpful.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Support regressors as anomaly score generators

Summary

Idea from this paper: https://arxiv.org/pdf/1711.00614.pdf
Taking the latent representation in the AE, and doing a regression on the reconstruction can be an alternative to just computing the reconstruction error.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Fix code coverage variability in anomalies.py

Due to the random numbers generated in the AnomalyGenerator class, the coverage varies with every run. Ideally we should have have a random seed in the test file, to have reproducibility.

Improve AnomalyGenerator docs

Problems with the current docs in AnomalyGenerator:

Lacks class usage description
Currently does not follow the docstring format used elsewhere in the project

Do online inference using the models saved in MLFlow registry.

Summary

What change needs making?

Use Cases

When would you use this?

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Increase float precision in train logging

Currently, it only shows the last 3 places after decimal. Customize this to show more info.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Add Variational AE based models

Summary

Adding different architectures of variational AE based models to output a unified anomaly score for multivariate time series.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Store train and val loss as an optional metric in registry

Summary

Storing training loss and validation loss would provide more insights in the model look up.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Validation set error is not shown

Describe the bug
Validation error is not shown if provided a validation dataset in the autoencoder trainer.

To Reproduce
Provide a validation dataloader to the autoencoder trainer

Expected behavior
Validation scores should show up in addition to the training scores.

Message from the maintainers:

Impacted by this bug? Give it a 👍. We often sort issues this way to know what to prioritize.

Support streaming generation of synthetic time series

Summary

Support streaming generation of time series in synthetic module

Use Cases

This can be useful in generating a streaming example, mostly coupled in a numaflow pipeline

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Support memory caching for artifacts

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Add documentation for example pipeline: Thresholding documentation.

Summary

Add documentation for thresholding and why is it needed.

Use Cases

When would you use this?

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Allow tensorflow models to be used as a custom model

Summary

Allow users to use their custom tensorflow models and use them seamlessly as they would use numalogic's in-built models.

Use Cases

ML engineers and data scientists might want to use their own models written in tensorflow for inference.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Test numalogic examples

Changes to code and numalogic-python can cause numalogic examples to potentially break. Need to make sure that the examples run as expected.
Can be done either as a part of unit tests or as github workflows

staticpowertransformer throws error intermittently

Intermittently this test case throws this error, mostly because of the random number that is generated.

=================================== FAILURES ===================================
_________________ TestTransformers.test_staticpowertransformer _________________

self = <numalogic.tests.preprocess.test_transformer.TestTransformers testMethod=test_staticpowertransformer>

    def test_staticpowertransformer(self):
        x = 1 + np.random.randn(5, 3)
        transformer = StaticPowerTransformer(3, add_factor=2)
        x_prime = transformer.transform(x)
    
        assert_almost_equal(np.power(2 + x, 3), x_prime)
        assert_almost_equal(transformer.fit_transform(x), x_prime)
>       assert_almost_equal(transformer.inverse_transform(x_prime), x, decimal=4)
E       AssertionError: 
E       Arrays are not almost equal to 4 decimals
E       
E       x and y nan location mismatch:
E        x: array([[1.9392, 3.3033, 0.2871],
E              [1.5581, 1.6368, 2.0375],
E              [0.3713, 3.0054, 1.0976],...
E        y: array([[ 1.9392,  3.3033,  0.2871],
E              [ 1.5581,  1.6368,  2.0375],
E              [ 0.3713,  3.0054,  1.0976],...

numalogic/tests/preprocess/test_transformer.py:26: AssertionError
=============================== warnings summary ===============================
numalogic/tests/preprocess/test_transformer.py::TestTransformers::test_staticpowertransformer
  /home/runner/work/numalogic/numalogic/numalogic/preprocess/transformer.py:41: RuntimeWarning: invalid value encountered in power
    return np.power(X, 1.0 / self.n) - self.add_factor

Need to offset the random number to make sure that test case always passes.

Add GRU based autoencoder

GRU based autoencoder will be a good addition, since it is simpler than LSTM, hence is quicker to train.

Support for Loss Functions (Symmetric Loss function)

Summary

Introduce Symmetric Loss functions for the ML model.

We are ultimately looking to support loss functions that are not only in PyTorch but also provide a flexibility to the user to plug in there own custom loss function. This issue does not only point to adding more loss functions but also asks for a better way of providing that interface to the user to bring in their own custom loss function that integrates with the models that we have today.

Use Cases

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Thresholding techniques as a separate estimator

Thresholding techniques can vary from basic ones like mean + std thresholding, median based methods to more complex ones.
Decoupling the threshold calculation from Autoencoder models can provide more flexibility.

Documentation for overall feature usage

Add docs for most of the modules as well as examples on how to use them.

Detect data drift automatically

Summary

Support detecting data drift in training vs real-time data, automatically using statistical methods to start with.

Use Cases

Data drift is natural, and can help determine when to trigger retraining of the model.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Support GPU accelaration for AE

Currently the Autoencoder code is not optimized for GPU if the device is supported. Need to make some changes to AutoencoderPipeline and SparseAEPipeline to move the model, optimizer etc to the correct device.

Validation loss getting computed during training

Summary

Compute validation loss, if optional validation data is provided in the fit function. This would be in line with ML best practices, and would help calculate model robustness.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Support same sequence shape for all architectures

Summary

Currently LSTM and Vanilla/Conv1d based models require a different shape of input. This creates a bit of confusion and can be simplified.

What change needs making?
Support same input shape for all autoencoder based architectures. Will need to change the view in one of the types of variants.

Add precommit config

Summary

Add pre-commit config for the project.

Use Cases

Would help make commits cleaner and better.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Docs for SyntheticTSGenerator

SyntheticTSGenerator is responsible for generating synthetic time series. Need documentation for the class and it's methods.

Pin protobuf to 3.20

Pin protobuf version to 3.20, due to incompatibility with tensorboard (pytorch-lightning depends on that) until they make tensorboard optional: Lightning-AI/pytorch-lightning#9900

Related issue: Lightning-AI/pytorch-lightning#13159

Message from the maintainers:

Impacted by this bug? Give it a 👍. We often sort issues this way to know what to prioritize.

end to end example of numalogic

Summary

Give an end to end example of how to use numalogic. The example could have the following steps

HTTP endpoint
write some data (time series) to the HTTP endpoint
we print out anomalies in the log
tuning
- tune the config for the model and we see different scores
- bring your own model

Support Dynamodb as a registry

Summary

Dynamodb as an optional registry mechanism.
Support versioning support as well.

Use Cases

Can be optional to mlflow for a user.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Support multi worker Dataset

Currently the StreamDataset class does not support multi worker data loading. Doing this should improve training time.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Fix loss logging to return average of all batches in epoch

Describe the bug
The logged loss while training the autoencoder models, does not represent the average loss for the whole batch. Instead currently it just returns the last batch's loss.

Expected behavior
The loss returned should be an average of the loss of all the batches in the epoch.

Support model caching using Redis

Summary

Support redis for artifact caching.

Use Cases

This will speedup model loading and inference.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Support Python 3.11

Add workflow to make sure 3.11 build succeeds
Update pyproject.toml file to support python 3.11.x versions

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

Local file based registry

Summary

Local file based registry by overloading the base ArtifactManager class.

Use Cases

Will be useful in testing out the registry saving/loading pattern for quick start guides.

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

numaproj / numalogic Goto Github PK

numalogic's Introduction

numaproj - Operational data analytics for Kubernetes

What is numaproj?

Project Resources

numalogic's People

Contributors

Stargazers

Watchers

Forkers

numalogic's Issues

Summary

Summary

Summary

Use Cases

When would you use this?

Summary

Summary

Use Cases

Summary

Summary

Use Cases

Summary

Use Cases

Summary

Summary

Summary

Use Cases

Summary

Summary

Summary

Use Cases

Summary

Summary

Summary

Use Cases

Summary

Use Cases

Summary

Use Cases

Summary

Use Cases

Summary

Use Cases

Summary

Summary

Summary

Use Cases

Related issue: Lightning-AI/pytorch-lightning#13159

Summary

Summary

Use Cases

Summary

Use Cases

Summary

Use Cases

Recommend Projects

Recommend Topics

Recommend Org