neuralmagic / sparsify Goto Github PK

ML model optimization product to accelerate inference.

License: Apache License 2.0

Makefile 0.92% Python 98.35% HTML 0.55% Dockerfile 0.19%

sparsify smaller-models quantization pruning inference-performance sparsification-recipe computer-vision image-classification object-detection pytorch tensorflow keras automl deep-learning-accelerator onnx

sparsify's Introduction

Sparsify [Alpha]

ML model optimization product to accelerate inference

🚨 February 2024: Important Sparsify Update

The Neural Magic team is pausing the Sparsify Alpha at this time. We are refocusing efforts around a new exciting project to be announced in the coming months. Thank you for your continued support and stay tuned

🚨 October 2023: Important Sparsify Announcement

Given our new focus on enabling sparse large language models (LLMs) to run competitively on CPUs, Sparsify Alpha is undergoing upgrades to focus on fine-tuning and optimizing LLMs. This means that we will no longer be providing bug fixes, prioritizing support, or building new features and integrations for non-LLM flows including the CV and NLP Sparsify Pathways.

Neural Magic is super excited about these new efforts in building Sparsify into the best LLM fine-tuning and optimization tool on the market over the coming months and we cannot wait to share more soon. Thanks for your continued support!

🚨 July 2023: Sparsify's next generation is now in alpha as of version 1.6.0!

Sparsify enables you to accelerate inference without sacrificing accuracy by applying state-of-the-art pruning, quantization, and distillation algorithms to neural networks with a simple web application and one-command API calls.

Sparsify empowers you to compress models through two components:

Sparsify Cloud - a web application that allows you to create and manage Sparsify Experiments, explore hyperparameters, predict performance, and compare results across both Experiments and deployment scenarios.
Sparsify CLI/API - a Python package and GitHub repository that allows you to run Sparsify Experiments locally, sync with the Sparsify Cloud, and integrate them into your workflows.

Quickstart Guide
Companion Guides
Resources

Quickstart Guide

Interested in test-driving our alpha? Get a sneak peek and influence the product's development process. Thank you in advance for your feedback and interest!

This quickstart details several pathways you can work through. We encourage you to explore one for Sparsify's full benefits. When you finish the quickstart, sparsifying your models is as easy as:

sparsify.run sparse-transfer --use-case image-classification --data imagenette --optim-level 0.5

1. Install and Setup

1.1 Verify Prerequisites

First, verify that you have the correct software and hardware to run the Sparsify Alpha.

Software

Sparsify is tested on Python 3.8 and 3.10, ONNX 1.5.0-1.12.0, ONNX opset version 11+, and manylinux compliant systems. Sparsify is not supported natively on Windows and MAC OS.

Additionally, for installation from PyPi, pip 20.3+ is required.

Hardware

Sparsify requires a GPU with CUDA + CuDNN in order to sparsify neural networks. We recommend you use a Linux system with a GPU that has a minimum of 16GB of GPU Memory, 128GB of RAM, 4 CPU cores, and is CUDA-enabled. If you are sparsifying a very large model, you may need more RAM than the recommended 128GB. If you encounter issues setting up your training environment, file a GitHub issue.

1.2 Create an Account

Creating a new one-time account is simple and free.
An account is required to manage your Experiments and API keys.
Visit the Neural Magic's Web App Platform and create an account by entering your email, name, and unique password. If you already have a Neural Magic Account, sign in with your email.

1.3 Install Sparsify

pip is the preferred method for installing Sparsify. It is advised to create a fresh virtual environment to avoid dependency issues.

Install with pip using:

pip install sparsify-nightly

1.4 Log in via CLI

Next, with Sparsify installed on your training hardware:

Authorize the local CLI to access your account by running the sparsify.login command and providing your API key.
Locate your API key on the homepage of the Sparsify Cloud under the 'Get set up' modal, and copy the command or the API key itself.
Run the following command:

sparsify.login API_KEY

2. Run an Experiment

Experiments are the core of sparsifying a model. They allow you to apply sparsification algorithms to a dataset and model through the three Experiment types detailed below:

One-Shot
Training-Aware
Sparse-Transfer

All Experiments are run locally on your training hardware and can be synced with the cloud for further analysis and comparison, using Sparsify's two components:

Sparsify Cloud - explore hyperparameters, predict performance, and generate the desired CLI/API command.
Sparsify CLI/API - run an experiment.

2.1 One-Shot

Sparsity	Sparsification Speed	Accuracy
++	+++++	+++

One-Shot Experiments quickly sparsify your model post-training, providing a 3-5x speedup with minimal accuracy loss, ideal for quick model optimization without retraining your model.

To run a One-Shot Experiment for your model, dataset, and use case, use the following command:

sparsify.run one-shot --use-case USE_CASE --model MODEL --data DATASET --optim-level OPTIM_LEVEL

For example, to sparsify a ResNet-50 model on the ImageNet dataset for image classification, run the following commands:

wget https://public.neuralmagic.com/datasets/cv/classification/imagenet_calibration.tar.gz
tar -xzf imagenet_calibration.tar.gz -C ./imagenet_calibration
sparsify.run one-shot --use-case image_classification --model "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/base-none" --data ./imagenet_calibration --optim-level 0.5

Or, to sparsify a BERT model on the SST2 dataset for sentiment analysis, run the following commands:

wget https://public.neuralmagic.com/datasets/nlp/text_classification/sst2_bert_calibration.tar.gz
tar -xzf sst2_bert_calibration.tar.gz
sparsify.run one-shot --use-case text_classification --model "zoo:nlp/sentiment_analysis/bert-base/pytorch/huggingface/sst2/base-none" --data --data ./sst2_bert_calibration --optim-level 0.5

To dive deeper into One-Shot Experiments, read through the One-Shot Experiment Guide.

Note, One-Shot Experiments currently require the model to be in an ONNX format and the dataset to be in a NumPy format. More details are provided in the One-Shot Experiment Guide.

2.2 Sparse-Transfer

Sparsity	Sparsification Speed	Accuracy
++++	++++	+++++

Sparse-Transfer Experiments quickly create a smaller and faster model for your dataset by transferring from a SparseZoo pre-sparsified foundational model, providing a 5-10x speedup with minimal accuracy loss, ideal for quick model optimization without retraining your model.

To run a Sparse-Transfer Experiment for your model (optional), dataset, and use case, run the following command:

sparsify.run sparse-transfer --use-case USE_CASE --model OPTIONAL_MODEL --data DATASET --optim-level OPTIM_LEVEL

For example, to sparse transfer a SparseZoo model to the Imagenette dataset for image classification, run the following command:

sparsify.run sparse-transfer --use-case image_classification --data imagenette --optim-level 0.5

Or, to sparse transfer a SparseZoo model to the SST2 dataset for sentiment analysis, run the following command:

sparsify.run sparse-transfer --use-case text_classification --data sst2 --optim-level 0.5

To dive deeper into Sparse-Transfer Experiments, read through the Sparse-Transfer Experiment Guide.

Note, Sparse-Transfer Experiments require the model to be saved in a PyTorch format corresponding to the underlying integration such as Ultralytics YOLOv5 or Hugging Face Transformers. Datasets must additionally match the expected format of the underlying integration. More details and exact formats are provided in the Sparse-Transfer Experiment Guide.

2.3 Training-Aware

Sparsity	Sparsification Speed	Accuracy
+++++	++	+++++

Training-aware Experiments sparsify your model during training, providing a 6-12x speedup with minimal accuracy loss, ideal for thorough model optimization when the best performance and accuracy are required.

To run a Training-Aware Experiment for your model, dataset, and use case, run the following command:

sparsify.run training-aware --use-case USE_CASE --model OPTIONAL_MODEL --data DATASET --optim-level OPTIM_LEVEL

For example, to sparsify a ResNet-50 model on the Imagenette dataset for image classification, run the following command:

sparsify.run training-aware --use-case image_classification --model "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenette/base-none" --data imagenette --optim-level 0.5

Or, to sparsify a BERT model on the SST2 dataset for sentiment analysis, run the following command:

sparsify.run training-aware --use-case text_classification --model "zoo:nlp/sentiment_analysis/bert-base/pytorch/huggingface/sst2/base-none" --data sst2 --optim-level 0.5

To dive deeper into Training-Aware Experiments, read through the Training-Aware Experiment Guide.

Note that Training-Aware Experiments require the model to be saved in a PyTorch format corresponding to the underlying integration such as Ultralytics YOLOv5 or Hugging Face Transformers. Datasets must additionally match the expected format of the underlying integration. More details and exact formats are provided in the Training-Aware Experiment Guide.

3. Compare Results

Once you have run your Experiment, the results, logs, and deployment files will be saved under the current working directory in the following format:

[EXPERIMENT_TYPE]_[USE_CASE]_{DATE_TIME}
├── deployment
│   ├── model.onnx
│   └── *supporting files*
├── logs
│   ├── *logs*
├── training_artifacts
│   ├── *training artifacts*
    ├── *metrics and results*

You can compare the accuracy by looking through the metrics printed out to the console and the metrics saved in the experiment directory. Additionally, you can use DeepSparse to compare the inference performance on your CPU deployment hardware.

Note: In the near future, you will be able to visualize the results in Sparsify Cloud, simulate other scenarios and hyperparameters, compare the results to other Experiments, and package for your deployment scenario.

To run a benchmark on your deployment hardware, use the deepsparse.benchmark command with your original model and the new optimized model. This will run a number of inferences to simulate a real-world scenario and print out the results.

It's as simple as running the following command:

deepsparse.benchmark --model_path MODEL --scenario SCENARIO

For example, to benchmark a dense ResNet-50 model, run the following command:

deepsparse.benchmark --model_path "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenette/base-none" --scenario sync

This can then be compared to the sparsified ResNet-50 model with the following command:

deepsparse.benchmark --model_path "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none" --scenario sync

The output will look similar to the following:

DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.6.0.20230629 COMMUNITY | (fc8b788a) (release) (optimized) (system=avx512, binary=avx512)
deepsparse.benchmark.benchmark_model INFO     deepsparse.engine.Engine:
	onnx_file_path: ./model.onnx
	batch_size: 1
	num_cores: 1
	num_streams: 1
	scheduler: Scheduler.default
	fraction_of_supported_ops: 0.9981
	cpu_avx_type: avx512
	cpu_vnni: False
=Original Model Path: ./model.onnx
Batch Size: 1
Scenario: sync
Throughput (items/sec): 134.5611
Latency Mean (ms/batch): 7.4217
Latency Median (ms/batch): 7.4245
Latency Std (ms/batch): 0.0264
Iterations: 1346

See the DeepSparse Benchmarking User Guide for more information on benchmarking.

4. Deploy a Model

As an optional step to this quickstart, now that you have your optimized model, you are ready for inferencing. To get the most inference performance out of your optimized model, we recommend you deploy on Neural Magic's DeepSparse. DeepSparse is built to get the best performance out of optimized models on CPUs.

DeepSparse Server takes in a task and a model path and will enable you to serve models and Pipelines for deployment in HTTP.

You can deploy any ONNX model using DeepSparse Server with the following command:

deepsparse.server --task USE_CASE --model_path MODEL_PATH

Where USE_CASE is the use case of your Experiment and MODEL_PATH is the path to the deployment folder from the Experiment.

For example, to deploy a sparsified ResNet-50 model, run the following command:

deepsparse.server --task image_classification --model_path "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"

If you're not ready for deploying, congratulations on completing the quickstart!

Companion Guides

Resources

Now that you have explored Sparsify [Alpha], here are other related resources.

Feedback and Support

Report UI issues and CLI errors, submit bug reports, and provide general feedback about the product to the Sparsify team via the Neural Magic Slack Channel, or via GitHub Issues. Alpha support is provided through those channels.

Terms and Conditions

Sparsify Alpha is a pre-release version of Sparsify that is still in active development. The product is not yet ready for production use; APIs and UIs are subject to change. There may be bugs in the Alpha version, which we hope to have fixed before Beta and then a general Q3 2023 release. The feedback you provide on quality and usability helps us identify issues, fix them, and make Sparsify even better. This information is used internally by Neural Magic solely for that purpose. It is not shared or used in any other way.

That being said, we are excited to share this release and hear what you think. Thank you in advance for your feedback and interest!

Learning More

Documentation: SparseML, SparseZoo, Sparsify, DeepSparse
Neural Magic: Blog, Resources

Release History

Official builds are hosted on PyPI

stable: sparsify
nightly (dev): sparsify-nightly

Additionally, more information can be found via GitHub Releases.

License

The project is licensed under the Apache License Version 2.0.

Community

Contribute

We appreciate contributions to the code, examples, integrations, and documentation as well as bug reports and feature requests! Learn how here.

Join

For user help or questions about Sparsify, sign up or log in to our Neural Magic Community Slack. We are growing the community member by member and happy to see you there. Bugs, feature requests, or additional questions can also be posted to our GitHub Issue Queue.

You can get the latest news, webinar and event invites, research papers, and other ML Performance tidbits by subscribing to the Neural Magic community.

For more general questions about Neural Magic, please fill out this form.

Cite

Find this project useful in your research or other communications? Please consider citing:

@InProceedings{
    pmlr-v119-kurtz20a, 
    title = {Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks}, 
    author = {Kurtz, Mark and Kopinsky, Justin and Gelashvili, Rati and Matveev, Alexander and Carr, John and Goin, Michael and Leiserson, William and Moore, Sage and Nell, Bill and Shavit, Nir and Alistarh, Dan}, 
    booktitle = {Proceedings of the 37th International Conference on Machine Learning}, 
    pages = {5533--5543}, 
    year = {2020}, 
    editor = {Hal Daumé III and Aarti Singh}, 
    volume = {119}, 
    series = {Proceedings of Machine Learning Research}, 
    address = {Virtual}, 
    month = {13--18 Jul}, 
    publisher = {PMLR}, 
    pdf = {http://proceedings.mlr.press/v119/kurtz20a/kurtz20a.pdf},
    url = {http://proceedings.mlr.press/v119/kurtz20a.html}, 
    abstract = {Optimizing convolutional neural networks for fast inference has recently become an extremely active area of research. One of the go-to solutions in this context is weight pruning, which aims to reduce computational and memory footprint by removing large subsets of the connections in a neural network. Surprisingly, much less attention has been given to exploiting sparsity in the activation maps, which tend to be naturally sparse in many settings thanks to the structure of rectified linear (ReLU) activation functions. In this paper, we present an in-depth analysis of methods for maximizing the sparsity of the activations in a trained neural network, and show that, when coupled with an efficient sparse-input convolution algorithm, we can leverage this sparsity for significant performance gains. To induce highly sparse activation maps without accuracy loss, we introduce a new regularization technique, coupled with a new threshold-based sparsification method based on a parameterized activation function called Forced-Activation-Threshold Rectified Linear Unit (FATReLU). We examine the impact of our methods on popular image classification models, showing that most architectures can adapt to significantly sparser activation maps without any accuracy loss. Our second contribution is showing that these these compression gains can be translated into inference speedups: we provide a new algorithm to enable fast convolution operations over networks with sparse activations, and show that it can enable significant speedups for end-to-end inference on a range of popular models on the large-scale ImageNet image classification task on modern Intel CPUs, with little or no retraining cost.} 
}

@misc{
    singh2020woodfisher,
    title={WoodFisher: Efficient Second-Order Approximation for Neural Network Compression}, 
    author={Sidak Pal Singh and Dan Alistarh},
    year={2020},
    eprint={2004.14340},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

sparsify's People

Contributors

Stargazers

Watchers

sparsify's Issues

Transformers Text Classification Profiling fails and crases sparsify server (data indices for a Gather operation)

Describe the bug
When trying to perform an initial profile of a transformers style text classification model an error is thrown relating to indices out of bounds for a Gather operation. The profiling stops and the server crashes

Expected behavior
Completion of profiling for a valid onnx export of a huggingface text classification model.

Environment
Include all relevant environment information:

Debian- 11 - bullseye (Docker). AWS C6i.8xlarge
Python 3.9.12
Sparsify '0.12.1'
torch: '1.9.1+cpu'

onnxruntime = "^1.11.1"
sparsify = "^0.12.1"
deepsparse = "^0.12.1"
torch = {url = "https://download.pytorch.org/whl/cpu/torch-1.9.1%2Bcpu-cp39-cp39-linux_x86_64.whl"}
onnx = "<=1.10.1"
sparseml = "^0.12.1"
transformers = {url = "https://github.com/neuralmagic/transformers/releases/download/nightly/transformers-4.18.0.dev0-py3-none-any.whl"}
protobuf = "<=3.20"
datasets = "^2.2.2"

AWS C6i.8xlarge

To Reproduce

Make a fresh 3 class text classifier based on distilbert

from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-multilingual-cased")
model = AutoModelForSequenceClassification.from_pretrained(
    "distilbert-base-multilingual-cased", num_labels=3)

model.save_pretrained("new_text_classifier")
tokenizer.save_pretrained("new_text_classifier")

Export from transformers to onnx

sparseml.transformers.export_onnx --model_path new_text_classifier/ --sequence_length 128 --task text-classification

Start sparisfy

sparsify --working-dir=.

Enter path to model to upload new_text_classifier/model.onnx
Hit Run to start profiling

Errors

2022-05-26 19:08:07.628652552 [E:onnxruntime:, sequential_executor.cc:352 Execute] Non-zero status code returned while running Gather node. Name:'Gather_7' Status Message: indices element out of data bounds, idx=5335349968635603386 must be within the inclusive range [-119547,119546]
NM: Fatal error encountered: Non-zero status code returned while running Gather node. Name:'Gather_7' Status Message: indices element out of data bounds, idx=5335349968635603386 must be within the inclusive range [-119547,119546], exiting.

Additional context
Add any other context about the problem here. Also include any relevant files.

Improve documentation when exporting a recipe

What is the URL, file, or UI containing proposed doc change
Where does one find the original content or where would this change go?

This change would go in the main README of sparsify repository.

What is the current content or situation in question
https://github.com/neuralmagic/sparsify#exporting-a-recipe

What is the proposed change
In the main README it should be mentioned that tensorflow or torch should be installed with sparsify to be able to export them.

Moreover it should be mentioned the range of version for torch or tensorflow that can be handled:

Additional context
Related-To new tutorial with NerualMagic: AICoE/elyra-aidevsecops-tutorial#297

Import error on optimization export page

Describe the bug
Optimization config file displays `No module named 'sparseml.pytorch.recal' on export.

Expected behavior
Should display the exported recipe

Environment
Include all relevant environment information:

OS [e.g. Ubuntu 18.04]: Ubuntu 18.04
Python version [e.g. 3.7]: 3.7
Sparsify version or commit hash [e.g. 0.1.0, f7245c8]: 5432d66
ML framework version(s) [e.g. torch 1.7.1]: 1.7.1
Other Python package versions [e.g. SparseZoo, DeepSparse, numpy, ONNX]: SparseML 0.1.0
Other relevant environment information [e.g. hardware, CUDA version]: n/a

To Reproduce
Exact steps to reproduce the behavior:

create an optimization
Hit Export
Error appears under "optimization config file"

Errors

Additional context
n/a

Can't import EpochRangeModifier

Describe the bug
When trying to create a recipe, the recipe cannot be generated because sparseml.pytorch.optim.EpochRangeModifier cannot be imported. From what I understood, it's because modifier.py moved from optim to sparsification in sparseml.

Expected behavior
Generate a recipe :)

Environment
Include all relevant environment information:

OS [e.g. Ubuntu 18.04]: Arch
Python version [e.g. 3.7]: 3.8
Sparsify version or commit hash [e.g. 0.1.0, f7245c8]: 0.12
ML framework version(s) [e.g. torch 1.7.1]: PyTorch 1.9.1
Other Python package versions [e.g. SparseZoo, DeepSparse, numpy, ONNX]: sparseml 0.12
Other relevant environment information [e.g. hardware, CUDA version]: N/A

To Reproduce
Exact steps to reproduce the behavior:

Import an onnx model (it sounds like the initial benchmark doesn't work either, I get "job cancelled".
Create a recipe using sparsify.

Errors
If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.
It simply says it cannot import EpochRangeModifier from sparseml

Additional context
Add any other context about the problem here. Also include any relevant files.

Quantization in UI

Hi,
Quantization is not available in the UI, could you provide an approximate ETA? is there a recommended course of action for performing pruning with UI and quantization by other means?
Thanks!

PyTorch to ONNX weight names may change on export, causing name mismatch in subsequent training

Hi,
I'm using my own PyTorch model and its torch.onnx.export() function to obtain an onnx model for sparsification.

However, PyTorch to ONNX does not guarantee to retain the weight names, which raises an error when I fine tune the model with the produced recipe.

The error I get is, for example:
RuntimeError: All supplied parameter names or regex patterns not found.No match for 2425 in found parameters []. Supplied ['2425']
Which means that one of the existing layers had a name change to 2425. I tried with the option mentioned in the thread above but it didn't work.

cannot install in virtual environment python=3.6

I kept running into a dependency error during installation:

ERROR: Cannot install sparsify==0.1.0, sparsify==0.1.1, sparsify==0.2.0, sparsify==0.3.0, sparsify==0.3.1, sparsify==0.4.0, sparsify==0.5.0, sparsify==0.5.1, sparsify==0.6.0, sparsify==0.7.0, sparsify==0.8.0 and sparsify==0.9.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    sparsify 0.9.0 depends on pysqlite3-binary>=0.4.0
    sparsify 0.8.0 depends on pysqlite3-binary>=0.4.0
    sparsify 0.7.0 depends on pysqlite3-binary>=0.4.0
    sparsify 0.6.0 depends on pysqlite3-binary>=0.4.0
    sparsify 0.5.1 depends on pysqlite3-binary>=0.4.0
    sparsify 0.5.0 depends on pysqlite3-binary>=0.4.0
    sparsify 0.4.0 depends on pysqlite3-binary>=0.4.0
    sparsify 0.3.1 depends on pysqlite3-binary>=0.4.0
    sparsify 0.3.0 depends on pysqlite3-binary>=0.4.0
    sparsify 0.2.0 depends on pysqlite3-binary>=0.4.0
    sparsify 0.1.1 depends on pysqlite3-binary>=0.4.0
    sparsify 0.1.0 depends on pysqlite3-binary>=0.4.0

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies

Steps to reproduce:

conda create -n py3.6 python=3.6
pip installl sparsify

Issue when logging in

Describe the bug
When I try to run sparsify.login api_token. I'm getting the error below.

Expected behavior
Could you help me solve this? Thanks

To Reproduce
Exact steps to reproduce the behavior:

pip install sparsify.nighlty
pip install numpy==1.21.6
pip install sparsezoo==1.5.0
sparsify.login api_token
sparsify.run -h

Errors

  File "/anaconda/envs/sparsify-env/bin/sparsify.run", line 5, in <module>
    from sparsify.cli.run import main
  File "/anaconda/envs/sparsify-env/lib/python3.8/site-packages/sparsify/__init__.py", line 18, in <module>
    from .login import *
  File "/anaconda/envs/sparsify-env/lib/python3.8/site-packages/sparsify/login.py", line 38, in <module>
    from sparsezoo.analyze.cli import CONTEXT_SETTINGS
  File "/anaconda/envs/sparsify-env/lib/python3.8/site-packages/sparsezoo/__init__.py", line 20, in <module>
    from .model import *
  File "/anaconda/envs/sparsify-env/lib/python3.8/site-packages/sparsezoo/model/__init__.py", line 17, in <module>
    from .model import *
  File "/anaconda/envs/sparsify-env/lib/python3.8/site-packages/sparsezoo/model/model.py", line 22, in <module>
    from sparsezoo.analytics import sparsezoo_analytics
  File "/anaconda/envs/sparsify-env/lib/python3.8/site-packages/sparsezoo/analytics.py", line 154, in <module>
    sparsezoo_analytics = GoogleAnalytics("sparsezoo", sparsezoo_version)
  File "/anaconda/envs/sparsify-env/lib/python3.8/site-packages/sparsezoo/analytics.py", line 68, in __init__
    self._disabled = analytics_disabled()
  File "/anaconda/envs/sparsify-env/lib/python3.8/site-packages/sparsezoo/analytics.py", line 43, in analytics_disabled
    return env_disabled or is_gdpr_country()
  File "/anaconda/envs/sparsify-env/lib/python3.8/site-packages/sparsezoo/utils/gdpr.py", line 93, in is_gdpr_country
    country_code = get_country_code()
  File "/anaconda/envs/sparsify-env/lib/python3.8/site-packages/sparsezoo/utils/gdpr.py", line 79, in get_country_code
    geo = geocoder.ip(ip)
  File "/home/azureuser/.local/lib/python3.8/site-packages/geocoder/api.py", line 498, in ip
    return get(location, provider='ipinfo', **kwargs)
  File "/home/azureuser/.local/lib/python3.8/site-packages/geocoder/api.py", line 198, in get
    return options[provider][method](location, **kwargs)
  File "/home/azureuser/.local/lib/python3.8/site-packages/geocoder/base.py", line 407, in __init__
    self._before_initialize(location, **kwargs)
  File "/home/azureuser/.local/lib/python3.8/site-packages/geocoder/ipinfo.py", line 80, in _before_initialize
    if location.lower() == 'me' or location == '':
AttributeError: 'NoneType' object has no attribute 'lower'

errored out Invalid input shape, cannot create a random input shape from: (3, None, None)

I have a model with dynamic input shape [batches, 3, height, width] (dynamic height, width), so sparsify raises errors on profiling. How can I provide input examples or something to solve this?

Trying to apply sparsify on 1-layer transformer model

I have tried the sparsify interface. When hitting run, the server crashes with the trace bellow.
Any help is welcomed!

Michael

2021-11-13 00:19:27 sparsify.blueprints.jobs INFO retrieved job {'job': {'error': None, 'job_id': 'e1eb1c43eb614143b2c0a31285eca111', 'created': '2021-11-13T00:19:27.071845', 'modified': '2021-11-13T00:19:27.071870', 'type_': 'CreatePerfProfileJobWorker', 'status': 'pending', 'project_id': '8c8f6df6be0d4dd18d15716bdf7ff327', 'progress': None, 'worker_args': {'model_id': 'de66e42d06af4e4786b210c0ee59b0b2', 'profile_id': 'a8005237aef943c3bf4917ce0210bd5f', 'batch_size': 1, 'core_count': 4, 'pruning_estimations': True, 'quantized_estimations': False, 'iterations_per_check': 10, 'warmup_iterations_per_check': 5}}}
10.0.0.4 - - [13/Nov/2021 00:19:27] "GET /api/jobs/e1eb1c43eb614143b2c0a31285eca111 HTTP/1.1" 200 -
2021-11-13 00:19:27 sparsify.workers.projects_profiles INFO running perf profile for project_id 8c8f6df6be0d4dd18d15716bdf7ff327 and model_id de66e42d06af4e4786b210c0ee59b0b2 and profile_id a8005237aef943c3bf4917ce0210bd5f with batch_size:1, core_count:4, pruning_estimations:True, quantized_estimations:False, iterations_per_check:10, warmup_iterations_per_check:5
DeepSparse Engine, Copyright 2021-present / Neuralmagic, Inc. version: 0.8.0 (68df72e1) (release) (optimized) (system=avx512, binary=avx512)
2021-11-13 00:19:27 sparsify.blueprints.jobs INFO getting job e1eb1c43eb614143b2c0a31285eca111
2021-11-13 00:19:27 sparsify.blueprints.jobs INFO retrieved job {'job': {'error': None, 'job_id': 'e1eb1c43eb614143b2c0a31285eca111', 'created': '2021-11-13T00:19:27.071845', 'modified': '2021-11-13T00:19:27.155796', 'type_': 'CreatePerfProfileJobWorker', 'status': 'started', 'project_id': '8c8f6df6be0d4dd18d15716bdf7ff327', 'progress': {'iter_indefinite': False, 'iter_class': 'analysis', 'num_steps': 2, 'step_class': 'baseline_estimation', 'step_index': 0, 'iter_val': 0.0}, 'worker_args': {'model_id': 'de66e42d06af4e4786b210c0ee59b0b2', 'profile_id': 'a8005237aef943c3bf4917ce0210bd5f', 'batch_size': 1, 'core_count': 4, 'pruning_estimations': True, 'quantized_estimations': False, 'iterations_per_check': 10, 'warmup_iterations_per_check': 5}}}
10.0.0.4 - - [13/Nov/2021 00:19:27] "GET /api/jobs/e1eb1c43eb614143b2c0a31285eca111 HTTP/1.1" 200 -
[nm_ort 7fbe5589e700 >ERROR< supported_subgraphs /home/ubuntu/build/nyann/src/onnxruntime_neuralmagic/supported/subgraphs.cc:782] ==== FAILED TO COMPILE ====
Unexpected exception message: bad optional access
DeepSparse Engine, Copyright 2021-present / Neuralmagic, Inc. version: 0.8.0 (68df72e1) (release) (optimized)
Date: 11-13-2021 @ 00:19:27 UTC
OS: Linux linuxvm1 4.15.0-1061-azure #66-Ubuntu SMP Thu Oct 3 02:00:50 UTC 2019
Arch: x86_64
CPU: Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Vendor: GenuineIntel
Cores/sockets/threads: [4, 1, 8]
Available cores/sockets/threads: [4, 1, 8]
L1 cache size data/instruction: 32k/32k
L2 cache size: 1Mb
L3 cache size: 35.75Mb
Total memory: 15.6651G
Free memory: 1.88776G

Assertion at /home/ubuntu/build/nyann/src/onnxruntime_neuralmagic/nm_execution_provider.cc:76

Backtrace:
0# wand::detail::abort_prefix(std::ostream&, char const*, char const*, int, bool, bool, unsigned long) in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libonnxruntime.so.1.8.0
1# 0x00007FBE2913F285 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libonnxruntime.so.1.8.0
2# 0x00007FBE291410AE in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libonnxruntime.so.1.8.0
3# 0x00007FBE2940D1C1 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libonnxruntime.so.1.8.0
4# 0x00007FBE29A5A668 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libonnxruntime.so.1.8.0
5# 0x00007FBE29A5D0A2 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libonnxruntime.so.1.8.0
6# 0x00007FBE29A603B9 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libonnxruntime.so.1.8.0
7# 0x00007FBE293EC76C in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libonnxruntime.so.1.8.0
8# 0x00007FBE293F24C3 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libonnxruntime.so.1.8.0
9# 0x00007FBE293AC982 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libonnxruntime.so.1.8.0
10# 0x00007FBE293ACC05 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libonnxruntime.so.1.8.0
11# deepsparse::ort_engine::init(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, int, int, int, wand::safe_type<wand::parallel::use_current_affinity_tag, bool>, std::shared_ptrwand::parallel::scheduler_factory_t) in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/libdeepsparse.so
12# 0x00007FBE5FDDD7EB in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/deepsparse_engine.so
13# 0x00007FBE5FDDDA09 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/deepsparse_engine.so
14# 0x00007FBE5FDFD986 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/deepsparse_engine.so
15# 0x00007FBE5FDEAA09 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/deepsparse/avx512/deepsparse_engine.so
16# 0x00005592DB0A07AE in /home/mbetser/anaconda3/envs/optimize/bin/python
17# _PyObject_MakeTpCall in /home/mbetser/anaconda3/envs/optimize/bin/python
18# 0x00005592DB0CAD6A in /home/mbetser/anaconda3/envs/optimize/bin/python
19# PyObject_Call in /home/mbetser/anaconda3/envs/optimize/bin/python
20# 0x00005592DB040689 in /home/mbetser/anaconda3/envs/optimize/bin/python
21# 0x00005592DB0A06C7 in /home/mbetser/anaconda3/envs/optimize/bin/python
22# 0x00007FBE42973029 in /home/mbetser/anaconda3/envs/optimize/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_pybind11_state.cpython-38-x86_64-linux-gnu.so
23# _PyObject_MakeTpCall in /home/mbetser/anaconda3/envs/optimize/bin/python

Please email a copy of this stack trace and any additional information to: [email protected]
Aborted

Error While exporting the model

hello ,I followed all the steps in the document for my custom model of yolov5 and got struck near exporting step .could you please help me with it ?

How should I correctly train the model?

I am trying to sparsify distilbert-base-multilingual-cased.

First I have first converted it to .onxx format by following command sparseml.transformers.export_onnx --task mlm --model_path ./distilbert-base-multilingual-cased
then generated config file, which i took from the export with example training file.
I have changed the training file to fit my training process (datasets etc.) and tried to load model again.
Unfortunately I don't what is correct way to do so
If i load model from hugging face again names of layers exported to onxx and in the model don't match
raise RuntimeError( RuntimeError: All supplied parameter names or regex patterns not found.No match for 958 in found parameters []. Supplied ['958']
I don't know how to apply changes directly to .onxx model - is that possible?

What I am doing wrong? What did I missed?

🚨 Next Gen Sparsify Early Access Waitlist🚨

We are excited to mention the next generation of Sparsify is underway. You can expect more features and simplicity to build sparse models to target optimal general performance at scale.

The next generation of Sparsify can help you optimize models from scratch or sparse transfer learn onto your data to target best-in-class inference performance on your deployment hardware.

We will share more in the coming weeks. In the meantime, sign up for our Early Access Waitlist and be the first to try the Sparsify Alpha.

Neural Magic Product Team

Error while installing sparsify via pip

While I was installing sparsify on Windows 10, via cmd using the command 'pip install sparsify' I encountered some errors :

After searching on google I found out that pysqlite3 was supported in Python2 and now is a part of the standard library in Python3, and needs no explicit installation via pip. I don't know what to do next and I am not able to use sparsify.

Python Version: 3.9.0
Pip Version: 21.1.3

how Can I solve this problem

Hi,

Thank you so much for sharing this repository, how can I solve this problem in image below

Pretraining style training aware sparsification

Is your feature request related to a problem? Please describe.
I would like to try sparsifying several pretrained LLMs (e.g. Mistral 7b, Stable LM 3b etc). I have created a pretraining corpus (for causal LLMs) on topics I care about. The corpus is relatively small in terms of LLM pretraining, around 10b tokens, but is gigantic in terms of fine tuning. It seems such a corpus would be ideal for trying this out: https://github.com/neuralmagic/sparsify/blob/main/docs/training-aware-experiment-guide.md.

Describe the solution you'd like
Reading through the experiment guide, I cannot identify an appropriate dataset for causal pretraining data. Would appreciate some pointers on what I can try (let me be your guinea pig!).

How to Implement config file in training pipeline

I have optimized my model via sparsify and have got the config file but I don't know how to use the config file with the optimization code given:

I made a python file where I loaded my PyTorch model and passed it to the function 'ScheduledOptimizer' as the 'MODEL' parameter given in the above 'Code for optimization' but I don't know what is the optimizer variable or what to pass to it and if I leave it like this only it says
"optimizer" is not defined.

Having a sparsify-cli component to interact with the deployed sparsify server

Is your feature request related to a problem? Please describe.
As NeuralMagic User,

I would like to have a sparsify-cli component that I can use to interact with the deployed server for sparsify. This would allow me to automate the sparsify step to obtain a recipe (if it does not exists). In this way, the user does not need to use the UI but integrate that step in a pipeline that could retrieve an existing recipe if available and if not, it could request a recipe to the server providing the required inputs (e.g. URL to ONNX model).

Describe the solution you'd like

Have a sparsify-cli component that can interact with the server.

Describe alternatives you've considered

Additional context

Additional context
See: AICoE/elyra-aidevsecops-tutorial#297

Browser freezes when performing optimization (Mac M1 Max)

Describe the bug
After the performance of the model is analized, I am clicking on Optimize.
At this moment the page changes to "Optimization" and the browser freezes after 5 seconds.

Expected behavior
Optimization should start but does not.

Environment
Include all relevant environment information:

OS Mac OS Monterey 12.6
Python version 3.10.4
Sparsify version or commit hash 1.0.0
ML framework version(s) pytorch 1.12.1
Other Python package versions [e.g. SparseZoo, DeepSparse, numpy, ONNX]: onnx 1.10.1, onnxruntime 1.12.1, sparsezoo 1.0.0, numpy 1.21.6 (note: DeepSparse not installed b/c of Mac m1 chip)
Other relevant environment information [e.g. hardware, CUDA version]: Apple M1 Max 64GB

To Reproduce
Exact steps to reproduce the behavior:

Errors

Additional context
Add any other context about the problem here. Also include any relevant files.

Error getting estimated metrics: unsupported operand type(s) for *: 'float' and 'NoneType'

Hi, after uploading and analyzing a onnx model with sparsify, it gives the following message.
The onnx model is correct (checked by the onnx checker) and can be used by other software.

:mega: Try Sparsify Alpha now! :mega:

🚨 July 2023 🚨: Sparsify's next generation is now in alpha as of version 1.6.0-alpha!

Want to kick the tires? Read the Sparsify Quickstart Guide.

If you encounter any issues while using Sparsify Alpha, please file an issue here.

Sparse-fication recipe for yolov7

Hello there,

Thanks for working in Sparsify. This speed-ups are very impressive and amazing.

Yolov7 has been released recently and it is faster and more accurate than other object detectors at the moment. Do we have a plan to write sparse-fication recipe for this ?

I am open to work individually or with people to create a pull request.

neuralmagic / sparsify Goto Github PK

sparsify's Introduction

Sparsify [Alpha]

ML model optimization product to accelerate inference

Table of Contents

Quickstart Guide

1. Install and Setup

1.1 Verify Prerequisites

1.2 Create an Account

1.3 Install Sparsify

1.4 Log in via CLI

2. Run an Experiment

2.1 One-Shot

2.2 Sparse-Transfer

2.3 Training-Aware

3. Compare Results

4. Deploy a Model

Companion Guides

Resources

Feedback and Support

Terms and Conditions

Learning More

Release History

License

Community

Contribute

Join

Cite

sparsify's People

Contributors

Stargazers

Watchers

Forkers

sparsify's Issues

Recommend Projects

Recommend Topics

Recommend Org