Comments (13)
Thanks for the report ! I'll have a look.
from tract.
So a first thing of notice: the graph uses a "ConstantFill" operator (nodes 20 and 54). This operator is not part of Onnx, so you may want to tweak your pytorch network somehow.
The error message is bad, though. I'll try to see how it gets reported like this.
This is not documented anywhere (except with --help) but tract has an auditing command line that can help investigating these thing... cargo install tract
will install the utility, then tract GRU128KeywordSpotter.onnx --pass analyse dump
will dump the network. The two ConstantFill nodes show.
from tract.
OK, i found out what the other problem is. I have issues dealing with optional inputs and the way they are encoded in ONNX. ONNX uses an empty string as a input specifier to denote a missing input when it needs to skip it, and I have some problems modelling this in tract. Here is a dump of the problematic node:
input: "input.1"
input: "39"
input: "40"
input: "41"
input: ""
input: "20"
output: "42"
output: "43"
op_type: "GRU"
attribute {
name: "hidden_size"
type: INT
i: 128
}
attribute {
name: "linear_before_reset"
type: INT
i: 1
}
doc_string: "/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/rnn.py(179): forward\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py(477): _slow_forward\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py(487): __call__\n/home/ubuntu/ELL/ELL/tools/utilities/pythonlibs/audio/training/train_classifier.py(316): forward\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py(477): _slow_forward\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py(487): __call__\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/jit/__init__.py(252): forward\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py(489): __call__\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/jit/__init__.py(197): get_trace_graph\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/onnx/utils.py(192): _trace_and_get_graph_from_model\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/onnx/utils.py(224): _model_to_graph\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/onnx/utils.py(281): _export\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/onnx/utils.py(104): export\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/onnx/__init__.py(27): export\n/home/ubuntu/ELL/ELL/tools/utilities/pythonlibs/audio/training/train_classifier.py(96): export\n/home/ubuntu/ELL/ELL/tools/utilities/pythonlibs/audio/training/train_classifier.py(561): train\n/home/ubuntu/ELL/ELL/tools/utilities/pythonlibs/audio/training/train_classifier.py(652): <module>\n"
So this GRU is missing its sequence_lens
input. As an immediate workaround, you may actually be able to provide this input to the node, depending on your pytorch code, while I figure out a real fix for this recurring issue.
Note: I'm very interested in helping you gettiing tract
to work, as you're obviously doing voice and you're not a colleague :) But tract may not be work greatly out of the box. Support for recurring operators is relatively recent, with a lot of ongoing work through the kaldi support. I have to warn you that as it stands, the ONNX GRU operator implementation is just "passing the ONNX tests", (and as you've just discovered, they are not covering everything). I have never used it for real situations. The good news being, the current strategy is to re-express the complicated recurring operators (LSTM, GRU, ...) for the three frameworks (tf, onnx, and kaldi) in terms of simpler ops (Scan, MatMatMul, Sigmoid, etc) implemented in tract core. So as soon as the GRU translation is implemented, you will benefit from the all the work we are currently doing to implement and optimize kaldi inference.
So to summarize:
- in any case, you need to get rid of
ConstantFill
. As far as I can tell it is not part of the ONNX spec. - I will have a look at the optional input support, it may take a while (days) if I don't find something simple that I can do in a few hours
- you may be able to workaround that issue by providing the
sequence_lens
to the GRU operator that do not get it
This should get us to a network that loads, and we'll take it from there :)
from tract.
This may help about ConstantFill : onnx/onnx#1434
from tract.
from tract.
Wow, thanks for the investigation and support. Yes, I had realised that the ConstantFill
and GRU
ops may not be supported. It appears ConstantFill
was only ever an experimental feature on ONNX. I'll have to re-export the ONNX from PyTorch.
My other option was to use ELL, but I prefer a Rust implementation. My ultimate goal is to get one of these running on an ARM processor, of some sort.
I'll investigate your comments more tomorrow.
from tract.
Ha, real time voice on ARM should be the sweet spot indeed. Happy to see somebody out of the office give it a shot :)
from tract.
For your information, I have a POC for ONNX GRU translation here #143
from tract.
Thanks. With the latest tract code changes it's getting further. I've also updated my ONNX model and replaced the ConstantFill
with ConstantFillOfShape
. I've also replaced the GRU
with an LSTM
. I'll keep chipping away at the other issues I have.
from tract.
aha, LSTM :) I have not implemented LSTM translation to core ops yet, just the GRU one, but will try to do that soon. It should work anyway, just be relatively ineficient. I'll keep you posted.
from tract.
Ok, that makes sense. The LSTM took longer that 10 minutes in debug mode so I cancelled it. In release mode it crashed my computer, seriously, I had to hold the power button for 6 six seconds. But that's not an issue.
from tract.
That sounds a bit extreme... i’ll be happy to have a try at running your lstm model if you can share it.
from tract.
I found out that the optimisation step was taking a long time. I'll raise a bug soon, hopefully. I'm currently busy with other things. Thank you.
from tract.
Related Issues (20)
- Can't load speedspeech onnx file HOT 6
- tract-linalg: Invalid Instruction in fma_mmm_f32_24x4_0_20_7 HOT 2
- failing to translate LayerNormalization nodes -- seems unimplemented HOT 3
- Failed to analyse GPT-2 onnx model
- Possibly incorrect expected shapes for STFT op
- Setting variables does not set values that depend on those variables HOT 5
- Runnable TopK model accesses input tensor as wrong type
- Unable to run MobileBert QA HOT 5
- Model with new version doesn't work HOT 3
- bitshift / pow fails to analyze
- Embedding and encoder models unable to unify I64 with TDIM during analysis HOT 1
- Remainder/Mod calculation fails to analyze
- Decluttering a `DequntizeLinearF32` followed by `QuantizeLinearI8` results in an error at model patching HOT 2
- Trying to run YOLOv8 results in ModelBuildingError: Split13 has incomplete typing HOT 4
- The backward compatibility of `tract-onnx` between `0.20.x` and `0.21.x` is broken HOT 5
- Serialization/deserialization of optimized tract models HOT 6
- Error while reading with into_optimized quant models HOT 4
- Performance regression in 0.21 HOT 2
- YOLOv8 / ONNX object detection, multiple shapes on top of eachother
- simple tree model fails to analyze reduce sum HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tract.