The following code: <div class="highlight highlight-source-rust notranslate positi

This may help about ConstantFill : <a class="issue-link js-issue-link" data-error-text

<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="48

For your information, I have a POC for ONNX GRU translation here <a class="issue-link

"Could not resolve inputs at top-level" issue loading ONNX file about tract HOT 13 CLOSED

sonos commented on May 19, 2024

"Could not resolve inputs at top-level" issue loading ONNX file

from tract.

Comments (13)

kali commented on May 19, 2024

Thanks for the report ! I'll have a look.

from tract.

kali commented on May 19, 2024

So a first thing of notice: the graph uses a "ConstantFill" operator (nodes 20 and 54). This operator is not part of Onnx, so you may want to tweak your pytorch network somehow.

The error message is bad, though. I'll try to see how it gets reported like this.

This is not documented anywhere (except with --help) but tract has an auditing command line that can help investigating these thing... cargo install tract will install the utility, then tract GRU128KeywordSpotter.onnx --pass analyse dump will dump the network. The two ConstantFill nodes show.

from tract.

kali commented on May 19, 2024

OK, i found out what the other problem is. I have issues dealing with optional inputs and the way they are encoded in ONNX. ONNX uses an empty string as a input specifier to denote a missing input when it needs to skip it, and I have some problems modelling this in tract. Here is a dump of the problematic node:

input: "input.1"
input: "39"
input: "40"
input: "41"
input: ""
input: "20"
output: "42"
output: "43"
op_type: "GRU"
attribute {
  name: "hidden_size"
  type: INT
  i: 128
}
attribute {
  name: "linear_before_reset"
  type: INT
  i: 1
}
doc_string: "/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/rnn.py(179): forward\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py(477): _slow_forward\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py(487): __call__\n/home/ubuntu/ELL/ELL/tools/utilities/pythonlibs/audio/training/train_classifier.py(316): forward\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py(477): _slow_forward\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py(487): __call__\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/jit/__init__.py(252): forward\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py(489): __call__\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/jit/__init__.py(197): get_trace_graph\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/onnx/utils.py(192): _trace_and_get_graph_from_model\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/onnx/utils.py(224): _model_to_graph\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/onnx/utils.py(281): _export\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/onnx/utils.py(104): export\n/home/ubuntu/miniconda3/envs/py36/lib/python3.6/site-packages/torch/onnx/__init__.py(27): export\n/home/ubuntu/ELL/ELL/tools/utilities/pythonlibs/audio/training/train_classifier.py(96): export\n/home/ubuntu/ELL/ELL/tools/utilities/pythonlibs/audio/training/train_classifier.py(561): train\n/home/ubuntu/ELL/ELL/tools/utilities/pythonlibs/audio/training/train_classifier.py(652): <module>\n"

So this GRU is missing its sequence_lens input. As an immediate workaround, you may actually be able to provide this input to the node, depending on your pytorch code, while I figure out a real fix for this recurring issue.

Note: I'm very interested in helping you gettiing tract to work, as you're obviously doing voice and you're not a colleague :) But tract may not be work greatly out of the box. Support for recurring operators is relatively recent, with a lot of ongoing work through the kaldi support. I have to warn you that as it stands, the ONNX GRU operator implementation is just "passing the ONNX tests", (and as you've just discovered, they are not covering everything). I have never used it for real situations. The good news being, the current strategy is to re-express the complicated recurring operators (LSTM, GRU, ...) for the three frameworks (tf, onnx, and kaldi) in terms of simpler ops (Scan, MatMatMul, Sigmoid, etc) implemented in tract core. So as soon as the GRU translation is implemented, you will benefit from the all the work we are currently doing to implement and optimize kaldi inference.

So to summarize:

in any case, you need to get rid of ConstantFill. As far as I can tell it is not part of the ONNX spec.
I will have a look at the optional input support, it may take a while (days) if I don't find something simple that I can do in a few hours
you may be able to workaround that issue by providing the sequence_lens to the GRU operator that do not get it

This should get us to a network that loads, and we'll take it from there :)

from tract.

kali commented on May 19, 2024

This may help about ConstantFill : onnx/onnx#1434

from tract.

kali commented on May 19, 2024

#142

from tract.

psiphi75 commented on May 19, 2024

Wow, thanks for the investigation and support. Yes, I had realised that the ConstantFill and GRU ops may not be supported. It appears ConstantFill was only ever an experimental feature on ONNX. I'll have to re-export the ONNX from PyTorch.

My other option was to use ELL, but I prefer a Rust implementation. My ultimate goal is to get one of these running on an ARM processor, of some sort.

I'll investigate your comments more tomorrow.

from tract.

kali commented on May 19, 2024

Ha, real time voice on ARM should be the sweet spot indeed. Happy to see somebody out of the office give it a shot :)

from tract.

kali commented on May 19, 2024

For your information, I have a POC for ONNX GRU translation here #143

from tract.

psiphi75 commented on May 19, 2024

Thanks. With the latest tract code changes it's getting further. I've also updated my ONNX model and replaced the ConstantFill with ConstantFillOfShape. I've also replaced the GRU with an LSTM. I'll keep chipping away at the other issues I have.

from tract.

kali commented on May 19, 2024

aha, LSTM :) I have not implemented LSTM translation to core ops yet, just the GRU one, but will try to do that soon. It should work anyway, just be relatively ineficient. I'll keep you posted.

from tract.

psiphi75 commented on May 19, 2024

Ok, that makes sense. The LSTM took longer that 10 minutes in debug mode so I cancelled it. In release mode it crashed my computer, seriously, I had to hold the power button for 6 six seconds. But that's not an issue.

from tract.

kali commented on May 19, 2024

That sounds a bit extreme... i’ll be happy to have a try at running your lstm model if you can share it.

from tract.

psiphi75 commented on May 19, 2024

I found out that the optimisation step was taking a long time. I'll raise a bug soon, hopefully. I'm currently busy with other things. Thank you.

from tract.

"Could not resolve inputs at top-level" issue loading ONNX file about tract HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent