sonos / tract Goto Github PK

View Code? Open in Web Editor NEW

2.1K 41.0 206.0 30.08 MB

Tiny, no-nonsense, self-contained, Tensorflow and ONNX inference

License: Other

Rust 93.00% Shell 0.97% Dockerfile 0.04% Assembly 3.12% Python 1.38% C 1.48% Makefile 0.01% BitBake 0.01%

tensorflow onnx neural-networks artificial-intelligence rust-library rust

tract's People

Contributors

Stargazers

Watchers

Forkers

liautaud sanyaade-machine-learning sourcedexter garvys entn-at fredszaq ogrisel aldanor pombredanne isgasho cheikhovitch thaytan schultzer kfabryczny ruttenk xjump tempbottle kali stjordanis dmaniry forksbot yangdongchn vertexstudio 42bio cambricorp makinglong cxz globaleye hoangpq yh646492956 yos95 uchagani guillaumegomez d10z whjvenyl pruthvikar psychokiller1888 oziee magnusthor sunzc danielbank brainp4in harthur snips-fork kleintody mattalhonte avinashgupta doytsujin mercurial-harsh geal dorucioclea navpreetsamra furiosa-ai atul9 ygchoi0521 cyk0521 msgpo elenazy rogervaas lijianhackthon embarkstudios bkiefer bminixhofer 5l1v3r1 pwolf15 dflemstr gdh756462786 trendingtechnology atanida marcos0318 vestigej jcbrtl jdeaver andreytkachenko michaelgiba schliffen knight-x 0x00evil hokim98 rustyforks 97gushan placrosse icodein sinitame ishine hexgnu thomaub aron-granberg rxhmdia oo-simbo robin-des-bois90 harman01 ybbtuubj tawawhite hdlj ro99 photonquantum ai-and-ml aeveris dtolnay-contrib

tract's Issues

WebAssembly support

Hi!

At the moment, there is no easy way to reliably run ONNX models in the browser. ONNX.js exists but is apparently unmaintained and lacks support for important operations like Conv1d and LSTM.

The alternative is Tensorflow.js which does not directly support ONNX so a model would have to be converted from ONNX to TF, then to a TFJS model, which does also not work at the moment (see onnx/onnx-tensorflow#490).

So there is a bit of a gap in the ecosystem there.

That gap could be filled by compiling tract to WASM, and exposing a higher-level API (i. e. load and predict functions) to Javascript. WebGL support would of course be missing but that is out of scope.

I did some prototyping today, and got the latest release (tract-onnx = "0.6.3") to work in the browser without any changes. So I think a JS wrapper would not be too hard to make.

I'll start working on this in the next couple of days. Depending on how it goes and if there is interest on your side, this could be merged back into tract at a later point.

It would be great if you could officially support compiling to WASM, and add WASM to the CI (e. g. the current master branch does not compile to WASM because of the memory maps from 99c622a).

Thanks, and please let me know what you think!

Slow inference relative to Microsoft ONNX Runtime

Running the Squeezenet ONNX model 1000 times using tract-onnx takes ~20 seconds on my machine. Running the same model using the Microsoft ONNX runtime in C++ takes about 9 seconds.

The output results are the same.

At first I thought it was because I have to clone the image on every iteration since the model takes ownership of the input. Why is that btw? Why not allow passing in a reference?

I measured that separately thought and it does not contribute to the overall time.

Here is the Rust code:

fn run_image_classification_model() -> TractResult<()> {
    let now = Instant::now();

    let model_path = "d:/temp/model.onnx";

    let shape = tvec!(1, 3, 224, 224);
    let model =
        onnx()
            .model_for_path(model_path)?
            .with_input_fact(0, InferenceFact::dt_shape(f32::datum_type(), shape))?
            .into_optimized()?
            .into_runnable()?;

    eprintln!("Loaded ONNX model");

    let input_tensor_size = 3 * 224 * 224;
    let dummy_data: Vec<f32> =
        (0..input_tensor_size)
            .map(|i| i as f32 / (input_tensor_size + 1) as f32)
            .collect();

    let image: Tensor = Array::from_shape_vec((1, 3, 224, 224), dummy_data)?.into();
    let time_cloning = Instant::now();

    let mut result = model.run(tvec!(image.clone()))?;

    for i in 0..999 {
        result = model.run(tvec!(image.clone()))?;
    }

    println!("Elapsed time: {}", now.elapsed().as_secs());

    for (i, score) in result[0]
        .to_array_view::<f32>()?.iter().take(5).enumerate() {
        println!("Score for class [{}] =  {}\n", i, score);
    }


    Ok(())
}

Here is the C++ code:

// HeppOnnxCpp.cpp : This file contains the 'main' function. Program execution begins and ends there.
//

#include <onnxruntime_c_api.h>
#include <cstdio>
#include <vector>
#include <cassert>
#include <time.h>
#include <iostream>
#include <iomanip>
using namespace std;

const OrtApi* g_ort = OrtGetApiBase()->GetApi(ORT_API_VERSION);

//*****************************************************************************
// helper function to check for status
void CheckStatus(OrtStatus* status)
{
    if (status != NULL) {
        const char* msg = g_ort->GetErrorMessage(status);
        fprintf(stderr, "%s\n", msg);
        g_ort->ReleaseStatus(status);
        exit(1);
    }
}
int main(int argc, char* argv[]) {
    time_t start, end;
    time(&start);
    //*************************************************************************
    // initialize  enviroment...one enviroment per process
    // enviroment maintains thread pools and other state info
    OrtEnv* env;
    CheckStatus(g_ort->CreateEnv(ORT_LOGGING_LEVEL_WARNING, "test", &env));

    // initialize session options if needed
    OrtSessionOptions* session_options;
    CheckStatus(g_ort->CreateSessionOptions(&session_options));
    g_ort->SetIntraOpNumThreads(session_options, 1);

    // Sets graph optimization level
    g_ort->SetSessionGraphOptimizationLevel(session_options, ORT_ENABLE_BASIC);

    // Optionally add more execution providers via session_options
    // E.g. for CUDA include cuda_provider_factory.h and uncomment the following line:
    // OrtSessionOptionsAppendExecutionProvider_CUDA(sessionOptions, 0);

    //*************************************************************************
    // create session and load model into memory
    // using squeezenet version 1.3
    // URL = https://github.com/onnx/models/tree/master/squeezenet
    OrtSession* session;
#ifdef _WIN32
    const wchar_t* model_path = L"d:/temp/model.onnx";
#else
    const char* model_path = "d:/temp/model.onnx";
#endif

    printf("Using Onnxruntime C API\n");
    CheckStatus(g_ort->CreateSession(env, model_path, session_options, &session));

    //*************************************************************************
    // print model input layer (node names, types, shape etc.)
    size_t num_input_nodes;
    OrtStatus* status;
    OrtAllocator* allocator;
    CheckStatus(g_ort->GetAllocatorWithDefaultOptions(&allocator));

    // print number of model input nodes
    status = g_ort->SessionGetInputCount(session, &num_input_nodes);
    std::vector<const char*> input_node_names(num_input_nodes);
    std::vector<int64_t> input_node_dims;  // simplify... this model has only 1 input node {1, 3, 224, 224}.
                                           // Otherwise need vector<vector<>>

    printf("Number of inputs = %zu\n", num_input_nodes);

    // iterate over all input nodes
    for (size_t i = 0; i < num_input_nodes; i++) {
        // print input node names
        char* input_name;
        status = g_ort->SessionGetInputName(session, i, allocator, &input_name);
        printf("Input %zu : name=%s\n", i, input_name);
        input_node_names[i] = input_name;

        // print input node types
        OrtTypeInfo* typeinfo;
        status = g_ort->SessionGetInputTypeInfo(session, i, &typeinfo);
        const OrtTensorTypeAndShapeInfo* tensor_info;
        CheckStatus(g_ort->CastTypeInfoToTensorInfo(typeinfo, &tensor_info));
        ONNXTensorElementDataType type;
        CheckStatus(g_ort->GetTensorElementType(tensor_info, &type));
        printf("Input %zu : type=%d\n", i, type);

        // print input shapes/dims
        size_t num_dims;
        CheckStatus(g_ort->GetDimensionsCount(tensor_info, &num_dims));
        printf("Input %zu : num_dims=%zu\n", i, num_dims);
        input_node_dims.resize(num_dims);
        g_ort->GetDimensions(tensor_info, (int64_t*)input_node_dims.data(), num_dims);
        for (size_t j = 0; j < num_dims; j++)
            printf("Input %zu : dim %zu=%jd\n", i, j, input_node_dims[j]);

        g_ort->ReleaseTypeInfo(typeinfo);
    }

    // Results should be...
    // Number of inputs = 1
    // Input 0 : name = data_0
    // Input 0 : type = 1
    // Input 0 : num_dims = 4
    // Input 0 : dim 0 = 1
    // Input 0 : dim 1 = 3
    // Input 0 : dim 2 = 224
    // Input 0 : dim 3 = 224

    //*************************************************************************
    // Similar operations to get output node information.
    // Use OrtSessionGetOutputCount(), OrtSessionGetOutputName()
    // OrtSessionGetOutputTypeInfo() as shown above.

    //*************************************************************************
    // Score the model using sample data, and inspect values

    size_t input_tensor_size = 224 * 224 * 3;  // simplify ... using known dim values to calculate size
                                               // use OrtGetTensorShapeElementCount() to get official size!

    std::vector<float> input_tensor_values(input_tensor_size);
    std::vector<const char*> output_node_names = { "softmaxout_1" };

    // initialize input data with values in [0.0, 1.0]
    for (size_t i = 0; i < input_tensor_size; i++)
        input_tensor_values[i] = (float)i / (input_tensor_size + 1);

    // create input tensor object from data values
    OrtMemoryInfo* memory_info;
    CheckStatus(g_ort->CreateCpuMemoryInfo(OrtArenaAllocator, OrtMemTypeDefault, &memory_info));
    OrtValue* input_tensor = NULL;
    CheckStatus(g_ort->CreateTensorWithDataAsOrtValue(memory_info, input_tensor_values.data(), input_tensor_size * sizeof(float), input_node_dims.data(), 4, ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT, &input_tensor));
    int is_tensor;
    CheckStatus(g_ort->IsTensor(input_tensor, &is_tensor));
    assert(is_tensor);
    g_ort->ReleaseMemoryInfo(memory_info);

    // score model & input tensor, get back output tensor
    OrtValue* output_tensor = NULL;
    for (int i = 0; i < 1000; i++) {
        CheckStatus(g_ort->Run(session, NULL, input_node_names.data(), (const OrtValue* const*)&input_tensor, 1, output_node_names.data(), 1, &output_tensor));
    }
    CheckStatus(g_ort->IsTensor(output_tensor, &is_tensor));
    assert(is_tensor);

    // Get pointer to output tensor float values
    float* floatarr;
    CheckStatus(g_ort->GetTensorMutableData(output_tensor, (void**)&floatarr));
    assert(std::abs(floatarr[0] - 0.000045) < 1e-6);

    time(&end);
    // score the model, and print scores for first 5 classes
    for (int i = 0; i < 5; i++)
        printf("Score for class [%d] =  %f\n", i, floatarr[i]);

    // Results should be as below...
    // Score for class[0] = 0.000045
    // Score for class[1] = 0.003846
    // Score for class[2] = 0.000125
    // Score for class[3] = 0.001180
    // Score for class[4] = 0.001317

    g_ort->ReleaseValue(output_tensor);
    g_ort->ReleaseValue(input_tensor);
    g_ort->ReleaseSession(session);
    g_ort->ReleaseSessionOptions(session_options);
    g_ort->ReleaseEnv(env);
    printf("Done!\n");
    // Calculating total time taken by the program. 
    double time_taken = double(end - start);
    cout << "Time taken by program is : " << fixed
        << time_taken << setprecision(5);
    cout << " sec " << endl;
    return 0;
}

Build instructions

I tried to build the latest dev version of the project with rustc 1.31.0, with,

cargo build

but I'm getting the following error,

[...]
   Compiling jpeg-decoder v0.1.15                                                                                                              
   Compiling tract-core v0.2.6-pre (/home/rth/src/tract/core)                                                                                  
   Compiling failure v0.1.5                                                                                                                    
   Compiling image v0.19.0                                                                                                                     
   Compiling mio_httpc v0.6.24
   Compiling tract-onnx v0.2.6-pre (/home/rth/src/tract/onnx)                                                                                  
error: failed to run custom build command for `tract-onnx v0.2.6-pre (/home/rth/src/tract/onnx)`                                               
process didn't exit successfully: `/home/rth/src/tract/target/debug/build/tract-onnx-f9bfe0fb1264b616/build-script-build` (exit code: 101)
--- stdout
DEBUG ensure_onnx_git_checkout 1
DEBUG ensure_onnx_git_checkout 2
DEBUG ensure_onnx_git_checkout 3
DEBUG ensure_onnx_git_checkout 4
DEBUG ensure_onnx_git_checkout 5

--- stderr
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 18, kind: Other, message: "Invalid cross-device link" }', libcore/result.rs:1009:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Am I missing something?

Type arguments for "tract_core::model::ModelImpl"

I want to define a struct 'Model' but I get an error for type arguments. What is the correct way to use the 'ModelImpl' types?

struct Model {
            model: tract_core::model::ModelImpl,
            model_name: String,
        }

    let mut model_1 = Model {
        model: tract_tensorflow::tensorflow().model_for_path("mobilenet_v2_1.4_224_frozen.pb")?,
        model_name: String::from("model_1"),
    };
    
    let mut model = model_1.model;

When running the 'tract-mobilenet-v2-example' with the modification above I get the following error:

error[E0107]: wrong number of type arguments: expected 2, found 0
 --> examples/tensorflow-mobilenet-v2/src/main.rs:8:20
  |
8 |             model: tract_core::model::ModelImpl,
  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected 2 type arguments

Thanks!

Retro-engineer TypedReshape as AxisOp

Reshape allows to entirely rewrite the shape of a tensor. Only real constraint is to preserve the tensor volume. As such it does not play nice with simple AxisOp (Add, Rm and Permute) and needs its own implementation of pulsify.

extends AxisOp to support axis merging and axis split (these ops are actually generalisation of Rm and Add)
translate actual TypedReshape into a successions of these extended AxisOp (Levenshtein the two shapes to find invariants ?)

Integer-sizing a decluttered streaming TypedModel without Pulse (for non causal models)

Hey, I came across another problem trying the bidirectional LSTM model in a browser.
It is the same LSTM that is now in CI (download link). Now normally I'd use code similar to this:

use tract_onnx::prelude::*;

fn main() -> TractResult<()> {
    let model = tract_onnx::onnx()
        .model_for_path("model.onnx")?
        .into_optimized()?
        .into_runnable()?;

    let input: Tensor = tract_ndarray::Array2::<u8>::zeros((1, 100)).into();
    model.run(tvec!(input))?;

    Ok(())
}

but I get an error:

➜  ~/Documents/Experiments/sblstmtest git:(master) ✗ cargo run
   Compiling sblstmtest v0.1.0 (/Users/bminixhofer/Documents/Experiments/sblstmtest)
    Finished dev [unoptimized + debuginfo] target(s) in 4.51s
     Running `target/debug/sblstmtest`
Error: TractError(Msg("Translating node #1 \"input\" Source ToTypedTranslator"), State { next_error: Some(TractError(Msg("Output type not determined"), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })), backtrace: InternalBacktrace { backtrace: None } })

Running it without into_optimized, or with an input fact works.
So I understand that the model can not be optimized because the shape of the input (batch size and seq len) is not known at the time of building. Is that correct?
In practice I don't want to fix the input shape at build time because it has to work with different batch sizes.

Now so far it wouldn't be a problem, I'd just add an option optimize to the JS API to turn optimization on or off depending on whether dynamic shapes are needed during inference.

The problem comes when I try to store the model that I got by calling into_runnable without calling into_optimized before.

I get a model of type SimplePlan<InferenceFact, Box<dyn InferenceOp>, ModelImpl<InferenceFact, Box<dyn InferenceOp>>>. When I want to store such a model in a struct like:

use tract_onnx::prelude::*;

struct Model {
    inner: SimplePlan<InferenceFact, Box<dyn tract_hir::infer::ops::InferenceOp>, InferenceModel>,
}

I get an error which says that the module ops is private:

➜  ~/Documents/Experiments/sblstmtest git:(master) ✗ cargo run
   Compiling sblstmtest v0.1.0 (/Users/bminixhofer/Documents/Experiments/sblstmtest)
error[E0603]: module `ops` is private
  --> src/main.rs:4:64
   |
4  |     inner: SimplePlan<InferenceFact, Box<dyn tract_hir::infer::ops::InferenceOp>, InferenceModel>,
   |                                                                ^^^ private module
   |
note: the module `ops` is defined here
  --> /Users/bminixhofer/.cargo/registry/src/github.com-1ecc6299db9ec823/tract-hir-0.7.0/src/infer/mod.rs:12:1
   |
12 | mod ops;
   | ^^^^^^^^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0603`.
error: could not compile `sblstmtest`.

To learn more, run the command again with --verbose.

So I can't store the result. Am I missing something? And if not, is there some way to work around this?

Thanks for all your help :)

Support scikit-learn

I was wondering how much effort would be to support some of scikit-learn models. They can be exported to ONNX with https://github.com/onnx/onnxmltools

I imagine at least linear models shouldn't be too difficult.

Would that be in scope of this project?

Support "Resize" operator

Hi there! Thank you for your really needed and highly appreciated project!

Coming from the computer vision corner of DL, I utilize the Resize operator alot. It's used in many image segmentation networks such as the popular U-Net to upsample layers (think "deconvolution").

Tract has no support for this yet so I wanted to humbly ask if it would be possible to implement it in the foreseeable future. Rust is not really my strength (~2 hour experience so far...) but I would be willing to help if I can.

Edit:
I've just seen that my networks are using v11 of the operator set. This is the exact call:

onnx::Resize[coordinate_transformation_mode="align_corners", mode="linear", nearest_mode="floor"]

SONOS aquirement of snips

Hi @kali, will the SONOS aquirement of snips affect this repo? https://forum.snips.ai/t/important-message-regarding-the-snips-console/4145

It would be such a shame!!

Handle the "spatial" attribute in onnx BatchNormalization

I tried Mobilenet v2 example from ONNX model zoo to check if it works, given that the corresponding tensorflow model is part of the examples.

https://s3.amazonaws.com/onnx-model-zoo/mobilenet/mobilenetv2-1.0/mobilenetv2-1.0.onnx

Turned out that it crashes on start up. Traced it back an assert assuring spatial = 0

tract_onnx::ops::nn::batch_normalization::h9eb0af9dc679fb50 mod.rs:117
tract_onnx::model::ParsingContext::parse_graph::haa33f85c9f8fbf5e model.rs:108
tract_onnx::model::Onnx::parse::h5ff6da06eaeb2b66 model.rs:208
_$LT$tract_onnx..model..Onnx$u20$as$u20$tract_hir..framework..Framework$LT$tract_onnx..pb..ModelProto$GT$$GT$::model_for_proto_model::hf9a44d34d9c02e91 model.rs:221
tract_hir::framework::Framework::model_for_read::h95a5740e42f84c36 framework.rs:32
tract_hir::framework::Framework::model_for_path::h68896d9cea1dd51a framework.rs:39
[...]

For pytorch, supporting spatial was trivial (just ignore the attribute):

https://github.com/pytorch/pytorch/pull/9492/files

Would it be similar easy for tract?

To check that, I removed the assert. But that leads to node analysis errors. For reference:

Error: TractError(Msg("Failed analyse for node #268 \"mobilenetv20_features_conv0_fwd\" Conv"), State { next_error: Some(TractError(Msg("Infering facts"), State { next_error: Some(TractError(Msg("Applying rule inputs[0].shape[1] == 1*{inputs[1].shape[1]}: Impossible to unify Val(224) with Val(3)."), State { next_error: None, backtrace: InternalBacktrace { backtrace: Some(stack backtrace: [..]

Cannot run object detection model from tf model zoo

Dear all,

thanks a lot for this very interesting crate! I could run mobinet just fine. Faster RCNN Inception+ResnetV2 seems not to work, though.

It would seems bool support is missing. I got a crash due to unimplemented type 10, however this alone seems not to help:

diff --git a/tensorflow/src/tensor.rs b/tensorflow/src/tensor.rs
index 17cba22c..9dc3ae91 100644
--- a/tensorflow/src/tensor.rs
+++ b/tensorflow/src/tensor.rs
@@ -102,6 +102,7 @@ impl<'a> TryFrom<&'a TensorProto> for Tensor {
                         t.string_val.iter().map(|s| Blob(s.to_owned())).collect::<Vec<Blob>>();
                     tensor_from_repeated_field(&*dims, strings)?
                 }
+                DataType::DtBool => tensor_from_repeated_field(&*dims, t.bool_val.to_vec())?,
                 _ => unimplemented!("missing type (for _val()) {:?}", t.dtype),
             }
         };

Next crash isn't obvious at all to me:

thread 'main' panicked at 'explicit panic', tensorflow/src/model.rs:171:21
stack backtrace:
 [...]
  11: std::panicking::begin_panic
             at /home/oleid/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panicking.rs:438
  12: <tract_tensorflow::model::Tensorflow as tract_hir::framework::Framework<tract_tensorflow::tfpb::tensorflow::GraphDef>>::model_for_proto_model
             at tensorflow/src/model.rs:171
  13: tract_hir::framework::Framework::model_for_read
             at /home/oleid/src/tract/hir/src/framework.rs:32
  14: tract_hir::framework::Framework::model_for_path
             at /home/oleid/src/tract/hir/src/framework.rs:39
  15: example_tensorflow_mobilenet_v2::main

EDIT:

Crash as i starts with ^:

[tensorflow/src/model.rs:171] i = "^Preprocessor/map/while/Identity"

Maybe you have an idea?

Investigate streaming performance.

When running cargo run --release -- <model> -v --size Sx2xf32 profile on models like identity.pb, concat-0.pb or concat-1.pb, there seems to be a big performance difference between the first step and the other steps -- but not in the way we would expect.

Streaming profiling for ../../tensorflow-examples/exports/concat-0.pb:
======================================================================

Starting step 0 with input [...].
Completed step 0 with output  [...],  [...] in:
    - Real: 0.002 ms/i.
    - User: 0.002 ms/i.
    - Sys: 0.001 ms/i.

Starting step 1 with input [...].
Completed step 1 with output  [...],  [...] in:
    - Real: 0.019 ms/i.
    - User: 0.016 ms/i.
    - Sys: 0.005 ms/i.

Starting step 2 with input [...].
Completed step 2 with output  [...],  [...] in:
    - Real: 0.014 ms/i.
    - User: 0.000 ms/i.
    - Sys: 0.015 ms/i.

I can't make sense of this, especially considering all the buffers are empty at the end of each step.

support for tensorflow signal ops?

So I was looking at potentially moving from the tensorflow bindings to tract for some audio based neural networks I have. The preprocessing in the graph gets the MEL features from the audio so needs the tensorflow short-time-fourier-transform block.

Just wondering if you had any support for these and if not whether you'd consider adding them (or accepting a PR 😄)?

Also, how is the support for common recurrent networks like LSTM and BLSTM cells?

How to run tract with multiple inputs

My model tasks in two tensors: sequences (size max_seq_length x num_sequences x 25) and batch sizes (size num_sequences).

How do I specify this when running tract analyse from the command line?
I tried:

./tract /path/to/model.onnx -i 3x2x25xf32,2xi32

But I get:

[2020-06-05T14:37:31.184879513Z ERROR tract] invalid digit found in string
Error: invalid digit found in string

So I guess comma separating the inputs is not the right approach...

Support TF2 saved model

I tried importing a simple model from the TF2 tutorial:

from __future__ import absolute_import, division, print_function, unicode_literals

# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras



fashion_mnist = keras.datasets.fashion_mnist


(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']



train_images = train_images / 255.0

test_images = test_images / 255.0




model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(train_images, train_labels, epochs=10)
#model.save('my_model.h5')

tf.saved_model.save(model, 'model_as_tf')

Then I copied the pb file and ran:

use tract_core::prelude::*;
fn main() {

    let tf = tract_tensorflow::tensorflow();
    let model = tf.model_for_path("saved_model.pb").unwrap();

}

But it results in:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: TractError(Msg("WireError(UnexpectedWireType(WireTypeVarint))"), State { next_error: None, backtrace: InternalBacktrace { backtrace: Some(stack backtrace:

Does this lib work with TF2, if not, is there a way to convert a TF2 model to TF1?

BTW thank you for this awesome lib!

Optimization fails due to Rank mismatch, InferenceModel works

Hi!

After updating some of our ONNX generators, we're seeing failures during the optimization step for tract. The error we get is as follows:

Failed analyse for node #23 "Min__6" MinNary
Infering facts: "Applying rule inputs[0].rank == inputs[1].rank: Impossible to unify 2 with 0."

This is based on tensorflow code that looks like this:

x = tf.clip_by_value(x, MIN, MAX)

Where result is a [BATCH, x] tensor, and both MIN and MAX are constant scalars [-20, 2]. The file renders fine with Netron, and as noted works without optimizations, so I'm guessing this is simply an unhandled case somewhere for the scalar min case during optimizations? I'm unfortunately unable to find out where this should be handled, but if you can point me in the right direction I can take a stab at fixing it in a PR.

MobileNet ops not supported

I wanted to run the pretrained frozen .pb models from mobilenetv1 and mobilenetv2 with

let tfd = ::tract_tensorflow::tensorflow().model_for_path(mobilenetv1_frozen).unwrap();
let plan = ::tract::SimplePlan::new(&tfd).unwrap();
let input = load_image(img);
let outputs = plan.run(tvec![input]).unwrap();

But for MobilenetV1 I get

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: TractError(Msg("Evaluating #13 \"MobilenetV1/MobilenetV1/Conv2d_0/Relu6\" Unimplemented(Relu6): unimplemented operation: Relu6"), State { next_error: None, backtrace: InternalBacktrace })', src/libcore/result.rs:997:5

and for MobilenetV2

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: TractError(Msg("Node named MobilenetV2/Conv/BatchNorm/FusedBatchNorm not found"), State { next_error: None, backtrace: InternalBacktrace })', src/libcore/result.rs:997:5

Any plan to support Relu6 or FusedBatchNorm? Would you be willing to point me where can I add those?

no_panic attribute applied to function that can panic (SafePatchIterator::next(), I think)

Linking with tract like this:

tract-core = "0.5.0"
tract-onnx = "0.5.0"

I'm getting the following linker error on Windows, using Rust 1.37:

         lld-link: error: undefined symbol:

          ERROR[no-panic]: detected panic in function `next`

          >>> referenced by libtract_core-36f780e6812f081d.rlib(tract_core-36f780e6812f081d.tract_core.blvpe490-cgu.9.rcgu.o):(_ZN4core3ptr18real_drop_in_place17h27638d8e99f97d5cE)
          >>> referenced by libtract_core-36f780e6812f081d.rlib(tract_core-36f780e6812f081d.tract_core.blvpe490-cgu.9.rcgu.o):(_ZN165_$LT$$LT$tract_core..ops..cnn..patches..SafePatchIterator$u20$as$u20$core..iter..traits..iterator..Iterator$GT$..next..__NoPanic$u20$as$u20$core..ops..drop..Drop$GT$4drop17hdff2e4cd80e88a13E)

Seems like no_panic is used to ensure that there are no panics in SafePatchIterator::next(), but one is in fact there. I don't see the possible panic myself, but something may have changed in a new version of Rust?

Can easily clone and patch this out myself, of course, but this seems fishy.

EDIT: Confirmed this is an issue on current master, and that taking out no_panic makes it link correctly.

Can not make a TypedTensorInfo out of 1x200x1x?

Hi there,

I was trying to run tract on a onnx file when I saw the below error (while trying to convert from an InferenceModel to a typed model):

While translating #39 "217" Reduce<Mean>: TractError(Msg("Can not make a TypedTensorInfo out of 1x200x1x?"), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })

Upon inspection of the onnx file, I can see that node 217 is ReduceMean with axis = -1 and keepDims = 1.

Any idea why this would happen? Unfortunately I can't share the onnx file, but hope the above is enough debug information.

Thanks!

"Could not resolve inputs at top-level" issue loading ONNX file

The following code:

  let path = Path::new("GRU128KeywordSpotter.onnx");
  let mut model = tract_onnx::onnx().model_for_path(path)?;

fails with the following error message:

Error: TractError(Msg("Could not resolve inputs at top-level: [\"\"]"), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })

The ONNX file appears to be valid since it can be used otherwise. The offending ONNX file can be found here: https://psiphi75.github.io/workingfiles/GRU128KeywordSpotter.onnx

NonZero ONNX operator support

It looks like this operator is not supported yet. This seems to be the one missing operator used by my model. Would this be hard to add?

Make inference rules more readable.

In order to make it easier to write and read inference rules for operators, I suggest writing a simple rules! macro which would convert this code:

let input = &inputs[0];
let dims = &inputs[1];
let output = &outputs[0];

rules! {
    inputs.len = 2;
    outputs.len = 1;

    input.datatype = output.datatype;
    input.rank + 2 * dims.rank = outputs.rank;
    
    input.rank as ir {
        // Some code depending on ir.
    };
};

Into the following:

let input = &inputs[0];
let dims = &inputs[1];
let output = &outputs[0];

solver
    .equals(&inputs.len, 2)
    .equals(&outputs.len, 1)

    .equals(&input.datatype, &output.datatype)
    .equals_zero(wrap![&input.rank, (2, &dims.rank), (-1, &outputs.rank)])
    
    .given(&inputs.rank, |solver, ir| {
        // Some code depending on ir.  
    });

set_input_names confusion

I've got CI working now for tractjs. I am using a squeezenet model where I have working .pb and .onnx files. I've added tests for the default (no options) and input facts (and already fixed a bug ;) ).

But I can't quite get custom inputs to work. I'd like to set the inputs of the ONNX model (link) to the nodes squeezenet0_conv8_fwd and squeezenet0_conv9_fwd. But I get a strange error:

Cargo.toml

[package]
name = "tractinputtest"
version = "0.1.0"
authors = ["Benjamin Minixhofer <[email protected]>"]
edition = "2018"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
tract-onnx = { git = "https://github.com/snipsco/tract/" }

src/main.rs

use tract_onnx::prelude::*;

fn main() -> TractResult<()> {
    let _model = tract_onnx::onnx()
        .model_for_path("squeezenet1_1.onnx")?
        .with_input_names(&["squeezenet0_conv8_fwd", "squeezenet0_conv9_fwd"])?
        .into_optimized();

    Ok(())
}

➜  ~/Documents/Experiments/tractinputtest git:(master) ✗ cargo run
   Compiling tractinputtest v0.1.0 (/Users/bminixhofer/Documents/Experiments/tractinputtest)
    Finished dev [unoptimized + debuginfo] target(s) in 4.37s
     Running `target/debug/tractinputtest`
thread 'main' panicked at 'no entry found for key', /Users/bminixhofer/.cargo/git/checkouts/tract-9a24dc151f0802d9/cfd5b19/core/src/model/compact.rs:68:49
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

But I see the nodes when running:
tract squeezenet1_1.onnx -i 1x3x224x224xF32 --pass analyse

I am also not sure about the difference between set_input_namesand set_input_outlets / which I should use. Could you elaborate?

`model.into_optimized()` consumes too much RAM - crashes

When I call model.into_optimized() the RAM consumption of the process grows very quickly to run out of all available RAM. The OS (Linux in this case) kills the process after a while. The 800 kB model below consumed at least 12 GB of RAM before the process was killed.

Below is the code I use (compiled using --release):

fn main() -> TractResult<()> {
  let path = Path::new("GRU128KeywordSpotter-v2-10epochs.onnx");
  let mut model = tract_onnx::onnx().model_for_path(path)?;
  model.set_input_fact(0, TensorFact::dt_shape(f32::datum_type(), tvec!(1, 1, 80)))?;

  let model = model.into_optimized()?;  // This never completes

  let plan = SimplePlan::new(&model)?;

}

The sample ONNX file can be found here: https://psiphi75.github.io/workingfiles/GRU128KeywordSpotter-v2-10epochs.onnx

Could not resolve inputs at top-level

When I try loading a model with tract_onnx::onnx().model_for_read() I get this error message: Could not resolve inputs at top-level: [""]

Any idea why this happens?

[ONNX] Impossible to unify Val(3) with Val(1)

Here is another one - this time a simple model I trained myself last week at work - my first binary classifier made with tensorflow ever. I used onnxruntime to run it. Works fine. I thought I'd be a nice test for tract. The network is quite simple:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 100, 100, 8)       80        
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 50, 50, 8)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 50, 50, 16)        1168      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 25, 25, 16)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 25, 25, 32)        4640      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 12, 12, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 12, 12, 64)        18496     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 6, 6, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 2304)              0         
_________________________________________________________________
dense (Dense)                (None, 128)               295040    
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 129       
=================================================================

model = Sequential([
        Conv2D(8, 3, activation='relu', input_shape=input_shape, padding="same"),
        MaxPooling2D(padding="valid"),

        Conv2D(16, 3, activation='relu', padding="same"),
        MaxPooling2D(padding="valid"),
        Conv2D(32, 3, activation='relu', padding="same"),
        MaxPooling2D(padding="valid"),
        Conv2D(64, 3, activation='relu', padding="same"),
        MaxPooling2D(padding="valid"),

        Flatten(),
        Dense(128, activation='relu'),
        Dropout(0.5),
        Dense(1, activation='sigmoid'),
])
model.compile(optimizer='adam',
                loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
                metrics=['accuracy'])

It is converted to onnx via keras2onnx after training. Input are 100x100 grayscale images. Output is a float [0..1].

RUST_LOG=debug ../../target/debug/example-tensorflow-mobilenet-v2 
[2020-04-16T21:34:35Z DEBUG tract_onnx::model] ONNX operator set version: 10
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #0 "conv2d_input" Source
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #1 "dense_1/kernel:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #2 "conv2d_1/kernel:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #3 "conv2d/kernel:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #4 "conv2d_3/bias:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #5 "conv2d/bias:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #6 "conv2d_1/bias:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #7 "conv2d_2/kernel:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #8 "dense/bias:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #9 "dense_1/bias:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #10 "dense/kernel:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #11 "conv2d_3/kernel:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #12 "conv2d_2/bias:0" Const
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #13 "Transpose10" PermuteAxes
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser]   Refined 13/0>: ..x? -> 1x3x224x224xF32
[2020-04-16T21:34:35Z DEBUG tract_hir::infer::analyser] Starting step for #14 "conv2d" Conv
Error: TractError(Msg("Failed analyse for node #14 \"conv2d\" Conv"), State { next_error: Some(TractError(Msg("Infering facts"), State { next_error: Some(TractError(Msg("Applying rule inputs[0].shape[1] == 1*{inputs[1].shape[1]}: Impossible to unify Val(3) with Val(1)."), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })), backtrace: InternalBacktrace { backtrace: None } })), backtrace: InternalBacktrace { backtrace: None } })

Doesn't look to me like missing operations - especially since this model is nothing fancy.

Please find it here: https://www.jottacloud.com/s/1466964b3a782324620a23585140feb3456

UnimplementedOp: Loop

Hi there,
I'm attempting to run an ONNX model containing at Loop operation. However, upon running it, I'm seeing this error:

[2020-05-27T06:22:25.559559680Z ERROR tract] TractError(Msg("Translating node #7 \"Loop_6\" Unimplemented(Loop) ToTypedTranslator"), State { next_error: Some(TractError(Msg("Operator can not be made a TypedOp."), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })), backtrace: InternalBacktrace { backtrace: None } })

Which suggests that Loop is not yet implemented. What would be the required step to implement this? Happy to help out with providing test ONNX files / contribute to a PR if you can point me in the right direction :)

Need TypedRunnableModel aside the TypedSimplePlan

This would make the RunnableModel idea actually works... in the current form, the RunnableModel needs 3 parameters, so it's way harder to use than the TypedSimplePlan

Missing node type: AddV2

Tensorflow 2 has started outputting a new kind of Add node, AddV2, here and there, which seems to be the same. It's not yet supported by Tract. It might be possible to, as a first step, simply accept AddV2 as an alternate name for Add but otherwise treat it the same, but I'm not 100% sure about this.

Optimise nets with symbolic dimension(s)

This is a followup to #300 . This issue is an epic, not 100% sure how everything will go. There may be valuable incremental improvements too here and there.

To what extent can we optimize a network retaining the symbolic stream dimension, so each Plan::run can run with a different dimension ? (required for a full fix to #300)
Can we generalize to get optimized support for
- other dimensions (batch, aka N dim ?) (somewhat hinted at by #300)
- dynamically shaped tensors (required by some onnx ops)

List of ONNX suppoted ops

Just a page/.md with the list of supported ONNX ops.
Also unclear from the main .md page which version of onnx/opset does tract support.
Tks.

Unsupported cast from I64 to Bool

Hi, I hit this error today:

[2020-05-26T13:09:42.326524742Z DEBUG tract_hir::infer::analyser] Starting step for #6 "Cast_5" Cast
[2020-05-26T13:09:42.326527855Z TRACE tract_hir::infer::analyser]   Input  #0: xI64 1
[2020-05-26T13:09:42.326529379Z TRACE tract_hir::infer::analyser]   Output #0: ..x?
[2020-05-26T13:09:42.326535128Z TRACE tract_hir::infer::rules] Building rules for ElementWiseOp(Cast { to: Bool })
[2020-05-26T13:09:42.326538361Z TRACE tract_hir::infer::rules] Applying rules for ElementWiseOp(Cast { to: Bool })
[2020-05-26T13:09:42.326540004Z TRACE tract_hir::infer::rules::solver]   Applying rule GivenRule { inputs[0].datum_type }
[2020-05-26T13:09:42.326542598Z TRACE tract_hir::infer::rules::solver]     Given rule: inputs[0].datum_type is I64
[2020-05-26T13:09:42.326544481Z TRACE tract_hir::infer::rules::solver]   Applying rule inputs[0].shape == outputs[0].shape
[2020-05-26T13:09:42.326548508Z TRACE tract_hir::infer::rules::solver]   Applying all rules
[2020-05-26T13:09:42.326550492Z TRACE tract_hir::infer::rules::solver]   Applying rule outputs[0].datum_type == Bool
[2020-05-26T13:09:42.326552332Z TRACE tract_hir::infer::rules::solver]   Applying all rules
[2020-05-26T13:09:42.326553368Z TRACE tract_hir::infer::rules::solver]   Applying all rules
[2020-05-26T13:09:42.326554314Z TRACE tract_hir::infer::rules::solver]   Solver exiting Context { inputs: [xI64 1], outputs: [Bool] }
[2020-05-26T13:09:42.326558449Z TRACE tract_hir::infer::rules] Solver done
[2020-05-26T13:09:42.326570174Z DEBUG tract_hir::infer::analyser] TractError(Msg("Failed analyse for node #6 \"Cast_5\" Cast"), State { next_error: Some(TractError(Msg("Eager eval"), State { next_error: Some(TractError(Msg("Unsupported cast from I64 to Bool"), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })), backtrace: InternalBacktrace { backtrace: None } })), backtrace: InternalBacktrace { backtrace: None } })

Would it be possible to support casting from I64 to bool? PyTorch will happily export the operation (calling .bool() on a LongTensor), so I assume it's valid ONNX.

Streaming Audio Example

Thanks for your greate work!
I would love to see an audio streaming inference Example here.

TIA
Andy

Some tensorflow extensions for keras layers support

I'm trying to load a model into rust and I'm getting an error when I run the model.

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: 
TractError(Msg("Translating #30 \"global_average_pooling1d/Mean\" Unimplemented(Mean)"), 
State { next_error: Some(TractError(Msg("Operator can not be made a TypedOp."), 
State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })), 
backtrace: InternalBacktrace { backtrace: None } })

It seems that the mean operation of global_average_pooling1d is not supported, does anyone know anymore about this?

Support: MaxPool [kernel=2 stride=2] in pulsed network when previous kernel have impair context

"cargo update" brings in a protobuf version (2.9.0) that doesn't compile.

Try to just run cargo update in the root directory, then cargo build.

Some cargo.toml dependency needs to be tightened down, or things need to be fixed to work with the new version of protobuf.

I first thought the culprit was harness/onnx-test-suite/debug-utils/Cargo.toml where it's set to "*", but no dice changing that.

armv7 has TEXTREL

Building the library for armv7 creates a library with TEXTRELs.

TEXTRELs are not compatible with Android API Level 23 and above.

When built for armv7-linux-androideabi

AR=arm-linux-androideabi-ar CC=armv7a-linux-androideabi29-clang cargo build --target armv7-linux-androideabi

scanelf then gives

scanelf -qT target/armv7-linux-androideabi/release/libandroid.so
  libandroid.so: (memory/data?) [0x57E9F8] in (optimized out: previous $d.1) [0x57E9C0]
  libandroid.so: (memory/data?) [0x57EC70] in (optimized out: previous $d.1) [0x57EC3C]
  target/armv7-linux-androideabi/release/libandroid.so

TensorListReserve Operation

Hi there,

When I try to run a frozen TensorFlow graph with the TensorListReserve operation Tract throws an error (it seems that the implementation is missing). How feasible would it be to add this instruction?

Thanks!

Windows support?

I noticed in the CI scripts, windows builds aren't tested.

Are there any plans to support windows in the future?

BERT support

Not too sure what specific operators the BERT architecture will require but:

https://github.com/onnx/models/tree/master/text/machine_comprehension/bert-squad use OneHot which is not implemented but easy to add (as an onnx primitive)
as reported here, #313 (comment) we may encounter ConstantOfShape with dynamic inputs

Inconsistent matmul

Hi there,

When run tract (latest master) on my ONNX file, I hit this problem doing translation:

[2020-05-22T15:27:11.562648738Z DEBUG tract_core::model::translator] Translating #44 "437" Const ToTypedTranslator
[2020-05-22T15:27:11.562651342Z DEBUG tract_core::model::translator] Translating #45 "MatMul_36" MatMul ToTypedTranslator
[2020-05-22T15:27:11.562736937Z ERROR tract] TractError(Msg("Translating node #45 \"MatMul_36\" MatMul ToTypedTranslator"), State { next_error: Some(TractError(Msg("Inconsistent matmul: a: 1x1x192 b: 1x128x5, a_trans: false b_trans: false c_trans: false"), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })), backtrace: InternalBacktrace { backtrace: None } })

The analyze step passes fine with no problems:

[2020-05-22T15:27:11.550766173Z TRACE tract_hir::infer::analyser] Remaining nodes 219
[2020-05-22T15:27:11.550768014Z DEBUG tract_hir::infer::analyser] Starting step for #45 "MatMul_36" MatMul
[2020-05-22T15:27:11.550770738Z TRACE tract_hir::infer::analyser]   Input  #0: 3x1x128xF32
[2020-05-22T15:27:11.550772811Z TRACE tract_hir::infer::analyser]   Input  #1: 128x5xF32 0.0128821805, 0.05316847, 0.004321085, -0.03827938, -0.056081012, 0.032178704, -0.012988224, 0.07952597...
[2020-05-22T15:27:11.550776261Z TRACE tract_hir::infer::analyser]   Output #0: 3x1x5xF32
[2020-05-22T15:27:11.550779731Z TRACE tract_hir::infer::rules] Building rules for MatMul { a_trans: false, b_trans: false, c_trans: false, q_params: None }
[2020-05-22T15:27:11.550782876Z TRACE tract_hir::infer::rules] Applying rules for MatMul { a_trans: false, b_trans: false, c_trans: false, q_params: None }
[2020-05-22T15:27:11.550785301Z TRACE tract_hir::infer::rules::solver]   Applying rule inputs[0].datum_type == inputs[1].datum_type
[2020-05-22T15:27:11.550787928Z TRACE tract_hir::infer::rules::solver]   Applying rule inputs[0].datum_type == outputs[0].datum_type
[2020-05-22T15:27:11.550790006Z TRACE tract_hir::infer::rules::solver]   Applying rule Given2Rule { (inputs[0].shape, inputs[1].shape) }
[2020-05-22T15:27:11.550794605Z TRACE tract_hir::infer::rules::solver]   Applying all rules
[2020-05-22T15:27:11.550796348Z TRACE tract_hir::infer::rules::solver]   Applying rule inputs[0].datum_type == inputs[1].datum_type
[2020-05-22T15:27:11.550798764Z TRACE tract_hir::infer::rules::solver]   Applying rule inputs[0].datum_type == outputs[0].datum_type
[2020-05-22T15:27:11.550801504Z TRACE tract_hir::infer::rules::solver]   Applying rule outputs[0].shape == 3x1x5
[2020-05-22T15:27:11.550806511Z TRACE tract_hir::infer::rules::solver]   Applying all rules
[2020-05-22T15:27:11.550808224Z TRACE tract_hir::infer::rules::solver]   Solver exiting Context { inputs: [3x1x128xF32, 128x5xF32 0.0128821805, 0.05316847, 0.004321085, -0.03827938, -0.056081012, 0.032178704, -0.012988224, 0.07952597...], outputs: [3x1x5xF32] }
[2020-05-22T15:27:11.550813870Z TRACE tract_hir::infer::rules] Solver done
[2020-05-22T15:27:11.550816799Z TRACE tract_hir::infer::fact] Unifying 3x1x128xF32 with 3x1x128xF32 into 3x1x128xF32.
[2020-05-22T15:27:11.550820370Z TRACE tract_hir::infer::fact] Unifying 128x5xF32 0.0128821805, 0.05316847, 0.004321085, -0.03827938, -0.056081012, 0.032178704, -0.012988224, 0.07952597... with 128x5xF32 0.0128821805, 0.05316847, 0.004321085, -0.03827938, -0.056081012, 0.032178704, -0.012988224, 0.07952597... into 128x5xF32 0.0128821805, 0.05316847, 0.004321085, -0.03827938, -0.056081012, 0.032178704, -0.012988224, 0.07952597....
[2020-05-22T15:27:11.550827557Z TRACE tract_hir::infer::fact] Unifying 3x1x5xF32 with 3x1x5xF32 into 3x1x5xF32.
[2020-05-22T15:27:11.550830426Z TRACE tract_hir::infer::analyser] Remaining nodes 218

It seems that during translation the first argument til matmul is suddenly 1x1x192 (when I should be 3x1x128 as shown during analysis).

Any idea how this could happen? Is there any way I print more debugging information during translation? :)

Internal multithreading

Hi!

From #326:

tract does not make any effort to run a computation using multiple cores, but is safe to use in multiple threads. So you may get better results by calling run::() on several inputs (or several copies) from different thread (using a parallel iterator may do the trick).

Are there any plans to support internal multithreading? Tract is already very fast. With internal multithreading it could possibly be faster than onnxruntime and ONNX.js*.

*That is, if we can exploit multithreading in the browser, but there is already a working wasm-bindgen example with rayon so I'm confident we would get there.

I'm not very familiar with parallelized implementations of neural nets but I think there are three major points where parallelization is possible:

Slicing the input in chunks that get computed on different cores e. g. with batch sizes > 1.
Computing different operators on different cores, each operator could start it's computation once all its inputs are computed.
Internal parallelization of an operation, e. g. different convolution filters on different cores.

Feel free to close this issue if this does not align with your Roadmap for tract.

API issue: Can't figure out how to keep around a model+plan to be reused

To demonstrate my issue, I tried to modify tract-mobilenet-v2-example as follows:

use tract_core::ndarray;
use tract_core::prelude::*;

struct ImageProcessor {
    model: ???,  // What types?
    plan: ???,
}

impl ImageProcessor {
    fn create_from_file() -> TractResult<Processor> {
        // load the model
        let mut model =
            tract_tensorflow::tensorflow().model_for_path("mobilenet_v2_1.4_224_frozen.pb")?;

        // specify input type and shape
        model.set_input_fact(0, InferenceFact::dt_shape(f32::datum_type(), tvec!(1, 224, 224, 3)))?;

        // optimize the model and get an execution plan
        let model = model.into_optimized()?;
        let plan = SimplePlan::new(&model)?;
        Ok(ImageProcessor {
            model,
            plan
        })
    }
}

fn main() -> TractResult<()> {
    let processor = ImageProcessor::create_from_file()?;

    // open image, resize it and make a Tensor out of it
    let image = image::open("grace_hopper.jpg").unwrap().to_rgb();
    let resized = image::imageops::resize(&image, 224, 224, ::image::FilterType::Triangle);
    let image: Tensor = ndarray::Array4::from_shape_fn((1, 224, 224, 3), |(_, y, x, c)| {
        resized[(x as _, y as _)][c] as f32 / 255.0
    })
    .into();

    // run the plan on the input
    let result = processor.plan.run(tvec!(image))?;

    // find and display the max value with its index
    let best = result[0]
        .to_array_view::<f32>()?
        .iter()
        .cloned()
        .zip(1..)
        .max_by(|a, b| a.0.partial_cmp(&b.0).unwrap());
    println!("result: {:?}", best);
    Ok(())
}

That way, I could pass the ImageProcessor object into something that loops over an array of images, or whatever, and just have that call Run with the desired input data.

But I just can't get this to compile, no matter how much I mess around with the types - the dependency between plan and model creates a self-referential struct, which Rust does not support.

Is there a supported way to bundle up a plan and model using the crate's API?

Thanks!

Get 'attempt to multiply with overflow' error with ONNX file

Hi, I'm using tract on another onnx file and get the following error when I try to optimise the file:

thread 'main' panicked at 'attempt to multiply with overflow', /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libcore/ops/arith.rs:801:51

The offending onnx file can be found here.

The code to run this is:

use std::path::Path;
use tract_core::prelude::*;

fn main() -> TractResult<()> {
  // load the model
  println!("Load Onnx");
  let path = Path::new("onnx.onnx");
  let mut model = tract_onnx::onnx().model_for_path(path)?;

  // specify input type and shape
  println!("Specify input shape");
  model.set_input_fact(0, TensorFact::dt_shape(f32::datum_type(), tvec!(1, 3, 64, 64)))?;

  // optimize the model and get an execution plan
  println!("Optimising model");
  model.into_optimized()?;
}

The full stacktrace is:

thread 'main' panicked at 'attempt to multiply with overflow', /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libcore/ops/arith.rs:801:51
stack backtrace:
   0:     0x5603dbe82afb - backtrace::backtrace::libunwind::trace::hfe5db90796807973
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.29/src/backtrace/libunwind.rs:88
   1:     0x5603dbe82afb - backtrace::backtrace::trace_unsynchronized::h34b865a835594335
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.29/src/backtrace/mod.rs:66
   2:     0x5603dbe82afb - std::sys_common::backtrace::_print::h527254ae44989167
                               at src/libstd/sys_common/backtrace.rs:47
   3:     0x5603dbe82afb - std::sys_common::backtrace::print::he85dd5ddddf46503
                               at src/libstd/sys_common/backtrace.rs:36
   4:     0x5603dbe82afb - std::panicking::default_hook::{{closure}}::h847a2eb38b396f14
                               at src/libstd/panicking.rs:200
   5:     0x5603dbe827d7 - std::panicking::default_hook::h2ca0f9a30a0e206b
                               at src/libstd/panicking.rs:214
   6:     0x5603dbe83210 - std::panicking::rust_panic_with_hook::hffcefc09751839d1
                               at src/libstd/panicking.rs:477
   7:     0x5603dbe82d92 - std::panicking::continue_panic_fmt::hc0f142c930c846fc
                               at src/libstd/panicking.rs:384
   8:     0x5603dbe82c76 - rust_begin_unwind
                               at src/libstd/panicking.rs:311
   9:     0x5603dbe9c0dd - core::panicking::panic_fmt::h2daf88b2616ca2b2
                               at src/libcore/panicking.rs:85
  10:     0x5603dbe9c01c - core::panicking::panic::h2d0bc53a963fb996
                               at src/libcore/panicking.rs:49
  11:     0x5603db383f9b - <usize as core::ops::arith::MulAssign>::mul_assign::h0ddb15e88dbf93db
                               at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libcore/ops/arith.rs:801
  12:     0x5603db384920 - <usize as core::ops::arith::MulAssign<&usize>>::mul_assign::h0b18b310cf23d13e
                               at core/src/lib.rs:1
  13:     0x5603db309b67 - tract_core::ops::math::mat_mul::Geo<T>::new::{{closure}}::hc7f57c849900bf73
                               at core/src/ops/math/mat_mul.rs:164
  14:     0x5603db573e86 - <core::iter::adapters::Scan<I,St,F> as core::iter::traits::iterator::Iterator>::next::{{closure}}::hd1c234b5a37658fb
                               at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libcore/iter/adapters/mod.rs:1679
  15:     0x5603dbc81c6a - core::option::Option<T>::and_then::h6268d3a9d77a59ce
                               at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libcore/option.rs:624
  16:     0x5603db573cb3 - <core::iter::adapters::Scan<I,St,F> as core::iter::traits::iterator::Iterator>::next::h90da42eaf54874b1
                               at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libcore/iter/adapters/mod.rs:1679
  17:     0x5603db5b6c27 - <&mut I as core::iter::traits::iterator::Iterator>::next::h08211bfabbef9568
                               at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libcore/iter/traits/iterator.rs:2608
  18:     0x5603db580c82 - core::iter::traits::iterator::Iterator::nth::hd4d628382565a50e
                               at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libcore/iter/traits/iterator.rs:315
  19:     0x5603db5b7fd3 - <core::iter::adapters::Skip<I> as core::iter::traits::iterator::Iterator>::next::hd50a102c4cb55bba
                               at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libcore/iter/adapters/mod.rs:1420
  20:     0x5603daef0cb0 - <smallvec::SmallVec<A> as core::iter::traits::collect::Extend<<A as smallvec::Array>::Item>>::extend::h76ba7fea8679899f
                               at /home/simon/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-0.6.10/lib.rs:1357
  21:     0x5603daf62420 - <smallvec::SmallVec<A> as core::iter::traits::collect::FromIterator<<A as smallvec::Array>::Item>>::from_iter::hd0f92089e463168a
                               at /home/simon/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-0.6.10/lib.rs:1342
  22:     0x5603db58ebc9 - core::iter::traits::iterator::Iterator::collect::hbf1b5f20172114bc
                               at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libcore/iter/traits/iterator.rs:1466
  23:     0x5603db3081c4 - tract_core::ops::math::mat_mul::Geo<T>::new::h275a6d2cbacf74f9
                               at core/src/ops/math/mat_mul.rs:159
  24:     0x5603db30b0ab - tract_core::ops::math::mat_mul::new_mat_mul_unary_finite::hea080a512d4010f0
                               at core/src/ops/math/mat_mul.rs:463
  25:     0x5603db8e5c83 - <tract_core::ops::math::mat_mul::MatMulUnary as tract_core::ops::Op>::codegen::h0b79d28b91e22295
                               at core/src/ops/math/mat_mul.rs:364
  26:     0x5603dbb7cb3f - <tract_core::optim::CodegenOps as tract_core::optim::TypedPass>::pass::hc6771f429b5f96d3
                               at core/src/optim/mod.rs:117
  27:     0x5603db8f6c03 - tract_core::model::<impl tract_core::model::model::ModelImpl<tract_core::model::tensor_info::TypedTensorInfo,alloc::boxed::Box<dyn tract_core::ops::TypedOp>>>::codegen::ha4b582fc098ece07
                               at core/src/model/mod.rs:294
  28:     0x5603db8f4f83 - tract_core::model::<impl tract_core::model::model::ModelImpl<tract_core::analyser::types::TensorFact,alloc::boxed::Box<dyn tract_core::ops::InferenceOp>>>::into_optimized::hbe0fc2f2165b33ca
                               at core/src/model/mod.rs:260
  29:     0x5603da7d2016 - onnx_gru_example::main::h9bc5e9b63b97e7a6
                               at examples/onnx-gru/src/main.rs:17
  30:     0x5603da7d1852 - std::rt::lang_start::{{closure}}::hc92a5ddd759a31b1
                               at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libstd/rt.rs:64
  31:     0x5603dbe82c13 - std::rt::lang_start_internal::{{closure}}::h447d8812e3ee306d
                               at src/libstd/rt.rs:49
  32:     0x5603dbe82c13 - std::panicking::try::do_call::h4a61cb372364c745
                               at src/libstd/panicking.rs:296
  33:     0x5603dbe84d5a - __rust_maybe_catch_panic
                               at src/libpanic_unwind/lib.rs:82
  34:     0x5603dbe8371d - std::panicking::try::hdf71f938885bca42
                               at src/libstd/panicking.rs:275
  35:     0x5603dbe8371d - std::panic::catch_unwind::h7e85dbf162b1611a
                               at src/libstd/panic.rs:394
  36:     0x5603dbe8371d - std::rt::lang_start_internal::h1e06cc26b9fc25ea
                               at src/libstd/rt.rs:48
  37:     0x5603da7d1819 - std::rt::lang_start::ha4cdda85009acf6c
                               at /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libstd/rt.rs:64
  38:     0x5603da7d255a - main
  39:     0x7f42ed82db6b - __libc_start_main
  40:     0x5603da7cf18a - _start
  41:                0x0 - <unknown>

Build failing for stable-x86_64-pc-windows-gnu

On Windows, the only way to use the debugger in CLion is to use the GNU toolchain.
Unfortunately tract does not compile using this. I get the following error:

Compiling tract-linalg v0.9.2
error: failed to run custom build command for `tract-linalg v0.9.2`

Caused by:
  process didn't exit successfully: `D:\temp\hello_onnx\target\debug\build\tract-linalg-3c598281dbd21f17\build-script-build` (exit code: 101)
--- stderr
thread 'main' panicked at 'Could not find lib.exe', C:\Users\johan\.cargo\registry\src\github.com-1ecc6299db9ec823\tract-linalg-0.9.2\build.rs:16:17

To reproduce:

rustup default stable-x86_64-pc-windows-gnu
cargo build

Heisenbug around path

@liautaud One that the unit tests did not catch :) hurray for proptests... I spent a few hours characterizing it. Care to have a look ?

checkout the branch bug/heisenbug-around-path
run cargo test --release -p conform space_to_batch_1

After a while, one of the asserts I added in space_to_batch throws...

14:21:00 [INFO] Checking inference on op
14:21:00 [DEBUG] tfdeploy::ops::nn::space_to_batch: get input[0]
14:21:00 [DEBUG] tfdeploy::ops::nn::space_to_batch:    input[0] -> inputs[0]
14:21:00 [DEBUG] tfdeploy::ops::nn::space_to_batch: get input[1]
14:21:00 [DEBUG] tfdeploy::ops::nn::space_to_batch:    input[1] -> inputs[1]
14:21:00 [DEBUG] tfdeploy::ops::nn::space_to_batch: get input[2]
14:21:00 [DEBUG] tfdeploy::ops::nn::space_to_batch:    input[2] -> inputs[2]
test space_to_batch_1 ... FAILED

failures:

---- space_to_batch_1 stdout ----
	thread 'space_to_batch_1' panicked at 'assertion failed: `(left == right)`
  left: `[0, 2]`,
 right: `[0, 1]`', src/ops/nn/space_to_batch.rs:91:9

Support TreeEnsembleClassifier op

(I'm aware that it overlaps somewhat with #56, but it's a bit more specific, hence opening it as a separate issue)

Given that it's now officially possible to convert LightGBM (and xgboost) tree ensemble classifiers into ONNX, how realistic would it be to expect tract to support TreeEnsembleClassifier op in the foreseeable future? This would potentially be a huge feature, instantly unlocking whole universe of tree ensemble classifiers (and potentially regressors as well).

// I'd be glad to help if there was some guidance on what to do and where, if needed; not quite sure how much of work it is to implement this since I'm not very familiar with the internals of tract.

Thanks!

Support for Conv2DTranspose / ConvTranspose ONNX 10

Hello !
Was trying tract and so far it looks amazing.

While playing with a conv autoencoder model, I noticed that the Deconvolution layers are not supported in ONNX and Tensforflow.

Would it be hard to implement it ? I looked at the code but so far my skills are very limited as I really don't understand well deconvolution.

I could try to implement it, because i feel that this op is not very different from the Conv layer.
The thing is that dynamic outputs might be problematic.

Very good job so far ! it's a pleasure to see something so advanced in pure rust.

Cheers

onnx one-hot operator

https://github.com/onnx/onnx/blob/master/docs/Operators.md#OneHot

needed by #331 .

sonos / tract Goto Github PK

tract's People

Contributors

Stargazers

Watchers

Forkers

tract's Issues

Recommend Projects

Recommend Topics

Recommend Org