Coder Social home page Coder Social logo

relay-aot's People

Contributors

ad1024 avatar jroesch avatar marisakirisame avatar slyubomirsky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

relay-aot's Issues

Known Issue: Reusing the result of aot.compile leads to segfault upon Python exiting

Here is a test case from the unit tests, which passes:

def test_compose():
    mod = Module()
    p = Prelude(mod)
    add_nat_definitions(p)
    x = relay.Var('x')
    inc = GlobalVar('inc')
    mod[inc] = Function([x], p.s(x))
    x = relay.Var('x')
    func = GlobalVar('func')
    f = Function([x], relay.Call(p.compose(inc, p.double), [x]))
    mod[func] = f
    cfunc = compile(func, mod)
    assert nat_to_int(cfunc(p.s(p.s(p.z())))) == 5

However, this case results in a segfault when the Python interpreter exits (all tests pass):

def test_compose():
    mod = Module()
    p = Prelude(mod)
    add_nat_definitions(p)
    x = relay.Var('x')
    inc = GlobalVar('inc')
    mod[inc] = Function([x], p.s(x))
    x = relay.Var('x')
    func = GlobalVar('func')
    f = Function([x], relay.Call(p.compose(inc, p.double), [x]))
    mod[func] = f
    cfunc = compile(func, mod)
    assert nat_to_int(cfunc(p.s(p.s(p.z())))) == 5
    assert nat_to_int(cfunc(p.s(p.s(p.z())))) == 5
    assert nat_to_int(cfunc(p.s(p.s(p.z())))) == 5

The GDB backtrace reveals the following:

Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
malloc_consolidate (av=av@entry=0x7ffff7dcfc40 <main_arena>) at malloc.c:4439
4439    malloc.c: No such file or directory.
(gdb) bt
#0  malloc_consolidate (av=av@entry=0x7ffff7dcfc40 <main_arena>) at malloc.c:4439
#1  0x00007ffff7a7c0ab in _int_free (have_lock=0, p=<optimized out>, av=0x7ffff7dcfc40 <main_arena>) at malloc.c:4362
#2  __GI___libc_free (mem=0x1b72750) at malloc.c:3124
#3  0x00007fff89044e69 in dmlc::parameter::FieldEntry<int>::~FieldEntry() ()
   from /home/sslyu/.local/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so
#4  0x00007fff89044557 in dmlc::parameter::ParamManager::~ParamManager() ()
   from /home/sslyu/.local/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so
#5  0x00007ffff7a270f1 in __run_exit_handlers (status=0, listp=0x7ffff7dcf718 <__exit_funcs>,
    run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
#6  0x00007ffff7a271ea in __GI_exit (status=<optimized out>) at exit.c:139
#7  0x00007ffff7a05b9e in __libc_start_main (main=0x4b0c20 <main>, argc=2, argv=0x7fffffffdcb8,
    init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdca8)
    at ../csu/libc-start.c:344
#8  0x00000000005b250a in _start ()

There seems to be some kind of nasty interaction happening somewhere inside TVM's memory (I have also had this happen upon exiting Python). This was done on TVM commit 5046ff25116d66032f5d1b69d240f0a655a1ed92; I do not know exactly which TVM commit this bug begins with.

Note also that the bug can be inconsistent: Sometimes duplicating one call to a compiled function will succeed; other times, I will have to duplicate a different compiled function to get a segfault.

Test for a inception module

Hi, I wan to use AOT to test concatenate op, but it has some error.

import numpy as np
import tvm
import tvm.relay as relay

from aot import compile
from tvm.relay.transform import gradient
from tvm.relay.testing import ctx_list, run_infer_type


def compute(data, axis):
    relay_op = relay.concatenate
    relay_x = []
    for x in data:
        relay_x.append(relay.var("input", relay.TensorType(x.shape, "float32")))
    y = relay_op(relay_x, axis)

    fwd_func = relay.Function(relay_x, y)
    fwd_func = run_infer_type(fwd_func)

    bwd_func = run_infer_type(gradient(fwd_func))

    print("data: ", data)
    tgt = tvm.target.create('llvm')
    ctx = tvm.context('llvm', 0)
    mod = relay.module.Module()
    intrp_wrapper = compile(bwd_func, mod, ctx=ctx, tgt=tgt)
    output = intrp_wrapper(*data)
    print("out: ", output)


def test_concatenate():
    """ Test for concatenate operator """
    print('\n----------------------Test start------------------------')

    def verify_concatenate(dshapes, dtype, axis):
        data = []
        for shape in dshapes:
            x = np.random.rand(*shape).astype(dtype)
            data.append(x)

        compute(data, axis)

    verify_concatenate([(2, 3), (3, 3)], 'float32', 0) # Success
    verify_concatenate([(2, 3), (3, 3), (4, 3), [5, 3]], 'float32', 0) # Success
    verify_concatenate([(2, 3), (3, 3), (4, 3)], 'float32', 0) # Failed
    verify_concatenate([(2, 3), (3, 3), (4, 3), (5, 3), (6, 3)], 'float32', 0) # Failed


if __name__ == '__main__':
    test_concatenate()
 

when i concatenate 2 or 4 array, it can be compiled. but when the parameter is 3 or 5, it can not be compiled, this is the error log:

Traceback (most recent call last):

  File "/home/rui.huang/tvm-0813/tests/python/relay/train/models/test_aot.py", line 50, in <module>
    test_concatenate_grad()

  File "/home/rui.huang/tvm-0813/tests/python/relay/train/models/test_aot.py", line 45, in test_concatenate_grad
    verify_concatenate([(2, 3), (3, 3), (4, 3)], 'float32', 0)

  File "/home/rui.huang/tvm-0813/tests/python/relay/train/models/test_aot.py", line 41, in verify_concatenate
    compute(data, axis)

  File "/home/rui.huang/tvm-0813/tests/python/relay/train/models/test_aot.py", line 26, in compute
    intrp_wrapper = compile(bwd_func, mod, ctx=ctx, tgt=tgt)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 264, in compile
    func = compiler.visit(func)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 43, in visit
    res = self.visit_function(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 162, in visit_function
    return CPPFunction(func.params, self.visit(func.body), func.checked_type.ret_type)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 43, in visit
    res = self.visit_function(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 162, in visit_function
    return CPPFunction(func.params, self.visit(func.body), func.checked_type.ret_type)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 43, in visit
    res = self.visit_function(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 162, in visit_function
    return CPPFunction(func.params, self.visit(func.body), func.checked_type.ret_type)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 47, in visit
    res = self.visit_let(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 139, in visit_let
    cpp_value = self.visit(let.value)

  File "/home/rui.huang/tvm-0813/python/tvm/relay/expr_functor.py", line 45, in visit
    res = self.visit_call(expr)

  File "/home/rui.huang/tvm-0813/relay-aot/python/aot/aot.py", line 130, in visit_call
    assert (call.attrs == None)

AssertionError

I make a break point at aot.py(def visit_call(self, call: Expr) -> Expr:), the log shows:

call = {Call} v0.0.3\nfree_var %p0: Tensor[(9, 3), float32]\nsplit(%p0, indices_or_sections=[2, 5]) /* ty=(Tensor[(2, 3), float32], Tensor[(3, 3), float32], Tensor[(4, 3), float32]) */
 _checked_type_ = {TupleType} v0.0.3\n(Tensor[(2, 3), float32], Tensor[(3, 3), float32], Tensor[(4, 3), float32])
 args = {Array} [Var(p0, ty=TensorType([9, 3], float32))]
 attrs = {SplitAttrs} relay.attrs.SplitAttrs(0x1b9f7d0)
 op = {Op} v0.0.3\nsplit
 span = {NoneType} None
 type_args = {Array} [TensorType([9, 3], float32)]
self = {AoTCompiler} <aot.aot.AoTCompiler object at 0x7f1fa7eb6fd0>

As i know when it can be compiled, the call.attrs should be None, but when the parameter is odd number, the call.attrs is SplitAttrs type.
By the way, the concatenate's backward op is split, I implement it by this way:

@register_gradient("concatenate")
def concatenate_grad(orig, grad):
    """
    Return concatenate gradient
    :param orig: attrs(axis)
    :param grad: initial gradient computed for execution result of concatenate
    :return:
    """

    axis = orig.attrs.axis

    # Compute the data shape after split
    data_type = orig.type_args[0]
    indices = []
    split_indices = 0

    for field in data_type.fields[:-1]:
        split_indices += field.shape[axis]
        indices.append(split_indices)

    return [split(grad, indices, axis)]

it can be compiled by using JIT

Can anyone give me some suggestion? thanks very much

How to reuse compiled librelay_aot_{_LIB_COUNTER}.so file?

When I use aot to train a mlp model, it will get a source.cc file. after compile this source.cc file, it will generate a librelay_aot_{_LIB_COUNTER}.so file in the tmp/relay_aot_compiler... directory. Because it is too slow to execute the compilation, so I would like to ask if there is any way to reuse the compiled librelay_aot_{_LIB_COUNTER}.so file.
I tried to directly replace the path to library_path, but it doesn't work.
# library_path = compile_cpp(source_code, lib_name, flags=["-O3"])
library_path = '/tmp/relay_aot_compilerz3wrq31z/librelay_aot_2.so
Could any guys give me some suggestion? thanks!
@MarisaKirisame @SWu

Compiled code isn't standalone

I don't think the compiled function is standalone right now, as ops inside the function are registered at compile-time as hashed references to JIT functions (i.e. here), which show up in the generated source.cc as e.g. runtime::Registry::Get("op_-3048088960110736787");

I believe this means that outside of the context of the python interpreter where the function is first compiled, these references won't be valid. In particular, I don't think it's possible to directly use the source.cc as integration in a C++ app.

Is there a way to generate truly standalone native code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.