Ref <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id

Catching traps about wasmtime HOT 7 CLOSED

bytecodealliance commented on July 24, 2024

Catching traps

from wasmtime.

Comments (7)

sunfishcode commented on July 24, 2024

Yeah. The TrapSink interface allows for the creation of a map from PC addresses to wasm trap ID and bytecode offset, but it does require that signals be caught and have access to the saved state. SpiderMonkey does this, on all its supported platforms, so it is doable.

So we have a few options here. One is to start building up a signal handling library and handling the signals. I imagine we'd start by just exiting the process cleanly, which shouldn't be too complex. We could then incrementally work on printing out the trap code and/or bytecode offset, or further, unwinding the stack and allowing the embedder to recover, which are doable, but more work.

Another would be to add a feature to cranelift for calling a designated callback when a trap would otherwise occur. This would make the generated code bigger, and preclude the heap guard optimizations and require explicit bounds checks on all heap accesses, but it would make it easier to embed wasmtime in environments where signals aren't available.

from wasmtime.

pepyakin commented on July 24, 2024

Hm, correct me if my thinking is too naive, but is it that difficult to start with signals with unwinding right away?

I thought that it is as simple as:

before jumping to the generated code in execute call setjmp
set the signal handler for SIGILL (for ud2), SIGBUS and SIGSEGV
in the handler check if the signal comes from the generated code, translate signal info to trapcode and fetch location etc. Save that info somewhere, possibly fetch somehow vmctx or store it thru the thread_local (if that's signal safe enough ¯_(ツ)_/¯)
longjmp back to execute. Return trap info as Err

would that work or am I missing something?

from wasmtime.

sunfishcode commented on July 24, 2024

Interesting idea. I don't know how reliable longjmp from signal handlers is on various platforms these days. I believe does work on at least some though, so I wouldn't be opposed to having that as an option.

If we ever allow wasm to call into arbitrary native code and vice versa, we'd probably want to do a new setjmp each time we call back into the wasm code, so that we don't unwind through native Rust code with setjmp, but that's doable.

In the future, another option would be to do an unwind. Cranelift doesn't yet support .eh_frame or other metadata needed by native unwinders, but if we added that, then we could do a plain unwind all the way through both wasm and native code at once.

from wasmtime.

pepyakin commented on July 24, 2024

I don't know how reliable longjmp from signal handlers is on various platforms these days. I believe does work on at least some though, so I wouldn't be opposed to having that as an option.

Ha! Just implemented (or rather hacked :) ) setting a signal handler for SIGILL and catching unreachable/ud2 on a macOS machine and it works! I think, there is a good chance that it will work on Linux as well. Theoretically I can also test on Windows machine.

Regarding the second option: is it actually safe? For example, what if that native code wasn't compiled with unwind metadata?

from wasmtime.

sunfishcode commented on July 24, 2024

Fun!

For unwinding, yeah, that may require all native code to be unwindable. On x86-64 System-V ABIs, LLVM and GCC both emit .eh_frame sections for all code, including C code. Other unwinders also have the ability to at least follow frame pointers. Ultimately we'd have to check each platform to see what's supported, but it'd be an option, and it's one we might need to explore eventually anyway when wasm gets support for EH.

from wasmtime.

pepyakin commented on July 24, 2024

I've verified that my code runs correctly on the linux machine!

However, I've ran into a problem: it's unclear how to read and write data from the signal handler. We need to read data for longjmp use and we need to write data to pass illegal instruction address or faulting address back to execute to translate this data to useful form.

It turned out that thread_local is actually not safe enough (rust-lang/rust#43146), and we can't use global variables for that since it's not thread-safe.
Also, it is not clear to me how to get vmctx without resorting to black magic.

How does SpiderMonkey solve this issue?

from wasmtime.

sunfishcode commented on July 24, 2024

The short answer is that SpiderMonkey doesn't use Rust's std::thread_local to do this.

I don't know of a way to do this in safe Rust. With unsafe code, we could have a global (not thread-local) variable hold a raw vmctx pointer. It can be statically initialized to null, and assigned the vmctx value before we call into any JIT code. Then, the signal handler can read it, and if it's null, it means we're not in JIT code. If it's non-null, then it points to a data structure where we can keep the known ranges of JIT code and use them to determine if that's where the fault happened.

And we can have variations on that if we want to support multiple wasmtime instances in the same process.

from wasmtime.

Catching traps about wasmtime HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent