Coder Social home page Coder Social logo

Catching traps about wasmtime HOT 7 CLOSED

bytecodealliance avatar bytecodealliance commented on May 10, 2024
Catching traps

from wasmtime.

Comments (7)

sunfishcode avatar sunfishcode commented on May 10, 2024

Yeah. The TrapSink interface allows for the creation of a map from PC addresses to wasm trap ID and bytecode offset, but it does require that signals be caught and have access to the saved state. SpiderMonkey does this, on all its supported platforms, so it is doable.

So we have a few options here. One is to start building up a signal handling library and handling the signals. I imagine we'd start by just exiting the process cleanly, which shouldn't be too complex. We could then incrementally work on printing out the trap code and/or bytecode offset, or further, unwinding the stack and allowing the embedder to recover, which are doable, but more work.

Another would be to add a feature to cranelift for calling a designated callback when a trap would otherwise occur. This would make the generated code bigger, and preclude the heap guard optimizations and require explicit bounds checks on all heap accesses, but it would make it easier to embed wasmtime in environments where signals aren't available.

from wasmtime.

pepyakin avatar pepyakin commented on May 10, 2024

Hm, correct me if my thinking is too naive, but is it that difficult to start with signals with unwinding right away?

I thought that it is as simple as:

  1. before jumping to the generated code in execute call setjmp
  2. set the signal handler for SIGILL (for ud2), SIGBUS and SIGSEGV
  3. in the handler check if the signal comes from the generated code, translate signal info to trapcode and fetch location etc. Save that info somewhere, possibly fetch somehow vmctx or store it thru the thread_local (if that's signal safe enough ¯_(ツ)_/¯)
  4. longjmp back to execute. Return trap info as Err

would that work or am I missing something?

from wasmtime.

sunfishcode avatar sunfishcode commented on May 10, 2024

Interesting idea. I don't know how reliable longjmp from signal handlers is on various platforms these days. I believe does work on at least some though, so I wouldn't be opposed to having that as an option.

If we ever allow wasm to call into arbitrary native code and vice versa, we'd probably want to do a new setjmp each time we call back into the wasm code, so that we don't unwind through native Rust code with setjmp, but that's doable.

In the future, another option would be to do an unwind. Cranelift doesn't yet support .eh_frame or other metadata needed by native unwinders, but if we added that, then we could do a plain unwind all the way through both wasm and native code at once.

from wasmtime.

pepyakin avatar pepyakin commented on May 10, 2024

I don't know how reliable longjmp from signal handlers is on various platforms these days. I believe does work on at least some though, so I wouldn't be opposed to having that as an option.

Ha! Just implemented (or rather hacked :) ) setting a signal handler for SIGILL and catching unreachable/ud2 on a macOS machine and it works! I think, there is a good chance that it will work on Linux as well. Theoretically I can also test on Windows machine.

Regarding the second option: is it actually safe? For example, what if that native code wasn't compiled with unwind metadata?

from wasmtime.

sunfishcode avatar sunfishcode commented on May 10, 2024

Fun!

For unwinding, yeah, that may require all native code to be unwindable. On x86-64 System-V ABIs, LLVM and GCC both emit .eh_frame sections for all code, including C code. Other unwinders also have the ability to at least follow frame pointers. Ultimately we'd have to check each platform to see what's supported, but it'd be an option, and it's one we might need to explore eventually anyway when wasm gets support for EH.

from wasmtime.

pepyakin avatar pepyakin commented on May 10, 2024

I've verified that my code runs correctly on the linux machine!

However, I've ran into a problem: it's unclear how to read and write data from the signal handler. We need to read data for longjmp use and we need to write data to pass illegal instruction address or faulting address back to execute to translate this data to useful form.

It turned out that thread_local is actually not safe enough (rust-lang/rust#43146), and we can't use global variables for that since it's not thread-safe.
Also, it is not clear to me how to get vmctx without resorting to black magic.

How does SpiderMonkey solve this issue?

from wasmtime.

sunfishcode avatar sunfishcode commented on May 10, 2024

The short answer is that SpiderMonkey doesn't use Rust's std::thread_local to do this.

I don't know of a way to do this in safe Rust. With unsafe code, we could have a global (not thread-local) variable hold a raw vmctx pointer. It can be statically initialized to null, and assigned the vmctx value before we call into any JIT code. Then, the signal handler can read it, and if it's null, it means we're not in JIT code. If it's non-null, then it points to a data structure where we can keep the known ranges of JIT code and use them to determine if that's where the fault happened.

And we can have variations on that if we want to support multiple wasmtime instances in the same process.

from wasmtime.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.