Coder Social home page Coder Social logo

Comments (4)

PSeitz avatar PSeitz commented on September 2, 2024

Thanks for the bug report. The first stacktrace from atos is garbled, it doesn't make sense.

The second stacktrace does make sense, but looking at the code and I don't see how that could happen.
It uses the tantivy_bitpacker, which does not use any SIMD code in that path.

Can you run it with a modified version of tantivy? I could push some changes to a branch, to get more context information, when the panic occurs. Is the GH repo public?

from tantivy.

fulmicoton avatar fulmicoton commented on September 2, 2024

The code you point at is NOT using SIMD. SIMD bitpacking uses a different layout that prevents efficient random access, and we need random access for columns.

Like @PSeitz, I have no clue where it could come from.

There are two calls to BitUnpacker::new in this file.
The first one happens in deserialization of the existing column. It is probably called as we consume the dyn Iterator, but I think the stacktrace would have looked different if it was the one hitting.

The second one happens on code that is rather straightforward.

bit_width is obtained via:
let bit_width = buffer.iter().copied().map(compute_num_bits).max().unwrap();

pub fn compute_num_bits(n: u64) -> u8 {
    let amplitude = (64u32 - n.leading_zeros()) as u8;
    if amplitude <= 64 - 8 {
        amplitude
    } else {
        64
    }
}

As far as I can tell, all number exiting this function, and their max, should match the assert.
If it is easy to reproduce, we can add a couple of asserts here and there, and more info on the value triggering the assert.

If you can share the segments and the schema triggering the assert, this would be even more helpful.
Rust 1.78 had an LLVM upgrade so it could even be an actual compiler bug.

Also if you have some kind of compiler cache in your CI? can you try and clean it or disable it and see if your problem gets solved?

from tantivy.

fe9lix avatar fe9lix commented on September 2, 2024

Thanks for getting back. Sorry that the first stack trace is not more helpful. That's just the atos output for a couple of memory addresses from the crash log to see whether bitpacking showed up somewhere.

It all doesn't make sense to me either, so I tried comparing the build environment and the different CPU features was the only one I could spot so far. (But, as far as I understand, SIMD is not involved and even if there were compile time differences, it wouldn't explain the different runtime behavior because of feature detection.)

Now I tried running a CI build again with the old config and got an interesting new permutation of the crash this time (right after running the binary on the command line):
53287 illegal hardware instruction
so it tries running a machine instruction that is not supported by the CPU.

Here's the build config:

rust-toolchain.toml

[toolchain]
channel = "1.76.0"
targets = ["aarch64-apple-darwin", "x86_64-apple-darwin"]

Cargo.toml

[profile.release]
strip = "debuginfo"
opt-level = 3
lto = true
codegen-units = 1
panic = "unwind"

A build with toolchain 1.78.0 and panic = "abort" as only release profile option has been running stable so far.

Re CI and caching: There should be a fresh Github runner for each run and we don't do any custom caching in the action. For the toolchain setup we've been using this action, then set up cargo make via this action, run cargo clean, and then build via:

command = "cargo"
args = [
    "build",
    "--release",
    "--target",
    "aarch64-apple-darwin",
    "--locked",
    "-p",
    "server",

I'll check again the Github actions to see if there could be any effects.

from tantivy.

fe9lix avatar fe9lix commented on September 2, 2024

I couldn't reproduce it any more since we've removed third-party actions from our GitHub CI workflow. (Btw, we've found that the macOS runner by default supports the Rust toolchain, so there's no need for additional setup actions.)

from tantivy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.