Coder Social home page Coder Social logo

Comments (14)

Natooz avatar Natooz commented on August 25, 2024 1

Are the logs of the failed actions inaccessible ? https://github.com/Natooz/MidiTok/actions/runs/7700019008/job/20983922442
EDIT: my bad, I thought they stayed accessible.
Here are the the raw logs

Otherwise you can reproduce it easily on an intel Mac:

# Make sure to have loaded a virtual python environment

git clone https://github.com/Natooz/MidiTok && cd MidiTok
pip install ".[tests]"
pytest -n logical -v tests/test_tokenize.py::test_multitrack_midi_to_tokens_to_midi

from symusic.

Natooz avatar Natooz commented on August 25, 2024 1

Minimal failing test method:

def _test_tokenize(
    midi_path: str | Path,
    tok_params_set: tuple[str, dict[str, Any]],
) -> None:
    r"""
    Tokenize a MIDI file, decode it back and make sure it is identical to the ogi.

    The decoded MIDI should be identical to the original one after downsampling, and
    potentially notes deduplication.

    :param midi_path: path to the MIDI file to test.
    :param tok_params_set: tokenizer and its parameters to run.
    """
    # Reads the MIDI and add pedal messages to make sure there are some
    try:
        midi = Score(Path(midi_path))
    except MIDI_LOADING_EXCEPTION as e:
        pytest.skip(f"Error when loading {midi_path.name}: {e}")

    # Creates the tokenizer
    tokenization, params = tok_params_set
    tokenizer: miditok.MIDITokenizer = getattr(miditok, tokenization)(
        tokenizer_config=miditok.TokenizerConfig(**params)
    )

from symusic.

Yikai-Liao avatar Yikai-Liao commented on August 25, 2024

Sorry, but I don't find the log there. It seems all the tests have been passed.

from symusic.

Yikai-Liao avatar Yikai-Liao commented on August 25, 2024

Well, I did find something wrong with multiprocessing. The __setstate__ function was wrong. I have fixed it, but it is not related to this issue (Nothing changed after fixing it).

I didn't change the parsing midi code from 0.3.3 to 0.3.4. But checkout to v0.3.3 just works. And I have tried installing symusic from source, which would give the same error.

And this bug only appears on Mac. Quite a strange bug.

from symusic.

Natooz avatar Natooz commented on August 25, 2024

I first though of an error with pytest itself or pytest-xdist as mentioned in other issues pytest-dev/pytest#3216 but even with pytest v7.4.4 the issue occurs

from symusic.

Yikai-Liao avatar Yikai-Liao commented on August 25, 2024

I will test the changes from 0.3.3 to 0.3.4 one by one. Fortunately, there are only a few changes

from symusic.

Yikai-Liao avatar Yikai-Liao commented on August 25, 2024

@Natooz Could you offer me a minimal reproduce example without pytest? The error infomation in pytest is kind of ambiguous.

from symusic.

Natooz avatar Natooz commented on August 25, 2024

Do you have an idea of how this could be achieved? The issue is that it only occurs with xdist, even with one worker.

Here are a few tests I just ran (no code modification):

pytest -n 8 -v tests/test_tokenize.py::test_multitrack_midi_to_tokens_to_midi

-n 1 --> 3 fails
-n 2 --> 6 fails
-n 8 --> 3 fail
-n 20 --> 2 fails

I tested while the test method only loads the MIDI, no error occur, so this doesn't comes from here.

It fails (1 worker out of 8) when the test method is simply:

def _test_tokenize(
    midi_path: str | Path,
    tok_params_set: tuple[str, dict[str, Any]],
) -> None:
    r"""
    Tokenize a MIDI file, decode it back and make sure it is identical to the ogi.

    The decoded MIDI should be identical to the original one after downsampling, and
    potentially notes deduplication.

    :param midi_path: path to the MIDI file to test.
    :param tok_params_set: tokenizer and its parameters to run.
    """
    # Reads the MIDI and add pedal messages to make sure there are some
    try:
        midi = Score(Path(midi_path))
    except MIDI_LOADING_EXCEPTION as e:
        pytest.skip(f"Error when loading {midi_path.name}: {e}")

    # Creates the tokenizer
    tokenization, params = tok_params_set
    tokenizer: miditok.MIDITokenizer = getattr(miditok, tokenization)(
        tokenizer_config=miditok.TokenizerConfig(**params)
    )
    str(tokenizer)  # shouldn't fail
    tokenizer.preprocess_midi(midi)

So the issue occurs when preprocessing the MIDI, that is converting its attributes (notes, pitch bends...) to SOA and SOA back to attributes which are then assigned to the Scoreobject.

**EDIT: the issue occurs without preprocessing the MIDI, but the tokenizer has to be loaded for it to happen.

I'm investigating further

from symusic.

Natooz avatar Natooz commented on August 25, 2024

With the test method above, the tests pass on my machine with certain numbers of workers: 2 workers the tests pass, 4 workers sometimes one fails sometimes all pass, 3 workers (3 fails), 5 workers sometimes one or two fail...

No issue with only one worker.
Could this come from a concurrent operation performed by two workers?

from symusic.

Yikai-Liao avatar Yikai-Liao commented on August 25, 2024

What does the worker means here?

If they just load and process midi files separately, I think they won't operate the same part of memory. Symusic doesn't use any global variables and static variables. Further, if they do operate the same part of memory, the bug would also appears on windows and linux.

And if the worker means a subprocess, which is used more common in python, there won't be any problem about "thread safe"πŸ€”

Also, I don't change the code of SoA from 0.3.3 to 0.3.4.

from symusic.

Yikai-Liao avatar Yikai-Liao commented on August 25, 2024

Do you have an idea of how this could be achieved? The issue is that it only occurs with xdist, even with one worker

Do you mean that this bug can't be reproduced in python's built-in multiprocessing and thread library?

I don't know the implementation of xdist.

from symusic.

Natooz avatar Natooz commented on August 25, 2024

I'm not sure of how xdist handles multiprocessing
Indeed this is a strange bug as it occurs only on macOS.

from symusic.

Yikai-Liao avatar Yikai-Liao commented on August 25, 2024

After several tests, I did find what caused the problem. In minimidi, I tried to copy 4 bytes at one time instead of one by one, for a slight performance improvement. Well, maybe this caused some problems about accessing invalid memory.

Now this problem have been fixed. You could also try it locally to make sure there are no other bugs left. @Natooz

from symusic.

Natooz avatar Natooz commented on August 25, 2024

No more errors! πŸ™Œ
Thank you for help on this!

from symusic.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.