udoprog / audio Goto Github PK

View Code? Open in Web Editor NEW

73.0 3.0 10.0 1.13 MB

A crate for working with audio in Rust

License: Apache License 2.0

Rust 99.99% C 0.01%

alsa wasapi rust audio

audio's People

Contributors

Stargazers

Watchers

Forkers

isgasho aethertap be-ing uklotzde icodein mhk-ableton gallagator yotamofek henquist sahu771156

audio's Issues

Abstracting over sample format?

I frequently get questions about how to use my resampling library to resample audio data that is in integer format, often i16 but it varies. My resampler (as most other non-trivial signal processing) must work on float samples. So I need an abstraction layer that lets me read and write float samples to and from input and output buffers, no matter what layout and sample format the actual data buffer is using. Using audio buffers for input and output solves this for the layout, but it doesn't help with the sample format. Are there plans to support this in audio? Is it even possible with the current design?

I have experimented with implementing a solution for this, which ended up as this: https://github.com/HEnquist/audioadapter-rs
This solves the problem, but with the obvious downside that it's a completely different solution than the audio buffers. It does however implement simple wrappers for audio buffers, so a project using audioadapter-rs would be able to also use audio buffers.

Initially I was planning on using audio buffers, but since it only solves half my problem I'm not sure about that any more. So what are the plans? I also see that there hasn't been much activity here lately which is a little worrying.

Add copy_within method on ChannelMut?

Slices have a copy_within method similar to C's memmove. It could be nice to have this on ChannelMut. For LinearChannel, it's already easy enough to call .as_mut() to get a slice and use slice::copy_within on that. For InterleavedChannel, it would be trickier to implement.

0.2.0 release?

Rust 1.56 was released 3 November 2022 with GATs stabilized but this crate is still at 0.2.0-alpha.3. Is there anything left to do before releasing 0.2.0?

example does not build

audio on  main via 🦀 v1.58.0 
❯ cargo build
    Updating crates.io index
    Updating git repository `https://github.com/udoprog/minimp3-rs`
    Updating git submodule `https://github.com/lieff/minimp3.git`
    Updating git repository `https://github.com/udoprog/rubato`
error: no matching package named `audio` found
location searched: https://github.com/udoprog/rubato?branch=next#4ca6e6f9
required by package `rubato v0.9.0 (https://github.com/udoprog/rubato?branch=next#4ca6e6f9)`
    ... which satisfies git dependency `rubato` of package `examples v0.0.0 (/home/be/sw/audio/examples)`

The examples use your fork of rubato which uses relative filesystem paths for audio. While that may work on your machine, it does not work on mine. I recommend to use patch.crates-io in ~/.cargo/config.toml to work on libraries locally so you don't accidentally commit Cargo.toml in the repository with relative paths specific to your machine.

I see you have made some contributions to rubato but the branch used by the example uses a pretty big change that is not upstream nor do I see a PR for it upstream. Were you intending to make a PR for that upstream?

interleaved <--> sequential conversions

It would be nice to provide means to convert between interleaved and sequential buffers.

consider renaming Sequential and Dynamic

I haven't seen these terms used to refer to audio (or image/video) buffers in other contexts. What "Sequential" does is more commonly called "planar", which I think would be a good name for the struct.

"Dynamic" kinda implies that the others are "static", but they all allow resizing both the number of frames and channels. "Disjoint" would describe how "Dynamic" organizes memory, but that feels like an awkward name...

What to do about ExactSizeBuf and wrap::Dynamic?

Currently, audio::wrap::Dynamic does not implement the ExactSizeBuf trait. I think it is the only buffer struct in audio that does not implement this trait.

Rubato needs the number of frames in the buffer to verify that the user has passed in buffers of a sufficient size. Combined, these create a problem for integrating audio into Rubato due to HEnquist/rubato#52 (comment):

It must be reasonably easy to upgrade an existing project using non-audio Rubato. This and the previous point mean that it should be possible to somehow still use vecs of vecs, perhaps by wrapping them up as audio buffers. I don't want to force people to copy data back and forth between buffers, or to migrate the entire project to use audio buffers.

Currently, Rubato uses Vec<Vec<T>>s for its input and output buffers, so having an easy way to wrap those and pass them into a new version of Rubato using the Buf and BufMut traits is required.

I think it could make sense to implement ExactSizeBuf for wrap::Dynamic by returning the number of frames in the shortest channel. To start with, it would be odd to have any kind of multichannel buffer where the channels are different lengths. I'm having trouble thinking of a case where leaving off some frames of a bigger channel would be a problem. I think most likely, some data at the end of bigger channels would be skipped, but that's probably less bad than going out of bounds if the returned frame count was too large.

However, if ExactSizeBuf is implemented for every Buf, that raises the question, why have a separate trait for this instead of including a frames(&self) -> usize function in the Buf trait? Is the frames_hint method really needed?

In addition to this issue with Rubato, implementing a Range wrapper (#28) for Bufs gets awkward without a frames(&self) -> usize method on Buf.

SmallVec-style optimization

It would be neat if the audio buffers implemented an optimization like SmallVec where buffers with a frame sizes and channel counts <= to const generics were stored on the stack and larger buffers were stored on the heap. Often the likely maximum number of frames and channels is known at compile time, but hardcoding such limits would make the program inflexible. For example, a buffer in a realtime application taking data from arbitrary audio files will likely have <= 2 channels and <= 1024 or 512 frames.

lib.rs documentation links to docs.rs

... instead of using intra-doc links generated by rustdoc. This makes it confusing to work on the docs locally because the locally built docs jump to docs.rs instead of the local build.

Publish a new version?

The latest release on crates.io is quite old. Could we have new releases of audio and audio-core?

Unify documentation under a few common concepts

This library uses the following concepts:

Buffer refers to an object holding an audio buffer.
A sample is a single value from with in a buffer from a single channel.
Channel refers to a single channel inside of an audio buffer.
A frame refers to the group of channel at a given offset inside of an audio buffer.

The primary abstraction are buffers that contain 0 or more channels. Each buffer can also be viewed as a sequence of frames, where each frame contains the corresponding sample from every channel in that buffer.

shrink_to_fit method on BufMut

It would be nice to shrink a buffer's capacity after resizing it, like std::vec::Vec::shrink_to_fit.

Conversions cause distortion

The conversions currently scale with a different constant for positive and negative sample values.
https://github.com/udoprog/audio/blob/main/audio-core/src/translate.rs#L37
This creates distortion. Not very much, but it could and should be distortion-free.

Cpal does the same mistake, see my old comment here:
RustAudio/cpal#512 (comment)
In short, it's better to just divide by -MIN. For i8 it then works like this:

i8 to float: value as f32 / 128. Resulting range: -1.0 to +0.992.
float to i8: value*128, if below -128 or above +127 clamp to those values and cast to i8.
This way, a 24-bit int can be losslessly converted to 32-bit float and back again. The int->float conversion will not reach +1.0, just very close. This may seem wrong but it's actually more correct.

Topology struct?

I'm wondering if it could be helpful to add a struct like

struct Topology {
    pub channels: usize,
    pub frames: usize,
}

to pass to Buf::with_topology and ResizableBuf::resize_topology. When writing these functions, it is easy to mix up the channels and frames arguments because both are usizes. I have to refer to the documentation to ensure that I'm calling them correctly. Currently, calls to these look like:

let buffer = Sequential::with_topology(2, 256);

There would be less room for confusion like this:

let buffer = Sequential::with_topology(Topology { channels: 2, frames: 256 });

let topology = Topology { channels: 2, frames: 256 };
let buffer = sequential::with_topology(topology);

An alternative solution could be adding newtypes for frames and/or channels.

Frame access and manipulation

This crate looks nice, but it seems to be missing a way to iterate over the frames of a buffer, only the samples within one channel at a time. Sometimes it is necessary to iterate over frames to do operations relating the values of different channels, for example panning. Another use case is zipping an iterator over frames with the RampingValueIterator that I wrote for smoothly interpolating parameter changes over the frames of a buffer.

It would be possible to create an iterator in this crate to do this, but I think it would be better to use the dasp::signal::Signal iterator trait which already exists. Curiously, dasp doesn't have structs for holding audio data. It would be nice to be able to use the dasp Signal trait on the buffers in this crate which would also make it seamless to use the variety of other tools in dasp on the buffers. Perhaps it would make sense to merge this crate into a module of dasp as well. As a first step, I think it would be good to start by replacing this crate's Sample trait with dasp::Sample.

What are your thoughts on these proposals? My overarching goal is to have a common set of audio buffer structs and iterators used throughout the Rust audio ecosystem. Currently everyone is writing their own solution, which makes it cumbersome to pass audio data between crates and can require unnecessary copying.

Are Buf::skip and Buf::tail redundant?

From reading the examples in the documentation, they seem to do the same thing. Also the text of the documentation for these functions is identical "Construct a new buffer where n frames are skipped."

Vet API against third party projects

Vet the API against (forks) of third party projects to ensure that it works as intended.

List includes:

rubato (outdated fork).
minimp3-rs (fork).
cpal.
creek.
symphonia.

please feel free to suggest more and open sub-issues for questions that arises

no_std support

It would be nice to make this library compatible with no_std environments. I think it would just require guarding some modules behind a feature flag and replacing references to std with core and alloc. I don't think there's much of a need for dedicated buffer structs for this use case. no_std users could use audio::wrap::sequential/interleaved on plain arrays to make use of the Buf/BufMut traits.

Implement `Buf` and `BufMut` directly on wrapped dynamic types instead of providing wrappers?

Two libraries I'm using quite frequently are bstr, and bytes (who's naming convention is borrowed here), and one of their decisions is to implement an [extension trait directly on types instead of solely requiring a wrapper like the wrap module:

This is an option worth exploring for the audio crate, and maybe one to seriously consider. Because it would preempt questions such as the need to export the wrap module from crates, making it easier to use primitive Rust types directly meaning users might not even need to depend on the audio crate to use the library. Unless they have a specific reason like using audio crate-specific buffers.

CC: @Be-ing