gwbasic / soft_matrix Goto Github PK
View Code? Open in Web Editor NEWUpmixes stereo to surround sound
License: MIT License
Upmixes stereo to surround sound
License: MIT License
Blocked on GWBasic/wave_stream#28
Use channelMask to write a proper 5.1 channel. (So that on playback the 6 channels are mapped correctly.)
One thing to try is using 24-bit waves instead of floating point.
A mix of usize an u32 is used to indicate location in the sound file and window size.
This is confusing.
Instead, use usize everywhere.
Currently blocked by: GWBasic/wave_stream#26
Make the whole thing multithreaded
Support SQ, RM, and other "old" matrix formats
Also include a matrix named "carlos" for Wendy Carlos's stuff that she never released in discrete surround.
In order to avoid unneeded copying of transform vectors:
Instead, use an Option and None when no transform is needed
One rear channel needs to be +(0.5 pi), the other needs to be -(0.5 pi).
According to https://en.wikipedia.org/wiki/QS_Regular_Matrix:
Note that Dolby Surround tends to have the same phase shift for the back:
Users should be able to install via "brew install soft_matrix" (or similar)
Currently, there is one Fourier transform per sample. See if there is a way to have a single Fourier transform shared among a few samples.
Even if the quality is poor, this might be a good way to preview results quickly.
For SQ: https://en.wikipedia.org/wiki/Stereo_Quadraphonic
https://en.wikipedia.org/wiki/Matrix_decoder#SQ_matrix,_%22Stereo_Quadraphonic%22,_CBS_SQ_(4:2:4)
Rear left: Right is (3/4)pi ahead
Rear center: 135 degrees difference between channels, right is 135 degrees forward relative to left
Rear right: Right is (1/4)pi behind
https://www.desmos.com/calculator/zimzev6yla
l: left back in left total
e: right back in right total
k: left back in right total
r: right back in right total
Bottom functions:
Rear center in left total
Rear center in right total
Some .unwraps() were introduced in #3. Remove them.
https://doc.rust-lang.org/stable/std/thread/fn.available_parallelism.html states that available_parallelism changes depending on load.
Automatically adjust:
Optimize window size following instructions at: https://docs.rs/rustfft/latest/rustfft/#avx-performance-tips
Currently:
There should be a readme.md that has build and usage instructions
Branch IgnorePhaseInLFE has an attempt to ignore phase in the LFE. It's very staticy when normalized.
See: https://en.wikipedia.org/wiki/Dolby_Stereo#The_Dolby_Stereo_Matrix
In Dolby stereo, the center and rear channels are reduced by 0.707106781186548 when encoding.
In general, when mixing in stereo, items in the center channel need to be at amplitude 0.707 in order to be as loud as items isolated in a speaker at amplitude 1.0.
During upmixing, this creates a complication: If a tone plays at amplitude 1.0 in both speakers, it will be directed to the center speaker at amplitude 1.414213562373095. This will create clipping.
There are two options to handle this:
In this ticket:
https://en.wikipedia.org/wiki/Ambisonic_UHJ_format#Encoding
I must admit that I don't fully understand how Ambisonics works. It will require more detailed study to understand how to decode it.
This summary makes a lot more sense: https://en.wikipedia.org/wiki/Matrix_decoder#Ambisonic_UHJ_kernel_(3:2:4_or_more)
Currently, the lowest supported frequency is 10 hz. This should be configurable. A higher lowest supported frequency will allow for more quickly previewing an upsample.
Every second, write progress:
It appears that distributing via "cargo install" is the easiest way to distribute soft_matrix. (It will require that users install rust, though.)
Instructions at:
When I document how to install soft_matrix, I'll need to link to github issues with "Volunteers needed"
Horseshoe surround:
Super-stereo:
https://en.wikipedia.org/wiki/Dynaquad
In general, this requires:
Logging currently happens after writing a sample. This creates a large delay between starting and logging.
Instead, logging's percentage should be based on the number of forward and backward transforms performed:
The check for logging should happen whenever the counts of transforms are incremented.
Make reading stream based:
This will be useful to allow reading the wav from stdin; which will allow using sox or similar to read formats other than wav
Output currently is currently 4 channels; output should be 5.1
To start, introduce an options processor:
There are three phases in upmixing:
Each phase should be a separate file. (Keeping them all in upmixer.rs is unwieldy.) Each file should have a struct to maintain its state, and use closures as dependency injection to send along the next phase in processing.
Upmixing a file that's 2:05:56 (Just under 2 hours, 6 minutes) long, was truncated to 58:18. (Fifty-eight minutes, eighteen seconds.)
Not sure why
Edit: The root cause is that wave_stream doesn't support files longer than RIFF's 32-bit size values: GWBasic/wave_stream#30
To fix this: I'm going to implement some file splitting logic. Wav files longer than 4GB have inconsistent support, but Sox can concatinate into other formats that support long files.
When hardcoding the front-to-back to be the front, the pitch is always shifted up.
In this ticket, get a "no-op" with silent rear channels and no pitch shift.
Averaging for panning front to back only needs to be a single wavelength. Right now all averaging is the length of the entire window.
This will require varying length buffers, or an entirely different means of averaging
Observe that the point where the phase changes isn't centered in the transition between front and rear:
In the final phase of upmixing, when sound is steered, the vectors for the transforms are copied so that there are transforms for the front and back.
The copy occurs via a .tovec();
See if there is a faster way to copy these vectors.
(An early lesson is that copying vectors to maintain memory integrity had a lot of overhead, and using RefCells that allowed swapping, and empty vectors where the transforms are ignored, is much faster.)
It appears that the computer will still sleep while soft_matrix is running. The computer should stay awake while soft_matrix is running.
To do this:
The current audio output is highly staticy. Fix this
Front to back panning, in comments, explains that the phase is based on wavelength, but it doesn't actually use wavelength
Typing "soft_matrix" with no options should give an informative set of instructions
https://en.wikipedia.org/wiki/Stereo-4
In general this requires:
Users on Windows should be able to install via a standard package manager such as chocholatey
I believe this is because of poor understanding of phase, I suspect I misunderstand how .re is represented
If the averaging phase takes too long, and there's no more input, threads will exit. This could leave a single thread performing all of the backwards transforms and writing.
To fix this:
The article on QS states that it's incorrectly refferred to as RM: https://en.wikipedia.org/wiki/QS_Regular_Matrix
RM (Regular Matrix) was often used a synonym for the 'Sansui QS', 'Toshiba QM' and 'Nippon Columbia QX' matrix systems that were previously launched before the advent of the RM specification in 1973. Although none of the three previous matrices were compatible with the new RM specification, and with Toshiba and Nippon Columbia withdrawing their 'further RM incompatible' matrix systems from the market, Sansui's QS system was unofficially labelled by some record labels as RM, until the situation was clarified to those responsible for the mislabeling
At this point, I believe I've used wave_stream enough to open-source it.
It's getting awkward using a private reference to wave_stream.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.