voidxh / cavern Goto Github PK

Object-based audio engine and codec pack with Dolby Atmos rendering, room correction, HRTF, one-click Unity audio takeover, and much more.

Home Page: http://cavern.sbence.hu

License: Other

C# 98.36% C++ 1.16% C 0.31% Batchfile 0.07% JavaScript 0.10%

audio sound cinema audio-engine spatial-audio room-correction surround-sound unity dolby-atmos

cavern's Introduction

Cavern

Cavern is a fully adaptive object-based audio rendering engine and (up)mixer without limitations for home, cinema, and stage use. Audio transcoding and self-calibration libraries built on the Cavern engine are also available. This repository also features a Unity plugin and a standalone converter called Cavernize.

Features

Unlimited objects and output channels without position restrictions
Audio transcoder library with a custom spatial format
Supported codecs:
- E-AC-3 with Joint Object Coding (Dolby Digital Plus Atmos)
- Limitless Audio Format
- RIFF WAVE
- Audio Definition Model Broadcast Wave Format
- Supported containers: .ac3, .eac3, .ec3, .laf, .m4a, .m4v, .mka, .mkv, .mov, .mp4, .qt, .wav, .weba, .webm
Advanced self-calibration with a microphone
- Results in close to perfectly flat frequency response, <0.01 dB and <0.01 ms of uniformity
- Uniformity can be achieved without a calibration file
- Supported software/hardware for EQ/filter set export:
  - PC: Equalizer APO, CamillaDSP
  - DSP: MiniDSP 2x4 Advanced, MiniDSP 2x4 HD, MiniDSP DDRC-88A
  - Processors: Emotiva, StormAudio
  - Amplifiers: Behringer NX series
  - Others: Audyssey MultEQ-X, Dirac Live, YPAO
Direction and distance virtualization for headphones
Real-time upconversion of regular surround sound mixes to 3D
Mix repositioning based on occupied seats
Seat movement generation
Ultra low latency, even the upconverter can work from as low as one sample per frame
Unity-like listener and source functionality
Fixes for Unity's Microphone API
- Works in WebGL too

User documentation

User documentation can be found at the Cavern documentation webpage. Please go to this page for basic setup, in-depth QuickEQ tutorials, and command-line arguments.

The full list of changes for each version can be found in CHANGELOG.md.

How to build

Cavern

Cavern is a .NET Standard project with no dependencies. Open the Cavern.sln solution with Microsoft Visual Studio 2022 or later and all projects should build.

Sample projects

These examples use the Cavern library to show how it works. The solution containing all sample projects is found at CavernSamples/CavernSamples.sln. The same build instructions apply as to the base project.

Single-purpose sample codes are found under docs/Code.

Cavern for Unity

Open the CavernUnity DLL.sln solution with Microsoft Visual Studio 2022. Remove the references from the CavernUnity DLL project to UnityEngine and UnityEditor. Add these files from your own Unity installation as references. They are found in Editor\Data\Managed under Unity's installation folder.

CavernAmp

This is a Code::Blocks project, set up for the MingW compiler. No additional libraries were used, this is standard C++ code, so importing just the .cpp and .h files into any IDE will work perfectly.

Library quick start

Clip

Cavern is using audio clips to render the audio scene. A Clip is basically a single audio file, which can be an effect or music. The easiest method of loading from a file is through the Cavern.Format library, which will auto-detect the format:

Clip clip = AudioReader.ReadClip(pathToFile);

Refer to the scripting API for the complete description of this object.

Listener

The Listener is the center of the sound stage, which will render the audio sources attached to it. The listener has a Position and Rotation (Euler angles, degrees) field for spatial placement. All sources will be rendered relative to it. Here's its creation:

Listener listener = new Listener() {
    SampleRate = 48000, // Match this with your output
    UpdateRate = 256 // Match this with your buffer size
};

The Listener will set up itself automatically with the user's saved configuration. The used audio channels can be queried through Listener.Channels, which should be respected, and the output audio channel count should be set to its length. If this is not possible, the layout could be set to a standard by the number of channels, for example, this line will set up all listeners to 5.1:

Listener.ReplaceChannels(6);

Refer to the scripting API for the complete description of this object.

Source

This is an audio placed in the sound space, renders a Clip at where it's positioned relative to the Listener. Here's how to create a new source at a given position and attach it to the listener:

Source source = new Source() {
    Clip = clip,
    Position = new Vector3(10, 0, 0)
};
listener.AttachSource(source);

Sources that are no longer used should be detached from the listener using DetachSource. Refer to the scripting API for the complete description of this object.

Rendering

To generate the output of the audio space and get the audio samples which should be output to the system, use the following line:

float[] output = listener.Render();

The length of this array is listener.UpdateRate * Listener.Channels.Length.

Working with audio files

The Cavern.Format library handles reading and writing audio files. For custom rendering or transcoding, they can be handled on a lower level than loading a Clip.

Reading

To open any supported audio file for reading, use the following static function:

AudioReader reader = AudioReader.Open(string path);

After opening a file, the following workflows are available.

Getting all samples

The Read() function of an AudioReader returns all samples from the file in an interlaced array with the size of reader.ChannelCount * reader.Length.

Getting the samples block by block

For real-time use or cases where progress should be displayed, an audio file can be read block-by-block. First, the header must be read, this is not done automatically. Until the header is not read, metadata like length or channel count are unavailable. Header reading is accomplished by calling reader.ReadHeader().

The ReadBlock(float[] samples, long from, long to) function of an AudioReader reads the next interlaced sample block to the specified array in the specified index range. Samples are counted for all channels. A version of ReadBlock for multichannel arrays (float[channel][sample]) is also available, but in this case, the index range is given for a single channel.

Seeking in local files are supported by calling reader.Seek(long sample). The time in samples is relative to reader.Length, which means it's per a single channel.

Rendering in an environment

The reader.GetRenderer() function returns a Renderer instance that creates Sources for each channel or audio object. These can be retrieved from the Objects property of the renderer. When all of them are attached to a Listener, they will handle fetching the samples. Seeking the reader or the renderer works in this use case.

Writing

To create an audio file, use an AudioWriter:

AudioWriter writer = AudioWriter.Create(string path, int channelCount, long length, int sampleRate, BitDepth bits);

This will create the AudioWriter for the appropriate file extension if it's supported.

Just like AudioReader, an AudioWriter can be used with a single call (Write(float[] samples) or Write(float[][] samples)) or block by block (WriteHeader() and WriteBlock(float[] samples, long from, long to)).

Unity quick start

Cavern works exactly the same way as Unity's audio engine, only the names are different. For AudioSource, there's AudioSource3D, and for AudioListener, there's AudioListener3D, and so on. You will find all Cavern components in the component browser, under audio, and they will automatically add all their Unity dependencies.

Development documents

Scripting API with descriptions of all public members for all public classes
Virtualizer repository which contains the raw IR measurements and detailed information about their use
Limitless Audio Format for storing Cavern mixes in a CPU-effective spatial format
Cavern DCP channel order compared to DCP standards

Disclaimers

Code

Cavern is a performance software written in an environment that wasn't made for it. This means that clean code policies like DRY are broken many times if the code is faster this way, usually by orders of magnitude. Most changes should be benchmarked in the target environment, and the fastest code should be chosen, regardless of how bad it looks. This, however, can't result in inconsistent interfaces. In that case, wrappers should be used with the least possible method calls.

Driver

While Cavern itself is open-source, the setup utility and most converter interfaces are not, because they are built on licences not allowing it. However, their functionality is almost entirely using this plugin. Builds can be downloaded from the Cavern website.

Licence

By downloading, using, copying, modifying, or compiling the source code or a build, you are accepting the licence available here.

cavern's People

Contributors

Stargazers

Watchers

Forkers

hl4hck threedeejay ifgcguitarclub johannesduvenage delicioushouse aiwatch heavymetaldaji manoharofficial djtrance frederik256 ernest-cpu mgth

cavern's Issues

Feature Request : Import Dolby PHRTF to Headphone Virtualizer.

https://games.dolby.com/phrtf/games-instructions/

Dolby currently provides an app to generate PHRTF files via smartphones. This file is applicable to the binaural renderer of the Dolby Atmos Production Suite or the binaural renderer of Dolby Atmos for Headphones and start a personalized binaural rendering process.

If can apply PHRTF files to Cavern Driver or Cavernize's headphone virtualizer, it will be a more powerful program!

P.S. I can provide you with my Dolby PHRTF file if you'd like to check it out. Or you can generate a PHRTF file through the link I posted.

The sound of activating and listening to the headphone virtualizer sounds very strange.

Hello, I am an end user who loves Spatial Audio. I enjoy listening to Dolby Atmos or Sony 360 reality audio with my headphones.

I used the very interesting tool you made for the first time today. In your demonstration video, Core Universe's Dolby Atmos sound was fantastic to hear on headphones : https://youtu.be/sZCfmISQdHU

So I've been following you. I downloaded Core Universe(https://thedigitaltheater.com/dolby-trailers/), converted it to mkv using mkvtoolnix, and played it through Cavern.

However, it was very different from the sound I heard through your video. This is the sound I heard through Cavern. (I recorded the output using Audacity.)
https://drive.google.com/file/d/1bJ3tBC3CFs7hgcpd0ROdNI3x9Tpx5WXr/view?usp=sharing

So I was able to activate an option called Headphone Virtualizer, which sounds a lot more strange.
https://drive.google.com/file/d/1dwxo11LPJjwAGxcO0P8ezT_v1kAVf0Iy/view?usp=sharing

How did you set this video when you demonstrated it?

Implement AHT for AC-3

Streaming content started using AHT, which is not yet supported in Cavern. An example is Werewolf by Night.

Wrong track naming in Cavernize

The layout in channel names look like: (L+R+, with a '+' instead of ')' at the end.

Remove unnecessary closing filters from peaking EQ generations

Sometimes the created filter makes the overall response worse, and this wrong filter is usually added multiple times. This needs a check if a filter is usable or not, and when it isn't, the PEQ generation should stop.

E-AC-3 JOC file decode problem

I tried to convert a few E-AC-3 JOC files (downloaded from Tidal) to ADM BWF wav, and I do not get the expected results.

Test track 1: Ado's Usseewa from Kyougen (jpop?)
Expected: one channel contains drums, another contains vocals, another contains piano, another bass, etc, etc.
Actual: every channel contains a different mix of all instruments (slightly different ratio). Some does only contain part of the instruments but it changes (i.e. for 30 seconds it's backing vocals + drums, then for 30 seconds it's backing vocals + synths, etc) The majority of channels are very quiet and when turned up sounds heavily distorted. Spectrogram shows holes and uneven lowpass.

Test track 2: Joe Hisaishi's Links from Minima Rhythm (orchestra, minimalism)
Expected: one channel for violin 1, one channel for violin 2, one channel for cello, etc.
Actual: every channel contains a different mix of all instruments (slightly different ratio). They all sound very similar. The majority of channels are very quiet and when turned up sounds heavily distorted. Spectrogram shows holes and uneven lowpass.

Netflix open ADMs have invalid headers

WAV structure decoder reports a block tag of "\0\0\0\0" before the format header, breaking the decoding.

CavernUnity DLL build fix

It wasn't updated after recent API-breaking commits.

Support ASIO

Is it possible to add support for ASIO drivers?
This would allow to support the 7.1.4 format.
Thanks

Output filters

Allow filtering of individual output channels, like burning QuickEQ results into files for AVR playback. Should be optimized by caching output samples up to the FFT size which also helps with write performance.

Check for new versions on launch of Cavernize

As per request. Should only be done once per week to prevent DDoSing myself.

Cavernize GUI compatibility issue on Windows 11 22H2.

Cavernize GUI is not compatible with Windows 11 22H2 version. The ADM BWF conversion process always behaves erratically and does not result in desirable interactions with respect to any software that inserts the Dolby Atmos Master.

kUbuntu 22.04 wine Crash

When i run in kUbuntu 22.04 everything runs fine for about 6sec to 15sec then crashes with output -> Segmentation fault (core dumped)

Terminal EXE -> ~/Downloads/cavern64$ wine Cavern.exe
[UnityMemory] Configuration Parameters - Can be set up in boot.config
"memorysetup-bucket-allocator-granularity=16"
"memorysetup-bucket-allocator-bucket-count=8"
"memorysetup-bucket-allocator-block-size=4194304"
"memorysetup-bucket-allocator-block-count=1"
"memorysetup-main-allocator-block-size=16777216"
"memorysetup-thread-allocator-block-size=16777216"
"memorysetup-gfx-main-allocator-block-size=16777216"
"memorysetup-gfx-thread-allocator-block-size=16777216"
"memorysetup-cache-allocator-block-size=4194304"
"memorysetup-typetree-allocator-block-size=2097152"
"memorysetup-profiler-bucket-allocator-granularity=16"
"memorysetup-profiler-bucket-allocator-bucket-count=8"
"memorysetup-profiler-bucket-allocator-block-size=4194304"
"memorysetup-profiler-bucket-allocator-block-count=1"
"memorysetup-profiler-allocator-block-size=16777216"
"memorysetup-profiler-editor-allocator-block-size=1048576"
"memorysetup-temp-allocator-size-main=4194304"
"memorysetup-job-temp-allocator-block-size=2097152"
"memorysetup-job-temp-allocator-block-size-background=1048576"
"memorysetup-job-temp-allocator-reduction-small-platforms=262144"
"memorysetup-temp-allocator-size-background-worker=32768"
"memorysetup-temp-allocator-size-job-worker=262144"
"memorysetup-temp-allocator-size-preload-manager=262144"
"memorysetup-temp-allocator-size-nav-mesh-worker=65536"
"memorysetup-temp-allocator-size-audio-worker=65536"
"memorysetup-temp-allocator-size-cloud-worker=32768"
"memorysetup-temp-allocator-size-gfx=262144"
Mono path[0] = 'Z:/home/donno/Downloads/cavern64/Cavern_Data/Managed'
Mono config path = 'Z:/home/donno/Downloads/cavern64/MonoBleedingEdge/etc'
Initialize engine version: 2021.2.5f1 (4ec9a5e799f5)
[Subsystems] Discovering subsystems at path Z:/home/donno/Downloads/cavern64/Cavern_Data/UnitySubsystems
GfxDevice: creating device client; threaded=1; jobified=0
Direct3D:
Version: Direct3D 11.0 [level 11.1]
Renderer: Radeon (TM) RX 480 Graphics (ID=0x67df)
Vendor: ATI
VRAM: 4096 MB
Driver: 1.0
Begin MonoManager ReloadAssembly

Completed reload, in 0.184 seconds
Initializing input.
Input initialized.
Touch support initialization failed: Call not implemented.
.
UnloadTime: 0.817600 ms
Setting up 4 worker threads for Enlighten.
Segmentation fault (core dumped)

Don't apply environment changes when process is running

As per report. Cavernize can break when the render target changes mid-conversion. Make it so that will only apply to the next conversion. For CLI, display an error message when this command is called after the render start (output) command.

Queued jobs

Add an option to queue conversions.

ADM BWF timing issues

The ramps are read incorrectly, transitions should happen at the end of the timeslots, not at their beginning.

Allow the layout set in the Driver in Cavernize GUI

Don't forget to update the CLI and docs too.

Cavernize 1.6 - Incorrect channel mapping.

Channel.mapping.test.mp4

REAPER - can not be imported

with Ear Production Suite, ADM compact file not recognized, ADM ATMOS file either

[Feature request] Support for AC4 and TrueHD?

I can't find a player for AC4 or TrueHD*. I'm still in the process of requesting a trial to the Dolby Reference Player but I haven't heard back for a while (2+ weeks) so it's probably ignored.

Or will this project be for eac3 only?

*Full object support (?): I'm told that TrueHD stores the clustered objects separately (so lossy clustering->lossless (no JOC) ->lossless encode), unlike EAC3 where it's combined into 5.1/7.1 channels using JOC (so lossy clustering->lossy JOC->lossy encode), but mediainfo and ffmpeg all give me 7.1 channels for TrueHD, so I'm not sure what's happening. Maybe TrueHD is already fully supported by FFmpeg (and therefore most of the players).

Wrong 7.1.4 output in Cavernize

DD+ Atmos channel tests look like they use the asymmetric renderer. Add tests so this doesn't happen again.

Add support for DBMD

Requested by @ValZapod. Dolby Atmos Metadata, needed for DME imports. Should be a manual selection as "ADM BWF + DBMD". If all else fails, loading a donor file's DBMD will work.

AC-3 bed crosstalk

For Dolby's 7.1.4 web demo, the decoded AC-3 track has noise on all channels. Discussed in #56.

Only require FFmpeg for encodings that need it

= MKV targets.

Error when importing ADM BWF files into DaVinci Resolve.

I created an ADM BWF file using cavernize_gui and tried to import it into DaVinci Resolve, but I get an error:
"Master file contains an illegal object patched into bed channels 1-10. Import aborted."

Incorrect Channel Mapping

When decoding some files like the Dolby Nature's Fury trailer, the output channels are incorrectly mapped.
e.x. the dialogue comes out of the left speaker.

On other files the output is correct i.e channels are in the correct order (L R C LFE SL SR...)

master branch

Too much FFmpeg text in Cavernize CLI

Use -v error -stats for all FFmpeg calls in console mode.

Support dwChannelMask

This makes any render play in MPC-HC.
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ksmedia/ns-ksmedia-waveformatextensible

feature request: Please consider adding QuickEQ export for CamillaDSP

CamillaDSP supports many of the features of Equalizer APO, but is cross platform and open source. Being able to export to CamillaDSP would enable very powerful DSP on low power SBCs and in many situations where using a Windows PC is not an option.

Cavern for Unity build under macOS

Installed Unity.
Replaced reference in .csproj with /Applications/Unity/Unity.app/Contents/Managed/UnityEngine.dll and /Applications/Unity/Unity.app/Contents/Managed/UnityEditor.dll
Run

Result

/Users/jin/cavern1/CavernUnity DLL/AudioListener3D.cs(31,31): Error CS1501: No overload for method 'Normalize' takes 4 arguments (CS1501) (CavernUnity DLL)

I have no experience with C# or Unity so I have no idea if I'm doing anything wrong.

Propagate error messages to CLI

Currently errors are message boxes only, which might break large batches.

Problem with the reference levels of the Dolby tests

Hello,
Thanks for your application.

When decoding the Dolby reference file :
https://download.dolby.com/us/en/test-tones/dolby-test-tones_7_1_4.mp4
with Cavernize GUI to PCM interger with 7.1.4 Render target, the level of the 12 channels is not identical.
We obtain with an RMS measurement :
-21,34 dB; -21,35 dB; -21,35 dB; -16,47 dB; -23,91 dB; -23,92 dB; -24,03 dB; -24,04 dB; -18,25 dB; -18,25 dB; -18,26 dB; -18,25 dB

With the ADM conversion, we obtain identical reference levels, except for the LFE which is normal :
-16,21 dB; -21,35 dB; -21,35 dB; -21,35 dB; -21,26 dB; -21,26 dB;-21,02 dB;-21,03 dB;-21,28 dB;-21,27 dB;-21,28 dB;-21,32 dB

The difference in level is for the surround and top channels, the front channels have the right levels.

Thanks

Surround upmixing for Cavernize GUI

Add the Cavernize filter to Cavernize GUI as an option and remove the legacy Cavernize for FFmpeg source code as it's both deprecated and deceptive. Should be an option in the Rendering menu when content up to 7.1 is loaded without objects. Needs a complete rewrite of SurroundUpmixer - maybe include the heights created by the Cavernize filter.

Cavern Driver App won't stop rotating the room

Hi there,

I found about this project today and I find it very interesting.
I've been looking for some sort of PC based Dolby Atmos decoder as such was giving it a try.

However, while trying to explore the Cavern app, I found it very difficult to do so, as the room won't freaking stop rotating, making it almost impossible to place any object.

Would it be possible to add an option to stop the room rotating or is it a bug?

Thanks.

Support binaural for Cavernize's Render target

With an ADM BWF file, it is possible to output binaural audio through DaVinci Resolve.
For the EAC3-JOC demo content provided directly by Dolby, the ADM BWF file works successfully. On the other hand I've tried creating ADM BWF files for several streaming Atmos content that provides EAC3-JOC, but all of them don't work for DaVinci Resolve.

So I am writing this post. Can I expect something like this?

and it would be nice if .eac3 was added to the supported extension.

DD+ "Atmos" encoding

DD+ can contain channel-based data up to 9.1.6, should be just bitstream-copied with a modified channel order from an FFmpeg encode.

E-AC-3 JOC file decode produces distorted output

Version 1.5

Decoding E-AC-3 JOC music files (commercial and self encoded) to riff produces distorted results. I am comparing to the reference player.

7.1.4 layout to integer RIFF.

Also RIFF float settings produce 16bit results

Don't touch the center channel with the height upconverter

Also update the description of -uc from the doc.

Is it possible to play atmos file without convert

Now i can play convert file by jriver. The player uses asio in windows to output more than 8ch audio.
Can Cavern be used as audio decoder plugin in player to play atmos file directly?

Display remaining time from the current conversion in Cavernize

always output 16bit wav

Is it possible to output 24bit wav or 32bit wav by the following modify？
writer = new RIFFWaveWriter(exportName, activeRenderTarget.Channels,
target.Length, listener.SampleRate, BitDepth.Int24);//wj2 Int16);
} else {
writer = AudioWriter.Create(exportName, activeRenderTarget.Channels.Length,
target.Length, listener.SampleRate, BitDepth.Int24);//wj2 Int16);

Buffered reading

Keep a larger buffer in memory for each stream - maybe with a descendant of BlockBuffer.

Streaming instead of converting?

Hello, thank you for your hard work to make such a brilliant application. To be able to decode/render dolby atmos video and output to PCM channel using exists 7.1 audio card is great, but i wonder would it be better if we can make this as a codec or something and use it for streaming atmos audio/video without having to convert it or passthrough an avr to be able to have height channels? Im very interested in making it happens, but i dont know how to do it.

Enumerate unknown tracks in MKVs

Every track where the extra data is audio-related, should be enumerated. Will help correct FFmpeg indexing.

The DRC in eac3 is applied always when decode

And the output have less volume than source.

When decode must use -drc_scale 0 like ffmpeg (not allowed here)

AXML writing feedback

As per request. AXML takes time to write, and nothing indicates this on the GUI. Progress should be displayed.

macOS port?

I would like to do advanced playback and decoding of Dolby Atmos e-ac-3 on macOS, and it seems like this is the only software that supports it. However, it seems like this only runs on Windows and does not work under wine. Is porting this to macOS an easy job or should I plan on purchasing Parallels Desktop to run this inside a Windows VM?

But there is no sound, and all the levels read zero as shown in the screenshot.

The file plays fine in various other players (although to be clear, the other players I tried it with only support the base E-AC-3 layer).

Is this expected? I was under the impression that Cavern Driver could be used to play such files?

Here's what MediaInfo has to say about the file:

General
Complete name             : Y:\tmp\E-AC-3 JOC test.mkv
Format                    : Matroska
Format version            : Version 4
File size                 : 942 KiB
Duration                  : 10 s 27 ms
Overall bit rate mode     : Constant
Overall bit rate          : 770 kb/s
Writing application       : Lavf59.27.100
Writing library           : Lavf59.27.100
ErrorDetectionType        : Per level 1

Audio
ID                        : 1
Format                    : E-AC-3 JOC
Format/Info               : Enhanced AC-3 with Joint Object Coding
Commercial name           : Dolby Digital Plus with Dolby Atmos
Format settings           : Big
Codec ID                  : A_EAC3
Duration                  : 10 s 27 ms
Bit rate mode             : Constant
Bit rate                  : 768 kb/s
Channel(s)                : 6 channels
Channel layout            : L R C LFE Ls Rs
Sampling rate             : 48.0 kHz
Frame rate                : 31.250 FPS (1536 SPF)
Bit depth                 : 32 bits
Compression mode          : Lossy
Stream size               : 454 MiB
Language                  : English
Default                   : Yes
Forced                    : No
Complexity index          : 16
Number of dynamic objects : 15
Bed channel count         : 1 channel
Bed channel configuration : LFE

voidxh / cavern Goto Github PK

cavern's Introduction

Cavern

Features

User documentation

How to build

Cavern

Sample projects

Cavern for Unity

CavernAmp

Library quick start

Clip

Listener

Source

Rendering

Working with audio files

Reading

Getting all samples

Getting the samples block by block

Rendering in an environment

Writing

Unity quick start

Development documents

Disclaimers

Code

Driver

Licence

cavern's People

Contributors

Stargazers

Watchers

Forkers

cavern's Issues

Recommend Projects

Recommend Topics

Recommend Org