Coder Social home page Coder Social logo

voidxh / cavern Goto Github PK

View Code? Open in Web Editor NEW
269.0 11.0 12.0 3.22 MB

Object-based audio engine and codec pack with Dolby Atmos rendering, room correction, HRTF, one-click Unity audio takeover, and much more.

Home Page: http://cavern.sbence.hu

License: Other

C# 98.36% C++ 1.16% C 0.31% Batchfile 0.07% JavaScript 0.10%
audio sound cinema audio-engine spatial-audio room-correction surround-sound unity dolby-atmos

cavern's Introduction

Cavern

Cavern is a fully adaptive object-based audio rendering engine and (up)mixer without limitations for home, cinema, and stage use. Audio transcoding and self-calibration libraries built on the Cavern engine are also available. This repository also features a Unity plugin and a standalone converter called Cavernize.

Build Status GitHub release (latest by date) GitHub commits since latest release (by date) Lines of Code Codacy Badge

NuGet - Cavern NuGet - Cavern.Format NuGet - Cavern.QuickEQ

Features

  • Unlimited objects and output channels without position restrictions
  • Audio transcoder library with a custom spatial format
  • Supported codecs:
    • E-AC-3 with Joint Object Coding (Dolby Digital Plus Atmos)
    • Limitless Audio Format
    • RIFF WAVE
    • Audio Definition Model Broadcast Wave Format
    • Supported containers: .ac3, .eac3, .ec3, .laf, .m4a, .m4v, .mka, .mkv, .mov, .mp4, .qt, .wav, .weba, .webm
  • Advanced self-calibration with a microphone
    • Results in close to perfectly flat frequency response, <0.01 dB and <0.01 ms of uniformity
    • Uniformity can be achieved without a calibration file
    • Supported software/hardware for EQ/filter set export:
      • PC: Equalizer APO, CamillaDSP
      • DSP: MiniDSP 2x4 Advanced, MiniDSP 2x4 HD, MiniDSP DDRC-88A
      • Processors: Emotiva, StormAudio
      • Amplifiers: Behringer NX series
      • Others: Audyssey MultEQ-X, Dirac Live, YPAO
  • Direction and distance virtualization for headphones
  • Real-time upconversion of regular surround sound mixes to 3D
  • Mix repositioning based on occupied seats
  • Seat movement generation
  • Ultra low latency, even the upconverter can work from as low as one sample per frame
  • Unity-like listener and source functionality
  • Fixes for Unity's Microphone API
    • Works in WebGL too

User documentation

User documentation can be found at the Cavern documentation webpage. Please go to this page for basic setup, in-depth QuickEQ tutorials, and command-line arguments.

The full list of changes for each version can be found in CHANGELOG.md.

How to build

Cavern

Cavern is a .NET Standard project with no dependencies. Open the Cavern.sln solution with Microsoft Visual Studio 2022 or later and all projects should build.

Sample projects

These examples use the Cavern library to show how it works. The solution containing all sample projects is found at CavernSamples/CavernSamples.sln. The same build instructions apply as to the base project.

Single-purpose sample codes are found under docs/Code.

Cavern for Unity

Open the CavernUnity DLL.sln solution with Microsoft Visual Studio 2022. Remove the references from the CavernUnity DLL project to UnityEngine and UnityEditor. Add these files from your own Unity installation as references. They are found in Editor\Data\Managed under Unity's installation folder.

CavernAmp

This is a Code::Blocks project, set up for the MingW compiler. No additional libraries were used, this is standard C++ code, so importing just the .cpp and .h files into any IDE will work perfectly.

Library quick start

Clip

Cavern is using audio clips to render the audio scene. A Clip is basically a single audio file, which can be an effect or music. The easiest method of loading from a file is through the Cavern.Format library, which will auto-detect the format:

Clip clip = AudioReader.ReadClip(pathToFile);

Refer to the scripting API for the complete description of this object.

Listener

The Listener is the center of the sound stage, which will render the audio sources attached to it. The listener has a Position and Rotation (Euler angles, degrees) field for spatial placement. All sources will be rendered relative to it. Here's its creation:

Listener listener = new Listener() {
    SampleRate = 48000, // Match this with your output
    UpdateRate = 256 // Match this with your buffer size
};

The Listener will set up itself automatically with the user's saved configuration. The used audio channels can be queried through Listener.Channels, which should be respected, and the output audio channel count should be set to its length. If this is not possible, the layout could be set to a standard by the number of channels, for example, this line will set up all listeners to 5.1:

Listener.ReplaceChannels(6);

Refer to the scripting API for the complete description of this object.

Source

This is an audio placed in the sound space, renders a Clip at where it's positioned relative to the Listener. Here's how to create a new source at a given position and attach it to the listener:

Source source = new Source() {
    Clip = clip,
    Position = new Vector3(10, 0, 0)
};
listener.AttachSource(source);

Sources that are no longer used should be detached from the listener using DetachSource. Refer to the scripting API for the complete description of this object.

Rendering

To generate the output of the audio space and get the audio samples which should be output to the system, use the following line:

float[] output = listener.Render();

The length of this array is listener.UpdateRate * Listener.Channels.Length.

Working with audio files

The Cavern.Format library handles reading and writing audio files. For custom rendering or transcoding, they can be handled on a lower level than loading a Clip.

Reading

To open any supported audio file for reading, use the following static function:

AudioReader reader = AudioReader.Open(string path);

After opening a file, the following workflows are available.

Getting all samples

The Read() function of an AudioReader returns all samples from the file in an interlaced array with the size of reader.ChannelCount * reader.Length.

Getting the samples block by block

For real-time use or cases where progress should be displayed, an audio file can be read block-by-block. First, the header must be read, this is not done automatically. Until the header is not read, metadata like length or channel count are unavailable. Header reading is accomplished by calling reader.ReadHeader().

The ReadBlock(float[] samples, long from, long to) function of an AudioReader reads the next interlaced sample block to the specified array in the specified index range. Samples are counted for all channels. A version of ReadBlock for multichannel arrays (float[channel][sample]) is also available, but in this case, the index range is given for a single channel.

Seeking in local files are supported by calling reader.Seek(long sample). The time in samples is relative to reader.Length, which means it's per a single channel.

Rendering in an environment

The reader.GetRenderer() function returns a Renderer instance that creates Sources for each channel or audio object. These can be retrieved from the Objects property of the renderer. When all of them are attached to a Listener, they will handle fetching the samples. Seeking the reader or the renderer works in this use case.

Writing

To create an audio file, use an AudioWriter:

AudioWriter writer = AudioWriter.Create(string path, int channelCount, long length, int sampleRate, BitDepth bits);

This will create the AudioWriter for the appropriate file extension if it's supported.

Just like AudioReader, an AudioWriter can be used with a single call (Write(float[] samples) or Write(float[][] samples)) or block by block (WriteHeader() and WriteBlock(float[] samples, long from, long to)).

Unity quick start

Cavern works exactly the same way as Unity's audio engine, only the names are different. For AudioSource, there's AudioSource3D, and for AudioListener, there's AudioListener3D, and so on. You will find all Cavern components in the component browser, under audio, and they will automatically add all their Unity dependencies.

Development documents

Disclaimers

Code

Cavern is a performance software written in an environment that wasn't made for it. This means that clean code policies like DRY are broken many times if the code is faster this way, usually by orders of magnitude. Most changes should be benchmarked in the target environment, and the fastest code should be chosen, regardless of how bad it looks. This, however, can't result in inconsistent interfaces. In that case, wrappers should be used with the least possible method calls.

Driver

While Cavern itself is open-source, the setup utility and most converter interfaces are not, because they are built on licences not allowing it. However, their functionality is almost entirely using this plugin. Builds can be downloaded from the Cavern website.

Licence

By downloading, using, copying, modifying, or compiling the source code or a build, you are accepting the licence available here.

cavern's People

Contributors

threedeejay avatar voidxh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cavern's Issues

Feature Request : Import Dolby PHRTF to Headphone Virtualizer.

https://games.dolby.com/phrtf/games-instructions/

Dolby currently provides an app to generate PHRTF files via smartphones. This file is applicable to the binaural renderer of the Dolby Atmos Production Suite or the binaural renderer of Dolby Atmos for Headphones and start a personalized binaural rendering process.

If can apply PHRTF files to Cavern Driver or Cavernize's headphone virtualizer, it will be a more powerful program!

P.S. I can provide you with my Dolby PHRTF file if you'd like to check it out. Or you can generate a PHRTF file through the link I posted.

The sound of activating and listening to the headphone virtualizer sounds very strange.

Hello, I am an end user who loves Spatial Audio. I enjoy listening to Dolby Atmos or Sony 360 reality audio with my headphones.

I used the very interesting tool you made for the first time today. In your demonstration video, Core Universe's Dolby Atmos sound was fantastic to hear on headphones : https://youtu.be/sZCfmISQdHU

So I've been following you. I downloaded Core Universe(https://thedigitaltheater.com/dolby-trailers/), converted it to mkv using mkvtoolnix, and played it through Cavern.

However, it was very different from the sound I heard through your video. This is the sound I heard through Cavern. (I recorded the output using Audacity.)
https://drive.google.com/file/d/1bJ3tBC3CFs7hgcpd0ROdNI3x9Tpx5WXr/view?usp=sharing

So I was able to activate an option called Headphone Virtualizer, which sounds a lot more strange.
https://drive.google.com/file/d/1dwxo11LPJjwAGxcO0P8ezT_v1kAVf0Iy/view?usp=sharing

How did you set this video when you demonstrated it?

Implement AHT for AC-3

Streaming content started using AHT, which is not yet supported in Cavern. An example is Werewolf by Night.

E-AC-3 JOC file decode problem

I tried to convert a few E-AC-3 JOC files (downloaded from Tidal) to ADM BWF wav, and I do not get the expected results.

Test track 1: Ado's Usseewa from Kyougen (jpop?)
Expected: one channel contains drums, another contains vocals, another contains piano, another bass, etc, etc.
Actual: every channel contains a different mix of all instruments (slightly different ratio). Some does only contain part of the instruments but it changes (i.e. for 30 seconds it's backing vocals + drums, then for 30 seconds it's backing vocals + synths, etc) The majority of channels are very quiet and when turned up sounds heavily distorted. Spectrogram shows holes and uneven lowpass.

Test track 2: Joe Hisaishi's Links from Minima Rhythm (orchestra, minimalism)
Expected: one channel for violin 1, one channel for violin 2, one channel for cello, etc.
Actual: every channel contains a different mix of all instruments (slightly different ratio). They all sound very similar. The majority of channels are very quiet and when turned up sounds heavily distorted. Spectrogram shows holes and uneven lowpass.

Support ASIO

Is it possible to add support for ASIO drivers?
This would allow to support the 7.1.4 format.
Thanks

Output filters

Allow filtering of individual output channels, like burning QuickEQ results into files for AVR playback. Should be optimized by caching output samples up to the FFT size which also helps with write performance.

Cavernize GUI compatibility issue on Windows 11 22H2.

Cavernize GUI is not compatible with Windows 11 22H2 version. The ADM BWF conversion process always behaves erratically and does not result in desirable interactions with respect to any software that inserts the Dolby Atmos Master.

kUbuntu 22.04 wine Crash

When i run in kUbuntu 22.04 everything runs fine for about 6sec to 15sec then crashes with output -> Segmentation fault (core dumped)

Terminal EXE -> ~/Downloads/cavern64$ wine Cavern.exe
[UnityMemory] Configuration Parameters - Can be set up in boot.config
"memorysetup-bucket-allocator-granularity=16"
"memorysetup-bucket-allocator-bucket-count=8"
"memorysetup-bucket-allocator-block-size=4194304"
"memorysetup-bucket-allocator-block-count=1"
"memorysetup-main-allocator-block-size=16777216"
"memorysetup-thread-allocator-block-size=16777216"
"memorysetup-gfx-main-allocator-block-size=16777216"
"memorysetup-gfx-thread-allocator-block-size=16777216"
"memorysetup-cache-allocator-block-size=4194304"
"memorysetup-typetree-allocator-block-size=2097152"
"memorysetup-profiler-bucket-allocator-granularity=16"
"memorysetup-profiler-bucket-allocator-bucket-count=8"
"memorysetup-profiler-bucket-allocator-block-size=4194304"
"memorysetup-profiler-bucket-allocator-block-count=1"
"memorysetup-profiler-allocator-block-size=16777216"
"memorysetup-profiler-editor-allocator-block-size=1048576"
"memorysetup-temp-allocator-size-main=4194304"
"memorysetup-job-temp-allocator-block-size=2097152"
"memorysetup-job-temp-allocator-block-size-background=1048576"
"memorysetup-job-temp-allocator-reduction-small-platforms=262144"
"memorysetup-temp-allocator-size-background-worker=32768"
"memorysetup-temp-allocator-size-job-worker=262144"
"memorysetup-temp-allocator-size-preload-manager=262144"
"memorysetup-temp-allocator-size-nav-mesh-worker=65536"
"memorysetup-temp-allocator-size-audio-worker=65536"
"memorysetup-temp-allocator-size-cloud-worker=32768"
"memorysetup-temp-allocator-size-gfx=262144"
Mono path[0] = 'Z:/home/donno/Downloads/cavern64/Cavern_Data/Managed'
Mono config path = 'Z:/home/donno/Downloads/cavern64/MonoBleedingEdge/etc'
Initialize engine version: 2021.2.5f1 (4ec9a5e799f5)
[Subsystems] Discovering subsystems at path Z:/home/donno/Downloads/cavern64/Cavern_Data/UnitySubsystems
GfxDevice: creating device client; threaded=1; jobified=0
Direct3D:
Version: Direct3D 11.0 [level 11.1]
Renderer: Radeon (TM) RX 480 Graphics (ID=0x67df)
Vendor: ATI
VRAM: 4096 MB
Driver: 1.0
Begin MonoManager ReloadAssembly

  • Completed reload, in 0.184 seconds
    Initializing input.
    Input initialized.
    Touch support initialization failed: Call not implemented.
    .
    UnloadTime: 0.817600 ms
    Setting up 4 worker threads for Enlighten.
    Segmentation fault (core dumped)

Don't apply environment changes when process is running

As per report. Cavernize can break when the render target changes mid-conversion. Make it so that will only apply to the next conversion. For CLI, display an error message when this command is called after the render start (output) command.

ADM BWF timing issues

The ramps are read incorrectly, transitions should happen at the end of the timeslots, not at their beginning.

[Feature request] Support for AC4 and TrueHD?

I can't find a player for AC4 or TrueHD*. I'm still in the process of requesting a trial to the Dolby Reference Player but I haven't heard back for a while (2+ weeks) so it's probably ignored.

Or will this project be for eac3 only?

*Full object support (?): I'm told that TrueHD stores the clustered objects separately (so lossy clustering->lossless (no JOC) ->lossless encode), unlike EAC3 where it's combined into 5.1/7.1 channels using JOC (so lossy clustering->lossy JOC->lossy encode), but mediainfo and ffmpeg all give me 7.1 channels for TrueHD, so I'm not sure what's happening. Maybe TrueHD is already fully supported by FFmpeg (and therefore most of the players).

Add support for DBMD

Requested by @ValZapod. Dolby Atmos Metadata, needed for DME imports. Should be a manual selection as "ADM BWF + DBMD". If all else fails, loading a donor file's DBMD will work.

AC-3 bed crosstalk

For Dolby's 7.1.4 web demo, the decoded AC-3 track has noise on all channels. Discussed in #56.

Incorrect Channel Mapping

When decoding some files like the Dolby Nature's Fury trailer, the output channels are incorrectly mapped.
e.x. the dialogue comes out of the left speaker.

On other files the output is correct i.e channels are in the correct order (L R C LFE SL SR...)

  • master branch

Cavern for Unity build under macOS

  1. Installed Unity.
  2. Replaced reference in .csproj with /Applications/Unity/Unity.app/Contents/Managed/UnityEngine.dll and /Applications/Unity/Unity.app/Contents/Managed/UnityEditor.dll
  3. Run

Result

/Users/jin/cavern1/CavernUnity DLL/AudioListener3D.cs(31,31): Error CS1501: No overload for method 'Normalize' takes 4 arguments (CS1501) (CavernUnity DLL)

I have no experience with C# or Unity so I have no idea if I'm doing anything wrong.

Problem with the reference levels of the Dolby tests

Hello,
Thanks for your application.

When decoding the Dolby reference file :
https://download.dolby.com/us/en/test-tones/dolby-test-tones_7_1_4.mp4
with Cavernize GUI to PCM interger with 7.1.4 Render target, the level of the 12 channels is not identical.
We obtain with an RMS measurement :
-21,34 dB; -21,35 dB; -21,35 dB; -16,47 dB; -23,91 dB; -23,92 dB; -24,03 dB; -24,04 dB; -18,25 dB; -18,25 dB; -18,26 dB; -18,25 dB

With the ADM conversion, we obtain identical reference levels, except for the LFE which is normal :
-16,21 dB; -21,35 dB; -21,35 dB; -21,35 dB; -21,26 dB; -21,26 dB;-21,02 dB;-21,03 dB;-21,28 dB;-21,27 dB;-21,28 dB;-21,32 dB

The difference in level is for the surround and top channels, the front channels have the right levels.

Thanks

Surround upmixing for Cavernize GUI

Add the Cavernize filter to Cavernize GUI as an option and remove the legacy Cavernize for FFmpeg source code as it's both deprecated and deceptive. Should be an option in the Rendering menu when content up to 7.1 is loaded without objects. Needs a complete rewrite of SurroundUpmixer - maybe include the heights created by the Cavernize filter.

Cavern Driver App won't stop rotating the room

Hi there,

I found about this project today and I find it very interesting.
I've been looking for some sort of PC based Dolby Atmos decoder as such was giving it a try.

However, while trying to explore the Cavern app, I found it very difficult to do so, as the room won't freaking stop rotating, making it almost impossible to place any object.

Would it be possible to add an option to stop the room rotating or is it a bug?

Thanks.

Support binaural for Cavernize's Render target

With an ADM BWF file, it is possible to output binaural audio through DaVinci Resolve.
For the EAC3-JOC demo content provided directly by Dolby, the ADM BWF file works successfully. On the other hand I've tried creating ADM BWF files for several streaming Atmos content that provides EAC3-JOC, but all of them don't work for DaVinci Resolve.
failed

So I am writing this post. Can I expect something like this?
feature requeset

and it would be nice if .eac3 was added to the supported extension.
feature requeset2

DD+ "Atmos" encoding

DD+ can contain channel-based data up to 9.1.6, should be just bitstream-copied with a modified channel order from an FFmpeg encode.

E-AC-3 JOC file decode produces distorted output

Version 1.5

Decoding E-AC-3 JOC music files (commercial and self encoded) to riff produces distorted results. I am comparing to the reference player.

7.1.4 layout to integer RIFF.

Also RIFF float settings produce 16bit results

always output 16bit wav

Is it possible to output 24bit wav or 32bit wav by the following modify?
writer = new RIFFWaveWriter(exportName, activeRenderTarget.Channels,
target.Length, listener.SampleRate, BitDepth.Int24);//wj2 Int16);
} else {
writer = AudioWriter.Create(exportName, activeRenderTarget.Channels.Length,
target.Length, listener.SampleRate, BitDepth.Int24);//wj2 Int16);
图片

Buffered reading

Keep a larger buffer in memory for each stream - maybe with a descendant of BlockBuffer.

Streaming instead of converting?

Hello, thank you for your hard work to make such a brilliant application. To be able to decode/render dolby atmos video and output to PCM channel using exists 7.1 audio card is great, but i wonder would it be better if we can make this as a codec or something and use it for streaming atmos audio/video without having to convert it or passthrough an avr to be able to have height channels? Im very interested in making it happens, but i dont know how to do it.

AXML writing feedback

As per request. AXML takes time to write, and nothing indicates this on the GUI. Progress should be displayed.

macOS port?

I would like to do advanced playback and decoding of Dolby Atmos e-ac-3 on macOS, and it seems like this is the only software that supports it. However, it seems like this only runs on Windows and does not work under wine. Is porting this to macOS an easy job or should I plan on purchasing Parallels Desktop to run this inside a Windows VM?

X.X.2 front layout handling

5.1.2, 7.1.2, and 9.1.2 front render targets should be rendered as such that only front objects are elevated. This can be achieved by rendering 4 overheads and mixing the rear channels of both sides together.

Cavern Driver 1.5 no sound when playing E-AC-3 JOC file

I just discovered this project and got curious, so I rummaged through my content and found an E-AC-3 JOC file to use as a quick test. Here's a 10-second sample of the file: E-AC-3 JOC test.zip

Cavern Driver 1.5 successfully opens the file and seems to show the objects as well as player controls:

image

But there is no sound, and all the levels read zero as shown in the screenshot.

The file plays fine in various other players (although to be clear, the other players I tried it with only support the base E-AC-3 layer).

Is this expected? I was under the impression that Cavern Driver could be used to play such files?

Here's what MediaInfo has to say about the file:

General
Complete name             : Y:\tmp\E-AC-3 JOC test.mkv
Format                    : Matroska
Format version            : Version 4
File size                 : 942 KiB
Duration                  : 10 s 27 ms
Overall bit rate mode     : Constant
Overall bit rate          : 770 kb/s
Writing application       : Lavf59.27.100
Writing library           : Lavf59.27.100
ErrorDetectionType        : Per level 1

Audio
ID                        : 1
Format                    : E-AC-3 JOC
Format/Info               : Enhanced AC-3 with Joint Object Coding
Commercial name           : Dolby Digital Plus with Dolby Atmos
Format settings           : Big
Codec ID                  : A_EAC3
Duration                  : 10 s 27 ms
Bit rate mode             : Constant
Bit rate                  : 768 kb/s
Channel(s)                : 6 channels
Channel layout            : L R C LFE Ls Rs
Sampling rate             : 48.0 kHz
Frame rate                : 31.250 FPS (1536 SPF)
Bit depth                 : 32 bits
Compression mode          : Lossy
Stream size               : 454 MiB
Language                  : English
Default                   : Yes
Forced                    : No
Complexity index          : 16
Number of dynamic objects : 15
Bed channel count         : 1 channel
Bed channel configuration : LFE

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.