Coder Social home page Coder Social logo

Metadata about dca HOT 30 CLOSED

bwmarrin avatar bwmarrin commented on May 28, 2024
Metadata

from dca.

Comments (30)

bwmarrin avatar bwmarrin commented on May 28, 2024

Great ideas. Probably want to add fields for the opus encoder options used too, most importantly will be the frame size since this effect timing.

  • opus audio sampling rate
  • opus application setting (voip, music, lowdelay)
  • opus vbr setting
  • frame size (960, 1920, 2880)

Also some way to store cover image as URL or maybe inline

from dca.

bwmarrin avatar bwmarrin commented on May 28, 2024

http://wiki.multimedia.cx/index.php?title=FFmpeg_Metadata

That might be relevant here. I could also have dca pull the metadata of the input file with ffmpeg so it can be moved into the output metadata.

from dca.

davidcole1340 avatar davidcole1340 commented on May 28, 2024

👍

from dca.

DV8FromTheWorld avatar DV8FromTheWorld commented on May 28, 2024

I think the idea of this metadata is fantastic but I believe there should also be a Magic Byte header at the very beginning of the file so that it is easy to tell that we are in fact dealing with DCA audio. An example of this would be the WAV file magic byte header which is WAVERIFF as 8 bytes.
I propose DISCORDAUDIO as a 12 byte magic byte header.

Additionally, a single byte can only specify a length of 255, so I would suggest using 2 bytes (as a short) to specify the amount of bytes in the header. Please use a signed short, not an unsigned short. A signed short still allows for a max of 32,767...which should be plenty. I ask for a signed short because some languages don't have unsigned data types coughcoughjavacoughcough.

So, ideally:
[D][I][S][C][O][R][D][A][U][D][I][O]
[Length_1][Length_2]
[..Json String as byte array ..]
[.. DCA opus audio packets ..]

from dca.

DV8FromTheWorld avatar DV8FromTheWorld commented on May 28, 2024

For the JSON, I propose the following

{
    "dca": {
        "version": 1,
        "tool": {
            "name": "dca-encoder",
            "version": "1.0.0",
            "rev": "bwmarrin/dca#32361ee92fcbd0e404b2be18adf497a45fef4a5f",
            "url": "https://github.com/bwmarrin/dca/",
            "author": "bwmarrin"
        }
    },
    "info": {
        "title": "Out of Control",
        "artist": "Nothing's Carved in Stone",
        "album": "Revolt",
        "genre": "jrock",
        "comments": "Second Opening for the anime Psycho Pass"
    },
    "origin": {
        "source": "file",
        "abr": 192000,
        "channels": 2,
        "encoding": "MP3/MPEG-2L3",
        "url": "https://www.dropbox.com/s/bwc73zb44o3tj3m/Out%20of%20Control.mp3?dl=0"
    },
    "opus": {
        "sample_rate": 48000,
        "mode": "voip",
        "frame_size": 960,
        "channels": 2
    },
    "modified_date": 1456602080731,
    "creation_date": 1456602080731
}

modified_date and creation_date are milliseconds past epoch.

from dca.

meew0 avatar meew0 commented on May 28, 2024

I wouldn't use nested JSON for the metadata so parsing and processing it is simpler, rather have it all in one level and if necessary use prefixes. I agree with magic bytes, but I wouldn't put it before any other data - rather put it into the first packet with the other metadata so the parsing doesn't get more complicated than skipping the first packet (or not doing so and getting 20ms of garbage). DISCORDAUDIO seems excessive, how about DCA?

Also, in regards to packet headers, they're already two bytes each and already unsigned... why not just use ints to store them?

from dca.

meew0 avatar meew0 commented on May 28, 2024

I'll update my first post with some new metadata fields. I don't think we need modified and creation date as the file system already does this.

from dca.

davidcole1340 avatar davidcole1340 commented on May 28, 2024

Looks good, however we wouldn't be able to get origin.url correct?

👍 on @meew0 with the magic byte, DISCORDAUDIO is a bit excessive.

from dca.

DV8FromTheWorld avatar DV8FromTheWorld commented on May 28, 2024

I believe that nesting the JSON doesn't really change it's processibility but with the nested encapsulation of data it makes more sense. Using json prefixes in json instead of nesting isn't smart. Less intuitive.

I will admit that I have no clue as to how DCA works or stores its audio currently, so I have no concept of the packet storage system. However, with other file types like WAV, the first 8 bytes of the file are the Magic Header. This is useful because it means that you can read the minimum amount of the file to determine while kind of file it is. This is why I would suggest making the very first few bytes be the Magic Header.

I chose DISCORDAUDIO over DCA because the likeliness of a different random file possibly having 3 bytes matching DCA is quite likely, but likeliness of a file matching DISCORDAUDIO and it not being a DCA file is incredibly low. 9 more bytes really isn't that wasteful IMO.

from dca.

davidcole1340 avatar davidcole1340 commented on May 28, 2024

JSON looks good as it was posted.

Possibly just DISCORD?

from dca.

DV8FromTheWorld avatar DV8FromTheWorld commented on May 28, 2024

DISCORD is an acceptable alternative.

from dca.

meew0 avatar meew0 commented on May 28, 2024

The packets are stored using a simple two-byte header specifying the length of the opus packet. It doesn't really matter whether those magic bytes are at the very beginning, because the fixed length header means you can always read bytes 2 to (N+2) where N is the length of the magic string.

from dca.

meew0 avatar meew0 commented on May 28, 2024

DISCORD or DISCORDAUDIO don't really matter because the probability of three random bytes being DCA is 256⁻³ = 1/16777216, so negligibly small.

from dca.

DV8FromTheWorld avatar DV8FromTheWorld commented on May 28, 2024

That isn't what a magic header is supposed to do. It specifies at the very beginning so you can read the least amount of bytes possible and also allows you to assume nothing about the file until you've confirmed that it IS the file type you want.

Also, it is just as easy to read 2+ (N + 2) as it is to read the first N bytes and if it is the correct file type, skip MAGIC_HEADER.length bytes. Honestly, IMO, it is easier. Additionally, if we chose your system of putting the magic header inside the packet it would mess up the json. Or, we would need to know to skip the magic header bytes inside the packet instead of just relying on the header length value defined in the first 2 bytes of each packet when loading the json.

I don't see a real reasoning not to put the magic header bytes at the beginning. If they want to check if it is actually a DCA file, they scan the first MAGIC_HEADER.length bytes, otherwise they just skip MAGIC_HEADER.length bytes and assume that the next 2 bytes they read will be the 2 bytes defining the length of the opus packet.

from dca.

DV8FromTheWorld avatar DV8FromTheWorld commented on May 28, 2024

A comment about the DCA vs DISCORD vs DISCORDAUDIO, I've seen short magic headers be confused before. It is just a few extra bytes, I don't think it is really that much of a problem to go with DISCORD. It is a good median between DCA and DISCORDAUDIO

from dca.

meew0 avatar meew0 commented on May 28, 2024

The problem with putting magic bytes at the very beginning is that you have to introduce special code to read it, and existing code will break on it because it will read D amount of data which will of course be garbage. If breaking the JSON is a problem, it is of course a possibility to make two header packets - one with the magic bytes and one with the metadata. DISCORD doesn't fit the format in my opinion because it's mainly audio data not Discord data. If you take a look at existing magic bytes you'll see that they're usually 2-4 bytes long, with longer ones being the exception.

from dca.

bwmarrin avatar bwmarrin commented on May 28, 2024

Just wanted to jump in here and say.. I am reading all of this and I really appreciate the conversation and all the helpful ideas. I'm a bit torn on some of what's being debated so I need to think about it a bit :). Please keep throwing in ideas though!

from dca.

DV8FromTheWorld avatar DV8FromTheWorld commented on May 28, 2024

I'll drop the discussion about DCA vs DISCORD or DISCORDAUDIO. Your argument is pretty sound so I'll be okay with whatever is picked for that.

About the need for new special code: currently do people already skip the first packet of audio? If they Dont then just the implementation of metadata will require changes to be made to code in order to skip the first packet otherwise it will send the metadata as if it were an audio packet.

That being said.. after I thought about it for a while, if we stored it inside the packet it would make a streaming system easier because the metadata packet could be sent as the first packet, regardless of the stream' current read location.

from dca.

meew0 avatar meew0 commented on May 28, 2024

Having to skip the first packet doesn't really break existing files/code because 1. the first couple packets of audio files are generally zero anyway and 2. if existing code reads the metadata packet as an audio packet, either opus will fail to decode it or it will be decoded to garbage PCM - 20 ms of noise at the beginning doesn't really matter.

from dca.

DV8FromTheWorld avatar DV8FromTheWorld commented on May 28, 2024

Okay, I see your point. 20ms of garbage data vs all packets being garbage because it would be reading the wrong bytes for packet size.

from dca.

DV8FromTheWorld avatar DV8FromTheWorld commented on May 28, 2024

So, in conclusion. we are looking at storing a 3 byte magic header in the first packet of the DCA audio file. The magic header will be directly following the initial 2 bytes which specify packet size. This means that you will skip the first 2 bytes, read the next 3 to determine if it is the DCA audio and consequently the metadata packet, then using value stored in the initial 2 bytes, you will load byte 5 through packet.length when loading the json.

I still stand by my statement about the json needing to be nesting. It feels cleaner and is easier to work with IMO.

from dca.

bwmarrin avatar bwmarrin commented on May 28, 2024

So, I've been thinking about this and looking at how it's handled elsewhere...and...

Right now, I think the focus should be what's the best way to do this without regard to how the file works currently. There's only 2-3 people that use the dca format so if I make a major change that requires their code to be updated I'm okay with that :) I, however, don't want to decide not to make the best change now and then need to do so a year from now when 20 or 100 people are using it.

Personally, I think the magic byte header should be the very first bytes, and I propose using DCA1 and in case there is ever a huge and very backwards incompatible change we can use DCA2 and so forth. I don't expect that to happen ever, let alone frequently but at least it sets us up for the future possibility and as a four byte header it gives a decent level of protection from confusing file formats.

Next, I propose an int32 value that represents the length of the JSON data. This allows a very large chunk of JSON but most importantly it gives just enough extra space over int16 that we could include 300x300 base64 encoded or so sized thumbnail art which is something I specifically want for my own purposes and it's fairly expected of any audio file format at this point.

After that, I say we continue the existing format, except use an int16 length header to be more compatible with all languages cough java cough and an int16 is still plenty large enough for the max length an opus packet would ever be.

Also, I think the json format should be nested. I think it's easier to read in that format for one, but also it creates cleaner code in Go for me so, there's my selfish personal reasons again :) It also gives more freedom for future format changes and even custom fields individuals choose to add into the files. The specific format I'm not settled on and I think I need to think about that a bit longer..

I know this means changes for everyone but I think it gives us a better format for the long term future. The initial DCA1 and JSON metadata can be stripped before being passed to your libraries and then they wouldn't even need to deal with them. I can provide a -raw option on the dca program itself to do this as well.

from dca.

davidcole1340 avatar davidcole1340 commented on May 28, 2024

👍 Looks good, defs rather have it done sooner than later.

from dca.

meew0 avatar meew0 commented on May 28, 2024

The problem is that not only will existing DCA parsers break, existing files will break too, which might be an even bigger problem.

from dca.

bwmarrin avatar bwmarrin commented on May 28, 2024

I understand that, but.. How many of both do we have right now? There's what, 2-3 parsers? Beyond that, who is already storing a large collection of .dca files on disk? I don't have any at all but if someone is doing that I could absolutely write a small script that would convert them to include the new header.

from dca.

davidcole1340 avatar davidcole1340 commented on May 28, 2024

Only libs that use it at the moment is DiscordPHP and discordrb, it's a couple lines to ignore the header and JSON. Not a huge issue.

I don't know anyone that stores DCA files. Doubt anyone actually does, and @bwmarrin can convert them if needed.

from dca.

davidcole1340 avatar davidcole1340 commented on May 28, 2024

I've started working on the JSON metadata over here: https://github.com/uniquoooo/dca/tree/magic-bytes

@bwmarrin possibly create a seperate branch so I can merge in and you can fix all my mistakes? 😛

from dca.

meew0 avatar meew0 commented on May 28, 2024

After further discussion on Discord, this initial draft of the specification was created:

https://github.com/bwmarrin/dca/wiki/DCA1-specification-draft/7b7758d2170515a86121a2eec5365131c2bb686a

Please comment on it with specific things you'd like to have changes, or that should be discussed.

from dca.

meew0 avatar meew0 commented on May 28, 2024

Some problems that were mentioned and that should be discussed specifically:

  • The opus.abr setting doesn't make much sense with variable bitrate encoding.
  • The dca.tool.rev field may be too hard to provide because it would require a git repository parser. Removed in revision 3

from dca.

meew0 avatar meew0 commented on May 28, 2024
  • The metadata will need a field for additional arbitrary data. An extra object was suggested that will never be used internally by DCA. Added in revision 2

from dca.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.