Coder Social home page Coder Social logo

borewit / music-metadata Goto Github PK

View Code? Open in Web Editor NEW
851.0 6.0 87.0 146.33 MB

Stream and file based music metadata parser for node. Supporting a wide range of audio and tag formats.

License: MIT License

JavaScript 0.34% TypeScript 99.66%
metadata musicbrainz picard tag tags id3 flac mp3 mp4 vorbis

music-metadata's People

Contributors

andrewrk avatar aodev avatar arthi-chaud avatar borewit avatar certuna avatar cobalamin avatar dependabot-preview[bot] avatar dependabot[bot] avatar dertseha avatar drvirtuozov avatar evshiron avatar felamaslen avatar horaceli avatar jpage-godaddy avatar leetreveil avatar motabass avatar mutewinter avatar nirbhayk avatar ondras avatar onerpm avatar pmarques avatar rasuni avatar rodu avatar rossgrady avatar sayem314 avatar snyk-bot avatar tim-smart avatar trustedtomato avatar tyhchoi avatar ws avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

music-metadata's Issues

Encoding windows-1250

Hi, We are processing MP3 files from our supplier and some ID3 strings are saved in exotic encoding. I guess windows-1250. In this case, strings with accented chars are read corrupted.

ID3v2.3 MusicBrainz Picard mapping not accurate

I used a common mapping table fot both ID3v2.3 and ID3v2.4 mappings. It looks like the v2.3 is pretty different. Unit tests survived because they contain ID3v2.4 tag keys stored in ID3v2.3 format.

Update common type definition of the barcode

Currently the barcode (common.barcode) is an JavaScript number.
One of the disadvantages is that a barcode like 0 64027 11642 7 becomes 64027 11642.

Therefor I suggest to change the type to a string.

Possible issue with ID3 titles containing a forward slash

ID3v2.3 title tags containing forward slash characters appear to be parsed as multiple separate title tags, breaking on the slash. Is this expected behaviour / part of the ID3 spec, or a bug?

I have not noticed this behaviour in any other player / tag editor / parser when opening the same file.

Example:
A song with the title "This/That" returns the following two ID3v2.3 native tags:

[{
  id: "TIT2",
  value: "This"
}, {
  id: "TIT2",
  value: "That
}]

Thank you!

Remove .ts files from published npm module

The published npm module contains both the typescript and javascript files.
When using the module with tsloader and webpack, it throws about 117 errors.

ERROR in ./node_modules/music-metadata/lib/index.ts
Module parse failed: E:\electron\nighthawk\node_modules\music-metadata\lib\index.ts Unexpected token (6:20)
You may need an appropriate loader to handle this file type.
| import common from './common';
| import TagMap, {HeaderType} from './tagmap';
| import EventEmitter = NodeJS.EventEmitter;
| import {ParserFactory} from "./ParserFactory";
| import * as Stream from "stream";
 @ ./src/background/library.ts 3:0-37
 @ ./src/ui/pages.tsx
 @ ./src/ui/shell.tsx
 @ ./src/index.tsx
 @ multi (webpack)-dev-server/client?http://localhost:8080 webpack/hot/only-dev-server react-hot-loader/patch ./src/index.tsx

ERROR in [at-loader] ./node_modules/music-metadata/lib/ParserFactory.ts:168:11
    TS6133: 'warning' is declared but never used.

ERROR in [at-loader] ./node_modules/music-metadata/lib/Windows1292Decoder.ts:23:26
    TS7006: Parameter 'a' implicitly has an 'any' type.

ERROR in [at-loader] ./node_modules/music-metadata/lib/Windows1292Decoder.ts:23:29
    TS7006: Parameter 'min' implicitly has an 'any' type.

This occurs due to tsloader trying to load the .ts files instead of .js ones. Setting allowsJs to true does not resolve the errors.

When i manually delete the .ts files while keeping the .js and .d.ts files, the module works perfectly.

This occurs for both music-metadata and strtok3.

Support calculating approximate duration

Giving the duration: true option makes all media files be fully scanned to calculate their accurate duration, but when a large number of files should be handled, it is enough to get approximate duration of songs; most clients to play them do not trust what the server gives anyway.

How about adding an option like approximate: true to duration: true for approximate duration? As I recall, an old version of musicmetadata did that by assuming CBR when the first three frames have the same bit-rate; I ported that to mp3-duration (mycoboco/mp3-duration@abdb1bc).

No metadata found in files with metadata

I have two files that have music tags, but they are not seen by this parser and object is returned with nulls in properties while thousands of other files do not have such problem. I can't figure out what is wrong with these two songs, maybe I'm missing something.

Songs.zip

this.s.once is not a function

While trying to read an audio stream, the package throw below error.

Uncaught TypeError: this.s.once is not a function at new StreamReader (node_modules/then-read-stream/lib/index.js:28:16) at new ReadStreamTokenizer (node_modules/strtok3/lib/ReadStreamTokenizer.js:20:30) at Object.fromStream (node_modules/strtok3/lib/index.js:37:28) at Function.ParserFactory.parseStream (node_modules/music-metadata/lib/ParserFactory.js:54:24) at MusicMetadataParser.parseStream (node_modules/music-metadata/lib/index.js:103:46) at Object.parseStream (node_modules/music-metadata/lib/index.js:289:46) at audioType (app/controllers/uploads.js:63:15) at IncomingMessage.<anonymous> (app/controllers/uploads.js:104:11) at IncomingMessage.Readable.read (_stream_readable.js:381:10) at flow (_stream_readable.js:761:34) at resume_ (_stream_readable.js:743:3) at _combinedTickCallback (internal/process/next_tick.js:74:11) at process._tickDomainCallback (internal/process/next_tick.js:122:9)

Question: Error: EMFILE: too many open files

At first, thanks for this reader.

I want to read over 80k mp3 tags for my Software.
But after 7k the performance break down an finished with:

Error: EMFILE: too many open files....

Can someone help me to queue the async calls, with async or something like that:
https://caolan.github.io/async/docs.html#queue

Thanks

me code now is:

walker = walk.walk("/Volumes/..");
var mm = require('music-metadata');
const util = require('util');
walker.on("file", function (root, fileStats, next) {
 //todo, noch die anderen file extension freigeben
 if (path.extname(fileStats.name) === ".mp3") {
   mm.parseFile(path.join(root, fileStats.name)).then(function (metadata) {
     console.log(util.inspect(metadata, { showHidden: false, depth: null }));
   }).catch(function (err) {
     console.error(err.message);
   });
 }
 next();
});

EBADF error when parsing a large media collection

I'm running into a strange issue after porting some code from the original musicmetadata package. When the console.log is uncommented, the example below can parse 2000+ music files without ever having a single issue.

However, the moment you comment out that line. calling .close() on the fileStream will occasionally throw a EBADF: bad file descriptor, read after parsing some of the files.

MUSIC-METADATA:

import * as Chokidar from "chokidar"
import * as Path from "path"
import * as Fs from "fs"
import * as Metadata  from "music-metadata"

let tags = []
let watcher = Chokidar.watch(
  "/Music/**/*.mp3",
  {
    ignoreInitial: false,
    alwaysStat: false
  }
)

watcher.on("add", (file, fstat) =>
{
  let fileStream = Fs.createReadStream(file)
  Metadata.parseStream(fileStream, { native: true }, (error, metadata) =>
  {
    // console.log("no EBADF ever if this line uncommented")
    fileStream.close() // will occasionally throw `EBADF: bad file descriptor, read`
    tags.push(metadata)
  })
})

The same code using the original musicmetadata package will iterate over the same files without ever once throwing an EBADF error, and without the need of the console.log

MUSICMETADATA:

import * as Chokidar from "chokidar"
import * as Path from "path"
import * as Fs from "fs"
import * as Metadata  from "musicmetadata"

let tags = []
let watcher = Chokidar.watch(
  "/Music/**/*.mp3",
  {
    ignoreInitial: false,
    alwaysStat: false
  }
)

watcher.on("add", (file, fstat) =>
{
  let fileStream = Fs.createReadStream(file)
  Metadata(fileStream, (error, metadata) =>
  {
    fileStream.close()
    tags.push(metadata)
  })
})

Use straight file access instead of streams

Move away from the streaming approach and use straight file access; a stream is very inefficient in some cases (cannot jump (seek) to the end of the file, without reading everything in between).

Pros:

  • Better performance.
  • Opens the door for writing meta-data in addition to read meta-data.

Cons:

  • Would make this module less suitable for running in a browser.

Very slow parsing on files that need `duration: true`

I've come across some files that don't parse out a duration property. Setting the duration: true option works, but it results in a very slow parse:

{duration: true}: 12238.523ms  # duration: true
{duration: false}: 2.982ms # duration: false

(example repo: https://github.com/ballpit/parsing-speed uses LFS)'

Any ideas why this file takes so long to parse out a duration property? Are there any optimizations to this case that could be done to speed it up?

Fantastic module and work btw!

Memory leak by parsing special mp3 file

On parsing special single file, the node enviremont used more then 1,4 gb of ram and exit with:

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

@Borewit I sended you one file by mail for test, if u or another want more example please contact me.

Incorrect metadata retrieval again

After I had learned that some of my mp3's had multiple headers, I reset them to make every kind of headers (IDv1, IDv2 and so on) have the same tag values, but music-metadata still keeps failing to get correct metadata from some of them. I double-checked the files in question had headers properly set with at least two metadata editors.

I attach two samples to help you figure out the cause.

sample1.mp3.zip
sample2.mp3.zip

Parse Discogs tags.

Current implementation is strongly based on MusicBrainz tags.

In addition to MusicBrainz, support for Discogs tags, e.g.:

  • discogs_artist_id
  • discogs_artist_name
  • discogs_catalog
  • discogs_country
  • discogs_date
  • discogs_label
  • discogs_label_id
  • discogs_master_release_id
  • discogs_rating
  • discogs_release_id
  • discogs_released
  • discogs_votes

ToDo:

  • Map Discogs rating to common.rating
  • Avoid duplicate values
    • Catalog (label) number
    • Artist name
    • Country

Support for opus and question about cover art

This is AWESOME, it's almost an all in one.

Would it be possible in the future to add support for opus file? :)

Also, can I read cover art from FLAC files? Some of my files have this - no idea how they did it.

Thanks ๐Ÿ‘

Error when executing example

Just wanted to try the drag and drop example. I get the following error:

angular.js:13920init controller...
angular.js:13920 handleDropFiles: [File]
angular.js:13920 Retrieving metadata of file "01 How You Love Me (Axero Remix).m4a"....
music-metadata.js:317 Uncaught TypeError: Cannot read property 'queue' of undefined
    at Stream.through.autoDestroy (music-metadata.js:317)
    at Stream.stream.write (music-metadata.js:9655)
    at Class.ondata (music-metadata.js:7552)
    at Class.EventEmitter.emit (music-metadata.js:6104)
    at readableAddChunk (music-metadata.js:7223)
    at Class.Readable.push (music-metadata.js:7182)
    at check (music-metadata.js:6424)
    at FileReader.loaded (music-metadata.js:6343)
through.autoDestroy @ music-metadata.js:317
stream.write @ music-metadata.js:9655
ondata @ music-metadata.js:7552
EventEmitter.emit @ music-metadata.js:6104
readableAddChunk @ music-metadata.js:7223
Readable.push @ music-metadata.js:7182
check @ music-metadata.js:6424
loaded @ music-metadata.js:6343
angular.js:13920 handleDropFiles: [File]
angular.js:13920 Retrieving metadata of file "Light Into Darkness [Premiere].mp3"....
music-metadata.js:317 Uncaught TypeError: Cannot read property 'queue' of undefined
    at Stream.through.autoDestroy (music-metadata.js:317)
    at Stream.stream.write (music-metadata.js:9655)
    at Class.ondata (music-metadata.js:7552)
    at Class.EventEmitter.emit (music-metadata.js:6104)
    at readableAddChunk (music-metadata.js:7223)
    at Class.Readable.push (music-metadata.js:7182)
    at check (music-metadata.js:6424)
    at FileReader.loaded (music-metadata.js:6343)
through.autoDestroy @ music-metadata.js:317
stream.write @ music-metadata.js:9655
ondata @ music-metadata.js:7552
EventEmitter.emit @ music-metadata.js:6104
readableAddChunk @ music-metadata.js:7223
Readable.push @ music-metadata.js:7182
check @ music-metadata.js:6424
loaded @ music-metadata.js:6343

Both files contain music metadata, I checked it afterwards using ID3Tag.

FIxes for #38 and #39 have side effects

After upgrading to 0.8.3, I don't have errors anymore when parsing mp3 files, but some files have incorrect field values. For the attached file, e.g., 0.8.2 gives:

{ format: 
   { dataformat: 'mp3',
     lossless: false,
     bitrate: 192000,
     sampleRate: 44100,
     numberOfChannels: 2,
     codecProfile: 'CBR',
     encoder: 'LAME3.92 ',
     duration: 154.85387755102042,
     tagTypes: [ 'ID3v2.3', 'ID3v1.1' ] },
  native: undefined,
  common: 
   { track: { no: 7, of: null },
     disk: { no: null, of: null },
     album: 'Come Away With Me',
     artist: 'Norah Jones',
     genre: [ 'Jazz' ],
     label: 'Blue Note',
     title: 'Turn Me On',
     year: 2002,
     picture: [ [Object] ],
     artists: [ 'Norah Jones' ] } }

while 0.8.3 gives:

{ format: 
   { lossless: false,
     dataformat: 'mp3',
     bitrate: 192000,
     sampleRate: 44100,
     numberOfChannels: 2,
     codecProfile: 'CBR',
     encoder: 'LAME3.92 ',
     duration: 154.85387755102042,
     tagTypes: [ 'ID3v2.3', 'ID3v2.4', 'ID3v1.1' ] },
  native: undefined,
  common: 
   { track: { no: 7, of: null },
     disk: { no: null, of: null },
     title: 'Turn Me On',
     artist: 'Norah Jones',
     album: 'Come Away With Me',
     comment: [ [Object] ],
     genre: [ 'www.mp3-ogg.ru' ],
     copyright: 'Make Love Not War',
     artists: [ 'Norah Jones' ] } }

That is, incorrect genre, copyright and comment fields.

Similar issues arises for other files, but I suspect they are from the same reason.

sample.zip

Support rating tag

Related issue: leetreveil/musicmetadata#137

Parse rating tags and map it to a common tag.
In line with MusicBrainz Picard, normalize the common tag to a scale from 0 to 5.0:

            elif frameid == 'POPM':
                # Rating in ID3 ranges from 0 to 255, normalize this to the range 0 to 5
                if frame.email == config.setting['rating_user_email']:
                    rating = string_(int(round(frame.rating / 255.0 * (config.setting['rating_steps'] - 1))))
                    metadata.add('~rating', rating)

source: https://github.com/metabrainz/picard/blob/master/picard/formats/id3.py

Typing on ICommonTagsResult.comment appears incorrect

The type of ICommonTagsResult.comment is declared as string[] in index.d.ts in the root of the package, but in practise it appears to actually return:

{
    description: string,
    language: string,
    text: string,
}[]

Observed when installing v0.9.2.
I had a very brief look through some of the source but I couldn't immediately trace through how the comment tag data was being populated (due to my lack of familiarity with the project).
I can provide further detail if needed.

mpeg parsing fails for irrelevant attributes

I have some files in which the parsing fails on:

Invalid MPEG Audio version

or

expected frame header but was not found

I've "fixed" both errors by disabling them in the code (file: mpeg.js):

if (this.version === null) {
    this.version = 1;
    //throw new Error('Invalid MPEG Audio version');
}
if (this.frameCount > 0) {
    //return this.done(new Error('expected frame header but was not found'));
}

While this is definitely not a real fix, I am now able to read the song title and artist successfully which is all I currently need.

Roadmap v1.0

I just copied the goals into this issue to keep track of them. We can later just create a checklist out of them.

  • Re-factoring:
    • Move away from the readstream approach and use straight file access; a stream is very inefficient in some cases (cannot jump to the end of the file, without reading everything in between) and cumbersome in my opinion. For instance, if a FLAC file prefixed with an id3 header (bad practice, non-standard, but it is done) this code gets completely confused, because it expects mpeg.
      Moved to issue #7
    • Use Promises.
  • Improve tests:
    • Use a test framework like mocha or jest. I used jest for the last project, super easy to work with and supports multi-threaded tests.
      Moved to issue #8
    • Translate test case code to TypeScript: This brings up more problems than it actually brings benefits.
  • Improve documentation:
    • Write detailed documentation on the GitHib Wiki how the tag mapping is done
  • New features / functionality:
    • In addition to MusicBrainz tags, support Discogs tags (see #9), maybe iTunes tags
    • Report on errors (see #60)
    • Be able to write tags (see #10)
    • Support audio CRC values

Edit:
Concerning the refactoring. It would probably be better to start from scratch and use the current code base as a reference.
Changed the title to Roadmap v1.0. Refactoring will probably break the API. Bumping the version up might be a good idea.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.