Coder Social home page Coder Social logo

harmony's People

Contributors

aerozol avatar atj avatar kellnerd avatar mwiencek avatar phw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

harmony's Issues

Normalize and merge copyright lines

Continuing the discussion from #22 (comment)

We can factor out the copyright normalization logic and reuse it for other providers, e.g. as suggested for Tidal in https://community.metabrainz.org/t/harmony-music-metadata-aggregator-and-musicbrainz-importer/698641/15
But I'd do this after merging this PR. It also needs some further research. I know Tidal includes the copyright text both with and without the © symbol. What I'm unsure is whether this strictly contains copyright © info, or whether it also can sometimes contain phonographic copyright ℗ info. Spotify has those separated, which makes it easier.

I fully agree, this is enough for its own PR and it needs more research.
Tidal also has a copyright property at the track level by the way, this should also be considered if it is different from the release level coypright. So far they were identical for the releases which I have checked, maybe a compilation has different values there.

For starters I have a commit in the dev branch which displays the alternative copyright values.
When we have more examples we can decide how the release merge algorithm should handle these, one possibility would be to keep all and deduplicate them.

Metadata Providers

List of sources/websites for which a metadata provider has been implemented or requested.

Leave a comment which includes at least a link to an example release for a quick provider request, or create a separate issue which is named after the provider and labeled with provider Metadata provider if you have more research to share.
Detailed requests for sources with an open API and good documentation are more likely to be implemented.
Edit: Let's be honest, every request which contains more details than just the name or URL of the source is probably worth its own issue already.

If you plan to work on a provider, be it doing more research or actually implementing it, please create a separate issue which is named after the provider and ask for it being assigned to you.

Open JSON API

  • MusicBrainz
  • Deezer (GTIN, ISRC)
  • iTunes (per region; no GTIN)
  • Discogs (GTIN)
  • YouTube (GTIN) (#31)

Restricted JSON API

  • Spotify (requires token; GTIN, ISRC) (#16)
  • Tidal (requires token, per region; GTIN, ISRC) (#11)
  • Qobuz (documentation, requires app ID, IP-based 404s for unavailable regions; GTIN, ISRC)
  • Apple Music (requires paid token; GTIN, ISRC)
  • Beatport (requires token; GTIN, ISRC)
  • Soundcloud (requires authentication; GTIN, ISRC) (#35)
  • #12 (only accessible in some regions)
  • #14 (requires authentication)

Tokens and other secrets should not be included in this repository but loaded from environment variables.

HTML Scraping

  • Bandcamp (embedded JSON; GTIN)
  • Beatport (embedded JSON; GTIN, ISRC) (#2)
  • OTOTOY (#34)

Uncategorized Requests

Allow to use standard provider when linking to Harmony with URL parameter

For integration with external tools it would be convenient to be able to easily link to Harmony with a URL or GTIN. Links like this should be supported:

With URL: https://harmony.pulsewidth.org.uk/release?url=https%3A%2F%2Fgoatgirl.bandcamp.com%2Falbum%2Fbelow-the-waste

With GTIN: https://harmony.pulsewidth.org.uk/release?gtin=191402047554

There are two separate behaviors:

  1. Linking with URL works and triggers a lookup, but all default providers are disabled. To enable a provider an explicit parameter for this provider has to be passed, e.g. deezer=. This is inconvenient for third-party tools, as they need to hardcode all possible providers. It would be better if the default providers would be used instead.
  2. The GTIN link behaves differently. It does not automatically trigger a lookup. But instead it just shows the form and has the default providers set properly.

The second case is more convenient and probably intentional. Having case 1. with url parameter behave the same would likely solve the issue.

Not sure whether both cases should trigger an automatic lookup.

Barcode collision

Sometimes barcodes are not as unique as they should be...

635669065024 returns two different releases (different artists, but same label) with the iTunes, Spotify and Tidal providers. Deezer's API only returns one of them (YBC III), for the others it seems to be random which one is the first result that gets returned.

The iTunes provider at least warns about this, the other providers currently ignore this issue silently.

seed track URLs to MusicBrainz recordings

mostly for individual track pages, but I believe Bandcamp can have different licenses per track (tho I don't know if that'd be a recording or work URL...), for example

Support setting CC license for Bandcamp provider

Bandcamp contains many Creative Commons licensed releases, see for example https://aeonsable.bandcamp.com/album/aenigma-2023

The Bandcamp importer user script supports reading the license information and sets the license URL when seeding, see https://github.com/murdos/musicbrainz-userscripts/blob/master/bandcamp_importer.user.js#L188-L195

Similar could be done in the Bandcamp provider.

See https://community.metabrainz.org/t/harmony-music-metadata-aggregator-and-musicbrainz-importer/698641/12

Yandex Music

(as suggested in #5)

Yandex Music is a Russian music streaming service developed by Yandex. Users select musical compositions, albums, collections of musical tracks to stream to their device on demand and receive personalized recommendations. The service is also available as web browser. Service is available in Armenia, Azerbaijan, Belarus, Georgia, Israel, Kazakhstan, Kyrgyzstan, Moldova, Russia, Tajikistan, Turkmenistan and Uzbekistan. Subscription can only be paid from supported countries above, but the service is then available in all other countries. (wiki)

Example of an album: https://music.yandex.ru/album/12353342
Open JSON API: https://api.music.yandex.net/albums/12353342/ (or https://api.music.yandex.net/albums/12353342/with-tracks for additional info on tracks from album, such as the distributor of release). VPN might be needed to open those (mirror for the "with-tracks" response: https://www.jsonkeeper.com/b/YKSE)

The API does not support neither GTAN nor ISRC. Also, the "label" section of response takes the info from the P-line of release and in most cases would remove words "Productions", "Music", "Publishing" and etc., as well as split one label onto multiple ones if there's a slash in its name (like here).

API supports showing whether it's an album, single, podcast or an audiobook (since they all have a link of https://music.yandex.ru/album/album_id).

There's also an unofficial implantation of an API at https://github.com/MarshalX/yandex-music-api/releases but token needed to use it

OTOTOY

I honestly couldn't find any API for OTOTOY, but since it is a Japanese store, most of the help pages aren't in English, so there might be.

https://ototoy.jp

that said, perhaps it could be scraped for data, especially since it's one of the few stores I know that shows catalog numbers (for example, here).

important note, OTOTOY does keep seperate pages for Lossless and High-Resolution releases, which would be the same MusicBrainz release (all other data being the same, of course)

Roadmap

This is just a loosely ordered list of things I already have on my radar, to be cleaned up later™️.

Harmonized Data

  • Merge missing properties into the preferred provider's data
  • Check for conflicting properties during merge
    • Warning for duration
    • Error for GTIN
    • Error for incompatible medium track counts
    • Merge tracks if the total track count matches (1 medium vs N media)
    • Skip missing tracklist! (by far the most common error in the test phase logs)
    • Merge empty medium into medium with tracks
  • Release date quality ranking, plausibility checks for each provider (new attribute "date.warning"), merge strategy "prefer latest"
  • Generally warn about pre-release data
  • Guess featured artists from titles
  • Copyright notices
  • Explicitness of tracks
  • Optional title cleanup (numeric prefix, ETI style etc.)
    • Deezer allows crediting the same artist multiple times
  • Customizable search & replace rules
  • Audiobook / audio drama mode?
  • Preserve catalog numbers
  • Improve language detection (skip too short inputs, try alternatives?)

Providers

  • iTunes: Ensure that collection.trackCount equals the number of returned tracks
  • iTunes: Warn about responses which contain multiple release variants for an UPC
  • iTunes: Country list for optional region lookups
  • iTunes: Use region from URL
  • iTunes: Try to use artist (or ISRC?) region for canonical (region-specific) URL
  • iTunes: Try next region instead of throwing if JSON parsing fails (see screenshot)
  • Spotify: Pad GTIN with zeros if no results are found (example)
  • Deezer: Truncate padded GTIN
  • Bandcamp: Try band URL as label URL
  • Bandcamp: Add untitled hidden tracks, only their count is available as OG meta header
    • Extract more than just trAlbum into snapshots -> wrapper object, avoid deserializing JSON
  • Bandcamp: Try band URL as label URL, extract label from packages
  • Bandcamp: Check whether band is part of the release artist before using it as label
  • iTunes: Warn about missing tracklist
  • Bandcamp: VA releases
  • Bandcamp: /track URLs (#7)
  • Bandcamp: Extract release ISRCs from all /track URLs (expensive, only if there is no better source)
  • iTunes: Drop " - Single" from title (#9)
  • Bandcamp: Extract track images, from embedded player (only for pre-releases so far)
  • Beatport: Warn about catalog numbers which look like GTINs
  • iTunes: Show not only the URL with the last region when all lookup attempts failed
  • Bandcamp: no tracks https://2nxmusic.bandcamp.com/album/stolen-lullabies
  • Bandcamp: custom domains (#8)
  • Deezer: API sometimes returns too many tracks: https://musicbrainz.org/edit/112474481 or https://www.deezer.com/album/303245

MusicBrainz

  • Suggest existing release group
  • Find release group or similar releases, reuse recordings? Similar query as MB duplicates tab?
  • Resolve external links to MBIDs
    • Don't resolve ambiguous URLs to MBIDs
    • Allow two URL rels if for the same target entity (e.g. download and streaming)
    • Cache pending requests in a map, parallel resolving of all identifiers
    • Use resolved MBID of release artist for unresolved but identically named track artists
    • Combine release and track artists which share identifiers or names to avoid inconsistent results in edge cases
  • Guess release group types
  • Create edit note
    • Add permalink / homepage / repository URL (and version?)
  • Optionally fill the annotation with additional data (make the sections configurable)
    • Copyright notice
    • Availability
    • Release and track level credits (text only so far)
    • Explicitness (show, but do not seed for tracks; add to release disambiguation?)
  • Detect European releases (special country XE)
  • Target seeder at existing release
  • Use ampersand for last joinphrase by default
  • Support track URLs for other providers and suggest to look their release up

Infrastructure

  • URL lookup -> GTIN -> parallel GTIN lookups
  • Support provider-specific messages
  • Return all provider error messages if no lookup was successful
  • Allow to choose and exclude providers (Provider preferences)
  • Allow providers to return multiple releases, i.e. different variants (e.g. for Bandcamp)
  • Cache management: https://github.com/kellnerd/snap_storage
    • Invalidation strategy: FIFO/LRU/TTL? Maximum age (optional)
    • In memory and/or long-time cache? JSON files with compression or Redis?
    • Cache multiple versions with timestamps (daily? only if there have been changes?)
    • Let the requester know how old the data is and whether it is from the cache
    • Permalinks to specific cached version (include GTIN, enabled providers, optional additional URLs or ProviderName=ProviderId pairs)
  • Optimize lookups (perform no GTIN lookup if ID was already looked up)
    • These repeated lookups also skew the calculated processing time for the initial provider (e.g. Deezer track requests are now cached)
  • Use as few requests as possible (only make additional API calls for a provider if data is missing, e.g. iTunes regions or Deezer ISRCs)
  • Lookup by metadata (label and catno, title, artist, track count etc.) for providers without GTIN
  • Create provider feature categories (e.g. streaming, physical, with GTIN/ISRC, GTIN lookup, scraper, audio drama, Japanese etc.)
  • Lookup the entire discography of a given artist/label
  • Make MusicBrainz base url configurable (environment variable)
  • Deduplicate lookup ReleaseOptions.regions option by using an ordered set
  • Manage lookup state: Each provider "Example" is split into two classes ExampleProvider and ExampleReleaseLookup, where ExampleReleaseLookup has a (readonly) property provider
    • Splits general request logic and release processing logic
    • Possible to store release lookup state as class properties
    • Separation of unrelated tasks once we add artist/label lookups later, e.g. as ExampleArtistLookup and ExampleLabelLookup
  • Warn that available regions may not be accurate before the release date has passed (anywhere on earth, UTC-12)
  • Extract provider URLs from link shortener pages
  • Extract provider IDs and GTIN from a-tisket URLs
  • Write more test cases...
  • Preserve URL blurb (for Beatport)
  • Improve logging of AggregateErrors, they make it a PITA to find the real issue

Web Interface

  • Display header with logo and description
    • Harmony: Music Metadata Aggregator and MusicBrainz Importer/Seeder
    • Design banner logo and icon
  • Display footer with version, repo URL and support URL (environment variables DENO_DEPLOYMENT_ID, REPO_BASE_URL, optional COMMIT_BASE_URL, SUPPORT_URL)
  • Add OpenGraph meta tags
  • Allow to choose and exclude providers (persistent provider checkboxes)
  • Persist preferred regions input
  • Show provider and alternative values for interesting properties
    • Improve track length comparison, Deezer truncates instead of rounding
  • Settings page/section with persisted checkboxes
  • Multiple URL inputs (dynamic form)
  • Provider URL detection on the frontend (URLPattern polyfill for Firefox and Safari? https://caniuse.com/mdn-api_urlpattern)
  • CSS
  • Provider icons (external links or data URIs? inline TSX SVG? SVG sprite built with TSX)
  • Post-submission route/page ("release actions"):
    • ISRC submission (kepstin/tatsumo/custom?)
    • Artwork (ECAU)
    • External links (for artists, maybe for labels?)
  • Dynamic region list display: count, compact flags, detailed list
  • Group regions by continent
  • Serve documentation, written in Markdown
  • Use HTTPS by passing key and cert options to start() Support X-Forwarded-Proto proxy header
  • Trim GTIN input to avoid unnecessary errors

Tidal: support video releases

Tidal also provides videos as separate entities. They come with title, cover image, duration, release date, ISRC copyright info. Seems to be well suited to be added as releases on their own.

Examples:

API provides the /videos/{id} endpoint, see https://developer.tidal.com/reference/web-api?spec=catalogue&ref=get-video .

Example response for https://tidal.com/browse/video/358461354

{
  "resource": {
    "artifactType": "video",
    "id": "358461354",
    "title": "My Boy Only Breaks His Favorite Toys (Lyric Video)",
    "image": [
      {
        "url": "https://resources.tidal.com/images/931df7cf/57ce/47f8/9a6e/c7cea3e19287/1024x256.jpg",
        "width": 1024,
        "height": 256
      },
      {
        "url": "https://resources.tidal.com/images/931df7cf/57ce/47f8/9a6e/c7cea3e19287/1080x720.jpg",
        "width": 1080,
        "height": 720
      },
      {
        "url": "https://resources.tidal.com/images/931df7cf/57ce/47f8/9a6e/c7cea3e19287/160x107.jpg",
        "width": 160,
        "height": 107
      },
      {
        "url": "https://resources.tidal.com/images/931df7cf/57ce/47f8/9a6e/c7cea3e19287/160x160.jpg",
        "width": 160,
        "height": 160
      },
      {
        "url": "https://resources.tidal.com/images/931df7cf/57ce/47f8/9a6e/c7cea3e19287/320x214.jpg",
        "width": 320,
        "height": 214
      },
      {
        "url": "https://resources.tidal.com/images/931df7cf/57ce/47f8/9a6e/c7cea3e19287/320x320.jpg",
        "width": 320,
        "height": 320
      },
      {
        "url": "https://resources.tidal.com/images/931df7cf/57ce/47f8/9a6e/c7cea3e19287/480x480.jpg",
        "width": 480,
        "height": 480
      },
      {
        "url": "https://resources.tidal.com/images/931df7cf/57ce/47f8/9a6e/c7cea3e19287/640x428.jpg",
        "width": 640,
        "height": 428
      },
      {
        "url": "https://resources.tidal.com/images/931df7cf/57ce/47f8/9a6e/c7cea3e19287/750x500.jpg",
        "width": 750,
        "height": 500
      },
      {
        "url": "https://resources.tidal.com/images/931df7cf/57ce/47f8/9a6e/c7cea3e19287/750x750.jpg",
        "width": 750,
        "height": 750
      }
    ],
    "releaseDate": "2024-04-19",
    "artists": [
      {
        "id": "3557299",
        "name": "Taylor Swift",
        "picture": [
          {
            "url": "https://resources.tidal.com/images/03a7ff5b/e309/4c66/9df7/d469d8049c3d/1024x256.jpg",
            "width": 1024,
            "height": 256
          },
          {
            "url": "https://resources.tidal.com/images/03a7ff5b/e309/4c66/9df7/d469d8049c3d/1080x720.jpg",
            "width": 1080,
            "height": 720
          },
          {
            "url": "https://resources.tidal.com/images/03a7ff5b/e309/4c66/9df7/d469d8049c3d/160x107.jpg",
            "width": 160,
            "height": 107
          },
          {
            "url": "https://resources.tidal.com/images/03a7ff5b/e309/4c66/9df7/d469d8049c3d/160x160.jpg",
            "width": 160,
            "height": 160
          },
          {
            "url": "https://resources.tidal.com/images/03a7ff5b/e309/4c66/9df7/d469d8049c3d/320x214.jpg",
            "width": 320,
            "height": 214
          },
          {
            "url": "https://resources.tidal.com/images/03a7ff5b/e309/4c66/9df7/d469d8049c3d/320x320.jpg",
            "width": 320,
            "height": 320
          },
          {
            "url": "https://resources.tidal.com/images/03a7ff5b/e309/4c66/9df7/d469d8049c3d/480x480.jpg",
            "width": 480,
            "height": 480
          },
          {
            "url": "https://resources.tidal.com/images/03a7ff5b/e309/4c66/9df7/d469d8049c3d/640x428.jpg",
            "width": 640,
            "height": 428
          },
          {
            "url": "https://resources.tidal.com/images/03a7ff5b/e309/4c66/9df7/d469d8049c3d/750x500.jpg",
            "width": 750,
            "height": 500
          },
          {
            "url": "https://resources.tidal.com/images/03a7ff5b/e309/4c66/9df7/d469d8049c3d/750x750.jpg",
            "width": 750,
            "height": 750
          }
        ],
        "main": true
      }
    ],
    "duration": 208,
    "trackNumber": 0,
    "volumeNumber": 0,
    "isrc": "USUMV2400558",
    "copyright": "© 2024 Taylor Swift",
    "properties": {},
    "tidalUrl": "https://tidal.com/browse/video/358461354"
  }
}

Artist link apple/itunes. Difference?

Starting with https://www.deezer.com/fr/album/10882160 and harmony gives
https://music.apple.com/gb/artist/505840851

This leads to MB not autodetecting the service:
Bildschirmfoto zu 2024-06-09 14-46-11

Correct for autodetection would be https://itunes.apple.com/gb/artist/id505840851
I don't know if these are two separated services or just URL redundancy for itunes. If it's the same service, changing the output URL via harmony should easily fix it or is there some technical reason against?

For now I'll stick with the itunes link :)
https://musicbrainz.org/artist/2e21383f-f71e-4367-bfa8-5a02c74643a8

Handle incomplete releases

For some releases (pre-releases?), Tidal's API does not return all tracks:

The missing tracks are not shown on tidal.com/browse/album pages at all, on listen.tidal.com pages they are displayed greyed out.

Since the API returns at least the correct track count we could try to fill the tracklist (for single medium releases) with [unknown] tracks to allow for these releases being combined with other sources which have the track titles and lengths.

add a button to clear the release lookup fields

when I'm adding multiple releases, I find it easiest to have my importer in one window and the artists' page in another, so I can just click and drag a link when moving to the next release. with how Harmony currently works, I've got to highlight the whole field and backspace before I can do this

an alternate option would be to clear the provider and GTIN fields at the top after looking up a release, but there might be a reason to show that even after the lookup. perhaps a second "new lookup" set of fields could work too? I'm up for any solutions~

Some digital releases reuse the physical release's GTIN

Originally reported on the forums:

It seems for Bandcamp Harmony is lacking the check if a barcode is used for another edition like the userscript does:

https://harmony.pulsewidth.org.uk/release?bandcamp=consvmer%2Fseelenfrieden&ts=1718342804

According to the listing at Apple Music it should be 3617389461901

I would say this is a data error and it should be sufficient to unset the digital release GTIN only in case of a reused GTIN. If it is different from all physical release GTINs on the Bandcamp page (or when there are no physical packages) it should still be fine to use it.

Select the release with the matching GTIN if iTunes API returns multiple

https://harmony.pulsewidth.org.uk/release?gtin=197875266348&itunes=&region=GB&ts=1717477988

iTunes: The API also returned 1 other result, which was skipped: https://music.apple.com/gb/album/1702051779

The other result would have been the correct one with GTIN 197875266348.

iTunes: Extracted GTIN 197985529395 (from artwork URL) does not match the looked up value 197875266348

In this case, both image URLs contain the corresponding barcode, but this is not always the case unfortunately:

https://harmony.pulsewidth.org.uk/release?gtin=882951718827&itunes=&region=GB&ts=1717495471

iTunes: The API also returned 1 other result, which was skipped: https://music.apple.com/gb/album/600624295

That would've been the correct result 🫤

Another example where GTIN would help: https://harmony.pulsewidth.org.uk/release?gtin=822603266801&itunes=&region=GB&ts=1717435544

seed artist URLs to MB artist

one feature I miss from a-tisket is how it can seed an edit to add artist URLs from the services it supports to the MusicBrainz artist

No support for geo.music.apple.com links

When trying to put a geo.music.apple.com link, Harmony displays an error:
No provider supports https://geo.music.apple.com/XX/album/_/1234567890?mt=1&app=music&ls=1&at=1000lHKX
where XX is region code (e.g. US), and 1234567890 is the album's ID

better handling of feat. artists

featured artists are handled very inconsistently across the various platforms, with Spotify removing feats and putting them in the artist field, Deezer keeping feat in the title and the artist field, and Apple Music only keeping feats in the track title. I think if a service has a featured artist, this should be reflected in the harmonized data, both on the track level and potentially on the release level (if all tracks have the same feat, especially for singles)

here's a decent cross section of the variants on this release

image

Support release group types

It would be good if providers could set the primary type and if this would be seeded when submitting to MB.

Not all providers will support this, but it is sometimes possible to at least detect singles and EPs. If in doubt a provider should likely keep this field empty.

Some notes on specific implementations:

  1. iTunes: No specific support, but the suffixes - Single and - EP seem to be commonly added to singles / EPs. These should be stripped (see #9) and then can be used for seeding the primary type as well. a-tisket does this.
  2. Spotify: Releases have the field album_type, which is one of album, single or compilation. Maybe it is too broad to use the album type (better leave it empty and have the user decide), but single and compilation should be fine to use.
  3. Bandcamp: At least standalone tracks could be detected as "Single".
  4. Tidal: no types specified
  5. Guessing the release type from title might work in many cases. Most of the MB submission user scripts do this, see https://github.com/murdos/musicbrainz-userscripts/blob/master/lib/mbimport.js#L302-L325
  6. For any provider making use of the MusicAlbum schema there is a MusicAlbumReleaseType. Theoretically this supports the types AlbumRelease, BroadcastRelease, EPRelease and SingleRelease. But e.g. Bandcamp does not make full use of this and seems to use AlbumRelease generally, except for standalone tracks it uses SingleRelease.

Generally it seems that if specific types, in particular single or EP, are detectable, this could be seeded. In most cases a source type of "album", if given, might be too unspecific and better kept out.

In the release editor the primary type can be seeded using the field type.

Spotify provider

Implement a Spotify provider based on the Spotify Web API.

Implementation notes:

  • General API access should be very similar to the Tidal provider, including the client credentials auth flow.
  • Individual regions don't seem to be queried separately. Instead the API when queried without a "market" set returns a list of all markets the release is available on.
  • Primary type (#15) could be supported, at least for single and compilation types.
  • Fetching the full track list for an album can involve multiple calls, the initial result from the album request only contains the first page of tracks.
  • ISRCs are available, but seem to require a separate call, as the track info being returned as part of the album exclude this data.
  • Pad GTIN with zeros if no results are found (example). a-tisket already does this. See also #6
  • Spotify has a concept of Track Relinking, where tracks not being available in a specific market get swapped out with a similar track that is available. Not sure about the implications, we'll need some examples for this. Might be that if the data gets queried without region that the relinking is not indicated. If this can be detected it would at least be good to show a warning.
  • Copyright information gets returned with separate entries for © and ℗. Because this is clearly separated the entries not always contain the corresponding symbol. If we just import the text the entries cannot be distinguished. The provider should add the symbols based on type if not present in the given text.
  • Similar to Deezer label info is a single text that sometimes contains multiple labels separated by /.

Related to #5

Providers using an OAuth token should try to refresh the token on 401 responses

Providers using OAuth tokens (currently Tidal and Spotify) persist the token for the token lifetime, then do a refresh. This is usually working fine. But should the token become invalid for any reason on the server side this will block any requests until the currently stored token is expired.

It would be better if the providers would attempt to refresh the token if they get a 401 Unauthorized status response and retry the current request once. Only if it also fails with a new token raise the error exception.

SoundCloud

I know it's mentioned in #5, but I figured I'd start up a ticket with a link to the API docs at least~

https://developers.soundcloud.com/docs

a couple notes about SoundCloud:

  • some title cleanup might be needed? especially stuff like "[FREE DL IN DESCRIPTION]", which is quite common
  • GTIN and ISRC are optional metadata
  • playlists and albums are both implemented as "sets", and perhaps both should be importable? I know a lot of artists don't properly set the type for what's pretty clearly an album (even Taylor Swift has some such examples)
  • Creative Commons Licenses should be detected and added when present
  • track downloads are optional and should be detected (sometimes with a Buy link that goes to a file hosting service like MediaFire or MEGA, or with a DL LINK IN DESCRIPTION)

YouTube

continuing from discussion here.

so, after a very brief search, it seems there's no official YouTube Music API, only one for YouTube (and a few unofficial ones for YouTube Music)

a few items to be aware of specific to YouTube with examples where applicable:

  • metadata in the description is not at all standardized, perhaps save for distributed content (i.e. from distributors like DistroKid). this is probably only an issue if we want to eventually add relationships with Harmony (once that's possible of course)
  • a single video can have different titles, descriptions, and possibly different audio tracks per region
  • often an album will be released as a single video with chapter markers for each track (I believe these come from the video description)
  • video titles are not standardized, sometimes containing the artist name or [official video] or other nonsense, which maybe should or shouldn't be removed from MusicBrainz submissions? could probably add release disambiguations based on these (such as Visualizer, Lyric Video, Music Video, etc.)
  • there are categories for YouTube videos (including Music and Entertainment, to name a couple). I don't know if those would be important to use, as some "music" releases might be non-music (i.e. about Music Production or about Music), and some music videos might not be categorized as such (as well as some podcasts and other items people might want to import). I don't think these categories are visible on the video pages, but might be in the API

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.