Coder Social home page Coder Social logo

adrigrillo / music_kg Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 847 KB

Music specific knowledge graph using MusicBrainz database

Shell 18.33% PLpgSQL 81.67%
knowledge-graph music open-data semantic-web interlinking musicbrainz dbpedia wikidata

music_kg's People

Contributors

adrigrillo avatar

Stargazers

 avatar

Watchers

 avatar

music_kg's Issues

Areas graph part

Issue to discuss the mapping of the area in the graph. Is a sub-issue of #1.

Tables involved

  • area
  • area_alias: are used to store alternate names or misspellings.
  • iso_3166_1: contains the code of the countries. Can be useful for interlinking.

Possible entities

The most important division contained in the database are the countries that are also related with the table iso_3166_1. An entity then should be country that will be the parent of the area, this is, country will be the root of the area entities and the rest will have it as an origin.

The other types of areas that exist in the database are:

  • Subdivision: is used for the main administrative divisions of a country, e.g. California, Ontario, Okinawa.
  • County is used for smaller administrative divisions of a country which are not the main administrative divisions but are also not municipalities, e.g. counties in the USA.
  • Municipality is used for small administrative divisions which, for urban municipalities, often contain a single city and a few surrounding villages. Rural municipalities typically group several villages.
  • City is used for settlements of any size, including towns and
  • District is used for a division of a large city, e.g. Queens.
  • Island is used for islands and atolls which don't form subdivisions of their own, e.g. Skye. These are not considered when displaying the parent areas for a given area.

These type of areas form a hierarchical relation where: country -> subdivision ->county -> municipality -> city -> district -> island and their relation is present in in the l_area_area table. The relation between this entities will be firstly mapped with the smaller type indicating that is part of a bigger enity with the term dcterms:isPartOf.

Artist graph part

Issue to discuss the mapping of the genres in the graph. Is a sub-issue of #1.

General overview

The artist part of the graph is the most complex of the three because it contains different categories of artist that are interrelated, for example, an artist can have an entry and then be a member of different groups. By the definition of an artist in the Music Ontology is the band or group the entity that has to reference to the artist that form part of the band.
Moreover, the Music Ontology does not contains terms or properties related with the location of the artist, so other vocabulary will have to be used.

Tables involved

  • artist
  • l_artist_artist
  • artist_type

Genres graph part

Issue to discuss the mapping of the genres in the graph. Is a sub-issue of #1.

General overview

The MusicBrainz database has hard-coded the genres in the server-side of the application, providing a list of them here inside the object tag. With respect to the database, they are contained in the table tag mixed with annotations the users had made. Therefore is impossible to generate a graph with the table as it will include elements that are not genres.

On the other hand, DBpedia contains a list of 1250 musical genres that can be queried with the following SPARQL statement:

PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

select distinct * 
where {
  ?genre a dbo:Genre .
}

However it is possible that the genres of the database does not map 1:1 to the DBpedia genres list.

Relation between artists and genres

As it has been explained #4 the set of music genres are inside the artist_tag table without any kind of distinction, therefore some processing have to be done.

Idea overview

In general, the idea to connect the artist with the established genres follows a similar process as the one made for the aliases of artist and areas but implying some post-processing to remove the user-made tags. The steps to follow will be:

  1. Generate the link between the artist and its tags
  2. In SPARQL, with an inner join keep the tags that match with the defined genres.
  3. Link the genres with the ones of DBpedia to improve the information of the genres.

Completion of the knowledge graph with DBpedia and Wikidata

The objective is to add more information to the knowledge graph using as sources Wikidata and DBpedia. This issue comprehends the following tasks:

  • Generating new genres with Wikidata and DBpedia.
  • Generating any connection between the existing artists and the new genres.

Link the KG with DBpedia and Wikidata

To generate the linking between the create Knowledge Graph, DBpedia and Wikidata one of the two next available tools will be used:

Subtasks

  • Study the advantages and disadvantages of the tools for this task.
  • Examine DBpedia dictionary and entities.
  • Examine Wikidata dictionary and entities.
  • Link with DBpedia.
  • Link with Wikidata.
  • Review the links that do not overcome the certainty percentage.

Linking the genres with DBpedia and Wikidata

This is a sub-issue of #2 that handles the linking of the genres.

Overview

The number of instances for genres for the different knowledge graphs are:

  • MusicBrainz: 419 instances.
  • DBpedia: 1245 instances.
  • Wikidata: 1568 instances.

As it can be seen, the number of genres available in the other sources is much larger than the ones established in MusicBrainz. However, this could mean that the genres included in the created graph are quite general and that it could be a great number of matches in with the other two graphs.

  • Use LIMES to link with DBpedia.
  • Use LIMES to link with Wikidata.
  • Review the tuples that need to be reviewed.
  • Add the tuples to the graph.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.