Coder Social home page Coder Social logo

linkml / linkml-model Goto Github PK

View Code? Open in Web Editor NEW
33.0 13.0 16.0 12.92 MB

Link Modeling Language (LinkML) model

Home Page: https://linkml.github.io/linkml-model/docs/

Makefile 1.45% Shell 0.31% Python 97.79% Dockerfile 0.16% HTML 0.30%
linkml semantic-web json json-schema graph-ql metamodel yaml data-integration metadata uml

linkml-model's Introduction

Pyversions PyPi badge DOI PyPIDownloadsTotal PyPIDownloadsMonth codecov

LinkML - Linked Data Modeling Language

LinkML is a linked data modeling language following object-oriented and ontological principles. LinkML models are typically authored in YAML, and can be converted to other schema representation formats such as JSON or RDF.

This repo holds the tools for generating and working with LinkML. For the LinkML schema (metamodel), please see https://github.com/linkml/linkml-model

The complete documentation for LinkML can be found here:

linkml-model's People

Contributors

actions-user avatar cmungall avatar dalito avatar deepakunni3 avatar hsolbrig avatar joeflack4 avatar melonora avatar nicholsn avatar nlharris avatar pkalita-lbl avatar rly avatar sierra-moxon avatar sneakers-the-rat avatar sujaypatil96 avatar turbomam avatar vincentvialard avatar yarikoptic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

linkml-model's Issues

Automate deployment of docs

As part of the migration to poetry, we switched off auto-deployment of docs

This should be brought back, see the old workflow:

https://github.com/linkml/linkml-model/blob/813d73abaf1271a6fddf22556270549bd33ac408/.github/workflows/main.yaml

IMPORTANT check with @cmungall before merging any PR here. The way this site works is different from other schema sites

e.g.

wget -vvv https://w3id.org/linkml/meta.yaml
...
Location: https://linkml.github.io/linkml-model/linkml_model/model/schema/meta.yaml [following]
โœ— wget -vvv https://w3id.org/linkml/meta.owl
...
Location: https://linkml.github.io/linkml-model/linkml_model/owl/meta.owl.ttl [following]

Define semantics for referential integrity

A non-inlined class range may be intended as:

  • a reference to an object in the same document, database, or graph
  • a reference to an object in the external document, database, or graph

For external references, the mechanism for deferencing will vary depending on substrate:

  • for rdf, the reference MUST be a URI. If the URI is a URL then the URL may be deferenceable by conneg
  • for json/yaml docs, there would need to be a bridge from URIs to e.g. JSONPointer
  • for sql there would need to be a way to bind URIs to other databases

In all cases some kind of configuration would need to be supplied.

Additionally, the interpretation may be one of:

  • the reference MUST be present
  • the reference SHOULD be present

Furthermore, the semantics of deleting could be one of:

  • no action
  • cascading delete

`array: null` for any shaped arrays not allowed

I believe we arrived at a place where this would represent an array with any shape:

classes:
  MyClass:
    attributes:
      an_array:
        range: int
        array:

but currently the range of the array slot is a non-nullable array_expression:

range: array_expression

edit: wait obviously you can make metamodel attributes optional too my bad. i'll PR

Not sure how to express this in the schema, my first instinct would be something like this:

any_of:
- range: array_expression
- range: null

but I'm not sure if that's valid.

Another option to be able to differentiate between an explicit null like that and an any shaped array might be to make the syntax

array: Any

which would make the metamodel more straightforward:

any_of:
- range: array_expression
- range: Anything

Attach inlined metaslot to both 'slot definition' and 'slot expression'

This ticket is related to the discussion here: linkml/linkml#664

"From it looks like indeed the inline metaslots were deliberately attached to slot_definition, but in future metamodel versions these become slot-expression slots, so they can be used in expressions as well as named slots"

We should change the domain of inlined metaslot such that inlined can be used in 'slot definition' as well as 'slot expression'.

Need build artifacts in distribution

At the moment, the only files that we publish in pypi are the contents of the linkml_model directory. We also need to include the graphql, json, jsonld, jsonschema, model, owl, rdf, and shex directories, so that all of these artifacts can be accessed locally by other packages.

Se can't just add these to the setup.cfg directory as packages because it would end up adding "json", "owl", etc. to the python site_packages, so we need to build the following directory structure:

docs/
linkml_model/
     __init__.py   --> imports all the structures from python below
     graphql/
     json/
     jsonld/
     jsonschema/
     python/          --> everything from the linkml_model except the init file.  Add a new blank __init__ inderneath:
        __init__.py
         annotations.py
         extensions.py
         linkml_files.py
        mappings.py
        meta.py
        README.md
        types.py
     owl/
     rdf/
     shex/
     model/
tests/
.gitignore
...

For the short term, it would be cool if a setup.py expert could map the existing structure into something like the above. In the longer term, we probably need to go to this source structure (?)

Runtime footprint needs to be reduced

The pypi image currently requires the whole of biolinkml (soon to be linkml). While the whole package is needed to generate the output, the runtime portion should be cut down to the minimal libraries needed to support YAMLRoot and its relatives. This update should be done once we get linkml fully split into three components (model, runtime and development packages)

Proposal for enhancement to "id_prefixes"

  1. id_prefixes currently says "the identifier of this class or slot must begin with one of the URIs referenced by this prefix".

This sort of implies that a prefix can reference more than one URI. I'm hoping that we are dealing with a model where every prefix maps to exactly one URI (note, however, that the reverse may not necessarily be true... I need to check whether we guarantee uniqueness on URI's per prefix)

  1. when it comes to actually validating data, I would think that the following:
classes:
    HighClass:
        id_prefixes:
            - NCIt
            - SCT

Would assert that a YAML or JSON representation of the id of an instance of HighClass would necessarily start with "NCIt:" or "SCT:", while an RDF instance would start with https://nci.....org/ncit/... or http://snomed.org/id/.

What I would propose, however, is that we extend the definition of id_prefixes to support the following:

classes:
   HighClass:
       id_prefixes:
          NCIt:
          SCT:

Which would be the same as the above. We would extend the definition slightly, to allow:

classes:
    HighClass:
       id_prefixes:
           NCIt: ^C\d{5,6}$
           SCT: ^\d{6,18}$

Which would assert that the local name of a Curie or URI must begin with "C" and have 5 or 6 digits if it began w/ NCIt
or it must be a 6 to 18 digit number if it were SCT.

This would be a minimal change to the LinkML model itself, and, as of yet, the loaders do not do anything with ID prefixes so it would be no additions.

Questions:

  1. Do we really need the "^...$" pattern or can we assume them?
  2. Would we ever want two or more patterns and, if so, would we want something of the form
    id_prefixes:
        NCIt:
          - C\d{5}
          - M\d{7}

or would "(C\d{5}|M\d{7})" be ok?

  1. It should be noted that SNOMED CT, in particular, includes a check digit and other formatting information that isn't expressible as a simple RE. Should we provide a hook for future use that names an algorithm or just let it slide.

My suggested answers are: 1) Assume them, 2) single RE is fine and 3) nah - not now

Need to map a specific version identifier to a source

The current code in linkml_model/linkml_files.py maps a version identifier (e.g. "v0.0.1") to the SHA associated with the version. The SHA, however, doesn't give us the state of any file at that point in time, just the files that have changed. We need to add some code to linkml_files.py that allows us to go from:

GITHUB_PATH_FOR(Source.META, Format.NATIVE_JSONLD, "v0.0.1") to the version of jsonld/meta.model.context.jsonld at the point that the version tag was added.

Unit tests needed

A generic set of unit tests need to be added to:

  1. Verify that all of the generated python actually works
  2. Verify that all of the other artifacts are syntactically correct.

With the exception of GraphQL and the docs directory, part 2 can be realized by using the json and RDF loaders once they are separated into runtime

linkml_model/__init__.py needs relative paths

We believe that things will work a lot better if the init.py file has relative paths to the types, meta, etc. The "believe" because we won't know whether there are other issues w/ relative paths until we start using it.

`types` doesn't work as a relative path in python

from extensions import Extension
from types import Boolean

imports Extension as expected. types, however, is also the name of a python library and it appears that the python interpreter gives precedence to builtin libraries, resulting in the following error:

Traceback (most recent call last):
ImportError: cannot import name 'Boolean' from 'types' (/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/types.py)

An obvious alternative would be to use relative paths:

from .extensions import Extension
from .types import Boolean

But this is known to have issues -- one needs to have established a base before this works.

The other alternative, absolute paths:

from linkml_model.extensions import Extension
from linkml_model.types import Types

But this would require that the base (linkml_model in this case) be passed as an argument to the generator -- a bit of a challenge.

Add testing for py37, py39, and py310

Right now the github actions are only running on py38. It's mixed together the code that runs testing and does commits, so I don't want to mess that up by sending a PR that adds all of these in a strategy. Maybe there's a tricky way to use the if: entry to only run the commit steps on py38

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.