Coder Social home page Coder Social logo

edamontology / edamontology Goto Github PK

View Code? Open in Web Editor NEW
115.0 25.0 57.0 19.33 MB

EDAM is an ontology of bioscientific data analysis and data management. EDAM concentrates on topics, operations, types of data, and data formats related to analysis, modelling, optimisation, and data life-cycle in biosciences, and in other research and science-based applications.

License: Creative Commons Attribution Share Alike 4.0 International

Shell 100.00%
edam ontology web-ontology-language identifier topic data operation format biohackeu20 data-analysis

edamontology's Introduction

DOI representing all stable versions, resolving to the latest: 10.5281/zenodo.822690

DOI of the latest stable EDAM version 1.25: 10.5281/zenodo.3899895

Current status of the 'main' development file: Build status

Latest documentation: Documentation Status

Twitter: @edamontology (follow).

What is EDAM?

EDAM is a comprehensive ontology of well-established, familiar concepts that are prevalent within computational biology, bioinformatics, and bioimage informatics. EDAM includes types of data and data identifiers, data formats, operations, and topics related to data analysis in life sciences. EDAM provides a set of concepts with preferred terms and synonyms, related terms, definitions, and other information - organised into a simple and intuitive hierarchy for convenient use (see figure).

EDAM is particularly suitable for semantic annotations and categorisation of diverse resources related to bioscientific data analysis: e.g. tools, workflows, or training materials. EDAM is also useful in data management, for recording provenance metadata of processed bioscientific data.

Viewing and download

EDAM can be browsed online at the NCBO BioPortal, at OLS, and in the EDAM Browser.

The all-newest unstable version can be browsed and commented at the NCBO BioPortal and WebProtégé (free registration required).

The latest stable version is always downloable from http://edamontology.org/EDAM.owl | tsv | csv. For older versions, see http://edamontology.org/page#Download or /releases.

EDAM relations figure FOSSA Status

Documentation

Comprehensive documentation and guidelines are available via readthedocs (maintained here).

A quick overview is at the http://edamontology.org home page.

Citing EDAM

If you refer to EDAM or its part in a scholarly publication, please cite:

Melissa Black, Lucie Lamothe, Hager Eldakroury, Mads Kierkegaard, Ankita Priya, Anne Machinda, Uttam Singh Khanduja, Drashti Patoliya, Rashika Rathi, Tawah Peggy Che Nico, Gloria Umutesi, Claudia Blankenburg, Anita Op, Precious Chieke, Omodolapo Babatunde, Steve Laurie, Steffen Neumann, Veit Schwämmle, Ivan Kuzmin, Chris Hunter, Jonathan Karr, Jon Ison, Alban Gaignard, Bryan Brancotte, Hervé Ménager, Matúš Kalaš (2022). EDAM: the bioscientific data analysis ontology (update 2021) [version 1; not peer reviewed]. F1000Research, 11(ISCB Comm J): 1. Poster. 10.7490/f1000research.1118900.1 Open access

EDAM releases are citable with DOIs too, for cases when that is needed. 10.5281/zenodo.822690 represents all releases and resolves to the DOI of the last stable release.

Research notice

Please note that this repository is participating in a study into sustainability of open source projects. Data will be gathered about this repository for approximately the next 12 months, starting from June 2021.

Data collected will include number of contributors, number of PRs, time taken to close/merge these PRs, and issues closed.

For more information, please visit our informational page or download our participant information sheet.

License

FOSSA Status

edamontology's People

Contributors

albangaignard avatar ankitaxpriya avatar bgruening avatar bryan-brancotte avatar drashtipatoliya17 avatar egonw avatar ellschi avatar fossabot avatar hmenager avatar joncison avatar kigaard avatar lucielamothe avatar lukong123 avatar matuskalas avatar melibleq avatar mr-c avatar raashika03 avatar tawahpeggy avatar uttam-singhh avatar vedran-kasalica avatar yochannah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

edamontology's Issues

deprecated without consider nor replacedBy

operation_2238 and topic_3073 are tagged as deprecated by does not have "consider" nor "replacedBy" fields.

topic_3073 is also tagged obsolete

either operation_2238 should hold a "consider" or "replacedBy" field or it should be tagged as "obsolete"

best regards

Eric

Annotation of visualisation tools: How to do? How should EDAM model the related concepts?

Examples:
https://bio.tools/tool/BioJS/msa/0.4.8
https://bio.tools/tool/RostLab/Aquaria/1

Scenario:

A visualisation tool consumes a concrete type of data (D), and visualises it.

Notes:

  • The output of this visualisation operation (VO) is the visualisation (V) itself.
  • This visualisation (V) is ontologically neither an operation nor a type of data.
  • The visualisation (V) is usually on a screen, but can also be e.g. just sent to a printer, or a file, or only generated into a computer memory representation.

Implication for EDAM:

  • So what is the output type of data?
  • And format?
  • Is there any output data at all?
  • Is the output data the same as input (V=D), just in a different, "graphical" format?
  • What is the format then? Especially if it's not directly a graphical image file format (e.g. PNG, SVG) or a full valid HTML record?
  • Is it just a binary blob?
  • But isn't then EVERYTHING digital a binary blob?

Bonus scenario:

The same tool as above can in addition to the visualisation "save"/"export" some data from the visualisation.

Notes and questions:

  • The input and its format would then be the same as the output of the visualisation (V) discussed above, I suppose.
  • But what are the operations here?
  • Perhaps some Data extraction, which is highly useful elsewhere, too (And is perhaps better than abusing Data retrieval which states for retrieving records from a database and not from other data).

Bonus bonus:

  • Data editting (especially "manual") can be another natural operation with a visualisation!
  • Visual analysis would be another (especially "manual") operation with a visualisation!

How to annotate these?!?

Easy reference document for EDAM

EDAM should provide a simple searchable text or HTML document that lists the label, identifier and description for each term in a simplified hierarchy.

EDAM.obo and EDAM.owl mismatch

while working on EDAM.owl and EDAM.obo downloaded from http://edamontology.org/EDAM.owl and
http://edamontology.org/EDAM.obo
I noted some mismatch beetween 2 files regarding the same term.

eg topic_0100 from owl definitions is tagged deprecated

    <!-- http://edamontology.org/topic_0100 -->

    <owl:Class rdf:about="http://edamontology.org/topic_0100">
        <rdfs:label>Nucleic acid restriction</rdfs:label>
        <rdfs:subClassOf rdf:resource="&oboInOwl;ObsoleteClass"/>
        <obsolete_since>1.3</obsolete_since>
        <created_in>beta12orEarlier</created_in>
        <oboInOwl:hasDefinition>Topic for the study of restriction enzymes, their cleavage sites and the restriction of nucleic acids.</oboInOwl:hasDefinition>
        <owl:deprecated>true</owl:deprecated>
        <oboInOwl:consider rdf:resource="http://edamontology.org/topic_0747"/>
        <oboInOwl:inSubset rdf:resource="&oboOther;edam#obsolete"/>
    </owl:Class>

while obo definition is not:

[Term]
id: EDAM_topic:0100
name: Nucleic acid restriction
subset: bioinformatics
subset: edam
subset: topics
created_in: "beta12orEarlier"
def: "Topic for the study of restriction enzymes, their cleavage sites and the restriction of nucleic acids." [http://edamontology.org]
namespace: topic
is_a: EDAM_topic:0640 ! Nucleic acid sequence analysis
is_a: EDAM_topic:0821 ! Enzymes and reactions

and so does not have consider as in owl.

NB: same problem with topic_0748, topic_0179, operation_2448

Eric

Add a File splitting operation term

As already discussed in the mailing list, citing myself here:
"""
What I would like is a finer concept, a bit like
"File reformatting" but called "File splitting". Such tools can be very
helpful:
1- if a given program can only process one item
2- to parallelize the execution of a task, on a cluster (DRM or Hadoop
architectures...)
So I would like to create a child concept to File processing which would
be File "splitting".
Tell me your opinion,
"""

Gene functional annotation

Please add the following EDAM Operation, which will be useul for GoMapMan:

Analysis
    Annotation
        Sequence annotation
            Gene functional annotation

Add media types to format concepts

Add annotation property on "format" terms for mime-type, then add annotations to all relevant classes. Concentrate on formal IANA/IETF types first.

application/xml
text/csv
text/html
text/plain
text/xml

Convenient place I can pull the bioinformatics media types from, e.g.:
application/cellml+xml ??

Camille says ...
As you may remember, we (MIRIAM Registry) are keen on using the EDAM ontology.

I noticed the "Format" branch and I am wondering whether it would be possible to also record the media type [1] of each format (those small strings such as "text/turtle" or "application/cellml+xml")?

[1] http://en.wikipedia.org/wiki/Internet_media_type

You can use the official source
(http://www.iana.org/assignments/media-types), but not all formats are actually registered there. Also http://www.fileformat.info/info/mimetype/index.htm (which combines info from multiple sources) and finally you can refer to the individual format's specification and website.

ps. If you want an assertion to be inherited by all subclasses then you need to use an object or datatype property. Annotation properties aren’t inherited in any logical sense. Prob I’d create a little collection of mime types and then axiomatise your fomats as such: (parent class) has_mime_type some mime_type_x (sub class of parent class) has_mime_type some mime_type_y ---

Topics in EDAM

The list of topics in EDAM has grown and continues to do so. It's out of scope (prob) for SWO which imports EDAM - I don't know how much this even in scope for EDAM really. Can I suggest for these reasons, and to facilitate the import of software relevant parts into SWO, topics becomes it's own entity with a separate file which can then import EDAM if you want to see the full thing? This would also mean topics can be reused outside of EDAM for people not interested in software of course.

error in topic referenced from an operation

Operation 0291 has a reference on a topic which is actually an operation:

    <rdfs:subClassOf>
        <owl:Restriction>
            <owl:onProperty rdf:resource="http://edamontology.org/has_topic"/>
            <owl:someValuesFrom rdf:resource="http://edamontology.org/operation_0291"/>
        </owl:Restriction>
    </rdfs:subClassOf>

Fix edamontology-developers mailing list

Please:

  • remove moderator approval
  • make private w/o possibility of users adding themselves
  • remove other users than, for right now: Jon, Hervé, Matúš
  • make list of subscribers visible to all subscribers
  • test that we're all getting the emails promptly after sending (including the sender)
    Cheers!

Fix faulty inference of Data:Operation equivalence for SWO integration

Reported by Allyson Lister:
When the inferred hierarchy is calculated in Protege, the EDAM Data and EDAM Operation classes are inferred to be equivalent. This is definitely not true, and has the knock-on effect that the SWO data class is also inferred to be equivalent to Operation (because EDAM Data and SWO data are asserted to be equivalent). This need to be fixed as early as possible.

typo in term label

the operation term with id operation_2409 has the label "Utility operaration"

problem with the management of deprecated terms

It looks like there is a problem in the ontology due to the deprecation of obsolete terms.

eg if we look at topic_0100

topic_0100 is deprecated and has us consider topic_0747

it turns that topic_0747 is deprecated to and has us consider topic_0160 and topic_0640

OK let's check for topic_0160 and topic_0640
topic_0160 appears to be also deprecated and have consider topic_0080
same apply for topic_0640 that is replacedBy topic_0080

Fortunately, topic_0080 appears to be valid

so we can represent the path from topic_0100 to topic_0080 this way:

topic_0100 -> consider topic_0747 -> consider topic_0640 -> replacedBy topic_0080

So, in short, topic_0100 should have consider or replacedBy set to topic_0080 instead of some also obsolete topic.

other examples:
topic_0748 -> consider topic_0639 -> replacedBy topic_0080
topic_0179 -> consider topic_0698 -> replacedBy topic_2814
operation_2448 -> consider operation_2446 -> consider operation_2403

furthermore the intermediate consider topic or operation appears as "father" of some terms.

for example, topic_3510 and topic_3511 appears to be a subClassOff topic_0160
nd as we saw above, topic_0160 is to consider topic_0080

I would suggest that when a term is marked as "deprecated" and have either a consider or a "replacedBy" that the whole ontology must be updated to avoid keeping deprecated "replacedBy" or "consider" terms.

Best regards

Eric

Fix and improve definitions, together with a major clean-up

Please fix the weird URIs such as ...urigen....
Please fix concepts with more than one definition so that all have exactly one.

Workflow:

get a token by email when free
..edamontology [master]> git pull
edit EDAM_1.4_dev.owl locally
check up & validate locally
..edamontology [master]> git add EDAM_1.4_dev.owl
..edamontology [master]> git commit
fill in description of the commit
..edamontology [master]> git push origin
release token by email if everything alright

Topic "Immunology" has child Operation "Epitope Mapping"

Hi Jon,

I had understood that Topics and Operations are disjunct, please ignore this noise if I just happen to be wrong here.

Many thanks and greetings

Steffen

Update: I now see that I was falling into the EBI version 1.3 trap. The NCBO 1.11 does not have this any more.

Training material

Please add the following terms to EDAM Data Type in order to register ELIXIR eLearning Platform and similar platforms (e.g. Goblet):

Training material 
    OER - Open Educational Resource
        MOOC - Massive open online course

Html Widget to pull tools by query

Use case: group wants to show case all its tools in its own webpage, with all the info kept in the elixir registry


This should be simple enough for any group to pull the JSON and create an interface list of it. However, this could be made easier and more standard for all users as a general widget. Furthermore, the widget could include something like "Powered by Elixir Registry" to make some kind of publicity of the registry itself.

EDAM annotation guidelines

What to do for databases, tools, Web services. Appropriate levels of details. Real examples.

Including tricky issues, e.g. Documentation to clarify multiple inheritance etc.

Taverna workflow format is both in XML and Workflow format - this is OK, but document that "multiple inheritance" is used in EDAM.

Provide official mapping of alternate human-readable terms

In evaluating how the Common Workflow Language (https://github.com/common-workflow-language/common-workflow-language) should use EDAM, it has come up that requiring users use the canonical EDAM terms identifiers (which are mostly numeric and not meaningful to humans) is very user-unfriendly. It would be very helpful if EDAM provided an official mapping of human-readable terms to canonical EDAM identifiers so CWL doesn't have to create and maintain an unofficial mapping.

Missing a button for saving intermediate progress while annotating

When annotating, it is intuitive for me to press the "save resource" button to save the progress of my annotation. However, having the first tab annotated only, it gives me an error that it cannot be saved because of "topic:null". So, I suggest to have an additional button for saving the annotation progress (at least: browser cookies).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.