Coder Social home page Coder Social logo

obophenotype / provisional_cell_ontology Goto Github PK

View Code? Open in Web Editor NEW
8.0 10.0 1.0 113.81 MB

Draft cell type definitions from data - candidates for inclusion in CL

License: Creative Commons Attribution 4.0 International

Shell 8.37% Makefile 81.21% Batchfile 0.27% Scala 5.85% Ruby 4.30%
obofoundry cell-types transcriptomics ontology cell-ontology single-cell-rna-seq single-cell-analysis

provisional_cell_ontology's Introduction

Build Status

Provisional Cell Ontology

Description: Cell types that are provisionally defined by experimental techniques such as single cell transcriptomics rather than a straightforward & coherent set of properties. All terms subclass conventionally defined terms in CL. If more data emerges that allows more conventional definitions, terms may migrate to CL.

More information can be found at http://obofoundry.org/ontology/pcl

Browse the ontology using OLS https://www.ebi.ac.uk/ols4/ontologies/pcl

Versions

Latest PCL release files https://github.com/obophenotype/provisional_cell_ontology/releases/latest

Stable release versions

The latest version of the ontology can always be found at:

http://purl.obolibrary.org/obo/pcl.owl

Editors' version

Currently PCL works by importing components that should be created seperated, with a base file (an ontology file that only contains axioms belonging to the ontology, excluding any axioms from imported ontologies). To request inclusion of your ontology into PCL, please create a ticket in our Issue tracker.

Contact

Please use this GitHub repository's Issue tracker to request new terms/classes or report errors or specific concerns related to the ontology.

Acknowledgements

This ontology repository was created using the Ontology Development Kit (ODK).

Ontology Components

Brain Data Standards Ontology

Whole Human Brain Ontology

Contributors

  • David Osumi-Sutherland
  • Huseyin Kir
  • Richard Scheuermann
  • Brian Aevermann
  • Jeremy A Miller
  • Tom Gillespie
  • Yun (Renee) Zhang
  • Nicolas Matentzoglu
  • Shawn Zheng Kai Tan

For Editors

Currently, for the release to be prepared correctly, we need to force bdso-ext.owl component to be built first - this can be done by using sh run.sh make components/bdso-pcl-comp.owl prepare_release

PCL uses a "base file import system" that merges before incorporating. To update imports, use sh run.sh make no-mirror-refresh-merged

Cite

Ontology Metadata

id: pcl
mirror_from: https://raw.githubusercontent.com/obophenotype/provisional_cell_ontology/master/pcl-base.owl
title: "Provisional Cell Ontology"
contact:
  email: [email protected]
  label: David Osumi-Sutherland
  github: dosumis
description: Cell types that are provisionally defined by experimental techniques such as single cell or single nucleus transcriptomics rather than a straightforward & coherent set of properties.
domain: phenotype
homepage: https://github.com/obophenotype/provisional_cell_ontology
products:
  - id: pcl.owl
  - id: pcl.obo
  - id: pcl.json
  - id: pcl-base.owl
  - id: pcl-base.obo
  - id: pcl-base.json
  - id: pcl-full.owl
  - id: pcl-full.obo
  - id: pcl-full.json
  - id: pcl-simple.owl
  - id: pcl-simple.obo
  - id: pcl-simple.json
dependencies:
  - id: pr
  - id: go
  - id: uberon
  - id: ro
  - id: pato
  - id: ncbitaxon
  - id: bfo
  - id: cl
  - id: omo
  - id: nbo
  - id: chebi
  - id: so
tracker: https://github.com/obophenotype/provisional_cell_ontology/issues
license:
  url: http://creativecommons.org/licenses/by/4.0/
  label: CC BY 4.0
usages:
  - user: https://biccn.org/
    description: This ontology will be used to annotate cell types in single cell transcriptomics data, with an initial focus on the brain. It is also used to drive search and navigation in a cell type data explorer web application.
repository: https://github.com/obophenotype/provisional_cell_ontology
preferredPrefix: PCL

provisional_cell_ontology's People

Contributors

ar-ibrahim avatar cthoyt avatar dosumis avatar hkir-dev avatar matentzn avatar shawntanzk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

cthoyt

provisional_cell_ontology's Issues

Error loading PCL after converting it to Turtle format

Below are the error messages from Protege:

  1. From Turtle parser:
 org.openrdf.rio.RDFParseException: Expected an RDF value here, found ';' [line 999912]
  1. From Turtle Syntax parser:
org.semanticweb.owlapi.rdf.turtle.parser.ParseException: Encountered " ";" "; "" at line 1100522, column 130.
Was expecting:
    ")" ...

Steps to reproduce:

  1. Download pcl.owl file.
  2. Run the following ROBOT command to covert it to Turtle format:
$ robot convert -i pcl.owl --format ttl -o pcl.ttl
  1. Open the output file in Protege.

Template for generic additions to PCL

CellXGene has many cases of new cell types that are only defined by single cell transcriptomics. We think it would be useful to have a generic template for adding these with:

label; synonyms; parent class; reference; location; marker set

For marker set, it is probably best to use 'marker bags'. We need to look at the best way to generate these. We might need some programmatic step after an initial template table before ROBOT or DOSDP.

I think useful to have this anyway.

Some general issues.

  • What is the workflow for addition? Do we dump every new cell type here until an editor can review for addition to CL or do we have editors review and decide between PCL and CL?

  • differences in location relation by cell type, e.g. has_soma_location for neurons. Maybe have a different sheet for these?

CC @BAevermann

Documentation for defining cells in PCL

We need documentation on what goes into PCL (as opposed to CL)

  • eg If there isnt coherent set of properties and the cell type is grouped just by transcriptomics

Complete addition of human MTG cell types,

Decide what else should be harvested from the MTG cell excel file. The one logical component currently missing is TaxonID. I will add this. We could include other content as annotations - what should that include?

CI checks

Ok, I tried setting up CI checks but I'm not sure what I did wrong cause it doesn't seem to be triggering when I create a PR (might mean it needs reviewers first, but that's not ideal) - not super important so low priority, but could you help me look into this when you have some time @hkir-dev thanks!

Add BDSO as a imported component to PCL

BDSO should produce an artefact that includes base + custom imports (NS forest markers)
all other imports should come from PCL (CL, NCBITaxon, UBERON)
Blocked until BDSO can produce artefact for PCL

Imports bringing in errors

Errors aside from duplicate labels that will be fixed once we add species prefix include a bunch of duplicate def and multiple def from imports

For multiple defs - I would have thought that with base file imports now this would be solved @matentzn? PCL does have latest ODK. Do I need to redirect imports to use basefile somewhere for that to work?

There are duplicate_definition errors - this is from UBERON . issues - this is fixed here (obophenotype/uberon#2129) and I think just needs new UBERON release then import

purls dont resolve yet

@matentzn - wondering if I should be concern that our purls don't resolve yet? Do I need to do anything to get it into ontobee that I didnt?
Could we maybe redirect them to ols instead? @hkir-dev

make components/bdso-base-ext.owl not recognising change

Tried to run a release with the new BDS build, however it doesn't seem to be updating it.
Tried to make the components/bdso-base-ext.owl file, but make seems to think it is up to date even though it isn't (checked the date) - this might be due to the use of branch to grab the file rather than an actual release/master branch.
I can pull it over manually for now, but that is not ideal. Need to figure how how make checks what is up to date and deal with it from there.

[NTR] L6 FS cell w/ Large sag (Mus MOp)

Note - please check that the term does not already exist (check OLS: https://www.ebi.ac.uk/ols/ontologies/cl)

Preferred term label:
L6 fast-spiking with large hyperpolarization sag Pvalb GABAergic interneuron of the primary motor cortex (Mus musculus)

Synonyms
L6 FS Large sag PV MOp (mus)

Definition (free text, please give PubMed ID in format PMID:XXXXXX)
A Pvalb GABAergic cortical interneuron that is found in L6 of the primary motor cortex with mostly local axons. Some of these cells exhibited a horizontally elongated or downward projecting axon mostly innervating L6b. They are fast spiking, with large hyperpolarization sag and rebound potential.
PMID:22357664
PMID:31209381

Parent cell type term (check the hierarchy here https://www.ebi.ac.uk/ols/ontologies/cl):
GABAergic interneuron

What is the anatomical structure that the cell is a part of? Please check Uberon: https://www.ebi.ac.uk/ols/ontologies/uberon
http://purl.obolibrary.org/obo/UBERON_0001384 - Primary Motor Cortex
*has some soma location, can specify layer

Your nano-attribution (ORCID)

Any additional notes or concerns
subclass of Pvalb interneurons obophenotype/cell-ontology#868

Errors after Base Import

I created a release to check ROBOT report with base file import and a few errors that need to be addressed:

RO and BFO terms have multiplied labels with @en attached, example:
(of note: RO:000053 still has old bearer_of label)
Not sure where this is coming from cause on RO it seems alright

Level Rule Name Subject Property Value
ERROR multiple_labels BFO:0000063 rdfs:label precedes
ERROR multiple_labels BFO:0000063 rdfs:label precedes@en
ERROR multiple_labels BFO:0000066 rdfs:label occurs in
ERROR multiple_labels BFO:0000066 rdfs:label occurs in@en
ERROR multiple_labels BFO:0000067 rdfs:label contains process
ERROR multiple_labels BFO:0000067 rdfs:label contains process@en
ERROR multiple_labels RO:0000053 rdfs:label bearer of
ERROR multiple_labels RO:0000053 rdfs:label has characteristic@en
ERROR multiple_labels RO:0000056 rdfs:label participates in
ERROR multiple_labels RO:0000056 rdfs:label participates in@en
ERROR multiple_labels RO:0000056 rdfs:label participates_in

UBERON has some multipled_equivalent_classes error - these are accurate in UBERON itself:

Level Rule Name Subject Property Value
ERROR multiple_equivalent_classes UBERON:0000378 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid46142
ERROR multiple_equivalent_classes UBERON:0000378 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid46146
ERROR multiple_equivalent_classes UBERON:0000483 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid46389
ERROR multiple_equivalent_classes UBERON:0000483 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid46393
ERROR multiple_equivalent_classes UBERON:0004451 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid55460
ERROR multiple_equivalent_classes UBERON:0004451 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid55464
ERROR multiple_equivalent_classes UBERON:0015007 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid62199
ERROR multiple_equivalent_classes UBERON:0015007 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid62203
ERROR multiple_equivalent_classes UBERON:0015008 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid62211
ERROR multiple_equivalent_classes UBERON:0015008 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid62215
ERROR multiple_equivalent_classes UBERON:0018142 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid62607
ERROR multiple_equivalent_classes UBERON:0018142 owl:equivalentClass d80df3fa-9c13-4d2d-a88d-131d173f5c69genid62611

Unparsed triples error

I ran update-imports in an ontology that imports PCL and received the following:

ERROR Input ontology contains 558 triple(s) that could not be parsed: <http://purl.obolibrary.org/obo/pcl/CS1908210037> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> _:genid-nodeid-node1glt1i55qx56661. <http://purl.obolibrary.org/obo/pcl/CS202002013_20> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> _:genid-nodeid-node1glt1i55qx5535. <http://purl.obolibrary.org/obo/pcl/CS201912131_64> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> _:genid-nodeid-node1glt1i55qx26702. <http://purl.obolibrary.org/obo/pcl/CS202002013_7> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> _:genid-nodeid-node1glt1i55qx19411. <http://purl.obolibrary.org/obo/pcl/CS201912132_31> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> _:genid-nodeid-node1glt1i55qx25634. <http://purl.obolibrary.org/obo/pcl/CS202002013_97> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> _:genid-nodeid-node1glt1i55qx57968.

etc.

This doesn't interfere with anything (it's throwing an ERROR, but I think I can change that setting). I see discussion of this here (ontodev/robot#829) and here (owlcs/owlapi#1023) as well and wanted to flag just in case something unexpected is occurring in PCL. Feel free to close if this is expected or known and unproblematic behavior.

[NTR-pcl] Upper FS Basket cell (Mus MOp)

Preferred term label:
Upper fast-spiking basket Pvalb GABAergic interneuron of the primary motor cortex (Mus musculus)

Synonyms
Upper FS BC PV MOp (Mus)

Definition (free text, please give PubMed ID in format PMID:XXXXXX)
A Pvalb GABAergic cortical interneuron in upper L5 and L2/3 of the primary motor cortex that have strong layer-adapting morphology, classical L2/3 basket morphology in L2/3, and mostly large basket cell morphology in upper L5. They are Fast spiking with some showing delayed firing with large latency.
PMID:26612957
PMID:18568015

Parent cell type term (check the hierarchy here https://www.ebi.ac.uk/ols/ontologies/cl):
GABAergic interneuron

What is the anatomical structure that the cell is a part of? Please check Uberon: https://www.ebi.ac.uk/ols/ontologies/uberon
http://purl.obolibrary.org/obo/UBERON_0001384 - Primary Motor Cortex
*has some soma location, can specify layer

Your nano-attribution (ORCID)

Any additional notes or concerns
subclass of Pvalb interneurons #868

Decide on ID schema and ID management for PCL

We currently have IDs like this:

http://www.jcvi.org/cl_ext/mtg_cluster/pCL54

We should probably move to something that is not specific to some experiment (mtg_cluster) and uses a more conventional padded short_form (PCL_0000054 ?).

Should we stick with http://www.jcvi.org/cl_ext as the base for now?

Options for more conventional PURLs.

  1. Request obolibrary membership.

  2. Create a PCL domain underneath http://purl.obolibrary.org/obo/CL/

This would allow us to use PURL resolution to point to release files like this.
http://purl.obolibrary.org/obo/CL/PCL/pcl.owl

We could use this to roll IDs like this http://purl.obolibrary.org/obo/CL/ PCL/PCL_0000054 - although we don't have any architecture in place to make sure these resolve.

CC @scheuerm @BAevermann

Oligodendrocytes incorrectly classed as oligodendrocyte precursor cell

Jeremy Miller has identified 3 cells that are classed as oligodendrocyte precursor cell (OPC), but are not. All terms that start with oligo are oligodendrocyte, while terms that start with OPC are oligodendrocyte precursor cells (Bakken et al., 2021 Nature DOI:10.1038/s41586-021-03465-8):

image

Primates had a unique oligodendrocyte population (Oligo SLC1A3 LOC103793418 in marmosets and Oligo L2-6 OPALIN MAP6D1 in humans) that was not a distinct cluster in mice

  • Oligo L2-6 OPALIN MAP6D1 primary motor cortex oligodendrocyte precursor cell (Hsap) PCL:0015079
image
  • Oligo L2-6 OPALIN FTH1P3 primary motor cortex oligodendrocyte precursor cell (Hsap) PCL:0015120

  • Oligo L3-6 OPALIN-like ENPP6 primary motor cortex oligodendrocyte precursor cell (Hsap) PCL:0015119

We need to change the parent term to oligodendrocyte (CL:0000128) and relabel them (remove precursor cell).

GO terms falling into owl:Nothing

Screenshot 2021-12-08 at 16 20 05

seems to be from "tube lumen cavitation" & "formation of anatomical boundary"

these issues do not exist in BDSO - so not sure why it is coming in here

Create Bioportal artifact

To deal with current bioportal limitations, we need to create a specific artifact for it.
Spec:
from pcl-full

  1. strip all prefLabels
  2. add xsd:string to all rdfs:label

component build issues

Currently we cannot get the components to be built before releases making it so that imports from the components are left out. This is currently fixed by forcing components to be built first using sh run.sh make all_components prepare_release - however, long term, we should fix this properly.

This will not be looked at too soon, but is a ticket to remind us of it

[NTR] eccentric medium spiny neuron

Copied over from CL

# Class: obo:CL_4023032 (eccentric medium spiny neuron)

AnnotationAssertion(Annotation(oboInOwl:hasDbXref "PMID:30096299"^^xsd:string) obo:IAO_0000115 obo:CL_4023032 "A medium spiny neuron that are intermixed with other SPNs in the striatum with no obvious spatial organization but distinct from indirect and direct medium spiny neurons by their differential gene expression profile."^^xsd:string)
AnnotationAssertion(<http://purl.org/dc/terms/contributor> obo:CL_4023032 <https://orcid.org/0000-0001-7258-9596>)
AnnotationAssertion(Annotation(oboInOwl:hasDbXref "PMID:30096299"^^xsd:string) Annotation(oboInOwl:hasSynonymType <http://purl.obolibrary.org/obo/cl#abbreviation>) oboInOwl:hasExactSynonym obo:CL_4023032 "eSPN"^^xsd:string)
AnnotationAssertion(rdfs:label obo:CL_4023032 "eccentric medium spiny neuron")
SubClassOf(obo:CL_4023032 obo:CL_1001474)

Compare metadata single cell datasets from multiple platforms

Coordinated here: https://docs.google.com/spreadsheets/d/1MAWl4exteFSpCyIlpE8tzqfc-rCsu3cncllNSLn5o-0/edit?usp=sharing

Hi Shawn and Bradley - following on from today’s discussion, I’d really like to be able to do a side by side comparison of metadata on the same single cell datasets from multiple platforms:
HCA data portal https://data.humancellatlas.org/explore/projects/
celltype.info
https://cellxgene.cziscience.com/collections
Azimuth (although still not very much on there)
Allen Brain Atlas (can we download matrices directly from there) / Nemo
Problem - finding candidate sets to compare. We should have one brain and one-non-brain. Non-brain should be something with some detailed annotation.
Brain - should probably be mini-atlas. Challenge might be dataset size.
Non-brain. Kidney might be good, although struggling to find example (CAP needs to show publication!)
What to compare:
Keys used for standard metadata
How ontologies are supported
How cell type is recorded - what, if anything is lost in flattening
Relationship metadata content to display on site.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.