I propose a different, hopefully complementary, approach to this discussion. I am goin

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Taxonomy Test 6: How do the circumions of the same scientific name by two diffe

Re: <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id

TCS is just fine as it is about tnc HOT 12 CLOSED

tdwg commented on July 17, 2024 3

TCS is just fine as it is

from tnc.

Comments (12)

hlapp commented on July 17, 2024 1

@rdmpage 👍 to your proposed approach. It brings us firmly back to first defining concretely what the competency questions are (such as in the form of queries and expected results), and then determining the ontology that can satisfy them.

I'm also a firm believer in Occam's Razor, so IMHO the ontology we should be looking for is the simplest one that satisfies the competency questions, not a more elaborate one, whether driven by philosophy or moral objectives.

from tnc.

baskaufs commented on July 17, 2024

Here is a dataset that might be useful to play with:

Agricultural Research Council: Catalogue of Afrotropical Bees. http://doi.org/10.15468/u9ezbh
Accessed via http://www.gbif.org/dataset/da38f103-4410-43d1-b716-ea6b1b92bbac on 2016-10-26

It includes many possible pieces that could be connected using the existing TCS model. It is one of the datasets I played around with and described in this blog post in the section called "Taxon core with Occurrence, TypesAndSpecimen, Distribution, Reference, and Description extensions: Catalogue of Afrotropical Bees". There were a few additional comments about the dataset in the following post.

In my messing around, I used some of the TCS properties in my graph model (described in the post). The triples (as RDF/Turtle) can be downloaded here, but since my purpose was to use as many DwC terms as possible rather than to fully implement TCS, the dataset should probably be re-mapped to more fully embody the TCS graph model. (The mapping files that I used are here but probably won't make sense to anyone who hasn't already messed with Guid-O-Matic.) I don't have time to try re-mapping it myself right now, but if this line of inquiry continues for long enough, I might be able to work on it in a few weeks.

Oh, ho! I see that I put an example record here!

from tnc.

rdmpage commented on July 17, 2024

Names Test 3: Is a scientific name a homonym (either within a Code or across Codes)?

Here we test one of the classical "hemihomonyms", that is, a name which occurs in two Codes. Agathis montana is both the name of a wasp and the name of a tree. So, a simple query would be to see how many Codes have a given name:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX tn: <http://rs.tdwg.org/ontology/voc/TaxonName#>
PREFIX tcom: <http://rs.tdwg.org/ontology/voc/Common#>

SELECT *
WHERE { 
  ?thing tn:nameComplete "Agathis montana" .
  ?thing tn:nomenclaturalCode ?code .
}

Try it

giving:

thing	code
urn:lsid:ipni.org:names:92693-1:1.1.2.1.1.1.2.1.1.1	http://rs.tdwg.org/ontology/voc/TaxonName#botanical
urn:lsid:organismnames.com:name:1407520	http://rs.tdwg.org/ontology/voc/TaxonName#ICZN
urn:lsid:organismnames.com:name:1953681	http://rs.tdwg.org/ontology/voc/TaxonName#ICZN

Note that we have two zoological names because ION (the source of the names) has two records for Agathis montana (the same problem bedevils IPNI). So, we need to be a little cleverer:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX tn: <http://rs.tdwg.org/ontology/voc/TaxonName#>
PREFIX tcom: <http://rs.tdwg.org/ontology/voc/Common#>

SELECT (COUNT(DISTINCT ?code) AS ?count)
WHERE { 
  ?thing tn:nameComplete "Agathis montana" .
  ?thing tn:nomenclaturalCode ?code .
}

Try it

This query asks how many distinct Codes contain Agathis montana, and the answer is:

row	count
1	2

So, two codes have Agathis montana so it is a cross-Code homonym. TCS +1

Testing for homonyms within a Code is going to get a little messy given the number of duplicates some data sources contain, so we might want to test using publications, taxon authorship, or, in an ideal world, type specimens.

from tnc.

rdmpage commented on July 17, 2024

Names Test 5: What objective (Code-governed) synonyms exist for a scientific name?

One way to tackle this is if the name database has basionym relationships. IPNI and IndexFungorum do (although probably not complete).

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX tn: <http://rs.tdwg.org/ontology/voc/TaxonName#>
PREFIX tcom: <http://rs.tdwg.org/ontology/voc/Common#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>

SELECT *
WHERE { 
  ?thing tn:nameComplete "Agathis montana" .
  ?thing owl:versionInfo ?versionInfo .
  BIND(IRI(REPLACE( STR(?thing),CONCAT(":", ?versionInfo),"" )) AS ?iri). 
  {
    ?name tn:hasBasionym ?iri .
    ?name tn:nameComplete ?nameComplete .
  }
}

Try it

This query is a mess because IPNI's RDF is, in a word, buggered. They use a version identifier for the name, which makes cross linking within the data almost impossible. A great example of what happens when you design outputs without thinking about users (sigh). So we have to mess about with the name id to get the query to work. The query also works only in one direction (i.e., what names have the query name as their basionym), we'd want to go in the other direction as well (what are the names linked to the basionym of the query name) but IPNI's RDF prevents this. IndexFungorum is probably OK for this sort of query. ION is clueless about basionyms, so zoologists miss out.

Here's the result:

thing	versionInfo	iri	name	nameComplete
urn:lsid:ipni.org:names:92693-1:1.1.2.1.1.1.2.1.1.1	1.1.2.1.1.1.2.1.1.1	urn:lsid:ipni.org:names:92693-1	urn:lsid:ipni.org:names:77076253-1:1.2	Salisburyodendron montanum

So, Salisburyodendron montanum is an objective synonym of Agathis montana TCS +1, IPNI -1

from tnc.

rdmpage commented on July 17, 2024

Taxonomy Test 6: How do the circumscriptions of the same scientific name by two different authorities compare to each other?

This one is for @nfranz, taken from Fig. 1 from https://doi.org/10.1093/sysbio/syw023 where we have two taxon concepts both named Microcebus murinus.

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX tn: <http://rs.tdwg.org/ontology/voc/TaxonName#>
PREFIX tcom: <http://rs.tdwg.org/ontology/voc/Common#>
PREFIX tc: <http://rs.tdwg.org/ontology/voc/TaxonConcept#>

SELECT *
WHERE { 
  VALUES ?namestring { "Microcebus murinus" }
  ?concept1 tc:nameString ?namestring .
  ?concept1 tc:accordingToString ?accordingto1 .

  ?concept2 tc:nameString ?namestring .
  ?concept2 tc:accordingToString ?accordingto2 .
  
  ?relationship tc:fromTaxon ?concept1 .
  ?relationship tc:toTaxon ?concept2 .
  ?relationship tc:relationshipCategory ?relationship_type .
  
  FILTER(?concept1 != ?concept2)
}

Try it

This gives this result:

namestring	concept1	accordingto1	concept2	accordingto2	relationship	relationship_type
Microcebus murinus	http://kg-fuseki.sloppy.zone/tc/1993_Microcebus_murinus	MSW2	http://kg-fuseki.sloppy.zone/tc/2005_Microcebus_murinus	MSW3	http://kg-fuseki.sloppy.zone/tc/1993-2005	http://rs.tdwg.org/ontology/voc/TaxonConcept#Includes
Microcebus murinus	http://kg-fuseki.sloppy.zone/tc/2005_Microcebus_murinus	MSW3	http://kg-fuseki.sloppy.zone/tc/1993_Microcebus_murinus	MSW2	http://kg-fuseki.sloppy.zone/tc/2005-1993	http://rs.tdwg.org/ontology/voc/TaxonConcept#IsIncludedIn

So the 1993 concept of Microcebus_murinus is a larger taxon than the 2005 concept Microcebus_murinus . So TCS+1. Note that we could also express these relationships using the RCC5 terms in http://openbiodiv.net/

from tnc.

ghwhitbread commented on July 17, 2024

Very nice Rod. But these are competency questions for an information system designed to look and behave, much like TCS. Systems like APNI, AFD, IPNI, ITIS, CoL+, etc. … the TDWG ontology, TCS itself. We’ve had SPARQL services running off tn:views over APNI/APC and AFD for the past 8 years (currently disabled for system migration, sorry) with almost zero interest. Unusable by most clients, shunned by aggregators. Maybe it was just a sign of the times, and its yet to have its day. I'm still hopeful. For RDF at least, the power of Linked Open Data to simply implement complex services - like Taxon Name resolution, and for queries across datasets for example - has been well demonstrated. Though for TCS, we now use a local NSL model.

Like most contributors to this discussion we are custodians/developers of existing infrastructure and the question as to how we might model the domain is by now already well determined (for this current iteration). We have offered TCS and the TDWG ontology for export for many years but clients generally need to do what we do with these data and that is just not possible using any these standards. Delivery is always a compromise. Loss of information, a high barrier for understanding, the lack of adequate semantics, inappropriate generalisations, the “name”, “taxon”, “taxon concept” align/argument ... all contribute to a very poor standing on the reusability index.

Reusability, Interchange, knowing that the data delivered will be reasonably well understood, and represented correctly when it shows up elsewhere. These are the competency questions we are looking for now. A vocabulary for names and classifications, enabling lossless interchange of data (import, export) and good support for their discovery and extract.

At one level, between systems, for users like @rdmpage, a TCS+2 will very likely be the go. But when we deliver data it more often goes to support the taxonomic process, or into local lookup services, reused in controlled vocabularies, for checklist maintenance - into systems that work with the names of taxa. I would like to think that both use cases are possible with a TCS2 modelled as an application profile/ontology over a basic TDWG Names and Trees vocabulary.

from tnc.

rdmpage commented on July 17, 2024

Thanks Greg, I think there are two things here.

From my perspective the failure of previous attempts rests on several things: the expectation that users would use multiple SPARQL endpoints, the poor quality of the RDF (most of it not linked in any meaningful sense, just an RDF serialisation of data silos), the lack of rich content that people actually want (e.g, the absence of the literature), etc. I would argue that if we create properly linked data we can build rich clients on top of a centralised SPARQL server. My GBIF challenge entry is a proof of concept https://ozymandias-demo.herokuapp.com and this is built on the LSID TCS vocabulary (supplemented by a vocabulary @frmichel that handles things TCS makes awkward to do) and http://schema.org

What isn't clear to me is whether the previous failures are due to:

limitations or complexity of TCS (is TCS comprehensible, does it do what we want?)
limitations in the available data (we have lots of RDF for names, most of it problematic and weakly connected, if at all)
insufficient interest in the problem TCS was meant to solve (people have created massive, heavily used databases without TCS. maybe it's not actually needed?).

You write:

Reusability, Interchange, knowing that the data delivered will be reasonably well understood, and represented correctly when it shows up elsewhere. These are the competency questions we are looking for now. A vocabulary for names and classifications, enabling lossless interchange of data (import, export) and good support for their discovery and extract.

For the sake of argument I'm asserting that if we used TCS and had properly described and linked data, we could do all this. Note that I'm not saying that I necessarily believe this, I'm simply asking whether it's possible. In other words, if we have good tools and documentation based on TCS can we achieve the goals you outline?

from tnc.

mdoering commented on July 17, 2024

Can we agree to refer to the ratified standard which is an XML Schema as TCS and to the TCS ideas ported to RDF as the TDWG Ontology? I find this confusing.

from tnc.

rdmpage commented on July 17, 2024

@mdoering Apologies for any confusion, my interest in this topic dates from the LSID discussions of 2005 onwards so I’ve never paid XML schema any attention, focussing instead on RDF. If using **TDWG Ontology** helps avoid confusion I’ll happily use that.

…

On 3 Oct 2018, at 21:56, Markus Döring ***@***.***> wrote: Can we agree to refer to the ratified standard which is an XML Schema as TCS and to the TCS ideas ported to RDF as the TDWG Ontology? I find this confusing. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAFFau1UNrbABrTb4ZhYD2je7DbcGpiIks5uhSSagaJpZM4XEZ3h>.

from tnc.

nfranz commented on July 17, 2024

Re: #7 (comment). Excellent, looks great. Thanks, @rdmpage

P.s.: In this paper https://www.researchgate.net/publication/252228152_Perspectives_Towards_a_language_for_mapping_relationships_among_taxonomic_concepts, page 9, Table 3, I listed a number of terms that in my view should mostly/somehow find their way into an updated TCS, because they are useful. For instance, with TCS2 we should be able to express, in the case of "splitting", that {2005.TCL1 + 2005.TCL2 + 2005.TCL3} == 1993.TCL4. Where (e.g.) the taxonomic name Microcebus murinus may participate both in TCL1 and TCL4.

from tnc.

rdmpage commented on July 17, 2024

@nfranz I confess my initial reaction to Table 3 is a concern that adding more and more terms to describe relationships risks making things more complicated than the need to be. For example, if a relationship can be derived by a query then does there also be a term for that relationship? That said, the “plus” and “minus” terms could be used to describe relationships between classifications in terms of tree edit operations, which strikes me as more economical than listing mappings. So, maybe having more terms will help support alternative mechanisms for describing taxonomic change. Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: Nico Franz <[email protected]> Sent: Wednesday, October 3, 2018 10:55:52 PM To: tdwg/tnc Cc: Roderic Page; Mention Subject: Re: [tdwg/tnc] TCS is just fine as it is (#7) Re: #7 (comment)<#7 (comment)>. Excellent, looks great. Thanks, @rdmpage<https://github.com/rdmpage> P.s.: In this paper https://www.researchgate.net/publication/252228152_Perspectives_Towards_a_language_for_mapping_relationships_among_taxonomic_concepts, page 9, Table 3, I listed a number of terms that in my view should mostly/somehow find their way into an updated TCS, because they are useful. For instance, with TCS2 we should be able to express, in the case of "splitting", that {2005.TCL1 + 2005.TCL2 + 2005.TCL3} == 1993.TCL4. Where (e.g.) the taxonomic name Microcebus murinus may participate both in TCL1 and TCL4. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#7 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AAFFaqqdNCW_ER1duM5MqKYlZ2ALCakpks5uhTJogaJpZM4XEZ3h>.

from tnc.

nfranz commented on July 17, 2024

@rdmpage - thanks. A counter point here would be that that all these terms are spatial, and hence compatible with and informative for spatial logic reasoning. Some are shortcuts for convenient human use, yes, and not representing them would be fine for reasoning purposes.

Another way of saying this: the terms give someone an opportunity to "creatively" assert regions of congruence between classifications where such instances of congruence may not be very obvious. To paraphrase an example: "Take away one concept in classification 1 from this parent and add it to that parent, and then you have congruence otherwise with classification 2". Maximizing opportunities to express congruence (RCC-5: ==), in turn, allows reasoning approaches to be maximally "greedy" in terms of deducing other spatial relationships between classifications through transitivity rules. In that context, it helps to have a more diverse relationship vocabulary.

from tnc.

TCS is just fine as it is about tnc HOT 12 CLOSED

Comments (12)

Names Test 3: Is a scientific name a homonym (either within a Code or across Codes)?

Names Test 5: What objective (Code-governed) synonyms exist for a scientific name?

Taxonomy Test 6: How do the circumscriptions of the same scientific name by two different authorities compare to each other?

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent