Comments (9)
I like this approach, except that "identifying and representing author objects" should be embedded into the reference/literature components. Authorities of names and concepts (i.e., the sec authors) should be inherited from linked references (literature citations). I would say very simply that the scope of TCS2 should be about Names of organisms and their relationships to defined or implied Circumscriptions and Classifications through References. To parse that out a bit (capitalization intentional):
Names are text-string labels applies to organisms. I assume we agree that the scope of names includes those that fall under the three major Codes (i.e., Linnean-style names). Does it also included semi-Linnean-style names (e.g., provisional names assigned to morphospecies and other taxa lacking formal scientific names, in combination with a Linnean-style genus or higher-rank name, in the form of "Aus sp123")? Does it include names governed by the Code for viruses? What about vernacular names (especially important for birds)? The scope in this sense is important because it points to what sorts of properties and relationships of "Names" that TCS2 needs to accommodate. Perhaps we start with the three main Codes (Linnean-style names), then expand it with extensions to accommodate other kinds of names? The properties of the Code-governed names include the vocabularies for ranks, links to type specimens, links to References in which relevant nomenclatural acts occur, etc.
Circumscriptions are either defined (through sets of characters, character states, enumerated individuals, etc.) or implied (through synonymies, links to other circumscriptions, context, etc.), and represent asserted sets of organisms to which names are applied. The assumption is that the circumscribed set of organisms share certain properties that can be referenced in aggregate through the name (with the name serving as a proxy for the set of organisms). Going back to earlier discussions, I think it's safe to say that "Circumscriptions" is synonymous with "Taxonomic Concepts", but the word Circumscription is perhaps more precise and less burdened with all the confusion, alternate interpretations and other forms of intellectual baggage that "Taxonomic Concepts" carries. Identification/determination of organisms (e.g., occurrence records in GBIF) are probably the most ubiquitous external entities that connect to Circumscriptions (through Names), but all sorts of other things are linked to these Circumscriptions, such as roles as disease vectors, assertions about evolutionary relationships, and basically almost anything else people document about biology in a generalized way. Circumscriptions can be thought of as nodes in a tree, and their properties are the collective set of properties of all other nodes and branches "below" them (i.e., towards the leaves).
Classifications are assertions about hierarchical placements and arrangements of Circumscriptions with respect to each other. This can be thought of as Linnean-style hierarchies, or in the case of trees, it includes all the nodes and branches "above" a particular Node (i.e., towards the trunk).
References include all manner of publications, plus unpublished documents, and subsections of both. Names, Circumscriptions and Classifications only exist because they are asserted through References. Therefore, all of them fundamentally link (anchor to) a particular Reference. The details for how References are represented through structured information are outside the scope of TCS2; but because References are so fundamental to the other core components of TCS2 (Names, Circumscriptions, and Classifications), we need to minimally define them as such within TCS2, and ensure that the right granularity of properties for References are captured (e.g., date precision for nomenclature; subsection granularity for nomenclatural acts credited to one author-team that occur within a Reference unit traditionally represented in a bibliographic citation that is credited to a different author team). Critically, all authorship credits/citations for names (nomenclatural authorities) and Circumscription assertions (sec authorship) should be inherited through links to References. That is, there should be no direct links between Names, Circumscriptions, or Classifications to Authors. The links need to be to References, through which the authorships are inherited.
Wow.... that ended up being WAY longer than I intended it to be! But perhaps the first and simplest question for scoping TCS2 is the Names part. Once we decide how narrow or broad the scope of names we want it to accommodate should be, the rest of the scoping should follow more easily.
from tnc.
@mdoering Regarding your list
- structured name objects, nomenclatural relations and their status according to the codes?
- core vocabularies such as ranks?
- identifying and representing author objects?
- type specimens, species and subsequent designations?
- how does TCS2 deal with literature citations?
I would make a plea, that for (3) and (5) we defer to existing vocabularies, in particular http://schema.org/, e.g. https://schema.org/Person and https://schema.org/ScholarlyArticle. This avoids domain-specific vocabularies and aligns what is done here with things going on in the wider world. For more fine grained citation I'd argue that the W3C annotation working group provides pretty much everything needed.
That leaves 1, 2, and 4. The next question is "why isn't the existing TDWG LSID vocabulary already used by ION, Index Fungorum and IPNI" not sufficient for our purposes. IPNI, for example, uses it to describe not just names but their typification (i.e., 4).
I think it's reasonable null hypothesis that it's OK pretty much what we need, subject to some tweaks. If it's not, how about we clearly state why it's not?
from tnc.
I completely agree with @rdmpage on 3 & 5 (authors are part of the reference vocabularies, which are out of scope for TCS2, except perhaps as some sort of "verbatim" capture).
I also tend to agree with @rdmpage on the TDWG LSID vocabularies, except I need to review them. DwC terms cover most of what we need (as noted, with some tweaks).
from tnc.
I am wondering about authors because we do have domain specific interests and properties in them.
Also TDWG has a history of standards for authors and literature if you check the prior-standards. Databases with nomenclatural authors like IPNI usually track area of interest or collections codes they used to deposit types: http://beta.ipni.org/?q=author%20surname%3AMiller
If authors and citations are not covered, should TCS2 at least recommend something existing? To be used for data exchange we need somewhere at least a more comprehensive specification. That is also something that the old TCS was lacking, citations were left outside. In order to reach interoperability I think we would have to nail things down a bit more. That to me is a major reason why dwc archives are much more in use than TCS - combined with the simplicity and interoperability of delimited text of course which is reason number one.
from tnc.
@mdoering I don't see anything in the IPNI search that necessarily requires a domain-specific vocabulary, and in many ways these are things that could be derived from a query rather than being stated in a database. For example https://ozymandias-demo.herokuapp.com/?uri=https://biodiversity.org.au/afd/publication/%23creator/r-mesibov "knows" that Bob Mesibov works on millipedes (POLYDESMIDA) by doing a SPARQL query over the names he has published and the ALA taxonomic classification.
We are also heading towards a situation where every taxonomist is likely to have a Wikidata entry and/or ORCID (the Wikispecies folks are adding taxonomists like crazy), and many Wikidata entries are linked to identifiers such as IPNI author ids. So I think we can use Wikidata as a way to help define scope. If it's in Wikidata we can stop ;)
from tnc.
This all sounds great, I would add two things though, coming at this from a concept/circumscription perspective as opposed to a name perspective.
6. How does TCS2 define a taxonomic concept
7. How will names and name usages be mapped to taxonomic concepts
@deepreef I would add to your "Classifications are assertions about hierarchical placements and arrangements of Circumscriptions with respect to each other" and names to be applied to these Circumscriptions.
from tnc.
@mdoering : Literature/Reference data and authors as objects are vital to taxonomy and nomenclature. My point (which I think was also @rdmpage 's point - but not certain) was that they (and their vocabularies) ought to be managed outside the scope of TCS2. The parts within scope for TCS2 vocabularies would be limited to something like referenceID and perhaps things like verbatimReferenceCitation and verbatimAuthorship. One of the mistakes that taxonomic databases often make (in my opinion) is to try to link names/TNUs/concepts directly to authors, rather than indirectly to authors via Reference instances (which can include units of "Reference" that are more granular than what are traditionally cited in bibliographies, such as individual taxon treatments).
Stepping back to TDWG in general, there are definitely some domain-specific issues that we need to deal with that are not always addressed by library-based vocabularies or other vocabularies. Some examples include:
Reference instances:
- more granular units of Reference instances (such as individual treatments, as sub-components to more traditional units like journal article or book chapter).
- more specific kinds of dates associated with references, which differ somewhat from traditional publication/library dating requirements
Agent instances:
- more robust capabilities for aliases (synonyms)
- treating agents as defined entities separately from the names applied to the agents (related to previous one)
Authorship (relationship of Agent to Reference) instances:
- non-traditional roles of Agents with respect to References, such as "ex." authorships
TWDG spent several years trying to develop a domain-specific standard for References, and got most of the way there, but it never matured to the point where it was adopted. Existing library standards didn't quite cover all our particular needs, and usually included way too much library-specific detail. Probably the best is the NLM/NCBI Journal Publishing DTD, and the TaxPub extension -- but of course we also have to deal with kinds of publications (and unpublished documents) besides those that appear within Journals.
The TDWG efforts on literature didn't delve too deeply into authors, but I think that is reasonably covered by the FOAF vocabulary.
from tnc.
- How does TCS2 define a taxonomic concept
As I mentioned above, I think we should completely avoid the term "taxonomic concept" and focus instead on the more explicit and less baggage-laden "Circumscription". Are there aspects of your notion of "taxonomic concept" that do not fall within the scope of how we would define a "Circumscription"?
- How will names and name usages be mapped to taxonomic concepts
I think TNUs have a 1:1 relationship with defined or implied circumscriptions, so if we accept that "taxonomic concept"="circumscription", then the mapping is self-evident. Similarly, the relationship between "names" and TNUs is very-well defined (for just about every definition of a "name"), so that should be relatively straightforward as well.
@deepreef I would add to your "Classifications are assertions about hierarchical placements and arrangements of Circumscriptions with respect to each other" and names to be applied to these Circumscriptions.
I think the way that names are applied to Circumscriptions is different from Classifications. Basically, the relationship between names and Circumscriptions includes all the heterotypic synonymy stuff and homotypic synonyms involving replacement names (i.e., how many name-bearing type specimens fall within a particular Circumscription). Classifications simply deal with the hierarchical arrangement of the Circumscriptions with respect to each other. The one exception is for names below the rank of genus, where the "name" also incorporates elements of classification. This is the homotypic synonymy stuff (excluding replacement names), where the name itself serves two functions (one in labeling the Circumscription, and one in asserting a hierarchical classification).
from tnc.
We need to make sure that the outcomes of this discussion make it into the new specification. In the 6/7 Nov. meeting we decided to start working on (as in move properties into) the TaxonomicName and TaxonomicNameUsage classes. We decided on these names in this meeting, but they are really the result of the discussions in issue #1 and the 17/18 Sep. meeting.
While recognising the importance of the Relationship class, we decided to leave that for the new year and focus on the above-mentioned two classes first.
We might add Author (or Agent) and Reference (Document?) as auxiliary classes that we don't define ourselves – nor do we define most of their properties – but still need. I would like to suggest the Bibliographic Ontology (BIBO) as another option to consider for the references (@ghwhitbread always says that the NSL is a bibliographic system).
I think the W3C Web Annotation Vocabulary and Data Model could be very useful, especially for the relationship assertions.
from tnc.
Related Issues (20)
- Use of this repo - TNC vs. TCS2 HOT 5
- LSIDs for taxonomic names live again HOT 42
- property:{TO BE NAMED} to indicate the novel status of a taxon in a publication HOT 20
- Teleconference 14 January 2020 20:00 UTC
- The need for "intersects" as a TNU relationship type in addition to the five RCC-5 types HOT 28
- Proposed: 'protonym' property on TaxonomicNameUsage HOT 1
- Vernacular names HOT 13
- More appropriate name for TaxonRelationshipAssertion class HOT 60
- Agents and References HOT 14
- Teleconference 24 March 2020 20:00 UTC HOT 1
- TNU Hackathon 7 April 2020 HOT 5
- Task group? HOT 14
- Teleconference 26/27 May 2020 HOT 7
- "Taxon" ID that does not change unless circumscription has changed HOT 27
- How to indicate which TNUs are current HOT 21
- RCC5 relation intersects HOT 4
- Should taxonomicName be represented as a Subclass of taxonomicNameUsage HOT 45
- Fern concept example HOT 6
- Proposal: add properties that represent TNU relationships HOT 2
- Merging Discussions
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tnc.