Visualize Shifting Taxonomy Over Time

Input

Currently thinking the input will be some sort of unequal CSV style line-based input sort of like:

keyword	year	...	...
describe	1795	Microcebus murinus
describe	1795	Microcebus rufus
split	1994	Microcebus murinus	Microcebus myoxinus
describe	2008	Microcebus arnholdi
describe	2008	Microcebus margotmarshae
synonym	2006	Microcebus mamiratra	Microcebus lokobensis

The alternative "square" form I see is very non-standard and is to use an arrow syntax:

describe: 1795->Microcebus murinus
describe: 1795->Microcebus rufus
split: 1994->Microcebus murinus->Microcebus myoxinus
describe: 2008->Microcebus arnholdi
describe: 2008->Microcebus margotmarshae
synonym: 1795->Microcebus mamiratra->Microcebus lokobensis

Descriptions are unary, while splits and synonymizations are binary so a format blurring the two is nice, but I think the more standard format is the better option as ot will integrate with other tools more easily.

Output

Initial idea is to build a Sankey diagram:

This would break the complete history into the years of events (time moving rightward on the diagram) and the assumption would be made for visual immediacy that a two-way split means 50/50 split, a three-way split is 33/33/33, etc. The reverse for synonym (50/50 to 1). A future addition could be estimated population sizes at time points, but then normalizing the numbers across years would be necessary to retain at least some consistency visually (a population decrease is possible, but we would want to prevent the diagram looking like a sudden population crash or surge as this is not the plot for that purpose). The starting block would just be labeled with the Genus and the first year block would be first described species.

Rejection/Termination Conditions

Only ones I can think of currently is rejection for a split and synonym occurring in the same year involving the same species, or a describe event occurring the same year as a split or synonym involving the described species.

Similar tools

None that I could find, hence the tool request.

Description

It shows the taxonomy within a Genus over time. It does not show complete taxonomy at a higher level over time or otherwise suggests new taxonomy. Although I could see some benefit to incorporating a phylogenetic species concept designation for splits and synonyms to show any inconsistencies there -- i.e. a split was called at 10% genetic difference, but a synonym was called at 12% genetic difference, in reverse these are fine, but here the cutoff for how diverged a species must be for description is unclear.

Research Purpose

It visualizes an element of how we discuss species in an easily digested, high-information transfer manner. We could see quite readily how many species did we consider at a particular year, when new species where discovered, etc.

Add issue template

There needs to be an issue template that provides a guide for what information is needed. Generally speaking I would be interested in a blockbox description of the tool focused on:

I/O
Mention of any similar tools (if such exist, otherwise a snowclone of "It is like __ for __, except __.")
How the tool would help advance research (i.e., what is the scope of the work)

Tool requests must have a general purpose, not be specific to a researcher or lab, therefore by extension they must use standard file formats.

rhagenson / bio-tool-requests Goto Github PK

bio-tool-requests's People

Contributors

Watchers

bio-tool-requests's Issues

Visualize Shifting Taxonomy Over Time

Input

Output

Rejection/Termination Conditions

Similar tools

Description

Research Purpose

Add issue template

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent