rhagenson / bio-tool-requests Goto Github PK
View Code? Open in Web Editor NEWRequests for Bioinformatics tools people want, but nobody has built
License: MIT License
Requests for Bioinformatics tools people want, but nobody has built
License: MIT License
Currently thinking the input will be some sort of unequal CSV style line-based input sort of like:
keyword | year | ... | ... |
---|---|---|---|
describe | 1795 | Microcebus murinus | |
describe | 1795 | Microcebus rufus | |
split | 1994 | Microcebus murinus | Microcebus myoxinus |
describe | 2008 | Microcebus arnholdi | |
describe | 2008 | Microcebus margotmarshae | |
synonym | 2006 | Microcebus mamiratra | Microcebus lokobensis |
The alternative "square" form I see is very non-standard and is to use an arrow syntax:
describe: 1795->Microcebus murinus
describe: 1795->Microcebus rufus
split: 1994->Microcebus murinus->Microcebus myoxinus
describe: 2008->Microcebus arnholdi
describe: 2008->Microcebus margotmarshae
synonym: 1795->Microcebus mamiratra->Microcebus lokobensis
Descriptions are unary, while splits and synonymizations are binary so a format blurring the two is nice, but I think the more standard format is the better option as ot will integrate with other tools more easily.
Initial idea is to build a Sankey diagram:
This would break the complete history into the years of events (time moving rightward on the diagram) and the assumption would be made for visual immediacy that a two-way split means 50/50 split, a three-way split is 33/33/33, etc. The reverse for synonym (50/50 to 1). A future addition could be estimated population sizes at time points, but then normalizing the numbers across years would be necessary to retain at least some consistency visually (a population decrease is possible, but we would want to prevent the diagram looking like a sudden population crash or surge as this is not the plot for that purpose). The starting block would just be labeled with the Genus and the first year block would be first described species.
Only ones I can think of currently is rejection for a split and synonym occurring in the same year involving the same species, or a describe event occurring the same year as a split or synonym involving the described species.
None that I could find, hence the tool request.
It shows the taxonomy within a Genus over time. It does not show complete taxonomy at a higher level over time or otherwise suggests new taxonomy. Although I could see some benefit to incorporating a phylogenetic species concept designation for splits and synonyms to show any inconsistencies there -- i.e. a split was called at 10% genetic difference, but a synonym was called at 12% genetic difference, in reverse these are fine, but here the cutoff for how diverged a species must be for description is unclear.
It visualizes an element of how we discuss species in an easily digested, high-information transfer manner. We could see quite readily how many species did we consider at a particular year, when new species where discovered, etc.
There needs to be an issue template that provides a guide for what information is needed. Generally speaking I would be interested in a blockbox description of the tool focused on:
Tool requests must have a general purpose, not be specific to a researcher or lab, therefore by extension they must use standard file formats.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.