Coder Social home page Coder Social logo

datadrivenecologicalsynthesis / weirdestspeciescombination Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 0.0 42.24 MB

What/where are the weirdest species combinations on Earth? ๐Ÿง๐ŸŒด

Home Page: https://datadrivenecologicalsynthesis.github.io/WeirdestSpeciesCombination/

License: MIT License

R 2.53% HTML 97.47%
biodiversity-patterns species-distributions traits

weirdestspeciescombination's People

Contributors

afilazzola avatar benmersci avatar cdrobich avatar scisus avatar tpoisot avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

weirdestspeciescombination's Issues

TRY traits to retrieve

Uploaded a .csv of potential traits to pull for the trait requests - tried to have a variety of options that should have enough data. Just wanted people's input on it before requesting the data

To do for tomorrow:

  1. Compare std among the quadrats (so we can say which one is the weirdest i.e. within city comparison)

  2. Compare the community composition (species list) among cities and determine how much overlap there is (e.g. are 90% of the species in each city the same?)

  3. Figure out the extreme ends of the traits within each city? (like are they all Juniper vs Impatiens?)

  4. Make a species richness x quadrat barchart

  5. Assign parts to folks & write up the scripts etc.

Getting TRY data into GitHub

We got the TRY data for the first request (1000 species) as a .txt file. It came with a readMe that makes it easy to get this data into R and write a .csv, however the .csv is too large for GitHub. It is over 2 mil rows, or 1081GB (limit for github is 25GB). I think we have to make some kind of loop to break it down into small enough .csv (quick math of 1081/25 = ~73 .csvs). Or someone who is better at this than me can find a work around?

To make it accessible for everyone I will share the TRY data email with everyone.

Tidy data folder

Data folder would be clearer with a short meta file describing what's there and where it's from.

Might also want to put species lists into a folder and then fix paths in scripts.

Get rid of duplication in matchGbifTry.R

figure out if there's anything left in matchGbifTry.R that we want, move it somewhere useful. GetTRY_ids.R duplicates some of matchGbifTry.R's functionality and this is confusing.

Why only urban areas?

Should we include non-urban areas?

We chose (near) urban areas because of sampling density and human interest.

But it might be interesting to include non-urban areas near cities as comparisons.

Workflow description

Write a brief description of the workflow - order to run scripts and TRY steps.

Final figures

Place to brainstorm and organize figures

  1. Histogram showing SLA distribution among the 6 cities

    • Do we also want to do this by quadrat?
  2. List of weird species (5th and 95th quantiles)

    • By city?
    • By quadrats across cities?
    • Quadrats within each city?
    • All species?
  3. How to illustrate differences in weirdness within a city

  4. How to illustrate differences in weirdness among all cities

Generate Master species list

  1. Fix the for loop
  2. Generate species lists for all 6 cities
  3. Generate one Master list that includes all the species names and a column that sums all of their occurrences

Pull trait data for Toronto from TRY

We haven't settled on a final list yet, but using these (or some of these) traits suggested by @BenMerSci in issue #34

SLA
Seed dry mass
Plant woodiness
Plant height vegetative
Leaf thickness
Leaf phenols content per leaf dry mass ( proxy for palatable/"toxic"?)
Leaf petiole length
Leaf nitrogen (N) content per leaf dry mass

pull trait data for Toronto plants from TRY. Accession numbers for Toronto plants are now in the data/tryAccession folder.

LICENSE display

Minor issue, but I noticed that our LICENSE isn't displaying like the other projects, like we can't see what it includes or not when we click on it.
Feels like it's only a normal text file and not an actual LICENSE.
Can't find what's the problem though....

What taxa will we include?

We've currently limited it to plants because of trait data availability and our own interest.

We may need to limit it further if we want to include a larger area.

Find weird species

Define weird species

  • overall (globally weird)
  • and for each city (locally weird)

using 5th and 95th percentiles.

Limitations in the number of occurrences

Problems:

  • There is close to 500 k observations within the Toronto boundary. GBIF allows downloads in batches of 500 with a maximum of 200,000 / day. Too many for DL and takes a long time even smaller batches.
  • The Toronto boundary is uneven, based on rounding lat & lon, there is also the lake in the way. Need to create polygon that is comparable to other cities.

Choose cities to include

Consider

  • How many? (Enough to be interesting, not so many that it's overwhelming)
  • Different biogeographical regions?

What about a city would make it have weird(er) species assemblages?

I think that @BenMerSci 's idea about adding other layers like climate might be useful for answering this question (if we decide to pursue it.)

Group discussion pointed out that city connectivity (e.g. ports) might matter.

I think that Vancouver might win because so many kinds of plants can survive it's mild climate. But that might depend on how many observations got made at the Montreal Botanical Gardens!

What is weird?

How exactly are we defining weird?

Thoughts from the group included

  • defining a normal and seeing what would violate that
  • what makes you say "whoa!"
  • trait ordination and seeing which assemblages are the widest range of traits in that space

How are we handling non-native species? Does the dataset we're using include them?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.