datadrivenecologicalsynthesis / weirdestspeciescombination Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 0.0 42.24 MB

What/where are the weirdest species combinations on Earth? 🐧🌴

Home Page: https://datadrivenecologicalsynthesis.github.io/WeirdestSpeciesCombination/

License: MIT License

R 2.53% HTML 97.47%

biodiversity-patterns species-distributions traits

weirdestspeciescombination's People

Contributors

Stargazers

Watchers

weirdestspeciescombination's Issues

Submit TRY request for species on the Master list

Requires cleaned gbif species names (#41) and accession numbers (#39) and list of finalized traits (#34).

TRY traits to retrieve

Uploaded a .csv of potential traits to pull for the trait requests - tried to have a variety of options that should have enough data. Just wanted people's input on it before requesting the data

To do for tomorrow:

Compare std among the quadrats (so we can say which one is the weirdest i.e. within city comparison)
Compare the community composition (species list) among cities and determine how much overlap there is (e.g. are 90% of the species in each city the same?)
Figure out the extreme ends of the traits within each city? (like are they all Juniper vs Impatiens?)
Make a species richness x quadrat barchart
Assign parts to folks & write up the scripts etc.

We got the TRY data for the first request (1000 species) as a .txt file. It came with a readMe that makes it easy to get this data into R and write a .csv, however the .csv is too large for GitHub. It is over 2 mil rows, or 1081GB (limit for github is 25GB). I think we have to make some kind of loop to break it down into small enough .csv (quick math of 1081/25 = ~73 .csvs). Or someone who is better at this than me can find a work around?

To make it accessible for everyone I will share the TRY data email with everyone.

Match TRY traits species traits back to the species in each quadrat

Basically combining data/traitData.csv and MasterSpeciesList_clean.csv

Tidy data folder

Data folder would be clearer with a short meta file describing what's there and where it's from.

Might also want to put species lists into a folder and then fix paths in scripts.

Get rid of duplication in matchGbifTry.R

figure out if there's anything left in matchGbifTry.R that we want, move it somewhere useful. GetTRY_ids.R duplicates some of matchGbifTry.R's functionality and this is confusing.

Reduce the species Master list to 10,000

~~WINNOW~~ REDUCE THAT LIST

Get it down to 10, 000 species, using an (arbitrary) cut off of occurrences until we're below 10,000 species

Why only urban areas?

Should we include non-urban areas?

We chose (near) urban areas because of sampling density and human interest.

But it might be interesting to include non-urban areas near cities as comparisons.

Update meta and workflow

update meta.md and workflow.md through the the work merged in with #59

also update the roadmap!

Workflow description

Write a brief description of the workflow - order to run scripts and TRY steps.

Final figures

Place to brainstorm and organize figures

Histogram showing SLA distribution among the 6 cities
- Do we also want to do this by quadrat?
List of weird species (5th and 95th quantiles)
- By city?
- By quadrats across cities?
- Quadrats within each city?
- All species?
How to illustrate differences in weirdness within a city
How to illustrate differences in weirdness among all cities

Add descriptive titles for open refine files

Which open refine file cleans what data? Bonus - update workflow.md to reflect new titles

Generate Master species list

Fix the for loop
Generate species lists for all 6 cities
Generate one Master list that includes all the species names and a column that sums all of their occurrences

Pull trait data for Toronto from TRY

We haven't settled on a final list yet, but using these (or some of these) traits suggested by @BenMerSci in issue #34

SLA
Seed dry mass
Plant woodiness
Plant height vegetative
Leaf thickness
Leaf phenols content per leaf dry mass ( proxy for palatable/"toxic"?)
Leaf petiole length
Leaf nitrogen (N) content per leaf dry mass

pull trait data for Toronto plants from TRY. Accession numbers for Toronto plants are now in the data/tryAccession folder.

LICENSE display

Minor issue, but I noticed that our LICENSE isn't displaying like the other projects, like we can't see what it includes or not when we click on it.
Feels like it's only a normal text file and not an actual LICENSE.
Can't find what's the problem though....

What taxa will we include?

We've currently limited it to plants because of trait data availability and our own interest.

We may need to limit it further if we want to include a larger area.

Find weird species

Define weird species

overall (globally weird)
and for each city (locally weird)

using 5th and 95th percentiles.

Limitations in the number of occurrences

Problems:

There is close to 500 k observations within the Toronto boundary. GBIF allows downloads in batches of 500 with a maximum of 200,000 / day. Too many for DL and takes a long time even smaller batches.
The Toronto boundary is uneven, based on rounding lat & lon, there is also the lake in the way. Need to create polygon that is comparable to other cities.

Clean species Master list

Initial tidying in matchGbifDataTry.R

#28 (comment)

Choose cities to include

Consider

How many? (Enough to be interesting, not so many that it's overwhelming)
Different biogeographical regions?

Which TRY traits will we include?

Decide which traits will be included in the analysis.

What about a city would make it have weird(er) species assemblages?

I think that @BenMerSci 's idea about adding other layers like climate might be useful for answering this question (if we decide to pursue it.)

Group discussion pointed out that city connectivity (e.g. ports) might matter.

I think that Vancouver might win because so many kinds of plants can survive it's mild climate. But that might depend on how many observations got made at the Montreal Botanical Gardens!

defining a normal and seeing what would violate that
what makes you say "whoa!"
trait ordination and seeing which assemblages are the widest range of traits in that space

How are we handling non-native species? Does the dataset we're using include them?

datadrivenecologicalsynthesis / weirdestspeciescombination Goto Github PK

weirdestspeciescombination's People

Contributors

Stargazers

Watchers

weirdestspeciescombination's Issues

Recommend Projects

Recommend Topics

Recommend Org