Coder Social home page Coder Social logo

forestgeo / fgeo.tool Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 6.0 3.79 MB

[R-package on CRAN] General purpose tools for ForestGEO Packages

Home Page: https://forestgeo.github.io/fgeo.tool

License: Other

R 99.81% Shell 0.19%
fgeo forestgeo tree ecology dynamics tools utils miscelaneas

fgeo.tool's Introduction

Import and manipulate ForestGEO data

lifecycle Coverage status CRAN status R-CMD-check Codecov test coverage R-CMD-check

fgeo.tool helps you to import and manipulate ForestGEO data.

Installation

Install the latest stable version of fgeo.tool from CRAN with:

install.packages("fgeo.tool")

Install the development version of fgeo.tool from GitHub with:

# install.packages("devtools")
devtools::install_github("forestgeo/fgeo.tool")

Or install all fgeo packages in one step.

Example

library(fgeo.tool)
#> 
#> Attaching package: 'fgeo.tool'
#> The following object is masked from 'package:stats':
#> 
#>     filter
# Helps access data for examples
library(fgeo.x)

example_path() allows you to access datasets stored in your R libraries.

example_path()
#>  [1] "csv"           "mixed_files"   "rdata"         "rdata_one"    
#>  [5] "rds"           "taxa.csv"      "tsv"           "vft_4quad.csv"
#>  [9] "view"          "weird"         "xl"

(vft_file <- example_path("view/vft_4quad.csv"))
#> [1] "/usr/local/lib/R/site-library/fgeo.x/extdata/view/vft_4quad.csv"

read_vft() and read_taxa() import a ViewFullTable and ViewTaxonomy from .tsv or .csv files.

read_vft(vft_file)
#> # A tibble: 500 × 32
#>     DBHID PlotName PlotID Family   Genus Speci…¹ Mnemo…² Subsp…³ Speci…⁴ Subsp…⁵
#>     <int> <chr>     <int> <chr>    <chr> <chr>   <chr>   <chr>     <int> <chr>  
#>  1 385164 luquillo      1 Rubiace… Psyc… brachi… PSYBRA  <NA>        185 <NA>   
#>  2 385261 luquillo      1 Urticac… Cecr… schreb… CECSCH  <NA>         74 <NA>   
#>  3 384600 luquillo      1 Rubiace… Psyc… brachi… PSYBRA  <NA>        185 <NA>   
#>  4 608789 luquillo      1 Rubiace… Psyc… berter… PSYBER  <NA>        184 <NA>   
#>  5 388579 luquillo      1 Arecace… Pres… acumin… PREMON  <NA>        182 <NA>   
#>  6 384626 luquillo      1 Araliac… Sche… moroto… SCHMOR  <NA>        196 <NA>   
#>  7 410958 luquillo      1 Rubiace… Psyc… brachi… PSYBRA  <NA>        185 <NA>   
#>  8 385102 luquillo      1 Piperac… Piper glabre… PIPGLA  <NA>        174 <NA>   
#>  9 353163 luquillo      1 Arecace… Pres… acumin… PREMON  <NA>        182 <NA>   
#> 10 481018 luquillo      1 Salicac… Case… arborea CASARB  <NA>         70 <NA>   
#> # … with 490 more rows, 22 more variables: QuadratName <chr>, QuadratID <int>,
#> #   PX <dbl>, PY <dbl>, QX <dbl>, QY <dbl>, TreeID <int>, Tag <chr>,
#> #   StemID <int>, StemNumber <int>, StemTag <int>, PrimaryStem <chr>,
#> #   CensusID <int>, PlotCensusNumber <int>, DBH <dbl>, HOM <dbl>,
#> #   ExactDate <date>, Date <int>, ListOfTSM <chr>, HighHOM <int>,
#> #   LargeStem <chr>, Status <chr>, and abbreviated variable names ¹​SpeciesName,
#> #   ²​Mnemonic, ³​Subspecies, ⁴​SpeciesID, ⁵​SubspeciesID
#> # ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable names

pick_dbh_under(), drop_status() and friends pick and drop rows from a ForestGEO ViewFullTable or census table.

tree5 <- fgeo.x::tree5

tree5 %>% 
  pick_dbh_under(100)
#> # A tibble: 18 × 19
#>    treeID stemID tag    StemTag sp     quadrat    gx    gy Measu…¹ Censu…²   dbh
#>     <int>  <int> <chr>  <chr>   <chr>  <chr>   <dbl> <dbl>   <int>   <int> <dbl>
#>  1   7624 160987 108958 175325  TRIPAL 722     139.  425.   486675       5  10.9
#>  2  19930 117849 123493 165576  CASARB 425      61.3 496.   471979       5  23.6
#>  3  31702  39793 22889  22889   SLOBER 304      53.8  73.8  447307       5  67  
#>  4  35355  44026 27538  27538   SLOBER 1106    203.  110.   449169       5  50  
#>  5  39705  48888 33371  33370   CASSYL 1010    184.  194.   451067       5  67  
#>  6  57380 155867 66962  171649  SLOBER 1414    274.  279.   459427       5  16.6
#>  7  95656 129113 131519 131519  OCOLEU 402      79.7  22.8  474157       5  23.6
#>  8  96051 129565 132348 132348  HIRRUG 1403    278    40.6  474523       5  12.9
#>  9  96963 130553 134707 134707  TETBAL 610     114.  182.   475236       5  18.6
#> 10 115310 150789 165286 165286  MANBID 225      24.0 497.   483175       5  14.6
#> 11 121424 158579 170701 170701  CASSYL 811     146.  218.   484785       5  20.2
#> 12 121689 158871 171277 171277  INGLAU 515      84.2 285.   485077       5  13.4
#> 13 121953 159139 171809 171809  PSYBRA 1318    247.  354.   485345       5  14  
#> 14 124522 162698 174224 174224  CASSYL 1411    279.  210.   488386       5  13.1
#> 15 125038 163236 175335 175335  CASSYL 822     153.  426.   488924       5  14.5
#> 16 126087     NA 177394 <NA>    CASARB 521      89.8 408.       NA      NA  NA  
#> 17 126803     NA 178513 <NA>    PSYBER 622     113.  426        NA      NA  NA  
#> 18 126934     NA 178763 <NA>    MICRAC 324      47   480.       NA      NA  NA  
#> # … with 8 more variables: pom <chr>, hom <dbl>, ExactDate <date>,
#> #   DFstatus <chr>, codes <chr>, nostems <dbl>, status <chr>, date <dbl>, and
#> #   abbreviated variable names ¹​MeasureID, ²​CensusID
#> # ℹ Use `colnames()` to see all variable names

pick_main_stem() and pick_main_stemid() pick the main stem or main stemid(s) of each tree in each census.

stem <- download_data("luquillo_stem6_random")

dim(stem)
#> [1] 1320   19
dim(pick_main_stem(stem))
#> Warning: The `add` argument of `group_by()` is deprecated as of dplyr 1.0.0.
#> Please use the `.add` argument instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
#> [1] 1000   19

add_status_tree() adds the column status_tree based on the status of all stems of each tree.

stem %>% 
  select(CensusID, treeID, stemID, status) %>% 
  add_status_tree()
#> # A tibble: 1,320 × 5
#>    CensusID treeID stemID status status_tree
#>       <int>  <int>  <int> <chr>  <chr>      
#>  1        6    104    143 A      A          
#>  2        6    119    158 A      A          
#>  3       NA    180    222 G      A          
#>  4       NA    180    223 G      A          
#>  5        6    180    224 G      A          
#>  6        6    180    225 A      A          
#>  7        6    602    736 A      A          
#>  8        6    631    775 A      A          
#>  9        6    647    793 A      A          
#> 10        6   1086   1339 A      A          
#> # … with 1,310 more rows
#> # ℹ Use `print(n = ...)` to see more rows

add_index() and friends add columns to a ForestGEO-like dataframe.

stem %>% 
  select(gx, gy) %>% 
  add_index()
#> Guessing: plotdim = c(320, 500)
#> * If guess is wrong, provide the correct argument `plotdim`
#> # A tibble: 1,320 × 3
#>       gx    gy index
#>    <dbl> <dbl> <dbl>
#>  1  10.3  245.    13
#>  2 183.   410.   246
#>  3 165.   410.   221
#>  4 165.   410.   221
#>  5 165.   410.   221
#>  6 165.   410.   221
#>  7 149.   414.   196
#>  8  38.3  245.    38
#>  9 143.   411.   196
#> 10  68.9  253.    88
#> # … with 1,310 more rows
#> # ℹ Use `print(n = ...)` to see more rows

Get started with fgeo

Information

fgeo.tool's People

Contributors

jimhester avatar maurolepore avatar overstreeth avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

fgeo.tool's Issues

Improve by_group() (or groupwise())

# I want to make any funciton work with grouped data. I propose 
# by_group(.data, .f, ...), similar to map(), by applies .f() not to each column
# of .data but to each group -- and to all of it if .data is ungrouped.
# alias: grouply(), groupwise().

# History of this feature including upcomming dplyr::nest_by()
# https://community.rstudio.com/t/is-nest-mutate-map-unnest-really-the-best-alternative-to-dplyr-do/11009/7?u=mauro_lepore

library(tidyverse)

# E.g.
# nest_by_groups(mtcars)
# nest_by_groups(group_by(mtcars, cyl))
nest_by_groups <- function(.data) {
  g <- group_vars(.data)
  .data %>% 
    tibble::as.tibble() %>% 
    tibble::add_column(.nest_id = dplyr::group_indices(.)) %>% 
    dplyr::ungroup() %>%
    tidyr::nest(-.nest_id)
}

by_group <- function(.data, .f, ...) {
  .data %>% 
    nest_by_groups() %>% 
    mutate(data = map(.data$data, .f, ...)) %>% 
    tidyr::unnest() %>% 
    dplyr::select(-.nest_id)
}

# E.g. 
first_row <- function(.x, to_chr = FALSE) {
  first <- .x[1, ]
  if (to_chr) {
    first[] <- lapply(first, as.character)
  }
  
  tibble::as.tibble(first)
}
mtcars %>% 
  by_group(first_row, to_chr = TRUE)
#> # A tibble: 1 x 11
#>   mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
#>   <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 21    6     160   110   3.9   2.62  16.46 0     1     4     4

mtcars %>% 
  group_by(cyl) %>% 
  by_group(first_row, to_chr = T)
#> # A tibble: 3 x 11
#>   mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
#>   <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
#> 2 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
#> 3 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2

Created on 2018-07-16 by the reprex package (v0.2.0.9000).

Create tree table from stem table

Pull the code out of rtbl?

Rewrite? E.g.

pick_stem_max_dbh <- function(.data) {
  # Allow input as ViewFullTable, stem and tree table
  old <- names(.data)
  names(.data) <- tolower(old)
  fgeo.base::check_crucial_names(.data, c("treeid", "stemid", "dbh"))
  
  # TODO: Check for unique censusid or group by census id 
  
  # TODO:
  # if (tree_already) {return(stats::setNames(.data, old))}
  
  .data <- tibble::rowid_to_column(.data)
  .data <- dplyr::group_by(.data, .data$treeid)
  .data <- dplyr::arrange(.data, dplyr::desc(.data$dbh))
  .data <- dplyr::ungroup(.data)
  
  out <- dplyr::distinct(.data, .data$treeid, .keep_all = TRUE)
  # Recover original order
  out <- dplyr::select(dplyr::arrange(out, .data$rowid), -.data$rowid)
  
  stats::setNames(out, old)
}

df <- tibble::tibble(
  treeID = c(1, 1, 1, 2, 2, 2),
  stemID = letters[c(1, 2, 3, 1, 2, 3)],
  dbh = c(1, 2, NA, 4, 5, NA)
)
df
#> # A tibble: 6 x 3
#>   treeID stemID   dbh
#>    <dbl> <chr>  <dbl>
#> 1      1 a          1
#> 2      1 b          2
#> 3      1 c         NA
#> 4      2 a          4
#> 5      2 b          5
#> 6      2 c         NA

pick_stem_max_dbh(df)
#> # A tibble: 2 x 3
#>   treeID stemID   dbh
#>    <dbl> <chr>  <dbl>
#> 1      1 b          2
#> 2      2 b          5

Created on 2018-06-28 by the reprex package (v0.2.0).

thank @rick_pack2

replace_all_na <- function(x, filler = 0) {
  replace(x, http://is.na (x), filler)
}
replace_all_na(df, 0)
#>   x y
#> 1 0 2
#> 2 1 0
replace_all_na(df, "missing")
#>         x       y
#> 1 missing       2
#> 2       1 missing

Write helpers to categorize data

From @maurolepore on August 31, 2017 16:19

Gabriel proposed to develop a friendly way to categorize (cut) numeric variables. Important ones include:

  • from dbh to size category.
  • from gx and gy to quadrat: see add_quad().
suppressPackageStartupMessages(library(fgeo))
x <- tibble::tibble(gx = c(0, 50, 999.9, 1000), gy = gx/2)
add_quad(x)
#> Gessing: plotdim = c(1000, 500)
#>   * If guess is wrong, provide the correct argument `plotdim`
#> # A tibble: 4 x 3
#>      gx    gy quad 
#>   <dbl> <dbl> <chr>
#> 1    0     0  0101 
#> 2   50    25  0302 
#> 3 1000.  500. 5025 
#> 4 1000   500  NANA
add_quad(x, start = 0)
#> Gessing: plotdim = c(1000, 500)
#>   * If guess is wrong, provide the correct argument `plotdim`
#> # A tibble: 4 x 3
#>      gx    gy quad 
#>   <dbl> <dbl> <chr>
#> 1    0     0  0000 
#> 2   50    25  0201 
#> 3 1000.  500. 4924 
#> 4 1000   500  NANA

Created on 2018-07-02 by the reprex package (v0.2.0).

Copied from original issue: forestgeo/fgeo.abundance#46

select2()

select2 <- function(.data, ...) {
  dots <- rlang::list2(...)
  select(.data, !!!text_exprs(dots))
}

text_exprs <- function(...) {
  rlang::parse_exprs(semicolon(...))
}

semicolon <- function(...) {
  paste0(..., collapse = "; ")
}

library(tidyverse)

mtcars <- as.tibble(mtcars)

text_vars <- c("mpg", "am")
select2(mtcars, text_vars)
#> # A tibble: 32 x 2
#>      mpg    am
#>  * <dbl> <dbl>
#>  1  21       1
#>  2  21       1
#>  3  22.8     1
#>  4  21.4     0
#>  5  18.7     0
#>  6  18.1     0
#>  7  14.3     0
#>  8  24.4     0
#>  9  22.8     0
#> 10  19.2     0
#> # ... with 22 more rows

# If multiple strings, separate vars with comma
text_vars <- c("mpg", "am", "carb")
select2(mtcars, text_vars)
#> # A tibble: 32 x 3
#>      mpg    am  carb
#>  * <dbl> <dbl> <dbl>
#>  1  21       1     4
#>  2  21       1     4
#>  3  22.8     1     1
#>  4  21.4     0     1
#>  5  18.7     0     2
#>  6  18.1     0     1
#>  7  14.3     0     4
#>  8  24.4     0     2
#>  9  22.8     0     2
#> 10  19.2     0     4
#> # ... with 22 more rows

# If single string, separate with semicolon
text_vars <- c("mpg; am; carb")
select2(mtcars, text_vars)
#> # A tibble: 32 x 3
#>      mpg    am  carb
#>  * <dbl> <dbl> <dbl>
#>  1  21       1     4
#>  2  21       1     4
#>  3  22.8     1     1
#>  4  21.4     0     1
#>  5  18.7     0     2
#>  6  18.1     0     1
#>  7  14.3     0     4
#>  8  24.4     0     2
#>  9  22.8     0     2
#> 10  19.2     0     4
#> # ... with 22 more rows

# With a single string you can do all you can do with bare
text <- 'mpg:cyl; matches("df"); everything()'
select2(mtcars, text)
#> # A tibble: 32 x 11
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>  * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
#>  5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
#>  6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
#>  7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
#>  8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
#>  9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
#> 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4
#> # ... with 22 more rows
# Same
library(rlang)
#> 
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#> 
#>     %@%, %||%, as_function, flatten, flatten_chr, flatten_dbl,
#>     flatten_int, flatten_lgl, invoke, list_along, modify, prepend,
#>     rep_along, splice
bare <- exprs(mpg:cyl, matches("df"), everything())
select(mtcars, !!!bare)
#> # A tibble: 32 x 11
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>  * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
#>  5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
#>  6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
#>  7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
#>  8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
#>  9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
#> 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4
#> # ... with 22 more rows

Created on 2018-04-29 by the reprex package (v0.2.0).

reexport the main dplyr verbs

Consider reexporting the main verbs and common tools.

pro:

  • everything is available with (fgeo)
    con:

  • conflicts will show up when loading tidyverse

  • makes it harder to discover the tidyverse

main verbs

filter
select
arrange
summarize
mutate

also

group_by
tibble
tribble
as_tibble
count
add_count

Fix byyr_abundance() to count only one individual per Tag not per StemID

Here the output should be 1 -- ont 2 (@).

library(fgeo)
#> -- Attaching packages -------------------------------------------- fgeo 0.0.0.9000 --
#> v bciex           0.0.0.9000     v fgeo.demography 0.0.0.9000
#> v fgeo.abundance  0.0.0.9004     v fgeo.habitat    0.0.0.9006
#> v fgeo.base       0.0.0.9001     v fgeo.map        0.0.0.9204
#> v fgeo.data       0.0.0.9002     v fgeo.tool       0.0.0.9003
#> 

vft <- data.frame(
  StemID = c("1", "2"),
  Tag = c("0001", "0001"),
  PlotName = "p",
  Status = c("alive", "alive"),
  DBH = c(10, 100),
  ExactDate = c("2000-01-01", "2000-01-01"),
  PlotCensusNumber = c(1, 1),
  CensusID = c(1, 1),
  Genus = c("A", "A"),
  SpeciesName = c("a", "a"),
  Family = "f",
  stringsAsFactors = FALSE
)
vft
#>   StemID  Tag PlotName Status DBH  ExactDate PlotCensusNumber CensusID
#> 1      1 0001        p  alive  10 2000-01-01                1        1
#> 2      2 0001        p  alive 100 2000-01-01                1        1
#>   Genus SpeciesName Family
#> 1     A           a      f
#> 2     A           a      f

# First pick the data you want
pick1 <- pick_plotname(vft, "p")
#> Using: p.

pick2 <- drop_dead_trees_by_cns(pick1)
#> Calculating tree-status (from stem `Status`) by `PlotCensusNumber`.
#> Warning: No observation has .status = dead
#>   * Detected values: alive
#> Dropping rows where `Status = dead`.
#> Warning: No observation has .status = dead
#>   * Detected values: alive
pick3 <- pick_dbh_min(pick2, 10)
pick3
#> # A tibble: 2 x 12
#>   StemID Tag   PlotName Status   DBH ExactDate  PlotCensusNumber CensusID
#>   <chr>  <chr> <chr>    <chr>  <dbl> <chr>                 <dbl>    <dbl>
#> 1 1      0001  p        alive     10 2000-01-01                1        1
#> 2 2      0001  p        alive    100 2000-01-01                1        1
#> # ... with 4 more variables: Genus <chr>, SpeciesName <chr>, Family <chr>,
#> #   status_tree <chr>

byyr_abundance(pick3)
#> # A tibble: 1 x 3
#>   species Family `2000`
#>   <chr>   <chr>   <dbl>
#> 1 A a     f           2

Created on 2018-06-20 by the reprex package (v0.2.0).

Clean flag_multiple() and multiple_var().

DRY. Some funcitons seem duplicated with fgeo.tool.

flat_multiple() is being used but flag_if() and multiple_var() are not. See if and which is a better alternative to flag_multiple() and similar functions in fgeo.tool. Then clean the rest.

FastField Forms spreadsheet export - missing required sheets

Please briefly describe your problem and what output you expect. If you have a question, please don't use this form. Instead, ask on https://stackoverflow.com/, https://community.rstudio.com/, https://github.com/forestgeo/forum/ or email Mauro Lepore at [email protected].

Please include a minimal reproducible example (AKA a reprex). If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex.


Brief description of the problem
When using FastField Forms for a re-census we've discovered or had submissions missing one or multiple of the required sheets: 'new_secondary_stems', 'recruits', and 'original_stems'. If no recruits are recorded in a quadrat, no sheet is exported. Likewise, post-submission if a recruit is found, then added to a 'new' file by itself, the 'new_secondary_stems', and 'original_stems' sheets are missing. This of course breaks the code and the files cannot be compiled. Any solution to this, or should I reach out to FastField Forms/add the sheets ourselves?

> ######## Compile the separate worksheets from the quadrat workbooks into new excel files

> files.directory <- "C:/Users/shuej/Dropbox (Smithsonian)/Field_Form_Data_Entry/HF/uploads"

> files.export <- "C:/Users/shuej/Dropbox (Smithsonian)/Field_Form_Data_Entry/HF/exports/test"
> xl_sheets_to_xl(files.directory, files.export, first_census = FALSE)
Error: Data should contain these sheets:
original_stems, new_secondary_stems, recruits, root
* Missing sheets: recruits
In addition: Warning messages:
1: `new_secondary_stems` has cero rows. 
2: Filling every cero-row dataframe with NAs (new_secondary_stems). 
3: `new_secondary_stems` has cero rows. 
4: Filling every cero-row dataframe with NAs (new_secondary_stems). 

error with xlff_to_dfs - repeating sheet name

After running the FastField Forms files through the xlff_to_xl function we then attempted to run the new xlff_to_dfs function. We were unsuccessful and received the following error:

> first_week_hf <- xlff_to_dfs("C:/Users/shuej/Dropbox (Smithsonian)/Field_Form_Data_Entry/HF/exports/test")
Error: Column `sheet` must have a unique name
In addition: Warning message:
Adding missing sheets: original_stems, new_secondary_stems, recruits, [root.]

The name of the sheet for each stem repeats multiple times depending upon which sheet the stem originated from.
excel file

Consider that mpala must be padded with 5 digits -- not 4

mpala is the only dataset that has so many quadrats that it needs to be padded with 5 instead of 4 "0".

Write an example showing that if the data is subsetted before it is passed to plot_tag_status_by_subquadrat() the order of maps may be different than expected. So show how to arrange the data before subsetting it.

fgeo classes and methods to classify tables

The job is to homogenize the different datasets so that they can be used with the greatest number of functions.

Use data from Yosemite for example. Suzanne said it's up to date

  • Identify each table by its names. Check the difference between names to understand what identifies each table uniquely

  • Check what names intersect

  • What variables are the same, regardless of their names. See definition in CFTS pakge

  • What variables are not the same

  • What variables matter -- the crucial names of all functions I have developed (search for crucial_names())

...
Fgeo_clasify x
If (names x match vft) as.vft
If ... as.stem

vft <- f (x, ...) structure(x, class = c(vft, tibble?, data.frame))

as.fgeo
as_fgeo?

homo_nms.vft
homo_nms.stem
...

fill_na()

Rename replace _all_na() to fill_na() with is more specific. Replace is a general term on the implementation domain -- because it is inspired in replace() -- while fill_na() is on the problem domain.

Consider adding a method for lists.

fill_na <- function(x, filler) {
  x[is.na(x)] <- filler
  x
}

Wrap vegan

Diversity

E.g. from qo00 in maurolepore/hrv

diversity_by_sample <- wrangle() %>%
  dplyr::group_by(sample_name) %>%
  dplyr::mutate(
    shannon = vegan::diversity(abun_pcnt, "shannon", MARGIN = 2),
    invsimpson = vegan::diversity(abun_pcnt, "invsimpson", MARGIN = 2),
    simpson = vegan::diversity(abun_pcnt, "simpson", MARGIN = 2),
    richness = length(sample_name)
  ) %>%
  dplyr::ungroup() %>%
  dplyr::select(
    time_interval, collected_from, sample_name, species, shannon:richness
  ) %>%
  # Keep only one row per sample -- not one per sample per species
  dplyr::select(-species) %>%
  unique() %>%
  # Transform to long format to allow facetting by metric(key)
  tidyr::gather(key = metric, value = value, shannon:richness)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.