Coder Social home page Coder Social logo

brad-cannell / meantables Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 1.0 2.08 MB

The goal of meantables is to quickly make tables of descriptive statistics (i.e., counts, means, confidence intervals) for continuous variables. This package is designed to work in a Tidyverse pipeline, and consideration has been given to get results from R to Microsoft Word ® with minimal pain.

License: Other

R 100.00%
epidemiology descriptive-statistics r

meantables's Introduction

meantables meantables hex logo

CRAN status Downloads

The goal of meantables is to quickly make tables of descriptive statistics (i.e., counts, means, confidence intervals) for continuous variables. This package is designed to work in a Tidyverse pipeline, and consideration has been given to get results from R to ‘Microsoft Word’ ® with minimal pain.

Installation

You can install the released version of meantables from CRAN with:

install.packages("meantables")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("brad-cannell/meantables")

Example

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(meantables)
data("mtcars")

Overall mean table with defaults

mtcars %>% 
  mean_table(mpg)
#> # A tibble: 1 × 9
#>   response_var     n  mean    sd   sem   lcl   ucl   min   max
#>   <chr>        <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 mpg             32  20.1  6.03  1.07  17.9  22.3  10.4  33.9

Formatting overall mean and 95% CI

mtcars %>%
  mean_table(mpg) %>%
  mean_format(
    recipe = "mean (lcl - ucl)",
    name = "mean_95",
    digits = 1
  ) %>% 
  select(response_var, mean_95)
#> # A tibble: 1 × 2
#>   response_var mean_95           
#>   <chr>        <chr>             
#> 1 mpg          20.1 (17.9 - 22.3)

Formatting grouped means table with mean and sd

mtcars %>%
  group_by(cyl) %>%
  mean_table(mpg) %>%
  mean_format("mean (sd)") %>% 
  select(response_var:group_cat, formatted_stats)
#> # A tibble: 3 × 4
#>   response_var group_var group_cat formatted_stats
#>   <chr>        <chr>         <dbl> <chr>          
#> 1 mpg          cyl               4 26.66 (4.51)   
#> 2 mpg          cyl               6 19.74 (1.45)   
#> 3 mpg          cyl               8 15.1 (2.56)

Grouped means table with defaults

mtcars %>% 
  group_by(cyl) %>% 
  mean_table(mpg)
#> # A tibble: 3 × 11
#>   response_var group_var group…¹     n  mean    sd   sem   lcl   ucl   min   max
#>   <chr>        <chr>       <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 mpg          cyl             4    11  26.7  4.51 1.36   23.6  29.7  21.4  33.9
#> 2 mpg          cyl             6     7  19.7  1.45 0.549  18.4  21.1  17.8  21.4
#> 3 mpg          cyl             8    14  15.1  2.56 0.684  13.6  16.6  10.4  19.2
#> # … with abbreviated variable name ¹​group_cat

meantables's People

Contributors

mbcann01 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

alturabi1990

meantables's Issues

Make mean_table(x) work with variable named "x"

Currently, this produces and error:

df <- tibble(
  x = c(1, 2, NA, 4, 5)
)

df <- df %>% 
  mean_table(x)
 Error: Problem with `summarise()` column `response_var`.
ℹ `response_var = rlang::quo_name(x)`.
x `expr` must quote a symbol, scalar, or call
Run `rlang::last_error()` to see where the error occurred. 

I'm pretty sure it's because of the x argument in

mean_table <- function(.data, x, t_prob = 0.975, output = default, digits = 2, ...)

Just change the x to .x or .var

Replace t_prob argument with percent_ci

  • Make consistent with freqtables.
  • entering percent_ci = 95 is much more natural for the end user than entering t_prob = 0.975.
  • Edit the function documentation
  • Edit the unit tests
  • Edit the using_meantables vignette

Change the calculation for "n"

Currently, the calculation for n simply uses the n() function. However, this doesn't give us the answer we are most likely looking for when there is missing data -- the number of non-missing values.

Replace: n = n(),
With: n = !is.na(.data[[rlang::quo_name(x)]]) %>% sum(),

This will also require tweaking the argument for the mean and sd calculation as well. See confidence_intervals.Rmd in r_notes. Wait, maybe not.

Initial submit to CRAN

  • Submit to win-builder
  • Change version number
  • Check changes that had to be made to freqtables

Make mean_tables and freq_tables work together better

Specifically thinking about creating Table 1 here. Look at L2C -> recreate_quarterly_report for example.

  • Consistent variable naming

There will probably be other things as I dig into this.

Should we use a formula impute to the functions (i.e., similar to lm())? Would that make the distinction between response variables and grouping variables more clear?

Add mode function to meantables

Not sure if I want this to be part of mean_table() or something separate.

mode_val <- function(x) {
  
  # Count the number of occurrences for each value of x
  value_counts <- table(x)
  
  # Get the maximum number of times any value is observed
  max_count <- max(value_counts)
  
  # Create and index vector that identifies the positions that correspond to
  # count values that are the same as the maximum count value: TRUE if so
  # and false otherwise
  index <- value_counts == max_count
  
  # Use the index vector to get all values that are observed the same number 
  # of times as the maximum number of times that any value is observed
  unique_values <- names(value_counts)
  result <- unique_values[index]
  
  # If result is the same length as value counts that means that every value
  # occured the same number of times. If every value occurred the same number
  # of times, then there is no mode
  no_mode <- length(value_counts) == length(result)
  
  # If there is no mode then change the value of result to NA
  if (no_mode) {
    result <- NA
  }
  
  # Return result
  result
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.