Coder Social home page Coder Social logo

sunilveeravalli / plotsignificances Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 2.0 26 KB

To compute and plot significances (pvalues and asterisks) on bar graphs

License: MIT License

R 100.00%
bargraphs pvalues pvalue asterisk errorbar ttest wilcoxon-mann-whitney-test tidyverse ggplot2 graphs

plotsignificances's Introduction

Plot significances on graphs

Objective

While working with scientific data, I always have to compute and plot significant differences between two or more datasets. While computing pvalues is pretty easy, plotting them on a graph has always been a tedious process. I have to provide information in the form of annotation to draw lines between bars and add pvalues on top of those lines. I looked for R packages that can do this automatically. Though there are some, but the user has to explicitly input the dataset names for which pvalues have to be plotted.

Since none of the packages could satisfy my needs and as I plot graphs and significances on a daily basis, I therefore decided to write a function that could do this automatically.

System requirements

  1. R
  2. RStudio

Install tidyverse package and attach it as described below.

install.packages("tidyverse", repos = "https://cran.ma.imperial.ac.uk/")
library(tidyverse)

Function usage

I would like to explain the usage of the function with an example.

In my repository, I have provided a sample dataset in "SampleData.csv" file. Import it and modify the data types.

growth.drug.test <- read_csv(file = "SampleData.csv", col_names = TRUE)
str(growth.drug.test)
growth.drug.test <- growth.drug.test %>%
    mutate(Subjects = as.factor(Subjects)) %>%
    mutate(Treatment = as.factor(Treatment))
head(growth.drug.test)
## # A tibble: 6 x 3
##   Subjects Treatment Growth
##   <fct>    <fct>      <dbl>
## 1 Children DrugA      10.5 
## 2 Children DrugB      19.4 
## 3 Children DrugC      29.0 
## 4 Children DrugA       9.18
## 5 Children DrugB      18.8 
## 6 Children DrugC      28.3

This object has three columns:

  1. Subjects
    • A categorical variable with four levels
  2. Treatment
    • A categorical variable with three levels
  3. Growth
    • A numeric variable

The function to plot significances requires the following arguments to be fulfilled:

  1. dataset
    • a dataframe with only two columns: first should be categorical variable and the second should be numeric varaible
  2. error.bars
    • specify "sd" for standard deviation or "sem" for standard error of the mean
  3. type
    • specify "t.test" to perform ttest or "wilcox.text" to perform wilcox test
  4. alternative
    • specify "two.sided" or "greater" or "less"
  5. conf.level
    • provide a value between 0 and 1. Example: for a 95% confidence level, use 0.95
  6. title
    • provide a title for the chart enclosed in double quotes
  7. subtitle
    • provide a subtitle for the chart enclosed in double quotes
  8. xlabel
    • provide a label for X-axis enlcosed in double quotes
  9. ylabel
    • provide a label for Y-axis enclosed in double quotes
  10. legend.title
    • provide a title for the legend enclosed in double quotes

Import the function into your environment

Note: Use this function only to draw bar charts. Will not work with other chart types.

Example 1:

To view the Growth among different Subjects, irrespective of the drug taken:

plotSignificances(dataset = growth.drug.test[c(1,3)],
                  error.bars = "sem",
                  type = "t.test",
                  alternative = "two.sided",
                  conf.level = 0.95,
                  title = "Growth among different subjects",
                  subtitle = "Source: SampleData.csv",
                  xlabel = "Subjects",
                  ylabel = "Growth",
                  legend.title = "Subjects"
                  )

Example 2:

To view the Growth with different Treatment drugs, irrespective of the Subjects tested:

plotSignificances(dataset = growth.drug.test[c(2,3)],
                  error.bars = "sem",
                  type = "t.test",
                  alternative = "two.sided",
                  conf.level = 0.95,
                  title = "Growth with different drugs",
                  subtitle = "Source: SampleData.csv",
                  xlabel = "Treatment",
                  ylabel = "Growth",
                  legend.title = "Drugs Tested"
                  )

Example 3:

As mentioned earlier, the function will only work with datasets that have one categorical variable and one numeric variable. It cannot plot significances on graphs (shown below) created from more than one categorical variables and one numeric variable.

ggplot(data = growth.drug.test,
       aes(x = Subjects, y = Growth, group = Treatment, fill = Treatment)) +
    geom_bar(stat = "identity", position = "dodge") + 
    theme_classic() +
    labs(title = "Graph from two categorical variables and one numeric variable",
         subtitle = "Source: SampleData.csv")

plotsignificances's People

Contributors

sunilveeravalli avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

pang-hd hjanime

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.