karstenslab / microshades Goto Github PK

This repo contains the R microshades package, which contains a color blind accessible color palette with 30 unique colors and functions for applying these colors to microbiome data.

Home Page: https://karstenslab.github.io/microshades

License: Other

R 100.00%

microbiome-data color-palette microshades-cvd-palettes phyloseq cvd r data-visualization

microshades's Issues

re: no Microshades hexagon logo

Hello !
I was noticing there is a banner header but no R hexagon logo. Can I design one for this package ? I will adhere strictly to the colours used in the banner.
I have made a logo for the Terra package. I will do it for free, need more examples for my portfolio. Please and thank you.

sink most abundant group taxonomy

Add a flag in reorder samples by or create new function to sink most abundant group after the color obj is created

Issue when generating a color object

Hi,

Thanks for the useful tool! However, issue is that when I am trying to generate a color object using the code:

color_objs_GP <- create_color_dfs(mdf_prep,selected_groups = c("Verrucomicrobia", "Proteobacteria", "Actinobacteria", "Bacteroidetes", "Firmicutes") , cvd = TRUE)

I got the message:

Error in create_color_dfs(mdf_prep, selected_groups = c("Verrucomicrobia", :
some 'selected_groups' do not exist in the dataset. Consider SILVA 138 c('Proteobacteria', 'Actinobacteriota', 'Bacteroidota', 'Firmicutes')

Any help would be appreciate :)

Jesus

microshades GP example

Examine different sample types as groups (Soil and Sediment) and run microshades on each grouping to show why it might be favorable to run

[BUG/Version Control] `fct_explicit_na()` was deprecated in forcats 1.0.0.

Hello!

Just to let you know about a function change in one of your dependencies!

Warning message: There was 1 warning in mutate(). ℹ In argument: Genus = fct_explicit_na(Genus, "Unknown"). Caused by warning: ! fct_explicit_na()was deprecated in forcats 1.0.0. ℹ Please usefct_na_value_to_level()instead. ℹ The deprecated feature was likely used in the microshades package. Please report the issue to the authors. This warning is displayed once every 8 hours. Calllifecycle::last_lifecycle_warnings()to see where this warning was generated.

Best,

Erfan

TreeSummarizedExperiment support

Thanks for the great work.

TreeSummarizedExperiment is a new data container for microbiome data in Bioconductor. Also the curatedMetagenomicData used in the examples is providing the data in this format.

-> Would be great to have support for TreeSummarizedExperiment in addition to phyloseq.

Add to CRAN & conda?

These are nice looking palettes! Would you consider submitting the package to CRAN and, once accepted by CRAN, submitting a corresponding recipe to conda-forge? This will allow users to install your package with conda in reproducible projects.

Reduce parameters

Reduce parameters in match_color_df() to determine selected_groups from mdf_group.

flip stack orientation

flip stack abundance so most abundant is at the top

sink_abundant_groups not working

Hello,

Thanks so much for developing this fantastic package. I'm trying to plot some of my own data following the Human Microbiome Project tutorial. I'm successfully able to reorder samples by abundance of, for example, Flavobacteriaceae, but sink_abundant_groups=TRUE doesn't seem to be bringing most abundant groups to the bottom (there is no difference between setting it to false vs. setting it to true). Is there something I'm missing? Here is some example code below and the plot output. Thanks!

`new.sample.order <- reorder_samples_by(mdf, cdf, order = "Flavobacteriaceae", group_level = "Phylum", subgroup_level = "Family", sink_abundant_groups = TRUE)

mdf.new.sample.order <-new.sample.order$mdf
cdf.new.sample.order <-new.sample.order$cdf

plot_2 <- plot_microshades(mdf.new.sample.order, cdf.new.sample.order, group_label = "Phylum Family")

plot_2 + scale_y_continuous(labels = scales::percent, expand = expansion(0)) +
theme(legend.key.size = unit(0.2, "cm"), text=element_text(size=10)) +
theme(axis.text.x = element_text(size= 6))`

Error with plot_microshades() with R 4.3.0

Hi,
Thank you for developing this very nice package.

I have used microshades with R version 4.1 without problems, but after updating to R 4.3.0, the plot_microshades() function fails with the following message:

Error in is.na(cdf$hex) || is.na(cdf$group) :
'length = 29' in coercion to 'logical(1)'

I found by googling that this might be due to changes in R:

https://stackoverflow.com/questions/72848442/r-warning-lengthx-2-1-in-coercion-to-logical1
https://cran.r-project.org/doc/manuals/r-devel/NEWS.html

Would it be possible to update the code to work with the newer R versions?

Thank you in advance.

color_reassign()

create color_reassign function to change the color assignment of the groups

Readme vignette links

The readme file links to the vignettes do not work. Check to see if they work once github pages is activated

Custom legend below plot

Hi! Thank you for developing this very useful package!
I was wondering if it is possible to get the custom legend displayed at the bottom of the plot. I tried to play around with the custom_legend function, but couldn't manage to get a working function. Is it a command one could use to choose how to split the rows and columns of the custom_legend?

Again, thank you for the package! The color combinations and visual display ia so beautiful!

Install bug/fix

remotes::install_github("KarstensLab/microshades")

in the install instructions in README doesn't work consistently - suggest to change to...

remotes::install_github("KarstensLab/microshades", dependencies=TRUE)

inconsistent bar width

Hello,

Thanks for this cool package! I'm wondering if you have any suggestions on inconsistent section widths within bars in the plots I've been creating. As you can see in the example posted below, the coloured sections in each bar/sample are sometimes different widths to the other sections/colours in the same bar (particularly obvious in the top end of the graph). Any ideas why this might be happening?

Here is the plotting code used

plot_tads_T_t2 <- plot_microshades(mdf_tads_neworder_T_t2, cdf_tads_neworder_T_t2)
tads_legend_T_t2 <- custom_legend(mdf_tads_neworder_T_t2, cdf_tads_neworder_T_t2, group_level = "Class", subgroup_level = "Genus", legend_key_size = 0.5, legend_text_size = 9)

plot_tads_T1 <- plot_tads_T_t2 + scale_y_continuous(labels = scales::percent, expansion(0)) + theme(legend.position = "none", panel.background = element_blank(), axis.title = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank()) + ggtitle("Treated") + facet_wrap(~ Mother.ID, scales = "free_x", labeller = labeller(Mother.ID = (c("Mother A" = "Sibship A", "Mother B" = "Sibship B", "Mother C" = "Sibship C"))))
plot_tads_T2 <- plot_grid(plot_tads_T1, tads_legend_T_t2)
plot_tads_T2

Relative abundance calculation after tax_glom() gives non-representative relative abundance in figures

Hi microshades team,

I've been noticing a bit of a bug in how relative abundances are calculated for microshades.

Issue description

That is, when using lower levels of taxonomy that might have unclassified taxa, microshades is calculating the relative abundance using only the reads that were classified at that level.   If microshading at the genus level, any taxa that aren't classified at the genus level are thrown out. Because of this, the relative abundances shown in the figure are not representative of the actual relative abundances. This also impacts the relative abundances shown for higher taxonomic levels.

Example

For example, if we had a community with 2 taxa in the family Ruminococcaceae:
 1. Faecalibacterium (50%)
2. Unclassified at the genus level (50%)

Which in phyloseq results in a taxonomy table that has:
 Family Genus 
 Ruminococcaceae Faecalibacterium
Ruminococcaceae <NA>

When this is plotted in microshades with subgroup=“Genus”, it throws out the <NA> values before calculating relative abundance. In this simple example community, then Feacalibacterium would be plotted as 100% taxa present.

Further documentation (easy to see the effects in a real data set)

I’ve attached an example knitted Rmarkdown example, where you can click through the different levels, and you can see major changes in relative abundance based on the subgroup level (pdf at the bottom). The knitted Rmd file also has some proposed workarounds.

Addressing this?

I think this is happening during speedyseq’s tax_glom function (in prep_mdf). If it is from speedyseq, would it be possible to have an argument that disables this behavior? Or could there be some sort of note in the documentation/examples telling users to handle taxa unclassified at lower taxonomic levels prior to using microshades?

Real dataset example

In these figures, particularly reference the "control" microbiomes on the far right. These are contaminated germ free mice, with very low diversity microbiomes, which is why it's so evident. On the family level microshades plot (the first one), it appears that peptostreptococcaceae makes up ~75% of some of the microbiomes. However, the genus level plot shows that they're 90% Turicibacter, which is not in the peptostreptococcaceae family. This is because the main ASV in peptostreptococcaceae was unclassified at the genus level, and it was thrown out for relative abundance calculations, resulting in a misleading relative abundance of Turicibacter presented. You can also see differences in the phylum-level relative abundances of Firmicutes and Bacteroidota in the samples labeled 74. I go into more depth in the knitted Rmd. I've attached it below as a pdf, but it lends itself to better reading (for tabbing through the figures) as html, so I can email that version if you'd like it.

Taxa-bar.pdf

Thanks again for this package! It's helped our lab make some fantastic plots. I just wanted to bring this behavior to your attention, since it confused me for quite a few days :)

Thanks,
John

reorder_samples_by() group level

Currently function allows for samples to be reordered by a particular subgroup. Add feature so they could be organized by subgroup or group level.

custom legend configuration

Create a ggplot custom microshades legend that uses The group name title and subgroup section of colors

Reference Links

@KarstensLab Review the reference links for the datasets in each vignette. Currently there is a github link, and I also added a secondary bioconductor link. Please determine what link is optimal to use and/or if there is a different link that should be used.