Coder Social home page Coder Social logo

Combine two datasets by rownames about radiant HOT 11 CLOSED

vnijs avatar vnijs commented on July 2, 2024
Combine two datasets by rownames

from radiant.

Comments (11)

kmezhoud avatar kmezhoud commented on July 2, 2024

I found the code that display the tables in Manage and combine panel. How can i change it to make visible the row names as in View panel?

in combine_ui.R

output$cmb_data1 <- renderText({
if (is.null(input$dataset2)) return()
show_data_snippet(title = paste("

Dataset 1:",input$dataset,"

"))
})

in manage_ui.R
output$htmlDataExample <- renderText({

if (is.null(.getdata())) return()

Show only the first 10 (or 30) rows

r_data[[paste0(input$dataset,"_descr")]] %>%
{ is_empty(.) %>% ifelse(., 30, 10) } %>%
show_data_snippet(nshow = .)
})

from radiant.

vnijs avatar vnijs commented on July 2, 2024

Because Radiant uses dplyr extensively it will not support rownames. From the dplyr 0.4 announcement

"I don’t think using row names is a good idea because it violates one of the principles of tidy data: every variable should be stored in the same way."

You cannot use the combine feature in Radiant to combine datasets by rowname. The best approach would be to add the row names to your data as a regular variable. Then you can combine the data by the new variable called rowname

library(dplyr)
mtcars %>% add_rownames()

If you want to do this inside radiant you could use Data > Transform and then select Create from the Transformation type dropdown. Enter the following snippet rowname = rownames(.getdata()) and press return. If this worked as expected press the save changes button.

from radiant.

kmezhoud avatar kmezhoud commented on July 2, 2024

Yes That works. Thanks
When I combine by "bind_rows", only columns with numeric value are taken. The columns with character are empty.
In my case I merge a df of a quantitative values of gene (numeric) with a df of clinical data of patients (age, gender...). The combine result has all columns of the two df but it remains empty for clinical data.

How can I allow the merge of non numeric data ? Maybe in combinedata function in combine.R file (line 22)? or with getdata function?
Thnaks
Karim

from radiant.

vnijs avatar vnijs commented on July 2, 2024

In case you haven't yet ... please make sure to read the documentation for Combine. bind_rows is not a type of merge. It just stacks one dataset on top of another. The columns that go together should have the same type in this case (see screen shot below). I expect you want one of the _join options (e.g.,, inner_join). If this is not what is going on please send me (small) datasets that demonstrate your problem (preferably as a Radiant state-file).

screen shot 2015-06-02 at 10 51 51 am

from radiant.

kmezhoud avatar kmezhoud commented on July 2, 2024

I would like to merge ProfData with Clinicaldata by Patient IDs.

https://drive.google.com/file/d/0B9NOY9eukkEeaU5CQW5rRk00TE0/view?usp=sharing

from radiant.

vnijs avatar vnijs commented on July 2, 2024

Thanks for the data @kmezhoud and for bringing this issue to my attention. I found the issue. combine, by default, would drop rows with missing values from each dataset. Since ClinicalData seems to have no values for IDC_10 there was no data to combine with. I have changed the combine function so that now, by default, is does not remove rows with missing values. I will package this up later this week but you can use devtools::install_github("vnijs/radiant") if you are setup to use devtools. If not you can see the results on a server through the link below.

http://vnijs.rady.ucsd.edu:3838/marketing/?SSUID=f4d278c1bec5d1c5f76d4b1dd79ddee1

Alternatively you can just drop the IDC_10 variable in the Data >Transform tab with Reorder/remove variables. Unless this is simulated data you can remove it from your google drive now. I will remove the data from the instance at the link below on your request.

from radiant.

kmezhoud avatar kmezhoud commented on July 2, 2024

I am working on version 0.2.12. When I added na.rm=FALSE the cmb_datasets does not appears in r_data$datasets. Here my combine.R version.

#' Combine datasets using dplyr's bind and join functions
#'
#' @details See \url{http://vnijs.github.io/radiant/base/combine.html} for an example in Radiant
#'
#' @param dataset Dataset name (string). This can be a dataframe in the global environment or an element in an r_data list from Radiant
#' @param dataset2 Dataset name (string) to combine with dataset. This can be a dataframe in the global environment or an element in an r_data list from Radiant
#' @param cmb_vars Variables used to combine dataset and dataset2
#' @param cmb_type The main bind and join types from the dplyr package are provided. \bold{inner_join} returns all rows from x with matching values in y, and all columns from x and y. If there are multiple matches between x and y, all match combinations are returned. \bold{left_join} returns all rows from x, and all columns from x and y. If there are multiple matches between x and y, all match combinations are returned. \bold{right_join} is equivalent to a left join for datasets y and x. \bold{full_join} combines two datasets, keeping rows and columns that appear in either. \bold{semi_join} returns all rows from x with matching values in y, keeping just columns from x. A semi join differs from an inner join because an inner join will return one row of x for each matching row of y, whereas a semi join will never duplicate rows of x. \bold{anti_join} returns all rows from x without matching values in y, keeping only columns from x. \bold{bind_rows} and \bold{bind_cols} are also included, as are \bold{intersect}, \bold{union}, and \bold{setdiff}. See \url{http://vnijs.github.io/radiant/base/combine.html} for further details
#' @param cmb_name Name for the combined dataset
#'
#' @return If list r_data exists the combined dataset is added as cmb_name. Else the combined dataset will be returned as cmb_name
#'
#' @examples
#' combinedata("titanic","titanic_pred",c("pclass","sex","age")) %>% head
#' titanic %>% combinedata("titanic_pred",c("pclass","sex","age")) %>% head
#' titanic %>% combinedata(titanic_pred,c("pclass","sex","age")) %>% head
#' avengers %>% combinedata(superheroes, cmb_type = "bind_cols")
#' combinedata("avengers", "superheroes", cmb_type = "bind_cols")
#' avengers %>% combinedata(superheroes, cmb_type = "bind_rows")
#'
#' @export
combinedata <- function(dataset, dataset2,
cmb_vars = "",
cmb_type = "bind_rows",
cmb_name = "") {

is_join <- grepl("_join",cmb_type)
if (is_join && cmb_vars[1] == "")
return(cat("No variables selected to join datasets"))

if(cmb_name == "")
cmb_name <- if(is_string(dataset)) paste0("cmb_",dataset) else "cmb_data"

if(is_join) {
cmb_dat <- get(cmb_type)(getdata(dataset, na.rm=FALSE), getdata(dataset2, na.rm=FALSE), by = cmb_vars)
cmb_madd <- paste0("\n\nBy: ", paste0(cmb_vars, collapse=", "))
} else {
cmb_dat <- get(cmb_type)(getdata(dataset, na.rm=FALSE), getdata(dataset2, na.rm=FALSE))
cmb_madd <- ""
}
cmb_message <- paste0("\n### Combined\n\nDatasets: ", dataset, " and ",
dataset2, " (", cmb_type, ")", cmb_madd, "\n\nOn: ",
lubridate::now())

if (exists("r_env")) {
c_env <- r_env
} else if (exists("r_data")) {
c_env <- pryr::where("r_data")
} else {
return(cmb_dat)
}

c_env$r_data[[cmb_name]] <- cmb_dat
c_env$r_data[[cmb_name]] %>% head %>% print
c_env$r_data[['datasetlist']] <- c(cmb_name, c_env$r_data[['datasetlist']]) %>% unique
c_env$r_data[[paste0(cmb_name,"_descr")]] <- cmb_message
cat("\nCombined data added as", cmb_name, "\n")
}

from radiant.

vnijs avatar vnijs commented on July 2, 2024

Did you try this on the server: http://vnijs.rady.ucsd.edu:3838/marketing/ ?

from radiant.

vnijs avatar vnijs commented on July 2, 2024

I just updated the package on GitHub. Please try

install.packages("radiant", repos = "http://vnijs.github.io/radiant_miniCRAN/")

and let me know if this works for you

from radiant.

kmezhoud avatar kmezhoud commented on July 2, 2024

Hi,
Yes that works for marketing and for mini. I am working with version 0.2.12 and I am adding other tools to connect to other server to get cancers genomics data. So I need to fix na.rm= False in the old version... It seems not the same combine.R file compared the last version.
Without results when I used the last version of combine.R file with full_join.
Thanks

from radiant.

vnijs avatar vnijs commented on July 2, 2024

Radiant is still under heavy development. I suggest you sync your fork and go from there (see link below).

https://help.github.com/articles/syncing-a-fork/

from radiant.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.