if I follow the example in the vignette I encounter this error: Add

Comments (4)

mcanouil commented on September 18, 2024

Hi,

I can't replicate your error.
And the vignette successfully compiled as you can see on the website

Below is a full reproducible example of the code you mentionned, as you can see I don't have your error. Please check the session information in the end.

library(GEOquery)
#> Loading required package: Biobase
#> Loading required package: BiocGenerics
#> Loading required package: parallel
#> 
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:parallel':
#> 
#>     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
#>     clusterExport, clusterMap, parApply, parCapply, parLapply,
#>     parLapplyLB, parRapply, parSapply, parSapplyLB
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#> 
#>     anyDuplicated, append, as.data.frame, basename, cbind,
#>     colnames, dirname, do.call, duplicated, eval, evalq, Filter,
#>     Find, get, grep, grepl, intersect, is.unsorted, lapply, Map,
#>     mapply, match, mget, order, paste, pmax, pmax.int, pmin,
#>     pmin.int, Position, rank, rbind, Reduce, rownames, sapply,
#>     setdiff, sort, table, tapply, union, unique, unsplit, which,
#>     which.max, which.min
#> Welcome to Bioconductor
#> 
#>     Vignettes contain introductory material; view with
#>     'browseVignettes()'. To cite Bioconductor, see
#>     'citation("Biobase")', and for packages 'citation("pkgname")'.
#> Setting options('download.file.method.GEOquery'='auto')
#> Setting options('GEOquery.inmemory.gpl'=FALSE)
# Download data
gse <- getGEO("GSE70970")
#> Found 1 file(s)
#> GSE70970_series_matrix.txt.gz
#> Parsed with column specification:
#> cols(
#>   .default = col_double(),
#>   ID_REF = col_character()
#> )
#> See spec(...) for full column specifications.
#> File stored at:
#> /tmp/RtmpKA2y6S/GPL20699.soft
# Get phenotypes
targets <- pData(phenoData(gse[[1]]))
getGEOSuppFiles(GEO = "GSE70970", baseDir = tempdir())
#>                                                                    size
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       1986560
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz     672
#>                                                                 isdir mode
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       FALSE  644
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz FALSE  644
#>                                                                               mtime
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       2019-11-15 11:25:23
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz 2019-11-15 11:25:24
#>                                                                               ctime
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       2019-11-15 11:25:23
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz 2019-11-15 11:25:24
#>                                                                               atime
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       2019-11-15 11:25:21
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz 2019-11-15 11:25:23
#>                                                                  uid gid
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       1738  50
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz 1738  50
#>                                                                    uname
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                       mcanouil
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz mcanouil
#>                                                                 grname
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_RAW.tar                        staff
#> /tmp/RtmpKA2y6S/GSE70970/GSE70970_characteristics_readme.txt.gz  staff
# Unzip data
untar(
  tarfile = paste0(tempdir(), "/GSE70970/GSE70970_RAW.tar"), 
  exdir = paste0(tempdir(), "/GSE70970/Data")
)
# Add IDs
targets$IDFILE <- list.files(paste0(tempdir(), "/GSE70970/Data"))

library(NACHO)
#> 
#> Attaching package: 'NACHO'
#> The following object is masked from 'package:BiocGenerics':
#> 
#>     normalize
GSE70970_sum <- summarise(
  data_directory = paste0(tempdir(), "/GSE70970/Data"), # Where the data is
  ssheet_csv = targets, # The samplesheet
  id_colname = "IDFILE", # Name of the column that contains the identfiers
  housekeeping_genes = NULL, # Custom list of housekeeping genes
  housekeeping_predict = TRUE, # Predict the housekeeping genes based on the data?
  normalisation_method = "GEO", # Geometric mean or GLM
  n_comp = 5 # Number indicating the number of principal components to compute. 
)
#> [NACHO] Importing RCC files.
#> [NACHO] Performing QC and formatting data.
#> [NACHO] Searching for the best housekeeping genes.
#> [NACHO] Computing normalisation factors using "GEO" method for housekeeping genes prediction.
#> [NACHO] The following predicted housekeeping genes will be used for normalisation:
#>   - hsa-miR-103
#>   - hsa-let-7e
#>   - hsa-miR-1260
#>   - hsa-miR-500+hsa-miR-501-5p
#>   - hsa-miR-1274b
#> [NACHO] Computing normalisation factors using "GEO" method.
#> [NACHO] Missing values have been replaced with zeros for PCA.
#> [NACHO] Normalising data using "GEO" method with housekeeping genes.
#> [NACHO] Returning a list.
#>   $ access              : character
#>   $ housekeeping_genes  : character
#>   $ housekeeping_predict: logical
#>   $ housekeeping_norm   : logical
#>   $ normalisation_method: character
#>   $ remove_outliers     : logical
#>   $ n_comp              : numeric
#>   $ data_directory      : character
#>   $ pc_sum              : data.frame
#>   $ nacho               : data.frame
#>   $ outliers_thresholds : list
#>   $ raw_counts          : data.frame
#>   $ normalised_counts   : data.frame

sessioninfo::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.1 (2019-07-05)
#>  os       Debian GNU/Linux 9 (stretch)
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language en_GB.UTF-8                 
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Etc/UTC                     
#>  date     2019-11-15                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package      * version date       lib source        
#>  assertthat     0.2.1   2019-03-21 [1] CRAN (R 3.6.1)
#>  backports      1.1.5   2019-10-02 [1] CRAN (R 3.6.1)
#>  Biobase      * 2.44.0  2019-05-02 [1] Bioconductor  
#>  BiocGenerics * 0.30.0  2019-05-02 [1] Bioconductor  
#>  cli            1.1.0   2019-03-19 [1] CRAN (R 3.6.1)
#>  colorspace     1.4-1   2019-03-18 [1] CRAN (R 3.6.1)
#>  crayon         1.3.4   2017-09-16 [1] CRAN (R 3.6.1)
#>  curl           4.2     2019-09-24 [1] CRAN (R 3.6.1)
#>  digest         0.6.21  2019-09-20 [1] CRAN (R 3.6.1)
#>  dplyr          0.8.3   2019-07-04 [1] CRAN (R 3.6.1)
#>  ellipsis       0.3.0   2019-09-20 [1] CRAN (R 3.6.1)
#>  evaluate       0.14    2019-05-28 [1] CRAN (R 3.6.1)
#>  GEOquery     * 2.52.0  2019-05-02 [1] Bioconductor  
#>  ggplot2        3.2.1   2019-08-10 [1] CRAN (R 3.6.1)
#>  glue           1.3.1   2019-03-12 [1] CRAN (R 3.6.1)
#>  gtable         0.3.0   2019-03-25 [1] CRAN (R 3.6.1)
#>  highr          0.8     2019-03-20 [1] CRAN (R 3.6.1)
#>  hms            0.5.1   2019-08-23 [1] CRAN (R 3.6.1)
#>  htmltools      0.4.0   2019-10-04 [1] CRAN (R 3.6.1)
#>  knitr          1.25    2019-09-18 [1] CRAN (R 3.6.1)
#>  lazyeval       0.2.2   2019-03-15 [1] CRAN (R 3.6.1)
#>  lifecycle      0.1.0   2019-08-01 [1] CRAN (R 3.6.1)
#>  limma          3.40.6  2019-07-26 [1] Bioconductor  
#>  magrittr       1.5     2014-11-22 [1] CRAN (R 3.6.1)
#>  munsell        0.5.0   2018-06-12 [1] CRAN (R 3.6.1)
#>  NACHO        * 0.6.1   2019-10-12 [1] CRAN (R 3.6.1)
#>  pillar         1.4.2   2019-06-29 [1] CRAN (R 3.6.1)
#>  pkgconfig      2.0.3   2019-09-22 [1] CRAN (R 3.6.1)
#>  purrr          0.3.3   2019-10-18 [1] CRAN (R 3.6.1)
#>  R6             2.4.0   2019-02-14 [1] CRAN (R 3.6.1)
#>  Rcpp           1.0.2   2019-07-25 [1] CRAN (R 3.6.1)
#>  readr          1.3.1   2018-12-21 [1] CRAN (R 3.6.1)
#>  rlang          0.4.0   2019-06-25 [1] CRAN (R 3.6.1)
#>  rmarkdown      1.16    2019-10-01 [1] CRAN (R 3.6.1)
#>  scales         1.0.0   2018-08-09 [1] CRAN (R 3.6.1)
#>  sessioninfo    1.1.1   2018-11-05 [1] CRAN (R 3.6.1)
#>  stringi        1.4.3   2019-03-12 [1] CRAN (R 3.6.1)
#>  stringr        1.4.0   2019-02-10 [1] CRAN (R 3.6.1)
#>  tibble         2.1.3   2019-06-06 [1] CRAN (R 3.6.1)
#>  tidyr          1.0.0   2019-09-11 [1] CRAN (R 3.6.1)
#>  tidyselect     0.2.5   2018-10-11 [1] CRAN (R 3.6.1)
#>  vctrs          0.2.0   2019-07-05 [1] CRAN (R 3.6.1)
#>  withr          2.1.2   2018-03-15 [1] CRAN (R 3.6.1)
#>  xfun           0.10    2019-10-01 [1] CRAN (R 3.6.1)
#>  xml2           1.2.2   2019-08-09 [1] CRAN (R 3.6.1)
#>  yaml           2.2.0   2018-07-25 [1] CRAN (R 3.6.1)
#>  zeallot        0.1.0   2018-01-28 [1] CRAN (R 3.6.1)
#> 
#> [1] /usr/local/lib/R/site-library
#> [2] /usr/local/lib/R/library

from nacho.

sheucke commented on September 18, 2024

I restarted R and tried again now it worked, sry dont know what went wrong the first time.

best regards
Sebastian

library(GEOquery)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply,
parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval, evalq,
Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax,
pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit, which, which.max, which.min

Welcome to Bioconductor

Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.

Setting options('download.file.method.GEOquery'='auto')
Setting options('GEOquery.inmemory.gpl'=FALSE)

gse <- getGEO("GSE70970")
Found 1 file(s)
GSE70970_series_matrix.txt.gz
trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE70nnn/GSE70970/matrix/GSE70970_series_matrix.txt.gz'
Content type 'application/x-gzip' length 351607 bytes (343 KB)
==================================================
downloaded 343 KB

Parsed with column specification:
cols(
.default = col_double(),
ID_REF = col_character()
)
See spec(...) for full column specifications.
File stored at:
/tmp/RtmpQb9ReH/GPL20699.soft

targets <- pData(phenoData(gse[[1]]))
getGEOSuppFiles(GEO = "GSE70970", baseDir = tempdir())
trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE70nnn/GSE70970/suppl//GSE70970_RAW.tar?tool=geoquery'
Content type 'application/x-tar' length 1986560 bytes (1.9 MB)
==================================================
downloaded 1.9 MB

trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE70nnn/GSE70970/suppl//GSE70970_characteristics_readme.txt.gz?tool=geoquery'
Content type 'application/x-gzip' length 672 bytes

downloaded 672 bytes

                                                               size isdir mode               mtime               ctime

/tmp/RtmpQb9ReH/GSE70970/GSE70970_RAW.tar 1986560 FALSE 664 2019-11-15 11:31:34 2019-11-15 11:31:34
/tmp/RtmpQb9ReH/GSE70970/GSE70970_characteristics_readme.txt.gz 672 FALSE 664 2019-11-15 11:31:35 2019-11-15 11:31:35
atime uid gid uname grname
/tmp/RtmpQb9ReH/GSE70970/GSE70970_RAW.tar 2019-11-15 11:31:32 1000 1000 sebastian sebastian
/tmp/RtmpQb9ReH/GSE70970/GSE70970_characteristics_readme.txt.gz 2019-11-15 11:31:34 1000 1000 sebastian sebastian

untar(

tarfile = paste0(tempdir(), "/GSE70970/GSE70970_RAW.tar"),
exdir = paste0(tempdir(), "/GSE70970/Data")
)

targets$IDFILE <- list.files(paste0(tempdir(), "/GSE70970/Data"))
library(NACHO)

Attaching package: ‘NACHO’

The following object is masked from ‘package:BiocGenerics’:

normalize

library(NACHO)
GSE70970_sum <- summarise(

data_directory = paste0(tempdir(), "/GSE70970/Data"), # Where the data is
ssheet_csv = targets, # The samplesheet
id_colname = "IDFILE", # Name of the column that contains the identfiers
housekeeping_genes = NULL, # Custom list of housekeeping genes
housekeeping_predict = TRUE, # Predict the housekeeping genes based on the data?
normalisation_method = "GEO", # Geometric mean or GLM
n_comp = 5 # Number indicating the number of principal components to compute.
)
[NACHO] Importing RCC files.
|========================================================================================================|100% ~0 s remaining
[NACHO] Performing QC and formatting data.
[NACHO] Searching for the best housekeeping genes.
[NACHO] Computing normalisation factors using "GEO" method for housekeeping genes prediction.
[NACHO] The following predicted housekeeping genes will be used for normalisation:
- hsa-miR-103
- hsa-let-7e
- hsa-miR-1260
- hsa-miR-500+hsa-miR-501-5p
- hsa-miR-1274b
  [NACHO] Computing normalisation factors using "GEO" method.
  [NACHO] Missing values have been replaced with zeros for PCA.
  [NACHO] Normalising data using "GEO" method with housekeeping genes.
  [NACHO] Returning a list.
  $ access : character
  $ housekeeping_genes : character
  $ housekeeping_predict: logical
  $ housekeeping_norm : logical
  $ normalisation_method: character
  $ remove_outliers : logical
  $ n_comp : numeric
  $ data_directory : character
  $ pc_sum : data.frame
  $ nacho : data.frame
  $ outliers_thresholds : list
  $ raw_counts : data.frame
  $ normalised_counts : data.frame

sessioninfo::session_info()
─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 3.6.1 (2019-07-05)
os Ubuntu 18.04.3 LTS
system x86_64, linux-gnu
ui RStudio
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/Berlin
date 2019-11-15

─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
package * version date lib source
assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1)
backports 1.1.5 2019-10-02 [1] CRAN (R 3.6.1)
Biobase * 2.44.0 2019-05-02 [1] Bioconductor
BiocGenerics * 0.30.0 2019-05-02 [1] Bioconductor
cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.1)
colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.1)
crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1)
curl 4.2 2019-09-24 [1] CRAN (R 3.6.1)
dplyr 0.8.3 2019-07-04 [1] CRAN (R 3.6.1)
ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.1)
GEOquery * 2.52.0 2019-05-02 [1] Bioconductor
ggplot2 3.2.1 2019-08-10 [1] CRAN (R 3.6.1)
glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.1)
gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.1)
hms 0.5.2 2019-10-30 [1] CRAN (R 3.6.1)
knitr 1.26 2019-11-12 [1] CRAN (R 3.6.1)
lazyeval 0.2.2 2019-03-15 [1] CRAN (R 3.6.1)
lifecycle 0.1.0 2019-08-01 [1] CRAN (R 3.6.1)
limma 3.40.6 2019-07-26 [1] Bioconductor
magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1)
munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.1)
NACHO * 0.6.1 2019-10-12 [1] CRAN (R 3.6.1)
pillar 1.4.2 2019-06-29 [1] CRAN (R 3.6.1)
pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.1)
purrr 0.3.3 2019-10-18 [1] CRAN (R 3.6.1)
R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.1)
Rcpp 1.0.3 2019-11-08 [1] CRAN (R 3.6.1)
readr 1.3.1 2018-12-21 [1] CRAN (R 3.6.1)
rlang 0.4.1 2019-10-24 [1] CRAN (R 3.6.1)
rstudioapi 0.10 2019-03-19 [1] CRAN (R 3.6.1)
scales 1.0.0 2018-08-09 [1] CRAN (R 3.6.1)
sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1)
stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.1)
tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.1)
tidyr 1.0.0 2019-09-11 [1] CRAN (R 3.6.1)
tidyselect 0.2.5 2018-10-11 [1] CRAN (R 3.6.1)
vctrs 0.2.0 2019-07-05 [1] CRAN (R 3.6.1)
withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.1)
xfun 0.11 2019-11-12 [1] CRAN (R 3.6.1)
xml2 1.2.2 2019-08-09 [1] CRAN (R 3.6.1)
zeallot 0.1.0 2018-01-28 [1] CRAN (R 3.6.1)

[1] /home/sebastian/R/x86_64-pc-linux-gnu-library/3.6
[2] /usr/local/lib/R/site-library
[3] /usr/lib/R/site-library
[4] /usr/lib/R/library

from nacho.

mcanouil commented on September 18, 2024

Perfect!
Enjoy NACHO ;)

from nacho.

athulmenon commented on September 18, 2024

Hi Mcanouil,

Restarted R and tried to run the code fresh again. Still the same error!
`> GSE70970_sum <- summarize(

data_directory = paste0(tempdir(), "/GSE70970/Data"), # Where the data is

ssheet_csv = targets, # The samplesheet

id_colname = "IDFILE", # Name of the column that contains the identfiers

housekeeping_genes = NULL, # Custom list of housekeeping genes

housekeeping_predict = TRUE, # Predict the housekeeping genes based on the data?

normalisation_method = "GEO", # Geometric mean or GLM

n_comp = 5 # Number indicating the number of principal components to compute.

Error goes like this : [NACHO] Importing RCC files. Error: Column cols must be length 1 (the number of rows), not 3

Any other solutions?
Thanks for quick response.

Athul

from nacho.

example in vignette error about nacho HOT 4 CLOSED

Comments (4)

trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE70nnn/GSE70970/suppl//GSE70970_characteristics_readme.txt.gz?tool=geoquery'
Content type 'application/x-gzip' length 672 bytes

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Comments (4)

trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE70nnn/GSE70970/suppl//GSE70970_characteristics_readme.txt.gz?tool=geoquery' Content type 'application/x-gzip' length 672 bytes

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org

trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE70nnn/GSE70970/suppl//GSE70970_characteristics_readme.txt.gz?tool=geoquery'
Content type 'application/x-gzip' length 672 bytes