r-lib / crancache Goto Github PK
View Code? Open in Web Editor NEWTransparent caching of CRAN package files - WORK IN PROGRESS!
License: Other
Transparent caching of CRAN package files - WORK IN PROGRESS!
License: Other
Would help revdep-checking on Linux, I think.
THe cache directory should just be a column, and the object returned should have a print method, so that it is still printed nicely.
(if not set)
Is there an ETA? I'm thinking about using this for tic (ropensci/tic#35 (comment)), but we'd also need a way to purge old versions from the cache.
At least from GH would be nice.
I noticed that different r-lib packages will use one or the other. What are the differences between them? Also, it looks like development on crancache has mostly stopped; does this mean packages that use it (only revdepcheck afaik) will likely switch to pkgcache in the future?
I called devtools::install(build_vignettes = TRUE)
to install a local package and got an error coming from crancache. Sorry this is not a clean reprex, but now that the package are installed, I can't go back. But I think this is probably informative enough to make a fix.
These packages have more recent versions available.
It is recommended to update all of them.
Which would you like to update?
1: All
2: CRAN packages only
3: None
4: tibble (3.1.8 -> 3.2.0) [CRAN]
5: fastmap (1.1.0 -> 1.1.1) [CRAN]
6: cachem (1.0.6 -> 1.0.7) [CRAN]
7: ggplot2 (3.4.0 -> 3.4.1) [CRAN]
Enter one or more numbers, or an empty line to skip updates: 1
tibble (3.1.8 -> 3.2.0) [CRAN]
fastmap (1.1.0 -> 1.1.1) [CRAN]
cachem (1.0.6 -> 1.0.7) [CRAN]
ggplot2 (3.4.0 -> 3.4.1) [CRAN]
Installing 4 packages: tibble, fastmap, cachem, ggplot2
Installing packages into '/Users/jenny/Library/R/x86_64/4.2/library'
(as 'lib' is unspecified)
There is a binary version available but the source version is later:
binary source needs_compilation
tibble 3.1.8 3.2.0 TRUE
Do you want to install from sources the package which needs compilation? (Yes/no/cancel)
trying URL 'https://cloud.r-project.org/bin/macosx/contrib/4.2/fastmap_1.1.1.tgz'
Content type 'application/x-gzip' length 201349 bytes (196 KB)
==================================================
downloaded 196 KB
trying URL 'https://cloud.r-project.org/bin/macosx/contrib/4.2/cachem_1.0.7.tgz'
Content type 'application/x-gzip' length 67349 bytes (65 KB)
==================================================
downloaded 65 KB
The downloaded binary packages are in
/tmp/RtmpszWjQG/downloaded_packages
installing the source package 'tibble'
trying URL 'https://cloud.r-project.org/src/contrib/tibble_3.2.0.tar.gz'
Content type 'application/x-gzip' length 565955 bytes (552 KB)
==================================================
downloaded 552 KB
* installing *source* package ‘tibble’ ...
** package ‘tibble’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I/usr/local/include -fPIC -Wall -g -O2 -c attributes.c -o attributes.o
clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I/usr/local/include -fPIC -Wall -g -O2 -c coerce.c -o coerce.o
clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I/usr/local/include -fPIC -Wall -g -O2 -c init.c -o init.o
clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I/usr/local/include -fPIC -Wall -g -O2 -c matrixToDataFrame.c -o matrixToDataFrame.o
clang -mmacosx-version-min=10.13 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o tibble.so attributes.o coerce.o init.o matrixToDataFrame.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
installing to /Users/jenny/Library/R/x86_64/4.2/library/00LOCK-tibble/00new/tibble/libs
** R
** inst
** tests
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
*** copying figures
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (tibble)
The downloaded source packages are in
'/private/tmp/RtmpszWjQG/downloaded_packages'
Adding 'cachem_1.0.7.tgz' to the cache
Adding 'fastmap_1.1.1.tgz' to the cache
Adding 'tibble_3.2.0.tar.gz' to the cache
Error in "INSTALL_opts" %in% names(args) && grepl("--no-test-load", args$INSTALL_opts, :
'length(x) = 2 > 1' in coercion to 'logical(1)'
Run `rlang::last_error()` to see where the error occurred.
> le
<error/rlang_error>
Error:
! 'length(x) = 2 > 1' in coercion to 'logical(1)'
---
Backtrace:
1. devtools::install(build_vignettes = TRUE)
2. remotes::install_deps(...)
at devtools/R/install.R:82:4
4. remotes:::update.package_deps(...)
5. remotes:::install_packages(...)
7. remotes (local) `<fn>`(...)
12. crancache (local) i.p(...)
13. crancache:::update_cache(...)
14. crancache:::update_cache_safe(...)
15. crancache:::update_cache_binaries(...)
Run `rlang::last_trace()` to see the full context.
> revdepcheck::revdep_check(bioc = TRUE, dependencies = c("Depends", "Imports", "Suggests", "Enhances", "LinkingTo"), quiet = FALSE, num_workers = 24, timeout = as.difftime(60, units = "mins"))
── CHECK ─────────────────────────────────────────────────────── 241 packages ──
Error in readRDS(dest) : error reading from connection
Calls: <Anonymous> ... available_packages -> eval -> eval -> <Anonymous> -> readRDS
Execution halted
Seems that _meta/https-bioconductor-org-packages-3-5-bioc-src-contrib/PACKAGES.rds
has size zero on my system. I'll delete it and try again.
Just prints the directory so you can inspect it locally if you want
During a long revdep run, I see quite a few:
In file.rename(file.path(dir, pkgs_new), file.path(dir, pkgs)) :
cannot rename file '/Users/hadley/Library/Caches/R-crancache/other/bin/macosx/el-capitan/contrib/3.4/PACKAGES.rds.new' to '/Users/hadley/Library/Caches/R-crancache/other/bin/macosx/el-capitan/contrib/3.4/PACKAGES.rds', reason 'No such file or directory'
I'm re-running now with options(warn = 2)
to at least get a traceback.
crancache::install_packages("dplyr")
#> Error in if (use_cache) { : argument is of length zero
I have two minor usability gripes with crancache_remove()
:
remove_packages()
(in parallel with update_packages()
, install_packages()
Adding a remove_packages()
function that did string-matching rather than regex-matching would preserve backward compatibility ...
In update_cache_binaries()
, there's:
Lines 7 to 8 in e2185c7
That grepl()
statement can resolve to a logical vector of any length.
I just ran into this indirectly (sorry, tried but fail to reproduce and don't have much time) where I ran with:
_R_CHECK_LENGTH_1_CONDITION=true
_R_CHECK_LENGTH_1_LOGIC2_=true
and received:
...
Adding 'fs_1.3.0.tar.gz' to the cache
Error in "INSTALL_opts" %in% names(args) && grepl("--no-test-load", args$INSTALL_opts, :
'length(x) = 2 > 1' in coercion to 'logical(1)'
Calls: <Anonymous> ... update_cache -> update_cache_safe -> update_cache_binaries
In revdepcheck I keep running into situations where the crancache contains the binary but not the source package, see r-lib/rcmdcheck#28. Not sure how this is possible, but in that case download.packages()
will happily return the binary package.
contrib.url("", "binary")
still returns "src/contrib"
on Linux, so a parallel repo seems to be the only solution.
We can do a better job than the session-based caching in tools
.
Like when doing a massive revdep check.
We probably should not make a copy of all of it, but do sg smarter.
I see that the crancache uses $HOME/.cache/R-crancache/cran-bin/src/contrib/
to cache binary package builds. On my Linux system, I'm running R 3.6.1, R 3.6.1 patched, and R-devel, in parallel.
Q. My impression is that crancache assumes that you run a single R version, and whenever switching, or upgrading (or use a build with different configure flags), you need to wipe the crancache (binary) cache. Is that correct?
Is it possible for crancache::install_packages()
to force a re-install, e.g. if you want to re-build a package towards a newer library dependency, e.g. a modern version of SQLite3?
The instructions in the README say to do source("https://install-github.me/r-lib/crancache")
. I wanted to look at what's behind that URL before running its code, but Chrome says the HTTPS certificate isn't valid, so it refused to load it.
I can override Chrome - but my company's firewall says it contains "malicious code" (sometimes it's overzealous...).
I think the certificate error may actually be caused by my firewall itself, which is decidedly unhelpful of it.
In any case - I tried going to the same URL in my phone (which isn't inside the firewall), and saw that the code at that URL is pretty large. But is it basically just doing devtools::install_github('r-lib/crancache')
? Is that another reasonable option, if I've already got devtools
(or remotes
) installed?
It would be convenient if install_packages()
reported on the cache size. Maybe something like:
#> Cache updated:
#> * Saved 10 MG
#> * Download 150 MB
#> * Cache size 1.5 GB
And it would be nice to have something like cache_prune()
which would just keep one version (the latest) of each package.
I've managed to corrupt my crancache but manually removing a single binary *.tar.gz in the CRANCACHE_DIR
. This gives me the following error:
> crancache::install_packages("CGHbase")
Installing package into ‘/wynton/home/cbi/hb/R/x86_64-pc-linux-gnu-library/3.6-CBI’
(as ‘lib’ is unspecified)
Warning in download.packages(pkgs, destdir = tmpd, available = available, :
package ‘CGHbase’ does not exist on the local repository
Warning in download.packages(pkgs, destdir = tmpd, available = available, :
package ‘CGHbase’ does not exist on the local repository
> subset(crancache::crancache_list(), Package == "CGHbase")
Package Repository Version MD5sum
162 CGHbase bioc-bin/source 1.44.0 b59cb3439595617e0e0e34a1bf810641
The file is truly not there;
ls -R /wynton/home/cbi/hb/R/x86_64-pc-linux-gnu-library/3.6-CBI | grep CGHbase | wc -l
0
QUESTION: How can I refresh crancache so that it doesn't believe 'CGHbase' exists?
Just an FYI: No error per se, but I noticed the following double-slash path that looks like a mistake:
> res <- crancache::download_packages("R.utils", destdir = tempdir())
> res
[,1]
[1,] "R.utils"
[,2]
[1,] "//home/hb/.cache/R-crancache/cran/src/contrib/R.utils_2.8.0.tar.gz"
> dir(tempdir())
[1] "downloaded_packages"
[2] "libloc_193_e9a4b394a35614d1.rds"
[3] "libloc_203_c3437f9018af3264.rds"
[4] "repos_https%3A%2F%2Fbioconductor.org%2Fpackages%2F3.9%2Fbioc%2Fsrc%2Fcontrib.rds"
[5] "repos_https%3A%2F%2Fbioconductor.org%2Fpackages%2F3.9%2Fdata%2Fannotation%2Fsrc%2Fcontrib.rds"
[6] "repos_https%3A%2F%2Fbioconductor.org%2Fpackages%2F3.9%2Fdata%2Fexperiment%2Fsrc%2Fcontrib.rds"
[7] "repos_https%3A%2F%2Fcloud.r-project.org%2Fsrc%2Fcontrib.rds"
[8] "repos_https%3A%2F%2Fwww.stats.ox.ac.uk%2Fpub%2FRWin%2Fsrc%2Fcontrib.rds"
> packageVersion("crancache")
[1] '0.0.0.9001'
Is there some way to get crancache
and remotes::install_deps
to play nice together? Maybe some way of tricking install_deps
to use crancache::install_packages
instead of utils::install.packages
?
Or is the best way to slurp in the information from the DESCRIPTION
file and iterate through the dependencies manually?
Because R CMD INSTALL
does not set the mtime
of the installed lib any more. So we need a new strategy here.
(For background, see r-lib/revdepcheck#108)
> repos <- "https://bioconductor.org/packages/3.5/data/annotation"
> pkgs <- crancache::available_packages(repos = repos)
Error: invalid version specification '2.2.0 '
Enter a frame number, or 0 to exit
1: crancache::available_packages(repos = repos)
2: apply_filters(res, filters)
3: f(pkgs)
4: tools:::.remove_stale_dups(db)
5: package_version(ap[wh, "Version"])
6: .make_numeric_version(x, strict, .standard_regexps()$valid_package_version, "
7: stop(gettextf("invalid version specification %s", paste(sQuote(unique(x[!ok])
From quick debugging of available_packages()
, I found that although:
> db <- utils::available.packages(repos = "https://bioconductor.org/packages/3.5/data/annotation")
> db["hgu95av2", "Version"]
[1] "2.2.0"
the local cached repository gives:
> db <- utils::available.packages(repos = "file:///home/hb/.cache/R-crancache/bioc")
> db["hgu95av2", "Version"]
[1] "2.2.0 "
So, somehow the local cache does not strip that trailing space in the DESCRIPTION:Version
field.
I.e. in separate local repositories. This is because sometimes we only want to use proper CRAN packages, like when doing revdep checks.
The master
branch of this repository will soon be renamed to main
, as part of a coordinated change across several GitHub organizations (including, but not limited to: tidyverse, r-lib, tidymodels, and sol-eng). We anticipate this will happen by the end of September 2021.
That will be preceded by a release of the usethis package, which will gain some functionality around detecting and adapting to a renamed default branch. There will also be a blog post at the time of this master
--> main
change.
The purpose of this issue is to:
message id: euphoric_snowdog
I tracked a problem down to this bit of code in get_cache_dir_for_file(file)
:
which <- if (grepl("\\.zip$", file)) {
if (.Platform$pkgType == "win.binary") {
"platform"
}
} else if (grepl("\\.tgz$", file)) {
if (grepl("^mac.binary", .Platform$pkgType)) {
"platform"
}
} else if (grepl("\\.tar\\.gz$", file)) {
## This also includes non-standard (Linux, Solaris, etc.) binaries
"source"
}
get_cache_package_dirs()[[paste0(prefix, which)]]
If the argument file
is a string like ".../reshape2_1.4.3.tgz"
and .Platform$pkgType
is "source"
, then I'm ending up with which
equal to NULL
. So at the end of the function, it's trying to do get_cache_package_dirs()[["cran/"]]
, but that's not a valid key, so I get this error:
Error in get_cache_package_dirs()[[paste0(prefix, which)]] :
subscript out of bounds
The names of get_cache_package_dirs()
are:
Browse[8]> names(get_cache_package_dirs())
[1] "cran-bin/source" "cran/platform" "cran/source" "bioc-bin/source"
[5] "bioc/platform" "bioc/source" "other-bin/source" "other/platform"
[9] "other/source"
It looks like there are several paths through the which
assignment code that result in NULL
.
The consequence of this is that the requested package does get installed, but if I run the same install_packages('pkgname')
command again, neither the binary nor the source are found in the cache.
This approach looks more correct to me:
if (should_update_crancache()) {
add_built_binaries <- should_add_binaries()
on.exit(update_cache(
destdir, binaries = add_built_binaries, warnings, errors, lib,
timestamp, args
))
}
withCallingHandlers(
utils::install.packages(
pkgs = pkgs,
lib = lib,
repos = myrepos,
## We don't specify contriburl, on purpose
method = method,
available = NULL, # overwritten
destdir = destdir,
dependencies = dependencies,
type = type,
...),
warning = function(w) { warnings <- append(warnings, w); warning(w) }
)
tryCatch()
installs exiting handlers, so the first warning will terminate execution of install.packages()
.
This likely needs to be fixed upstream (in cranlike, desc, or somewhere else), still filing here because it affects this package (and ultimately revdepcheck) and I'm not sure what the best fix is.
The reprex wipes your CRAN cache, be careful! We do seem to need a CRANCACHE_DIR
env var.
library(crancache)
repos <- c(
BioCsoft = "https://bioconductor.org/packages/3.5/bioc",
BioCann= "https://bioconductor.org/packages/3.5/data/annotation",
BioCexp = "https://bioconductor.org/packages/3.5/data/experiment",
BioCextra = "https://bioconductor.org/packages/3.5/extra",
CRAN = "http://cran.rstudio.com/"
)
# Caution: Wipes your CRAN cache!!!
unlink("~/.cache/R-crancache", recursive = TRUE)
install_packages("org.Hs.ipi.db", repos = repos)
#> Installing package into '/home/muelleki/R/x86_64-pc-linux-gnu-library/3.4'
#> (as 'lib' is unspecified)
#> Adding 'org.Hs.ipi.db_1.3.0.tar.gz' to the cache
#> Adding 'org.Hs.ipi.db_1.3.0_R_x86_64-pc-linux-gnu.tar.gz' to the cache
available_packages(repos = repos)
#> Error: invalid version specification '1.3.0 '
system("grep -A 10 org.Hs.ipi.db ~/.cache/R-crancache/*/*/*/PACKAGES | sed 's/$/|/'", intern = TRUE)
#> [1] "/home/muelleki/.cache/R-crancache/bioc-bin/src/contrib/PACKAGES:Package: org.Hs.ipi.db|"
#> [2] "/home/muelleki/.cache/R-crancache/bioc-bin/src/contrib/PACKAGES-Version: 1.3.0|"
#> [3] "/home/muelleki/.cache/R-crancache/bioc-bin/src/contrib/PACKAGES-Depends: R(>= 2.12.0) , methods, AnnotationDbi (>= 1.3.12), PAnnBuilder|"
#> [4] "/home/muelleki/.cache/R-crancache/bioc-bin/src/contrib/PACKAGES- (>= 1.3.0)|"
#> [5] "/home/muelleki/.cache/R-crancache/bioc-bin/src/contrib/PACKAGES-Imports: methods, AnnotationDbi, PAnnBuilder|"
#> [6] "/home/muelleki/.cache/R-crancache/bioc-bin/src/contrib/PACKAGES-License: Artistic-2.0|"
#> [7] "/home/muelleki/.cache/R-crancache/bioc-bin/src/contrib/PACKAGES-MD5sum: bfd850aef2bda5620d7c942e90d9cc32|"
#> [8] "/home/muelleki/.cache/R-crancache/bioc-bin/src/contrib/PACKAGES:File: org.Hs.ipi.db_1.3.0_R_x86_64-pc-linux-gnu.tar.gz|"
#> [9] "--|"
#> [10] "/home/muelleki/.cache/R-crancache/bioc/src/contrib/PACKAGES:Package: org.Hs.ipi.db|"
#> [11] "/home/muelleki/.cache/R-crancache/bioc/src/contrib/PACKAGES-Version: 1.3.0 |"
#> [12] "/home/muelleki/.cache/R-crancache/bioc/src/contrib/PACKAGES-Depends: R(>= 2.12.0) , methods, AnnotationDbi (>= 1.3.12), PAnnBuilder (>= 1.3.0) |"
#> [13] "/home/muelleki/.cache/R-crancache/bioc/src/contrib/PACKAGES-Imports: methods, AnnotationDbi, PAnnBuilder |"
#> [14] "/home/muelleki/.cache/R-crancache/bioc/src/contrib/PACKAGES-License: Artistic-2.0|"
#> [15] "/home/muelleki/.cache/R-crancache/bioc/src/contrib/PACKAGES-MD5sum: 787cc8c4512e1e07c2ef0812024438ca|"
#> [16] "/home/muelleki/.cache/R-crancache/bioc/src/contrib/PACKAGES:File: org.Hs.ipi.db_1.3.0.tar.gz|"
Would be great on Linux. But we need to check that they are functional.
In particular a message is printed when the tarball is added to the path.
crancache::crancache_remove("rex")
crancache::install_packages("rex", quiet=TRUE)
#> Adding ‘rex_1.1.2.tgz’ to the cache
E.g.
install.packages
CRANCACHE_DISABLE
) that can be set to yes
"irteQ" %in% crancache::available_packages()
#> [1] TRUE
This causes a revdepcheck of irteQ even though it was removed from CRAN 1 year ago.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.