Coder Social home page Coder Social logo

o2r-project / containerit Goto Github PK

View Code? Open in Web Editor NEW
285.0 285.0 29.0 1.66 MB

Package an R workspace and all dependencies as a Docker container

Home Page: https://o2r.info/containerit/

License: GNU General Public License v3.0

R 92.27% Shell 0.62% Dockerfile 2.55% TeX 4.55%
docker dockerfile r reproducible-research reproducible-science

containerit's People

Contributors

benmarwick avatar egouldo avatar esther-lyondelsordo avatar kyleniemeyer avatar markedmondson1234 avatar matthiashinz avatar nuest avatar omaymas avatar pat-s avatar vsoch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

containerit's Issues

Validate containerization with test-build

Expands on an idea mentioned in #6 (comment)

After a session (or workspace or something else) was containerized, a user may be able to test-build an image in order to verify that

(1) The build is successful
(2) The local session matches the dockerized session
(3) More ideas?

1 and 2 can be achieved analog to tests/testthat/test_sessioninfo_reproduce.R, by turning the test into feature

Load a prepared session / workspace with RStudio Server

Extends #37

Simple R images like rocker/r-ver do not come with a GUI, therefore, the use of R is restricted to the console.

With such an configuration (try for instance docker run -it --rm rocker/r-ver) users are not able to view R plots or any file or data that cannot be printed directly to the console.

Therefore, it would be beneficial to leverage Rstudio images (see rocker/rstudio) and thus restore sessions directly in an Rstudio Server session.

See https://github.com/rocker-org/rocker/wiki/Using-the-RStudio-image

Package a session with versioned system dependencies

As becomes clear in the discussion on geospatial libraries in Rocker, the versions of linked external libraries matter.

Can we support packaging explicit version of linked libraries?

> extSoftVersion()
                     zlib                     bzlib                        xz 
                  "1.2.8"      "1.0.6, 6-Sept-2010"              "5.1.0alpha" 
                     PCRE                       ICU                       TRE 
        "8.38 2015-11-23"                        "" "TRE 0.8.0 R_fixes (BSD)" 
                    iconv                  readline 
             "glibc 2.23"                     "6.3"

> library(sf)
Linking to GEOS 3.5.0, GDAL 2.1.2, proj.4 4.9.2
> sf::sf_extSoftVersion()
   GEOS    GDAL  proj.4 
"3.5.1" "2.1.2" "4.9.2" 

This information could be accessed by a funtion <pkgname_extSoftVersion>, see extSoftVersion and (sf_extSoftVersion()](https://github.com/edzer/sfr/blob/5c3dfea395af81bf352b4007d16c6a7d419883c2/R/init.R#L59)

Read/parse Dockerfiles

When we can write Dockerfiles, we might as well parse them. uncontainer_it = create a session on a host machine that resembles the one that is inside a container.

Use cases needed!

Package a session with locales

Extends #6 and #33

When reproducing an R session also match the locales.
As shown in #33 (comment) locales are not reproduced yet

In Linux, the locale first has to be generated, if missing, and then configured as default or current locale. That seems not to be trivial, especially in non-interactive mode.

The R functions Sys.getlocale() and Sys.setlocale() may be helpful.

  • A sessionInfo can be reproduced with German locales (or anything other than the default)
  • Testthat file tests/testthat/test_sessioninfo_reproduce.R runs with test (uncomment corresponding lines):
test_that("the locales are the same ", {
  expect_equal(local_sessionInfo$locale, docker_sessionInfo$locale)
})

Create labels

Following the suggestions in Label Schema we can add some meta-information to the images

  • metadata labels (generated when, by who, ..)
  • arbitrary content (just create labels from a named list)
  • put session info output into a label as plain text
  • util function to read the labels from an image/container

Package a session (basic)

Create a small session (load few packages, including one from CRAN but not in base packages) and create a Dockerfile that is as close as possible to recreate that session.

Based on sessionInfo()$running we have a mapping from running string to base image. In our case all running strings map to rocker.

Also check devtools::session_info(), could be useful to determine installation source.

Open questions:

  • how can we fill the MAINTAINER field?

Container workspace to NixOS

Add CLI wrapper and publish Docker image

We need a CLI (command line interface) wrapper around the library to integrate it into workflows in other programming languages (e.g. as part of a node.js-based webapp)

Alternatively, if docopt does not work at all, evaluate package https://cran.r-project.org/web/packages/optparse/

Example usage:

container_it.R [-f <path to (markdown, R-script)file>]

container_it.R [-s] # package new R session

What options do we need exposed? How easy is that with docopt?

Add a metadata extraction script to higher level containers (session extraction)

When packaging research into higher level containers, e.g. ERC, we most probably need some meta information. While the use can be asked for this, see #13, it would be better to extract this automagically from the session.

For this, we would need a feature that appends a script to the "main script file" of the container which has access to the R session "after" the analysis is completed.

Some ideas for informations that could be extracted here:

  • temporal extend of objects (classes sptdf, xts, timeseries, ...)
  • spatial extend of objects (classes sp, sf, raster, ...)
  • user metadata? publication metadata?
    • parse markdown header
  • names of (potential) "input" files and "output" files

This feature is complementary to the file analysis conducted by @7048730 in https://github.com/o2r-project/o2r-meta

Load prepared session on container startup

It could be useful to extend the Dockerfile so that the captured session is fully replicated directly after container start. This would save the user to call require/library on those packages manually.

The only way to restore an interactive session with required libraries seems to be defining a Rprofile.site file and setting R_PROFILE environment variable to its location (using ENV instruction)

The R_PROFILE file must contain a .First - function which attaches required packages using require(...) (or library)?

load namespaces via requireNameSpace() --> create instruction CMD ["R"] at the end of the Dockerfile

Include R objects in restored session

May extend #37

Users may include R objects from their R workspace in a restored session.
Therefore, the dockerfile()-method has a parameter 'objects' that is not yet implemented.

The objects-parameter takes a character vector containing the names of the objects, as returned by the function ls()

The objects are saved to an RData-file that is copied to the image at the location of the R working directory. If the file has no name (".RData"), R by default loads it automatically into the session on startup.

Alternatively, users may "load" the file manually from the working directory.

  • implement objects-parameter save_image
  • restore an arbitrary session with R objects
  • write a test

Package any script file

  • 1. containerit executes the script locally and reproduce the session that results by the end of the script
  • 2. copy = script (default) copies the supplied script to the image,
  • 3. copy = script_dir also copies the script and all files / directories of the same folder
  • 4. copy takes a list of files and directories to be copied to the folder
  • 5. cmd parameter can be set with Cmd_Rscript("path/to/script") resulting to
    CMD ["Rscript", "--save", "/path/to/Rscript"]
  • 6. Test 1-5 with test scripts

- [ ] 1. Default: execute a script locally and reproduce the session that results by the end of the script
- [ ] 2. copy_script = TRUE also copies the script
- [ ] 3. copy_parent = TRUE also copies the script and all files / directories of the same folder
- [ ] 4. batch_exec = TRUE also copies script and sets CMD instruction to
CMD ["Rscript", "--save", "/path/to/Rscript"]

- [ ] 5. copy_files takes a list of files and directories to be copied to the folder
- [ ] 6. Test 1-5 with test scripts

Issues (to be discussed later)

  • How to handle: R working directory, relative/absolute paths, build context

Do not try to package containeRit itself

It is completely correct that right now there is an error message when a session is packaged, because containeRit itself is not published online: "Failed to identify source for package containeRit. Therefore the package cannot be installed in the docker image."

However, it should not be common to "package the packaging lib", so we should add an option add_self = false to the dockerfile(..) function that does by default not try to add the containeRit package itself to the image.

Re-using shinyapps-dependencies and obstacles

Issue #33 suggests to determine system dependencies based on https://github.com/rstudio/shinyapps-package-dependencies

However, as previous discussion in #33 shows, system dependencies are not explicit for all packages, for instance rgdal. That is because those dependencies that are listed in the basic Dockerfile seem to be pre-assumed, while the scripts from the 'packages' folder apply to packages that rely on additional dependencies. Hence, in order to rely on the shinyapps-package-dependencies, we would need to install all dependencies from the basic Dockerfile (or use the shinyapps image as the base image) which can result in unnecessary overhead.

Moreover, the shell scripts are made for ubuntu/linux only and therefore may not be applied to all potential base images.

  • re-evaluate shinyapps-package-dependencies for re-use

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.