Coder Social home page Coder Social logo

cyberhubs's Introduction

TODO

Look for "todo" in this document to find open issues.

Remaining todos: [1]

Cyberhubs

Cyberhubs is an implementation of a JupyterHub server with application specific and customizable application hubs for multiple users that sign up with github authentication. Cyberhubs consist of two parts

  • a multiuser launcher that orchestrates administration of multiple users, and
  • a family of application docker hubs (singleuser), each of which presents a data analytics capability targeting a particular user group or application usecase. These hubs can be combined.

Cyberhubs is based on the Docker technology. This repo contains documentation and the necessary Dockerfiles and configuration files to either

  1. to pull pre-built docker images available from Cyberhubs on Docker Hub (this is the recommended option), or to
  2. build the corehub multiuser and singleuser (hubs) docker images from scratch,

and then, to launch the cyberhub to create a custrom virtual research environment.

In both cases either the corehub singleuser application (refered to in short as hubs) included in this repository can be launched, or any of the singleuser applications in the associated astrohubs repository. New singleuser application hubs can be easily created, or existing ones modified.

Repository summary

  • cyberhubs (this repository) provides the corehub multiuser launcher and basic corehub singleuser application corehub, while
  • astrohubs provides a family of astronomy-oriented singleuser hub applications.

All astrohubs and corehub multiuser and singleuser are available as pre-built docker images from Cyberhubs on Docker Hub. All astrohubs can be launched with the corehub multiuser docker image provided in this repository.

About this documentation

This documentation describes

  1. The configuration of a Linux server in order to host the core cyberhub.
  2. The (optional) building of the Docker images of the multiuser and singleuser components of the corehub.
  3. The configuration and deployment of the docker images into a running service.

In most cases users wishing to deploy a cyberhub should not start by building the Docker images themselves, but instead launch the service from pre-built images available from the Cyberhubs Docker repository. The deployment prepraration and procedure is, however, in both cases almost the same, with the only difference that the docker build step would be ommitted when launching from pre-built docker images. For that reason, both options are covered by this documentation and this repo, and the differences in deployment will be clearly marked (see multiuser/README).

Configuring the host server

This section describes how the server on which the cyberhub is installed has to be configured.

In our example we will use a CentOS server running as a virtual machine in the Compute Canada WestCloud on Arbutus located at the University of Victoria and running OpenStack.

Launching the VM

  • CentOS 7 machine with four cores, 15GB mem and 180GB attached hard drive
  • Assign key pair in Access and Security
  • Assign floating IP so that we can access from outside
  • Associate IP address, and try to connect to IP

Prepare Unix OS

  • It is recommended to operate this service as a dedicated user, such as the default centos user, or a dedicated user docker.
  • Start with CentOS 7 image
  • Update all packages: sudo yum update
  • Install the following packages: sudo yum install git wget sshfs
  • Install epel-repository packages: sudo yum install epel-release
  • Install docker-ce (community edition)
    • sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
    • sudo yum install docker-ce-17.09.0.ce
  • choose a stable version, currently (Nov. 2017 - present) we use: `docker-ce-17.09.0.ce-1.el7'

Get cyberhubs repo and configure disk volumes and mount points

Checkout cyberlaboratories/cyberhubs repository:

git clone https://github.com/cyberlaboratories/cyberhubs

Before starting the docker service prepare some disk mount points and data locations. The docker images may not be stored in their default location /var/lib/docker because there may not be enough space in the system dir. Instead, prepare or arrange for additional drive at mount point /mnt. Then move /var/lib/docker to /mnt/var/lib, and then set a symbolic link (see below).

  • Given the layered file structure of docker images, the host can easily run out of inodes. It is recommended to be proactive and make an ext4 file system to host the system docker directory with large number of inodes: mkfs -t ext4 -N 2147483648 /dev/XdY.
  • This is also where the users containers will live and where some shared user space will be available.
  • If you run out of space here you are in trouble as your docker installation may become unresponsive and difficult to resurrect.
  • 100GB size is minimum for a small installation, a TB is much better, ideally fast storage, such as SSD.

Setup the /mnt volume correctly:

  • In the scripts directory of the cyberhubs repo review and execute sudo ./mnt_volumses.sh.
  • This script will, among other things, set a link from /mnt/var/lib/docker back to /var/lib/docker (where docker expects this system directory to be).
  • Start the docker service, and add the user to the docker group and checking that the centOS user can execute as user.
sudo systemctl enable docker.service
sudo systemctl start docker
sudo docker run --rm hello-world

The last command should output Hello from Docker! to signal success.

  • Add dedicated user (e.g. centos or docker, see above) to docker group: sudo usermod -aG docker centos, log out and in again, and check docker as user: docker run --rm hello-world
  • install pip, instructions
curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py"
sudo python get-pip.py

This concludes the configuration of the host.

Docker configuration

Start by cloning the cyberhubs repository.

Adding mount points and volumes

  • Mount any external volumes you may want to add to the lab using scripts/mnt_alias.sh.
  • This will create the /mnt/ volumes as seen in the config.DockerSpawner.
  • it appears the /dev/fuse device may not have the right permissions, this should fix it: sudo chmod a+rw /dev/fuse (may need more testings)
  • configure environment variables in scripts/jupyterhub-config-script.sh:
    • define JUPYTER_HUB_AUTH_UR, JUPYTER_HUB_CLNT_ID, JUPYTER_HUB_CLNT_SE (see next section)
    • define mount points of external storage: add private, shared and immutable data spaces.
    • define server admins: JUPYTER_HUB_ADMN_NM

OAuth authentication and user configuration

  • Register the jupyterhub application github registration with GitHub. You need to enter
    • URL of the server (Homepage URL) - use https!
    • server URL and add /hub/oauth_callback
  • configure environment variables in scripts/jupyterhub-config-script.sh:
    • Specify the github admin user names
    • Possibly specify allowed users (white listing, can not changed after multiuser is up) todo: there are some reports of white listing not always working as advertised, needs to be looked into
    • Enter the call-back URL, the client_id and secret into.
    • Add the name of the singleuser docker container name
  • source /scripts/jupyterhub-config-script.sh
  • Optional whitelist and blacklist usage:
    • The directory access under multiuser/that will contain the blist and wlist files if needed, the multiuser/access/README.md
    • If the wlist, blist and the whitelist environment variable in scripts/jupyterhub-config-script.sh are empty, then all the github users are allowed to log in.
    • If the wlist and/or the whitelist environment variable is not empty, then only the whitelisted users will be allowed to log in.
    • The black list wins all the time: any user from the black list is denied access no matter what.

Important Note Regarding Whitelisting and Blacklisting

  • Any time the whiltelist or blacklist users have changed, the singleuser page must be reloaded not by simply refreshing the page, rather by clicking on the logo of the particular astrohub singleuser found in the upper left corner of the screen.
  • Another method is to re-visit the native URL of the singleuser host.The reason this is necessary is because the request for authentification must be re-made. When the URL of the page is simply refreshed, it does not re-submit this request, and so the whitelist and blacklist changes will not be made or seen.

Configure SSL keys/certificates

  • A valid SSL key/certificate must be available to properly connect, see README in corehub/multiuser/SSL

Configure the spawner menu

This multiuser image allows specification and menu-based user selection among several application hubs. The default is to offer corehub and the application hub defined in the variable

export JUPYTER_SGLEUSR_IMG='cyberhubs/corehub:latest' 

in scripts/jupyterhub-config-script.sh. The spawner menu can easily be configured to offer more option. This is specified in multiuser/jupyterhub_config.py, look for lines such as

        <option value="cyberhubs/corehub">corehub</option>

in the function def _options_form(self):. Just add lines with available applicationhubs, such as

<option value="cyberhubs/wendihub">Wendi</option>

with locally available application hub docker images (docker images).

Pulling the docker images and starting the service

  • Pull an application hub(singleuser) image, such as docker pull cyberhubs/corehub, or one of the application images available from the cyberhubs docker hub repp (using docker pull), or build images from the source (e.g. if you need to modify or add software packages) available from the cyberlaboratories/astrophubs GitHub repository.
  • In case you want to build and modify the corehub image available in this cyberlaboratories/cyberhubs repository go to corehub/singleuser and build image: make build. In this case the image name will be local/corehub as specified in the makefile.
  • Go to multiuser and consult the README file for options to build the multiuser image. In most cases you would use the prebuilt multiuser docker image. You do not need to pull this image manually, it will by default be pulled during the next step.
  • Launch the service: docker-compose up (add the option -d to the end of the command to suppress the output). This will start your multiuser docker-environment.
  • For more commands, such as bringing down your docker environment, see the corehub/dockerfiles/multiuser/README, or at the bottom here.

Maintencance

Prune unused images

In order to keep the system clean purge unused images from time to time using docker image prune -a.

Monitor and set number of processes

In a default configuration the user that is running the docker service (centos in our case) may run against the processes number limit. Check number of processes with this command: ps -efL --no-headers |wc -l. In order to change the max number of processes add nproc in /etc/security/limits.conf to specify the new soft and hard limits.

Basic docker

Docker Hub repository

If you have created new singleuser hubs that you would like to share in the cyberhubs framework you can add these to the cyberhubs docker hub repository.

  • get account on http://hub.docker.com
  • [get added to cyberhubs organisation]
  • login: docker login --username=username
  • list all images: docker images

Pulling pre-built images from DockerHub.

  • To pull corehub, after successfully logging in, you can get and run the images by:
docker pull cyberhubs/multisuer
docker pull cyberhubs/corehubsingeluser
  • You can tag your images with whatever new name you'd like with docker tag OLD_NAME NEW_NAME. This is useful when building other images from cyberhubs/corehubsingleuser.

Pushing your images to DockerHub

  • tag singleuser and multiuser image the image to be uploaded: docker tag image_ID_321 cyberhubs/corehub (for singleuser) and docker tag image_ID_123 cyberhubs/multiuser (get the image ID with docker images)
  • push to repository, for example: docker push cyberhubs/multiuser

Other useful commands

docker rmi $(docker images -q) # removes all docker images
docker-compose down; docker kill $(docker ps -aq); docker rm $(docker ps -aq) # stops all running Docker containers
docker images # lists all active Docker images
docker rmi --force  $(docker images -q) # Forces removal of Docker images
docker exec -it fherwig/corehub:multiuser bash # Enter bash (similar to SSH) into bash environment of running image
docker pull fherwig/corehub:singleuser # Take a docker image from the DockerHub 
docker pull -a fherwig/corehub # pull all images from repo on DockerHub

Roadmap

The following improvements are planned to be implemented:

  1. automatically renew certificats when needed (certbot reniew --dry-run, cronjob)
  2. time-out open access followed by wlisting
  3. add jupyterlab Gitlab extension

cyberhubs's People

Contributors

fherwig avatar amd1250x avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.