Coder Social home page Coder Social logo

sd2e / jupyteruser-sd2e Goto Github PK

View Code? Open in Web Editor NEW
1.0 10.0 4.0 8.89 MB

Dockerfile (and build/test support) for extending SD2E Jupyter Notebooks base image

License: Other

Makefile 0.02% Shell 0.08% Python 0.05% Jupyter Notebook 99.15% Common Lisp 0.01% JavaScript 0.05% CSS 0.48% Dockerfile 0.16%
jupyter-notebook agaveapi tacc

jupyteruser-sd2e's Introduction

SD2E Jupyter Notebook Environment

What it Gives You

  • Jupyter Notebook (v5.2.0)server
  • Conda Python 3.4.x and Python 2.7.x environments
    • Custom environments and kernels now supported through Advanced Jupyter.ipynb notebook
  • Pre-installed python packages (highlights)
    • pandas
    • matplotlib 2.1.2
    • numpy, scipy
    • seaborn
    • scikit-learn
    • numba
    • pyemd
  • R-kernel
  • Plotly, igraph, networkx, graphviz for graphs and plots
  • MIT's Open Probabilistic Programming Stack
  • Common Lisp kernel and stack
    • v1.4.6 Now installed as binary
  • Bioconda and Bioconductor
  • BioPython
  • Git

SD2E Specific Features

  • Integration with TACC's Agave API via the sd2e-cli and AgavePy library
  • The sd2nb Jupyter Notebook sharing service
  • The sd2e-jupyter application for launching HPC & GPU-powered notebooks

Details

This repository builds the Docker image supporting the SD2E Jupyter Notebooks environment. Until the platform team releases the upcoming feature to select beteween support environments at launch time, major dependencies and configuration options must be set in this base image.

Via a combination of the repository itself plus GitHub's collaborative issue management, code review, and other features we are implementing a cooperative process for SD2 collaborators to improve and extend the runtime environment. When we transition to supporting multiple base images, we'll extend this process to support that feature.

Guidelines and policies

The process for community contributions is outlined in CONTRIBUTING. Critical guidelines for the maintainers of this repo and the SD2E Jupyter are in MAINTAINERS.

jupyteruser-sd2e's People

Contributors

eriksf avatar mdehavensift avatar mwvaughn avatar zyndagj avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jupyteruser-sd2e's Issues

Limit HPC jobs

Ensure that a user can only have a single Jupyter instance running at a time.

Expected Behaviour

Agave rejects new HPC notebook jobs while one is running

Current Behaviour

Users can spawn multiple notebook servers, resulting in confusion

Possible Solution

Modify agave app description to limit the maximum number of jobs to 1

Quick look dependencies request

Need a few dependencies installed when you all can.

--> plotly, igraph, and networkx.

I need these to be able to run my quick look notebooks in the Jupyter hub.

Note that igraph in python also has some dependencies in C that might need to be installed. I did it so long ago I forget...

.agave directory is owned by root

.agave directory is owned by root, so the "current" credentials cannot be modified, and tenants cannot be switched.

Expected Behavior

auth-tokes-refresh -S

Current Behavior

gzynda@5bbcd1c3f18d:~$ ls -lha
total 76K
drwxrwxrwx  13 jupyter 65536     319 Oct 23 19:36 .
drwxr-xr-x   3 root    root       21 Oct 11 20:06 ..
drwxr-xr-x   2 root    root       21 Oct 23 19:36 .agave
-rw-rw-rw-   1 jupyter 65536     220 Aug 31  2015 .bash_logout
-rw-rw-rw-   1 jupyter 65536    4.1K Oct 11 20:06 .bashrc
drwxrwxrwx   4 jupyter 65536      43 Oct 12 18:28 .cache
drwxrwxrwx   4 jupyter 65536      43 Oct 12 18:34 .config
drwxrwxrwx   5 jupyter 65536     209 Oct 12 18:32 examples
drwx------   3  823749 G-819382   19 Oct 23 19:36 .local
-rw-r--r--   1  800730 G-816877   48 Oct 23 16:09 notebook.log
drwxrwxrwx   3 jupyter 65536      22 Oct 12 18:34 .npm
-rw-rw-rw-   1 jupyter 65536     655 May 16  2017 .profile
drwxrwxrwx   6 jupyter 65536     128 Oct 12 18:28 .quicklisp
-rw-rw-rw-   1 jupyter 65536      96 Sep 10 17:02 .sbclrc
drwxrwsr-x+ 19  845002 G-819382  28K Oct 17 20:16 sd2e-community
drwxr-xr-x   9  845002 root     4.0K Sep 24 20:21 sd2e-partners
drwxr-sr-x   3  845002 G-819382 4.0K Apr  2  2018 sd2e-projects
-rw-rw-rw-   1 jupyter 65536    8.0K Sep 28 20:29 SD2E_README.ipynb
drwxr-xr-x  38  823749   815499 4.0K Oct 23 16:23 tacc-work
-rw-rw-rw-   1 jupyter 65536     182 Oct 11 20:06 .wget-hsts
gzynda@5bbcd1c3f18d:~$ cd .agave/
gzynda@5bbcd1c3f18d:~/.agave$ ls
current
gzynda@5bbcd1c3f18d:~/.agave$ ls -lha
total 4.0K
drwxr-xr-x  2 root    root      21 Oct 23 19:36 .
drwxrwxrwx 13 jupyter 65536    319 Oct 23 19:36 ..
-rwxrwxrwx  1  800730 G-816877 358 Oct 23 19:36 current
gzynda@5bbcd1c3f18d:~/.agave$ touch file
touch: cannot touch 'file': Permission denied

Possible Solution

This needs to be fixed in the spawner

Steps to Reproduce (for bugs)

  1. Launch a notebook

Requesting git, cmake, graphviz support

Expected Behaviour

  • git clone should download repos.
  • cmake would be nice to have available in order to install projects within the notebook
  • import graphviz would be nice to have in Python (and also command line tools such as dot)

Current Behaviour

git has problems:

sharker@1b9ef543efea:~/test$ git clone https://github.com/shaunharker/DSGRN.git
Cloning into 'DSGRN'...
remote: Counting objects: 5491, done.
remote: Compressing objects: 100% (158/158), done.
remote: Total 5491 (delta 52), reused 181 (delta 30), pack-reused 5262
Receiving objects: 100% (5491/5491), 8.25 MiB | 0 bytes/s, done.
Resolving deltas: 100% (3508/3508), done.
Checking connectivity... done.
fatal: unable to look up current user in the passwd file: no such user
sharker@1b9ef543efea:~/test$

cmake and graphviz (both system and python package) aren't installed.

Possible Solution

  1. git -- add user $JPY_USER so they show up in /etc/passwd in docker file recipe in whichever way makes sense
  2. cmake -- RUN apt-get install cmake in dockerfile
  3. graphviz -- RUN apt-get install graphviz in dockerfile
  4. python graphviz module -- RUN pip install graphviz in dockerfile

Steps to Reproduce (for bugs)

  1. typing git clone https://github.com/shaunharker/DSGRN.git in bash
  2. typingcmake in bash
  3. typing dot in bash
  4. typing import graphviz in python interpreter

Context

The issues make it difficult to install DSGRN code https://github.com/shaunharker/DSGRN and the graph drawing functionality in Jupyter notebooks is missing.

Your Environment

Accessing https://jupyter.sd2e.org via the browser and using Jupyter notebooks and terminal.

Support pyemd

Support pyemd in base jupyter image

Expected Behaviour

import pyemd

Current Behaviour

import pyemd -> ERROR

We can do a local install, but it does not persist between server restarts.

Possible Solution

conda install pyemd
conda install -n python2 pyemd

Context

Implementation of Earth movers distance to compare histograms

Conda envs don't apply to commands shelling out in jupyter python notebook

I can create a custom conda environment, and can see it in the dropdown under 'new'. When I create a new jupyter python notebook with the custom environment, I can import the applicable modules, but when I shell out a command, it is running in the base conda environment instead of the one I selected.

Expected Behaviour

When I click new->{custom conda environment} it pulls up a new ipython notebook if I type:
!conda envs list
it should see that I am in my custom environment and allow me to run commands using programs I have installed in that custom environment.

Current Behaviour

!conda envs list

# conda environments:
#
sims                     /home/jupyter/tacc-work/jupyter_packages/envs/sims
base                  *  /opt/conda
python2                  /opt/conda/envs/python2

In addition, I can't switch to the custom environment because /bin/sh: 1: source: not found

Steps to Reproduce (for bugs)

https://jupyter.sd2e.org/user/kenclem1/notebooks/tacc-work/maverick/SafeGenes/SimulationsDemo/envTest.ipynb

  1. Create custom conda environment using details at: /examples/Advanced Jupyter.ipynb on jupyter.sd2e.org. Install for example the program pindel using the command conda install -c bioconda pindel
  2. Create a new notebook using that conda environment using the new->{custom environment}
  3. Type !conda envs
  4. Observe that the conda environment has not been loaded.
  5. Attempt to run pindel using the command pindel and note again that the conda environment has not been loaded.

Context

I need to create a conda environment that includes python packages as well as software packages called within my ipython notebook and the python scripts.

CLI is outdated

The cli shipping with this notebook runtime is out of date with the SD2E-CLI being used by the project. I propose to update it.

Expected Behaviour

  1. Support the sd2e command extensions
  2. Should support --rich text formatting
  3. Support automatic credential refresh
  4. Support the agave:// canonical URLs in file operations
  5. Faster parsing by using jq by default instead of bash for JSON parsing
  6. Support for swapping tenants and identities
  7. Support for filing support tickets from the CLI
  8. Much much more

Current Behaviour

  1. It's missing all the above plus support for some API features added since the deep base image was last updated

Possible Solution

  1. Add a RUN command to the custom Dockerfile that executes the sd2e-cli installer

Kernel Stops for Bash Jupyter Notebook

Expected Behaviour

When a new Bash Notebook is created, the kernel should continue to run until manually stopped.

Current Behaviour

When a new Bash Notebook is created, the kernel status begins as "Kernel starting" but then immediately displays "No kernel". When the kernel dies, the following prompt appears:
"Dead kernel
The kernel has died, and the automatic restart has failed. It is possible the kernel cannot be restarted. If you are not able to restart the kernel, you will still be able to save the notebook, but running code will no longer work until the notebook is reopened."
Restarting the kernel starts this cycle all over again.

Steps to Reproduce (for bugs)

  1. Start a new Bash Notebook
  2. Wait about 4 seconds
  3. Upon "Dead Kernel" prompt, select "Try Restarting Now"
  4. GOTO 2

Context

Bash is an extremely important utility. I'd like to have a command-line interface into sd2e.

Your Environment

The version of the notebook server is 5.0.0 and is running on:
Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]

Add Keras to SD2E Jupyter Notebooks base image

Expected Behaviour

Users should be able to import Keras with the "import keras" python command.

Current Behaviour

Error message "ImportError: No module named keras" is outputted when trying to import keras.

Possible Solution

pip install keras

Steps to Reproduce (for bugs)

  1. import keras

Context

I'm trying to create deep learning models.

Your Environment

I don't think this is necessary for this feature request.

Julia Kernel for Probcomp

While it is not a required dependency yet, the probcomp group has some notebooks that depend on the Julia kernel.

Expected Behaviour

The following notebooks should work

  • examples/probcomp/goal-inference-part1.ipynb
  • examples/probcomp/goal-inference-part2.ipynb
  • examples/probcomp/goal-inference-part3.ipynb

Current Behaviour

The notebooks cannot find the Julia kernel

Possible Solution

https://github.com/jupyter/docker-stacks/blob/db3ee82ad08a83e3019d7aaa2d8d6e0090e0f646/datascience-notebook/Dockerfile#L18

Steps to Reproduce (for bugs)

Launch problem notebooks

FlowCytometryTools import broken

These two blocks of code used to work, but seem to have been broken by the upgrade:

%%bash
source activate python2
pip install --user flowcytometrytools
from FlowCytometryTools import *

import os
from glob import glob

import re

import numpy as np
import pandas as pd
import seaborn as sns

from itertools import compress

%matplotlib inline
import matplotlib.pyplot as plt
from pylab import *

The first runs successfully:

Collecting flowcytometrytools
  Downloading FlowCytometryTools-0.5.0.tar.gz (11.9MB)
Requirement already satisfied: setuptools in /opt/conda/envs/python2/lib/python2.7/site-packages (from flowcytometrytools)
Requirement already satisfied: decorator in /opt/conda/envs/python2/lib/python2.7/site-packages (from flowcytometrytools)
Collecting fcsparser>=0.1.1 (from flowcytometrytools)
  Downloading fcsparser-0.2.0.tar.gz (4.7MB)
Requirement already satisfied: six in /opt/conda/envs/python2/lib/python2.7/site-packages (from fcsparser>=0.1.1->flowcytometrytools)
Requirement already satisfied: numpy in /opt/conda/envs/python2/lib/python2.7/site-packages (from fcsparser>=0.1.1->flowcytometrytools)
Requirement already satisfied: pandas in /opt/conda/envs/python2/lib/python2.7/site-packages (from fcsparser>=0.1.1->flowcytometrytools)
Requirement already satisfied: python-dateutil in /opt/conda/envs/python2/lib/python2.7/site-packages (from pandas->fcsparser>=0.1.1->flowcytometrytools)
Requirement already satisfied: pytz>=2011k in /opt/conda/envs/python2/lib/python2.7/site-packages (from pandas->fcsparser>=0.1.1->flowcytometrytools)
Building wheels for collected packages: flowcytometrytools, fcsparser
  Running setup.py bdist_wheel for flowcytometrytools: started
  Running setup.py bdist_wheel for flowcytometrytools: finished with status 'done'
  Stored in directory: /home/jupyter/.cache/pip/wheels/55/55/18/0fd943d5bba304d299de89f81bd86c260d21fd43fca670890c
  Running setup.py bdist_wheel for fcsparser: started
  Running setup.py bdist_wheel for fcsparser: finished with status 'done'
  Stored in directory: /home/jupyter/.cache/pip/wheels/01/60/78/b3cecdc562a0337e60406e470b21265a0b3739b0a41684e464
Successfully built flowcytometrytools fcsparser
Installing collected packages: fcsparser, flowcytometrytools
Successfully installed fcsparser-0.2.0 flowcytometrytools-0.5.0

But the second yields this:

ImportErrorTraceback (most recent call last)
<ipython-input-3-f8d41b630cfe> in <module>()
----> 1 from FlowCytometryTools import *
      2 
      3 import os
      4 from glob import glob
      5 

ImportError: No module named FlowCytometryTools

LOCAL_ENVS does not exist

I'm trying to follow along the instructions for loading custom conda envs into jupyter at /examples/Advanced Jupyter.ipynb on jupyter.sd2e.org.

However, the LOCAL_ENVS variable doesn't exist and is not defined anywhere in the page.

Expected Behaviour

$ echo $LOCAL_ENVS
/blablabla
$

Current Behaviour

$ echo $LOCAL_ENVS

$

Context

This has the effect that when I try to install conda environments, I get a

Check that you have sufficient permissions.```
error because I can't write to /.

## Your Environment
I'm logging in to maverick if that makes a differnece.

Incorporate Open Probabilistic Programming Stack

Hello,

We are interested in supporting the Open Probabilistic Programming Stack into the base Jupyter image to use BaysDB.

Expected Behavior

Current Behavior

  • Notebooks are not present
  • Necessary dependencies are not present

Possible Solution

Context

This software is necessary our collaborators from the MIT Probabilistic Computing Project lab.

Enable https on HPC image

It has been requested that we switch to https for communication to the HPC instances for security in public locations.

Expected Behaviour

url should begin with https

Current Behaviour

url begins with http

Possible Solution

forward connection through an internal proxy with https certs

Common Lisp Kernel

We would like to add the Common Lisp kernel to the Jupyter notebooks.

Expected Behaviour

Common Lisp should be one of the available kernel options when creating a new notebook when using the SD2 Jupyter notebook Docker container.

Current Behaviour

Currently, there is no support for Common Lisp in the Jupyter notebook Docker image.

Possible Solution

Install the Common Lisp kernel (https://github.com/fredokun/cl-jupyter).

Context

SIFT's code is written in Common Lisp. We would like to be able to easily share examples with other people on the project in a way that is interactive and familiar to most people. We think that using Jupyter notebooks would be a good solution because they are interactive and, while many people on the project may not be familiar with Common Lisp, most will be familiar with Jupyter notebooks and will be using them to some extent for other parts of the project. Integrating the Common Lisp kernel into the existing SD2 Jupyter notebook image will allow us to easily share examples in a way that other's will be able access and run conveniently.

Add imagemagick to sd2e image

It was requested that ImageMagick be made available on the HPC image

Expected Behaviour

After installation

convert -h

should work on HPC and native jupyter.

Current Behaviour

ImageMagick is not installed

Possible Solution

apt-get install imagemagick

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.