Coder Social home page Coder Social logo

leveling-up-jupyter's Introduction

Leveling up your Jupyter notebook skills

Most of us regularly work with Jupyter notebooks, but fail to see obvious productivity gains involving its usage. Did you know that the web interface works like a modal editor such as vim? Do you know that you can actually profile AND debug code in notebooks? How about setting formulas or use pre-made style settings for visualizations? Let us go through the tricks of the trade together!

In this tutorial I want to give you an overview working with Jupyter notebooks, especially giving you valuable information on what things there are. Additionally, I show you some micro things to improve your productivity. The tutorial closes with a small overview on how Jupyter notebooks are used in practice. To summarize you learn what Jupyter is, how to use it in practice and get an overview on what there is in its evergrowing ecosystem (thanks to the work of the community!).

Overview

Notebooks are an important tool for data science as they allow for:

  • collaboration - as they can be shared as editable text files.

  • presentation - notebooks are visually presented in an attractive web interface, which we will further improve on, and can also be exported into several reporting formats (html, pdf, slides, ...).

  • reproducibility - results can be reproduced by (re-)running the notebook on a different machine. To avoid system related issues, this is often done in docker images (see [1.]).

  • flexibility - you can write your analysis in several programming languages, and even mix them, or connect to cluster via Spark for instance.

Jupyter project / Setup

If you have Anaconda then Jupyter should already be installed, if not you can do by installing it via pip install jupyter. The command jupyter --version gives you the version of Jupyter you are running, this tutorial was tested with version 4.3.0, a jupyter notebook --version of 5.0.0.

The config directory of Jupyter can be found under ~/.jupyter and changed via the environment_variable JUPYTER_CONFIG_DIR. If you do not want to touch your installation, you can change to a different anaconda environment:

conda create -n levelup_jupyter python=3.5
source activate levelup_jupyter
pip install jupyter

Note: the environment has to be activated in every shell session.

The Jupyter notebook server can be started with the command

jupyter notebook --port 8888 --no-browser

Running Jupyter notebook servers can be found with jupyter notebook list.

It's a good idea to run pip install --upgrade notebook

Using Docker

By running the docker commands docker pull jupyter/datascience-notebook (DO NOT DO THIS ON CONFERENCE WIFI) you get a docker image that contains a simple Jupyter installation.

You can run this with the command docker run -p <local_port>:8888 jupyter/datascience-notebook and then connect to localhost:<local_port> with the token provided.

This repository contains an example Dockerfile that shows how to customize this Jupyter notebook in your own settings. Advantages are the isolation provided, so you could run on a colleague's PC or on your cluster with the same settings as your laptop.

The UI

The UI of Jupyter is web-based. In the entry tab you are able to traverse to the working directory to open notebooks, take a look at running notebooks and the cluster.

The notebook view works like a modal editor such as vim. The notebook consists out of a sequence of cells that can be of several types, most often you encounter code cells that contain runnable code and markdown cells that can be typeset into integrated html on-the-fly.

You can go into edit mode by typing ENTER and leave back to the command mode by pressing ESC. Colors indicate in which mode you are at. The shortcuts for your system can be listed under Help > Keyboard Shortcuts.

Most helpful is CMD + SHIFT + p (CTRL + SHIFT + p in Linux).

Command Mode

Edit Mode

Themes

Of course there are also themes. Try pip install jupyterthemes. The program jt is used to switch themes, try jt -t chesterish. Restarting the session if it does not work. Reset with jt -r. There are plenty of more options

Exporting of notebooks

Under File > Download as several options can be found to export the notebook from. For some reason, there are no shortcuts to do this. For pdf export, pandoc is required.

To create a reveal.js slide presentation, see RISE (7.).

Improve Visualization quality by pre-made settings

See the snippets.

Profiling & Debugging

See notebook on this topic. In a nutshell you can debug with the notebook magic %pdb and %debug, the former debugs the cell it is in, the other debugs the last stack trace.

Other Kernels

There are several other kernels that you can install. For instance with

conda install -c r r-essentials

Installs the R kernel. There are many other kernels that you can find.

There is a little well-known trick to use R and Python in the same notebook (as in transfering state), see trick 21 in source (10.).

Extensions

Do install the extensions to the Jupyter notebook use the command conda install -c conda-forge jupyter_contrib_nbextensions. Here is information regarding the provided extensions.

Once you restart the Jupyter notebook server you will find a extensions view on the notebooks starting page. Here you can select single extensions and read their documentations. It is recommended to turn off the version requirement. Note that, however, some of the pre-5.0 extensions are obsolete.

Some words regarding the work flow in industrial applications

See the slides.

Installing Packages within Jupyter

If you happen to miss a package and want to reinstall it later the classical approach is to use a code cell with !pip install package_name. However, this can cause problem as you can see in Source 12.

Helpful links / Sources

  1. docker-stacks
  2. Best Practices for Jupyter Notebooks
  3. Jupytercon
  4. IPython and Jupyter in Depth
  5. Jupyter docker stacks
  6. Slides from Jupyter
  7. Reproducible Data Analysis in Jupyter
  8. Data Science is Software (SciPy 2016 Tutorial)
  9. Jupyter Themes
  10. 28 Jupyter Notebook tips, tricks and shortcuts
  11. More about Profiling
  12. Correctly install Python Packages from Jupyter

leveling-up-jupyter's People

Contributors

uberwach avatar

Watchers

Nissan Dookeran avatar James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.