Jupyter Notebooks

Objectives

  • Get an idea of the purpose of Jupyter.

  • See some inspirational Jupyter Notebooks.

Instructor note

  • 10 min teaching

  • 0 min exercises

Motivation for Jupyter Notebooks

Jupyter Notebook is a tool for creating and sharing documents that contain live code, equations, visualizations, and narrative text.

A comparison between pure Python code and a Jupyter Notebook highlights how Jupyter Notebook can include much more elaborate narrative than regular code files.

Jupyter Notebooks can contain much more than just regular code. (CC-BY).

Jupyter Notebook-ish functionalities are widely used

Many tools can have Jupyter-like behaviour: code and markdown cells

  • VSCode

  • Google Colab

  • GitHub Codespaces


Case examples

Jupyter Notebooks make it feasible to share your code: you can explain your ideas and anyone can run the code eg. in Binder.

Gravitational wave discovery

Discovery of gravitational waves

Discovery of gravitational waves.

As a case example, let us have a look at the analysis published together with the discovery of gravitational waves. This page lists the available analyses and presents several options to browse them.

  • A quick look at short segments of data can be found at https://github.com/losc-tutorial/quickview

  • The notebook can be opened and interactively explored using Binder by clicking the “launch Binder” button.

  • How does the Binder instance know which Python packages to load? It takes the information from files like requirements.txt (Python) or environment.yml or runtime.txt (R).

Activity inequality study

Stanford Activity Inequality Study

Stanford Activity Inequality Study.

Researchers in the Stanford Activity Inequality Study measured daily activity from cell phone tracking data for over 700,000 users in different countries across the world.

More examples

For further inspiration, head over to the Gallery of interesting Jupyter Notebooks.


Use cases

  • Really good for linear workflows (e.g. read data, filter data, do some statistics, plot the results)

  • Experimenting with new ideas, testing new libraries/databases

  • As an interactive development environment for code, data analysis, and visualization

  • Interactive work on HPC clusters

  • Sharing and explaining code to colleagues

  • Teaching (programming, experimental/theoretical science)

  • Learning from other notebooks

  • Keeping track of interactive sessions, like a digital lab notebook

  • Supplementary information with published articles

  • Slide presentations using Reveal.js

Pitfalls

  • Programs with non-linear code flow

  • Large codebase (however it can make sense to use Jupyter as interface to the large codebase and import the codebase as a module)

  • You cannot easily write a notebook directly in your text editor (but you can do that with R Markdown)

  • Notebooks can be version controlled (nbdime helps with that), but there are still limitations.

  • Notebooks aren’t named by default and tend to acquire a bunch of unrelated stuff. Be careful with organization!

  • See also https://scicomp.aalto.fi/scicomp/jupyter-pitfalls/.

Good practices

  • Rename notebooks from “Untitled.ipynb”.

  • Run all cells before sharing/saving to verify that the results you see on your computer were not due to cells being run out of order (we will try this later).