Jupyter Notebooks
Objectives
Know what it is
Create a new notebook and save it
Open existing notebooks from the web
Be able to create text/markdown cells, code cells, images, and equations
Know when to use a Jupyter Notebook for a Python project and when perhaps not to
We will build up this notebook (spoiler alert!)
[this lesson is adapted from https://coderefinery.github.io/jupyter/motivation/]
Motivation for Jupyter Notebooks
Code, text, equations, figures, plots, etc. are interleaved, creating a computational narrative.
The name “Jupyter” derives from Julia+Python+R, but today Jupyter kernels exist for dozens of programming languages.
Our first notebook
Exercise: Create a notebook (15 min)
Open a new notebook (if you are unsure how, have a look at Software install instructions)
Rename the notebook
Create a markdown cell with a section title, a short text, an image, and an equation
# Title of my notebook Some text. ![Photo of Galilei's manuscript](https://upload.wikimedia.org/wikipedia/commons/b/b3/Galileo_Galilei_%281564_-_1642%29_-_Serenissimo_Principe_-_manuscript_with_observations_of_Jupiter_and_four_of_its_moons%2C_1610.png) $E = mc^2$
Most important shortcut: Shift + Enter, to run current cell and create a new one below.
Create a code cell where you define the
arithmetic_mean
function:def arithmetic_mean(sequence): s = 0.0 for element in sequence: s += element n = len(sequence) return s / n
In a different cell, call the function:
arithmetic_mean([1, 2, 3, 4, 5])
In a new cell, let us try to plot a layered histogram:
# this example is from https://altair-viz.github.io/gallery/layered_histogram.html import pandas as pd import altair as alt import numpy as np np.random.seed(42) # Generating Data source = pd.DataFrame({ 'Trial A': np.random.normal(0, 0.8, 1000), 'Trial B': np.random.normal(-2, 1, 1000), 'Trial C': np.random.normal(3, 2, 1000) }) alt.Chart(source).transform_fold( ['Trial A', 'Trial B', 'Trial C'], as_=['Experiment', 'Measurement'] ).mark_bar( opacity=0.3, binSpacing=0 ).encode( alt.X('Measurement:Q').bin(maxbins=100), alt.Y('count()').stack(None), alt.Color('Experiment:N') )
Run all cells.
Save the notebook.
Observe that a “#” character has a different meaning in a code cell (code comment) than in a markdown cell (heading).
Your notebook should look like this one.
Use cases for notebooks
Really good for step-by-step recipes (e.g. read data, filter data, do some statistics, plot the results)
Experimenting with new ideas, testing new libraries/databases
As an interactive development environment for code, data analysis, and visualization
Keeping track of interactive sessions, like a digital lab notebook
Supporting information with published articles
Situations where notebooks are less of a good fit:
Code takes long to run
It is so long and complex that I need to test it
When I need a command-line interface
When I want to process many similar files and each takes few minutes
Good practices
Run all cells or even Restart Kernel and Run All Cells before sharing/saving to verify that the results you see on your computer were not due to cells being run out of order.
This can be demonstrated with the following example:
numbers = [1, 2, 3, 4, 5]
arithmetic_mean(numbers)
We can first split this code into two cells and then re-define numbers
further down in the notebook. If we run the cells out of order, the result will
be different.