Sharing plots and notebooks

Objectives

Know about good practices for notebooks to make them reusable
Have a recipe to share a dynamic and reproducible visualization pipeline

[this lesson is adapted after https://coderefinery.github.io/jupyter/sharing/]

Document dependencies

If you import libraries into your notebook, note down their versions.

In Python, it is customary to do this either in a requirements.txt file (example):

jupyterlab
altair == 5.5.0
vega_datasets
pandas == 2.2.3
numpy == 2.1.2

… or in an environment.yml file (example):

name: data-viz
channels:
  - conda-forge
dependencies:
  - python <= 3.12
  - jupyterlab
  - altair-all = 5.5.0
  - vega_datasets
  - pandas = 2.2.3
  - numpy = 2.1.2

By the way, this is almost the same environment.yml file that we used to install the local software environment in the Software install instructions (the latter did not pin versions).

Place either requirements.txt or environment.yml in the same folder as the notebook(s).

This is not only useful for people who will try to rerun this in future, it is also understood by some tools (e.g. Binder) which we will see later.

Vega-Altair and notebooks containing sensitive data

If you plot sensitive data in a notebook with Vega-Altair, you need to be careful.

The author of Vega-Altair provided a good summary in this GitHub comment:

“Standard Altair rendering requires the entire dataset to be accessible to the viewer’s browser: this is a fundamental design decision in Vega/Vega-Lite, in which a chart is equivalent to a dataset plus a specification of how to render it. In general, you should assume that the entire contents of any dataframe you pass to the alt.Chart() object will be saved in the notebook and be inspectable by the viewer.”

“One way to get around this would be to render the chart server-side, export a PNG, and display this png instead of the live chart. Incidentally, in the Jupyter notebook you can do this by running:”

alt.renderers.enable('png')

“This sets up Altair such that charts will be rendered to PNG within the kernel, and only that PNG rendering will be embedded in the notebook. Note this requires some extra dependencies, described here.”

“But even here, I wouldn’t call your data “private” (for example, if you save a scatter plot to PNG, a user can straightforwardly read the data values off the chart!) So this makes me think you’re actually doing some sort of aggregation of your data before plotting (e.g. showing a histogram). If this is the case, I would suggest doing those aggregations outside of Altair using e.g. pandas, and then passing the aggregated dataset to the chart. Then you get the normal interactive display of the Altair chart, and your data is just as private as it would have been in the equivalent static rendering – the user can only see the aggregated values you supplied to the chart.”

How to get a digital object identifier (DOI)

Zenodo is a great service to get a DOI for a notebook (but first practice with the Zenodo sandbox).
Binder can also run notebooks from Zenodo.
In the supporting information of your paper you can refer to its DOI.

Sharing plots and notebooks

Document dependencies

Different ways to share a Vega-Altair plot

Vega-Altair and notebooks containing sensitive data

Different ways to share a notebook

Sharing dynamic notebooks using Binder

How to get a digital object identifier (DOI)

Sharing dynamic notebooks using Binder 