Notebooks and version control
Demonstrate two tools which make version control of notebooks easier.
10 min teaching
5 min exercises
Jupyter Notebooks are stored in JSON format. With this format it can be a bit difficult to compare and merge changes which are introduced through the notebook interface.
Packages and JupyterLab extensions to simplify version control
Several packages and JupyterLab extensions have been developed to make it easier to interact with Git and GitHub:
nbdime (notebook “diff” and “merge”) provides “content-aware” diffing and merging.
Adds a Git button to the notebook interface.
git mergeshell commands can use nbdime’s diff and merge for notebook files, but leave Git’s behavior unchanged for non-notebook files.
jupyterlab-git is a JupyterLab extension for version control using Git.
Adds a Git tab to the left-side menu bar for version control inside JupyterLab.
JupyterLab GitHub is a JupyterLab extension for accessing GitHub repositories.
Adds a GitHub tab to the left-side menu bar where you can browse and open notebooks from your GitHub repositories.
All three extensions can be used from within the JupyterLab interface. jupyterlab-git is installed as part of our Conda environment. nbdime is also already installed in this environment since it is a dependency of jupyterlab-git.
To install additional extensions, please consult the official documentation about installing and managing JupyterLab extensions.
Comparing changes without jupyterlab-git/nbdime
Create a new folder
Initialize a new Git repository (which is anyway good to demonstrate)
Copy the “darts” notebook into it (from the previous episode)
Stage and commit the file before trying the changes below
A plain git diff
Instructor demonstrates a plain git diff
To understand the problem, the instructor first shows the example notebook and then the source code in JSON format.
Then we introduce a simple change to the example notebook, for instance changing colors (change “red” and “blue” to something else) and also changing dimensions in
Run all cells.
We save the change (save icon) and in the JupyterLab terminal try a “normal”
git diffand see that this is not very useful. Discuss why.
Comparing changes with jupyterlab-git/nbdime
Let us inspect the same changes using jupyterlab-git (which uses nbdime). This is more convenient since it highlights only the changes that we have made:
Using nbdime on the command line
You can configure your (command line) Git to always use nbdime when comparing and merging notebooks:
$ nbdime config-git --enable --global
Now when you do git diff or git merge with notebooks, you should see a nice diff view. For more information please see the corresponding documentation.
nbdev developed by fast.ai is a notebook-driven development platform which includes support for git-friendly Jupyter notebooks
Verdant is a JupyterLab extension that automatically records history of all experiments you run in a Jupyter notebook, and stores them in an