Notebooks and version control
Demonstrate two tools which make version control of notebooks easier.
10 min teaching
5 min demo
Jupyter Notebooks are stored in JSON format. With this format it can be a bit difficult to compare and merge changes which are introduced through the notebook interface.
Packages and JupyterLab extensions to simplify version control
Several packages and JupyterLab extensions have been developed to make it easier to interact with Git and GitHub:
nbdime (notebook “diff” and “merge”) provides “content-aware” diffing and merging.
Adds a Git button to the notebook interface.
git mergeshell commands can use nbdime’s diff and merge for notebook files, but leave Git’s behavior unchanged for non-notebook files.
jupyterlab-git is a JupyterLab extension for version control using Git.
Adds a Git tab to the left-side menu bar for version control inside JupyterLab.
JupyterLab GitHub is a JupyterLab extension for accessing GitHub repositories.
Adds a GitHub tab to the left-side menu bar where you can browse and open notebooks from your GitHub repositories.
All three extensions can be used from within the JupyterLab interface and our Conda environment provides jupyterlab-git and nbdime. To install additional extensions, please consult the official documentation about installing and managing JupyterLab extensions.
Comparing Jupyter Notebooks on GitHub
For this you really want to enable Rich Jupyter Notebook Diffs on GitHub:
On GitHub click on your avatar/image (top right).
Click on “Feature preview”.
Enable “Rich Jupyter Notebook Diffs”.
To demonstrate the difference we have created a small change and you can try to compare the effect yourself by enabling/disabling the feature: https://github.com/coderefinery/jupyter/compare/5ff55b8..fce21e6
Here is the diff without “Rich Jupyter Notebook Diffs”:
Here is the same change, but this time with “Rich Jupyter Notebook Diffs” enabled:
Comparing changes locally without jupyterlab-git/nbdime
Create a new folder
Initialize a new Git repository (which is anyway good to demonstrate)
Copy the “darts” notebook into it (from the previous episode)
Stage and commit the file before trying the changes below
Instructor demonstrates a plain git diff
Then we introduce a simple change to the example notebook, for instance changing colors (change “red” and “blue” to something else) and also changing dimensions in
Run all cells.
We save the change (save icon) and in the JupyterLab terminal try a “normal”
git diffand see that this is not very useful. Discuss why.
Comparing changes with jupyterlab-git/nbdime
Let us inspect the same changes using jupyterlab-git (which uses nbdime). This is more convenient since it highlights only the changes that we have made:
Using nbdime on the command line
You can configure your (command line) Git to always use nbdime when comparing and merging notebooks:
$ nbdime config-git --enable --global
Now when you do git diff or git merge with notebooks, you should see a nice diff view. For more information please see the corresponding documentation.