Motivation for version control
Git is all about keeping track of changes
Below are screenshots of tracked changes with Git (from our example repository):
Discussion
Commits are like snapshots of the repository at a certain point in time.
Commits carry metadata about changes: author, date, commit message, and a checksum.
Why do we need to keep track of versions?
Version control is an answer to the following questions (do you recognize some of them?):
“It broke … hopefully I have a working version somewhere?”
“Can you please send me the latest version?”
“Where is the latest version?”
“Which version are you using?”
“Which version have the authors used in the paper I am trying to reproduce?”
“Found a bug! Since when was it there?”
“I am sure it used to work. When did it change?”
“My laptop is gone. Is my thesis now gone?”
Features: roll-back, branching, merging, collaboration
Roll-back: you can always go back to a previous version and compare
Branching and merging:
Work on different ideas at the same time
Different people can work on the same code/project without interfering
You can experiment with an idea and discard it if it turns out to be a bad idea
Collaboration: review, compare, share, discuss
Reproducibility
Someone asks you about your results from 5 years ago. Can you get the same results now?
How do you indicate which version of your code you have used in your paper?
When you find a bug, how do you know when precisely this bug was introduced (Are published results affected? Do you need to inform collaborators or users of your code?).
With version control we can “annotate” code (browse this example online):
Talking about code
You want to show someone a few lines from one of your projects. Which of these two is more practical?
“Clone the code, go to the file ‘src/util.rs’, and search for ‘time_iso8601’”. Oh! But make sure you use the version from August 2023.”
Or I can send you a permalink:
What we typically like to snapshot
Software (this is how it started but Git/GitHub can track a lot more)
Scripts
Notebooks
Documents (plain text files much better suitable than Word documents)
Manuscripts (Git is great for collaborating/sharing LaTeX or Quarto manuscripts)
Configuration files
Website sources
Data