Motivation for version control

Git is all about keeping track of changes

Below are screenshots of tracked changes with Git (from our example repository):

Screenshot of a git log on GitHub

Web browser, GitHub view

Screenshot of a git log in terminal

The same as above, but the terminal view

Discussion

  • Commits are like snapshots of the repository at a certain point in time.

  • Commits carry metadata about changes: author, date, commit message, and a checksum.

Why do we need to keep track of versions?

Version control is an answer to the following questions (do you recognize some of them?):

  • “It broke … hopefully I have a working version somewhere?”

  • “Can you please send me the latest version?”

  • “Where is the latest version?”

  • “Which version are you using?”

  • “Which version have the authors used in the paper I am trying to reproduce?”

  • “Found a bug! Since when was it there?”

  • “I am sure it used to work. When did it change?”

  • “My laptop is gone. Is my thesis now gone?”

Features: roll-back, branching, merging, collaboration

  • Roll-back: you can always go back to a previous version and compare

  • Branching and merging:

    • Work on different ideas at the same time

    • Different people can work on the same code/project without interfering

    • You can experiment with an idea and discard it if it turns out to be a bad idea

Branching explained with a gopher

Image created using https://gopherize.me/ (inspiration).

Reproducibility

  • Someone asks you about your results from 5 years ago. Can you get the same results now?

  • How do you indicate which version of your code you have used in your paper?

  • When you find a bug, how do you know when precisely this bug was introduced (Are published results affected? Do you need to inform collaborators or users of your code?).

With version control we can “annotate” code (browse this example online):

Example of a git-annotated code with code and history side-by-side

Example of a git-annotated code with code and history side-by-side.

Talking about code

You want to show someone a few lines from one of your projects. Which of these two is more practical?

  • “Clone the code, go to the file ‘src/util.rs’, and search for ‘time_iso8601’”. Oh! But make sure you use the version from August 2023.”

  • Or I can send you a permalink:

Screen-shot of a code portion

Permalink that points to a code portion.

What we typically like to snapshot

  • Software (this is how it started but Git/GitHub can track a lot more)

  • Scripts

  • Notebooks

  • Documents (plain text files much better suitable than Word documents)

  • Manuscripts (Git is great for collaborating/sharing LaTeX or Quarto manuscripts)

  • Configuration files

  • Website sources

  • Data