Overview
Teaching: 15 min
Exercises: 0 minQuestionsObjectives
- Why version control?
- Why Git?
- Make sure nobody leaves the workshop without starting to use some form of version control.
- Discuss the reasons why we advocate distributed version control.
Discuss the following directory listing. What possible problems do you anticipate with this kind of “version control”:
mylib-1.2.4_18.3.07.tgz somecode_CP_10.8.07.tgz
mylib-1.2.4_27.7.07.tgz somecode_CP_17.5.07.tgz
mylib-1.2.4_29.4.08.tgz somecode_CP_23.8.07_final.tgz
mylib-1.2.4_6.10.07.tgz somecode_CP_24.5.07.tgz
mylib-1.2.5_23.4.08.tgz somecode_CP_25.5.07.tgz
mylib-1.2.5_25.5.07.tgz somecode_CP_29.5.07.tgz
mylib-1.2.5_6.6.07.tgz somecode_CP_30.5.07.tgz
mylib-1.2.5_bexc.tgz somecode_CP_6.10.07.tgz
mylib-1.2.5_d0.tgz somecode_CP_6.6.07.tgz
mylib-1.3.0_4.4.08.tgz somecode_CP_8.6.07.tgz
mylib-1.3.1_4.4.08.tgz somecode_KT.tgz
mylib-1.3.2_22.4.08.tgz somecode_PI1_2007.tgz
mylib-1.3.2_4.4.08.tgz somecode_PI_2007.tgz
mylib-1.3.2_5.4.08.tgz somecode_PI2_2007.tgz
mylib-1.3.3_1.5.08.tgz somecode_PI_CP_18.3.07.tgz
mylib-1.3.3_20.5.08.tgz somecode_11.5.08.tgz
mylib-1.3.3_tstrm_27.6.08.tgz somecode_15.4.08.tgz
mylib-1.3.3_wk_10.8.08.tgz somecode_17.6.09_unfinished.tgz
mylib-1.3.3_wk_11.8.08.tgz somecode_19.7.09.tgz
mylib-1.3.3_wk_13.8.08.tgz somecode-20.7.09.tgz
...
Why Git?
We will use Git to record snapshots of our work:
- Easy to set up - use even by yourself with no server needed.
- Very popular: chances are high you will need to contribute to somebody else’s code which is tracked with Git.
- Distributed: good backup, no single point of failure, you can track and clean-up changes offline, simplifies collaboration model for open-source projects.
- Important platforms such as GitHub, GitLab, and Bitbucket build on top of Git.
- Many platforms build on top of GitHub.
- Sharing software and data is getting popular and required in research context and GitHub is a popular platform for sharing software.
- However, “Git is a four-handle, dual boiler espresso machine, not instant coffee.” [citation needed]. Git isn’t the most user friendly and has its design quirks but deep design is great and is definitely the most popular and what you are most likely to need to know. So we teach it.
Why not Subversion?
- Subversion is centralized (one server, many clients) and requires setting up and maintaining a server.
- You cannot easily clean-up your recorded snapshots (commits) before you share them.
- Not easy to get contributions from external contributors.
Why not Mercurial?
- Mercurial: many Git concepts still apply. For that matter, most important lesson is how and why to use version control, which applies to any system with some changes.
- Even if you use Mercurial chances are high you need to contribute to a code tracked by Git.
Before we create a new repository from scratch and learn how to record changes and create and merge branches, let us explore an existing Git repository on GitHub. The goal here is not to teach GitHub yet (we will explain some of the concepts later), but rather to get a glimpse of the wider picture and see the social aspect to know what our end goal is.
As an example we can explore a famous Git repository which was used to produce the Event Horizon Telescope images: https://github.com/achael/eht-imaging.
While some of these are GitHub features, it all can be done on other sites, or by yourself without GitHub at all.