Copy and browse an existing project

In this episode, we will look at an existing repository to understand how all the pieces work together. Along the way, we will make a copy (a fork) of the repository for us, which will be used for our own changes in the next episode.

  • We used to start by directly going and creating a repository from scratch. This was abstract and hard to understand.

  • Instead, we’ll show you all the cool stuff in a Git repository first, and then start adding files.

  • We use an example recipe book we created just for this course.

  • By the end of the course, you’ll know how to contribute your own recipes to it.

Objectives

  • See a real Git repository and understand what is inside of it.

  • Understand how version control allows advanced inspection of a repository.

  • See how Git allows multiple people to collaborate easily.

  • See the big picture instead of remembering a bunch of commands.

GitHub, VS Code, Command line, and more

We offer three different paths of how to do this exercise:

  • GitHub (this is the one we will demonstrate on day 1)

  • VS Code (if you prefer to follow along using an editor; we will do this on day 2)

  • Command line (for people comfortable with the command line; you will see more of this on day 2)

In the future we’ll add more paths, for example Jupyter and RStudio (contributions welcome!).

Creating a copy of the repository by “forking”

A repository is a collection of files in one directory tracked by git. A GitHub repository is GitHub’s copy, which adds things like access control. Each GitHub repository is owned by a user or organization, who controls what is in it.

First, we need to make our own copy of the exercise repository. This will become important later, when we make our own changes.

  1. Go to the repository view on GitHub:

  2. First, on GitHub, click the button that says “Fork”. It is towards the top-right of the screen:

    Screenshot on GitHub before clicking on "Fork"
  3. You should shortly be redirected to your copy of the repository YOUR_USER_NAME/recipe-book.

At all times you should be aware of if you looking at your repository or the CodeRefinery upstream repository.

  • Your repository: https://github.com/USERNAME/recipe-book

  • CodeRefinery upstream repository: https://github.com/coderefinery/recipe-book

You only need to open your own view, as described above. The browser URL should look like https://github.com/USER/recipe-book, where USER is your GitHub username.

Exercise

Work on this by yourself or in your team.

Instructor note

Before starting the exercise session:

  • Make sure you have shown how to fork the repository to own account (above).

Exercise: Browsing an existing project (25 min)

Browse the recipe-book project (introduced above) and explore commits and branches. Take notes and prepare questions. The hints are for the GitHub path in the browser.

  1. Browse the commit history: Are commit messages understandable? (Hint: “Commit history”, the timeline symbol, above the file list)

  2. Compare the commit history with the network graph (“Insights” -> “Network”). Can you find the branches?

  3. How can you find out when a recipe was last modified?

  4. How many changes did the Guacamole recipe receive (you find it under “sides”)? Try to click on some of the commits to see what changed. (Hint: “History” in the view of a single file)

  5. Which recipes include the ingredient “salt”? (Hint: the GitHub search. From the repository view, it should offer the filter “repo:USER/recipe-book” by default. What if you add a search term?)

  6. In the Guacamole recipe, find out who modified each line last and when (click on file, then click “Blame” button). Find out who added the cilantro and in which commit. (Hint: “Blame” view in the file view)

  7. Can you use these recipes yourself? Are you allowed to share modifications? (Hint: look for a license file)

  8. Browse issues and pull requests in the upstream repository (the repository you forked from). Any idea what these might be good for? (Hint: tabs in the repository view)

The solution below goes over most of the answers, and you are encouraged to use it when the hints aren’t enough - this is by design.

Solution and walk-through

(1) Basic browsing

The most basic thing to look at is the history of commits.

  • This is visible from a button in the repository view. We see every change, when, and who has committed.

  • Every change has a unique identifier, such as 554c187. This can be used to identify both this change, and the whole project’s version as of that change.

  • Clicking on a change in the view shows more.

Click on the timeline symbol in the repository view:

Screenshot on GitHub of where to find the commit history

(2) Compare commit history with network graph

The commit history we saw above looks linear: one commit after another. But if we look at the network view, we see some branches and merges. We’ll see how to do these later. This is another one of the basic Git views.

In a new browser tab, open the “Insights” tab, and click on “Network”. You can hover over the commit dots to see the person who committed and how they correspond with the commits in the other view:

Screenshot on GitHub of the network graph

(3) When was a recipe last modified?

We see the history for the whole repository, but we can also see it for a single file.

Navigate to the file view: Main page → sides directory → guacamole.md. Click the “History” button near the top right:

Screenshot on GitHub showing the history of a single file

(4) How many changes did the Guacamole recipe receive?

According to the view above, it seems to have five changes (as of 2024-03-07). This could change later on.

(5) Which recipes include the ingredient “salt”

Version control makes it very easy to find all occurrences of a single word. This is useful for things like finding where functions or variables are defined or used.

We go to the main recipe book view. We click the Search magnifying class at the very top, type “salt”, and click enter. We see every instance, including the context.

Searching in a forked repository will not work instantaneously!

It usually takes a few minutes before one can search for keywords in a forked repository since it first needs to build the search index the very first time we search. Start it, continue with other steps, then come back to this.

Screenshot on GitHub performing a search

(6) Who modified each line last and when?

This is called the “annotate” or “blame” view. The name “blame” is very unfortunate, but it is the standard term for historical reasons for this functionality and it is not meant to blame anyone.

From a recipe view, change preview to “Blame” towards the top-left. To get the actual commit, click on the commit message.

Screenshot on GitHub showing the "Blame" view

(7) Can you use these recipes yourself? Are you allowed to share modifications?

  • Look at the file LICENSE.

  • It says it is “Creative Commons Zero 1.0”, which is equivalent to public domain. You can use them without conditions.

  • Note the GitHub view of the file LICENSE gives a nice summary of what it means. Try it out:

    Screenshot on GitHub summarizing license terms

(8) Browse issues and pull requests in the upstream repository

This can only be done through the GitHub view. Go to the main repository coderefinery/recipe-book, (not your fork): https://github.com/coderefinery/recipe-book. Issues and Pull requests are different for each GitHub copy.

  • Click on the “Issues” tab. These are notes that people have added, which allow discussion about the project. Often they are used to communicate problems or ideas.

  • Click on the “Pull requests” tab. This allows anyone to propose changes, but only the repository owners can accept.

Summary

  • Git allowed us to understand this simple project much better than we could, if it was just a few files on our own computer.

  • It was also very easy to share the project with the course.

  • By forking the repository, we created our own copy. This is important for the next episode, where we will make changes to our copy.