Demo: From a script towards a workflow

In this episode we will explore code quality and good practices in Python using a hands-on approach. We will together build up a small project and improve it step by step.

We will start from a relatively simple image processing script which can read a telescope image of stars and our goal is to count the number of stars in the image. Later we will want to be able to process many such images.

The (fictional) telescope images look like the one below here (in this repository we can find more):

Generated image representing a telescope image of stars

Generated image representing a telescope image of stars.

Rough plan for this demo

  • (15 min) Discuss how we would solve the problem, run example code, and make it work (as part of a Jupyter notebook)?

  • (15 min) Refactor the positioning code into a function and a module

  • (15 min) Now we wish to process many images - discuss how we would approach this

  • (15 min) Introduce CLI and discuss the benefits

  • (30 min) From a script to a workflow (using Snakemake)

Plan

Topics we wish to show and discuss:

  • Naming (and other) conventions, project organization, modularity

  • The value of pure functions and immutability

  • Refactoring (explained through examples)

  • Auto-formatting and linting with tools like black, vulture, ruff

  • Moving a project under Git

  • How to document dependencies

  • Structuring larger software projects in a modular way

  • Command-line interfaces

  • Workflows with Snakemake

We will work together on the code on the big screen, and participants will be encouraged to give suggestions and ask questions. We will end up with a Git repository which will be shared with workshop participants.

Possible solutions