Introduction to Programming for Data Science

In this introductory-level workshop, we will learn the very basics of programming for data science. What does it mean to code? To “run a script”? Variables and functions?

We will work in Jupyter Notebooks for its simplicity to get started writing code even without any previous experience. We will learn how to produce some reproducible plots using Vega-Altair and in general how to find your way with programming in Python or R.

Who is the course for?

  • Somebody starting with Python or curious about Python.

  • Somebody who needs to read, process, and plot data for their work or studies and would like to try it out with Python.

Preparations

  • No programming language experience needed, we will start from zero and learn the basics together

  • Computer with network access

  • Software install instructions

What is not taught?

  • Version control. Although super useful it is outside of this workshop.

  • Python outside a Jupyter Notebook (e.g. on terminal, or VScode, or spyder)

  • Python sets and tuples are only mentioned.

  • File input/output is only used via libraries and doing “own” file-I/O is only part of optional material.

  • How to choose the right visualization format for the data at hand.

  • Python object oriented design.

  • Python packaging.

  • NumPy arrays.

  • Managing environments and installing Python packages.

Episode overview

Introductory workshop (3h):

More advanced topics:

Other:

Credit

When preparing this lesson, we have reused these resources: