Data visualization using Python

In this introductory-level workshop, we will learn to produce reproducible data visualization pipelines using the Python programming language and the Vega-Altair declarative visualization library.

We will work in Jupyter Notebooks and start with Python basics, to be able to read data from Excel sheets and comma-separated values (CSV) files. We will introduce the pandas library for “data wrangling” (reading, writing, sorting, and filtering of data). Finally, will learn how to produce and share reproducible plots using Vega-Altair.

Who is the course for?

  • Somebody starting with Python or curious about Python.

  • Somebody who needs to read, process, and plot data for their work or studies and would like to try it out with Python.

  • Persons who already use Python for this but want to learn about libraries to simplify common tasks and about how to share their workflow in a reproducible way.

Preparations

  • No programming langauge experience needed, we will start from zero and learn the basics together

  • Computer with network access

  • Anaconda installation (Software install instructions)

  • Installing the package altair

  • Bring one of your recent plotting tasks or challenges

What is not taught?

  • Version control. Although super useful it is outside of this workshop.

  • Python outside a Jupyter Notebook.

  • Python sets and tuples are only mentioned.

  • File input/output is only used via libraries and doing “own” file-I/O is only part of optional material.

  • How to choose the right visualization format for the data at hand.

  • Python object oriented design.

  • Python packaging.

  • NumPy arrays.

  • Managing environments and installing Python packages.

Episodes

35 min

Jupyter Notebooks

35 min

Python basics

35 min

Generating our first plot

45 min

Reading and slicing data with pandas

25 min

Tidy data and dealing with messy data

50 min

Customizing plots

20 min

Sharing notebooks

Example notebooks

Generating our first plot

nbviewer badge colab badge binder badge

Reading and slicing data with pandas

nbviewer badge colab badge binder badge

Customizing plots

nbviewer badge colab badge binder badge

Credit

When preparing this lesson, we have reused these resources: