Data visualization using Python and Vega-Altair
In this introductory-level workshop, we will learn to produce reproducible data visualization pipelines using the Python programming language and the Vega-Altair declarative visualization library.
We will work in Jupyter Notebooks and start with Python basics. We will introduce the pandas library for “data wrangling” (reading, writing, sorting, and filtering of data). With pandas, we will be able to read data from Excel sheets and comma-separated values (CSV) files.
Finally, will learn how to produce and share reproducible plots using Vega-Altair.
Who is the course for?
Somebody starting with Python or curious about Python.
Somebody who needs to read, process, and plot data for their work or studies and would like to try it out with Python.
Persons who already use Python for this but want to learn about libraries to simplify common tasks and about how to share their workflow in a reproducible way.
Preparations
No programming language experience needed, we will start from zero and learn the basics together
Computer with network access
Bring one of your recent plotting tasks or challenges
What is not taught?
Version control. Although super useful it is outside of this workshop.
Python outside a Jupyter Notebook.
Running the examples in VS Code or Spyder might not be possible. Please use Jupyter Notebooks for this course.
Python sets and tuples are only mentioned.
File input/output is only used via libraries and doing “own” file-I/O is only part of optional material.
How to choose the right visualization format for the data at hand.
Python object oriented design.
Python packaging.
NumPy arrays.
Managing environments and installing Python packages.
Episode overview
Day 1 morning:
Day 1 afternoon:
Day 2 morning:
Day 3 morning:
How to find help and how to navigate the documentation
Credit
When preparing this lesson, we have reused these resources: