Data visualization using Python
In this introductory-level workshop, we will learn to produce reproducible data visualization pipelines using the Python programming language and the Vega-Altair declarative visualization library.
We will work in Jupyter Notebooks and start with Python basics, to be able to read data from Excel sheets and comma-separated values (CSV) files. We will introduce the pandas library for “data wrangling” (reading, writing, sorting, and filtering of data). Finally, will learn how to produce and share reproducible plots using Vega-Altair.
Who is the course for?
Somebody starting with Python or curious about Python.
Somebody who needs to read, process, and plot data for their work or studies and would like to try it out with Python.
Persons who already use Python for this but want to learn about libraries to simplify common tasks and about how to share their workflow in a reproducible way.
Preparations
No programming langauge experience needed, we will start from zero and learn the basics together
Computer with network access
Anaconda installation (Software install instructions)
Installing the package
altair
Bring one of your recent plotting tasks or challenges
What is not taught?
Version control. Although super useful it is outside of this workshop.
Python outside a Jupyter Notebook.
Python sets and tuples are only mentioned.
File input/output is only used via libraries and doing “own” file-I/O is only part of optional material.
How to choose the right visualization format for the data at hand.
Python object oriented design.
Python packaging.
NumPy arrays.
Managing environments and installing Python packages.
Episodes
35 min |
|
35 min |
|
35 min |
|
45 min |
|
25 min |
|
50 min |
|
20 min |
Example notebooks
Generating our first plot
Reading and slicing data with pandas
Customizing plots
Credit
When preparing this lesson, we have reused these resources: