Python basics

Objectives

  • Knowing what types exist

  • Knowing the most common data structures (collections): lists, tuples, dictionaries, and sets

  • Creating and using functions

  • Knowing what a library is

  • Knowing what import does

  • Being able to “read” an error

Motivation for Python

  • Free

  • Huge ecosystem of examples, libraries, and tools

  • Relatively easy to read and understand

  • Similar in scope and use cases to R, Julia, and Matlab

Basic types

# int
num_measurements = 13

# float
some_fraction = 0.25

# string
name = "Bruce Wayne"

# bool
value_is_missing = False
skip_verification = True

# we can print values
print(name)

# and we can do arithmetics with ints and floats
print(5 * num_measurements)
print(1.0 - some_fraction)
  • Python is dynamically typed: We do not have to define that an integer is an int, we can use it this way and Python will infer it.

  • However, one can use type annotations in Python (see also mypy).

  • Now you also know that we can add # comments to our code.

Data structures for collections: lists, dictionaries, sets, and tuples

# lists are good when order is important
scores = [13, 5, 2, 3, 4, 3]

# first element
print(scores[0])

# we can add items to lists
scores.append(4)

# lists can be sorted
scores.sort()
print(scores)

# dictionaries are useful if you want to look up
# elements in a collection by something else than position
experiment = {"location": "Svalbard", "date": "2021-03-23", "num_measurements": 23}

print(experiment["date"])

# we can add items to dictionaries
experiment["instrument"] = "a particular brand"
print(experiment)

if "instrument" in experiment:
    print("yes, the dictionary 'experiment' contains the key 'instrument'")
else:
    print("no, it doesn't")
  • Lists are good when order is important, and it needs to be changed

  • Dictionaries are mappings key→value.

  • Sets are useful for unordered collections where you want to make sure that there are no repetitions.

  • There are also tuples that are similar to lists but their items cannot be modified.

You can put:

  • dictionaries inside lists

  • lists inside dictionaries

  • dictionaries inside dictionaries

  • lists inside lists

  • tuples inside …

Iterating over collections

Often we wish to iterate over collections.

Iterating over a list:

scores = [13, 5, 2, 3, 4, 3]

for score in scores:
    print(score)

# example with f-strings
for score in scores:
    print(f"the score is {score}")

We don’t have to call the variable inside the for-loop “score”. This is up to us. We can do this instead (but is this more understandable for humans?):

scores = [13, 5, 2, 3, 4, 3]

for x in scores:
    print(x)

Iterating over a dictionary:

experiment = {"location": "Svalbard", "date": "2021-03-23", "num_measurements": 23}

for key in experiment:
    print(experiment[key])

# another way to iterate
for (key, value) in experiment.items():
    print(key, value)

Functions

  • Functions are like reusable recipes. They receive ingredients (input arguments), then inside the function we do/compute something with these arguments, and they return a result.

    def add(a, b):
        result = a + b
        return result
    
  • Together we can write a function which sums all elements in a list (this is not the most elegant way to do this but it is easier at this point):

    def add_all_elements(sequence):
        """
        This function adds all elements.
        This here is a docstring, a documentation string for a function.
        """
        s = 0.0
        for element in sequence:
            s += element
        return s
    
    
    measurements = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    
    print(add_all_elements(measurements))
    
  • We reuse this function to write a function which computes the mean:

    def arithmetic_mean(sequence):
        # we are reusing add_all_elements written above
        s = add_all_elements(sequence)
        n = len(sequence)
        return s / n
    
    
    measurements = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    
    mean = arithmetic_mean(measurements)
    
    print(mean)
    
  • Functions can call other functions. Functions can also get other functions as input arguments.

  • Functions can return more than one thing:

    def uppercase_and_lowercase(text):
        u = text.upper()
        l = text.lower()
        return u, l
    
    
    some_text = "SequenceOfCharacters"
    uppercased_text, lowercased_text = uppercase_and_lowercase(some_text)
    
    print(uppercased_text)
    print(lowercased_text)
    

Why functions?

  • Less repetition

  • Simplify reading and understanding code

  • Simpler re-use of code

  • Make it easier for Python to deallocate memory

Reading error messages

Here we introduce a mistake and we together try to make sense of the traceback:

Example error traceback

Example error traceback. Can you explain the error?

Libraries

We can look at libraries as collections of functions. We can import the libraries/modules and then reuse the functions defined inside these libraries.

Try this:

import numpy

measurements = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

result = numpy.std(measurements)

print(result)

This means numpy contains a function called std which apparently computes the standard deviation (check also its documentation).

Often you see this in tutorials (the module is imported and renamed to a shortcut):

import numpy as np

result = np.std([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

It is possible to create own modules to collect own functions for reuse.

Great resources to learn more

Exercises

Exercise: create a function that computes the standard deviation

  • Arithmetic mean:

    \[\bar{x} = \frac{1}{N} \sum_{i=1}^N x_i\]
  • Standard deviation:

    \[\sqrt{ \frac{1}{N} \sum_{i=1}^N (x_i - \bar{x})^2 }\]
  • In other words the computation is similar but we need to sum over squares of differences and at the end take a square root.

  • Take this as a starting point:

    # we have written this one together previously
    def arithmetic_mean(sequence):
        s = 0.0
        for element in sequence:
            s += element
        n = len(sequence)
        return s / n
    
    
    def standard_deviation(sequence):
        # here we need to do some work:
        # mean = ?
        # s = ?
        n = len(sequence)
        return (s / n) ** 0.5
    
  • If this is the input list:

    measurements = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    

    Then the result would be: 2.872…

Exercise: working with a dictionary

  • We have this dictionary as a starting point:

    grades = {"Alice": 80, "Bob": 95}
    
  • Add the grades of few more (fictious) persons to this dictionary.

  • Print the entire dictionary.

  • What happens when you add a name which already exists (with a different grade)?

  • Print the grade for one particular person only.

  • What happens when you try to print the result for a person that wasn’t there?

  • Try also these:

    print(grades.keys())
    print(grades.values())
    print(grades.items())
    

The exercises below use if-statements.

Optional exercise/ homework: removing duplicates

  • This list contains duplicates:

    measurements = [2, 2, 1, 17, 3, 3, 2, 1, 13, 14, 17, 14, 4]
    
  • Write a function which removes duplicates from the list and sorts the list. In this case it would produce:

    [1, 2, 3, 4, 13, 14, 17]
    

Optional exercise/ homework: counting how often an item appears

  • Back to our list with duplicates:

    measurements = [2, 2, 1, 17, 3, 3, 2, 1, 13, 14, 17, 14, 4]
    
  • Your goal is to write a function which will return a dictionary mapping each number to how often it appears. In this case it would produce:

    {2: 3, 1: 2, 17: 2, 3: 2, 13: 1, 14: 2, 4: 1}