Appendix - Overview of Python¶
(c) 2019 Steve Phelps
Python is interpreted¶
Python is typically used in conjunction with an interpreter. This way of programming is well suited to developing and testing ideas with python in an interactive manner.
5 + 5
10
Groups of statements are all executed one after the other:
x = 5
y = 'Hello There'
z = 10.5
We can visualize the above code using PythonTutor.
x + 5
10
Assignments versus equations¶
In Python when we write
x = 5
this means something different from an equation \(x=5\).Unlike variables in mathematical models, variables in Python can refer to different things as more statements are interpreted.
x = 1
print('The value of x is', x)
x = 2.5
print('Now the value of x is', x)
x = 'hello there'
print('Now it is ', x)
The value of x is 1
Now the value of x is 2.5
Now it is hello there
Types¶
Values in Python have an associated type.
If we combine types incorrectly we get an error.
print(y)
Hello There
y + 5
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-b85a2dbb3f6a> in <module>
----> 1 y + 5
TypeError: can only concatenate str (not "int") to str
The type function¶
We can query the type of a value using the
type
function.
type(1)
int
type('hello')
str
type(2.5)
float
type(True)
bool
Null values¶
Sometimes we represent “no data” or “not applicable”.
In Python we use the special value
None
.This corresponds to
Null
in Java or SQL.
result = None
When we fetch the value
None
in the interactive interpreter, no result is printed out.
result
Testing for Null values¶
We can check whether there is a result or not using the
is
operator:
result is None
True
x = 5
x is None
False
Converting values between types¶
We can convert values between different types.
Converting to floating-point¶
To convert an integer to a floating-point number use the
float()
function.
x = 1
x
1
type(x)
int
y = float(x)
y
1.0
Converting to integers¶
To convert a floating-point to an integer use the
int()
function.
type(y)
float
int(y)
1
Variables are not typed¶
Variables themselves, on the other hand, do not have a fixed type.
It is only the values that they refer to that have a type.
This means that the type referred to by a variable can change as more statements are interpreted.
y = 'hello'
print('The type of the value referred to by y is ', type(y))
y = 5.0
print('And now the type of the value is ', type(y))
The type of the value referred to by y is <class 'str'>
And now the type of the value is <class 'float'>
Polymorphism¶
The meaning of an operator depends on the types we are applying it to.
1 + 1
2
'a' + 'b'
'ab'
'1' + '1'
'11'
Conditional Statements and Indentation¶
The syntax for control structures in Python uses colons and indentation.
Beware that white-space affects the semantics of Python code.
Statements that are indented using the Tab key are grouped together.
if
statements¶
x = 5
if x > 0:
print('x is strictly positive.')
print(x)
print('finished.')
x is strictly positive.
5
finished.
Visualize the above on PythonTutor.
Changing indentation¶
x = 0
if x > 0:
print('x is strictly positive.')
print(x)
print('finished.')
0
finished.
Visualize the above on PythonTutor.
if
and else
¶
x = 0
print('Starting.')
if x > 0:
print('x is strictly positive.')
else:
if x < 0:
print('x is strictly negative.')
else:
print('x is zero.')
print('finished.')
Starting.
x is zero.
finished.
Visualize the above on PythonTutor.
elif
¶
print('Starting.')
if x > 0:
print('x is strictly positive')
elif x < 0:
print('x is strictly negative')
else:
print('x is zero')
print('finished.')
Starting.
x is zero
finished.
Lists¶
We can use lists to hold an ordered sequence of values.
l = ['first', 'second', 'third']
l
['first', 'second', 'third']
Lists can contain different types of variable, even in the same list.
another_list = ['first', 'second', 'third', 1, 2, 3]
another_list
['first', 'second', 'third', 1, 2, 3]
Mutable Datastructures¶
Lists are mutable; their contents can change as more statements are interpreted.
l.append('fourth')
l
['first', 'second', 'third', 'fourth']
References¶
Whenever we bind a variable to a value in Python we create a reference.
A reference is distinct from the value that it refers to.
Variables are names for references.
X = [1, 2, 3]
Y = X
Side effects¶
The above code creates two different references (named
X
andY
) to the same value[1, 2, 3]
Because lists are mutable, changing them can have side-effects on other variables.
If we append something to
X
what will happen toY
?
X.append(4)
X
[1, 2, 3, 4]
Y
[1, 2, 3, 4]
Visualize the above on PythonTutor.
State and identity¶
The state referred to by a variable is different from its identity.
To compare state use the
==
operator.To compare identity use the
is
operator.When we compare identity we check equality of references.
When we compare state we check equality of values.
Example¶
We will create two different lists, with two associated variables.
X = [1, 2]
Y = [1]
Y.append(2)
Visualize the above code on PythonTutor.
Comparing identity¶
X is Y
False
Copying data prevents side effects¶
In this example, because we have two different lists we avoid side effects
Y.append(3)
X
[1, 2]
X == Y
False
X is Y
False
Iteration¶
We can iterate over each element of a list in turn using a
for
loop:
my_list = ['first', 'second', 'third', 'fourth']
for i in my_list:
print(i)
first
second
third
fourth
Visualize the above on PythonTutor.
Including more than one statement inside the loop¶
my_list = ['first', 'second', 'third', 'fourth']
for i in my_list:
print("The next item is:")
print(i)
print()
The next item is:
first
The next item is:
second
The next item is:
third
The next item is:
fourth
Visualize the above code on PythonTutor.
Looping a specified number of times¶
To perform a statement a certain number of times, we can iterate over a list of the required size.
for i in [0, 1, 2, 3]:
print("Hello!")
Hello!
Hello!
Hello!
Hello!
The range
function¶
To save from having to manually write the numbers out, we can use the function
range()
to count for us.We count starting at 0 (as in Java and C++).
list(range(4))
[0, 1, 2, 3]
for
loops with the range
function¶
for i in range(4):
print("Hello!")
Hello!
Hello!
Hello!
Hello!
List Indexing¶
Lists can be indexed using square brackets to retrieve the element stored in a particular position.
my_list
['first', 'second', 'third', 'fourth']
my_list[0]
'first'
my_list[1]
'second'
List Slicing¶
We can also a specify a range of positions.
This is called slicing.
The example below indexes from position 0 (inclusive) to 2 (exclusive).
my_list[0:2]
['first', 'second']
Indexing from the start or end¶
If we leave out the starting index it implies the beginning of the list:
my_list[:2]
['first', 'second']
If we leave out the final index it implies the end of the list:
my_list[2:]
['third', 'fourth']
Copying a list¶
We can conveniently copy a list by indexing from start to end:
new_list = my_list[:]
new_list
['first', 'second', 'third', 'fourth']
new_list is my_list
False
new_list == my_list
True
Negative Indexing¶
Negative indices count from the end of the list:
my_list[-1]
'fourth'
my_list[:-1]
['first', 'second', 'third']
Collections¶
Lists are an example of a collection.
A collection is a type of value that can contain other values.
There are other collection types in Python:
tuple
set
dict
Tuples¶
Tuples are another way to combine different values.
The combined values can be of different types.
Like lists, they have a well-defined ordering and can be indexed.
To create a tuple in Python, use round brackets instead of square brackets
tuple1 = (50, 'hello')
tuple1
(50, 'hello')
tuple1[0]
50
type(tuple1)
tuple
Tuples are immutable¶
Unlike lists, tuples are immutable. Once we have created a tuple we cannot add values to it.
tuple1.append(2)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-64-46e3866e32ee> in <module>
----> 1 tuple1.append(2)
AttributeError: 'tuple' object has no attribute 'append'
Sets¶
Lists can contain duplicate values.
A set, in contrast, contains no duplicates.
Sets can be created from lists using the
set()
function.
X = set([1, 2, 3, 3, 4])
X
{1, 2, 3, 4}
type(X)
set
Alternatively we can write a set literal using the
{
and}
brackets.
X = {1, 2, 3, 4}
type(X)
set
Sets are mutable¶
Sets are mutable like lists:
X.add(5)
X
{1, 2, 3, 4, 5}
Duplicates are automatically removed
X.add(5)
X
{1, 2, 3, 4, 5}
Sets are unordered¶
Sets do not have an ordering.
Therefore we cannot index or slice them:
X[0]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-70-19c40ecbd036> in <module>
----> 1 X[0]
TypeError: 'set' object is not subscriptable
Operations on sets¶
Union: \(X \cup Y\)
X = {1, 2, 3}
Y = {4, 5, 6}
X | Y
{1, 2, 3, 4, 5, 6}
Intersection: \(X \cap Y\):
X = {1, 2, 3, 4}
Y = {3, 4, 5}
X & Y
{3, 4}
Difference \(X - Y\):
X - Y
{1, 2}
Dictionaries¶
A dictionary contains a mapping between keys, and corresponding values.
Mathematically it is a one-to-one function with a finite domain and range.
Given a key, we can very quickly look up the corresponding value.
The values can be any type (and need not all be of the same type).
Keys can be any immutable (hashable) type.
They are abbreviated by the keyword
dict
.In other programming languages they are sometimes called associative arrays.
Creating a dictionary¶
A dictionary contains a set of key-value pairs.
To create a dictionary:
students = { 107564: 'Xu', 108745: 'Ian', 102567: 'Steve' }
The above initialises the dictionary students so that it contains three key-value pairs.
The keys are the student id numbers (integers).
The values are the names of the students (strings).
Although we use the same brackets as for sets, this is a different type of collection:
type(students)
dict
Accessing the values in a dictionary¶
We can access the value corresponding to a given key using the same syntax to access particular elements of a list:
students[108745]
'Ian'
Accessing a non-existent key will generate a
KeyError
:
students[123]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-77-26e887eb0296> in <module>
----> 1 students[123]
KeyError: 123
Updating dictionary entries¶
Dictionaries are mutable, so we can update the mapping:
students[108745] = 'Fred'
print(students[108745])
Fred
We can also grow the dictionary by adding new keys:
students[104587] = 'John'
print(students[104587])
John
Dictionary keys can be any immutable type¶
We can use any immutable type for the keys of a dictionary
For example, we can map names onto integers:
age = { 'John':21, 'Steve':47, 'Xu': 22 }
age['Steve']
47
Creating an empty dictionary¶
We often want to initialise a dictionary with no keys or values.
To do this call the function
dict()
:
result = dict()
We can then progressively add entries to the dictionary, e.g. using iteration:
for i in range(5):
result[i] = i**2
print(result)
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
Iterating over a dictionary¶
We can use a for loop with dictionaries, just as we can with other collections such as sets.
When we iterate over a dictionary, we iterate over the keys.
We can then perform some computation on each key inside the loop.
Typically we will also access the corresponding value.
for id in students:
print(students[id])
Xu
Fred
Steve
John
The size of a collection¶
We can count the number of values in a collection using the
len
(length) function.This can be used with any type of collection (list, set, tuple etc.).
len(students)
4
len(['one', 'two'])
2
len({'one', 'two', 'three'})
3
Arrays¶
Python also has arrays which contain a single type of value.
i.e. we cannot have different types of value within the same array.
Arrays are mutable like lists; we can modify the existing elements of an array.
However, we typically do not change the size of the array; i.e. it has a fixed length.
The numpy
module¶
Arrays are provided by a separate module called numpy. Modules correspond to packages in e.g. Java.
We can import the module and then give it a shorter alias.
import numpy as np
We can now use the functions defined in this package by prefixing them with
np
.The function
array()
creates an array given a list.
Creating an array¶
We can create an array from a list by using the
array()
function defined in thenumpy
module:
x = np.array([0, 1, 2, 3, 4])
x
array([0, 1, 2, 3, 4])
type(x)
numpy.ndarray
Functions over arrays¶
When we use arithmetic operators on arrays, we create a new array with the result of applying the operator to each element.
y = x * 2
y
array([0, 2, 4, 6, 8])
The same goes for functions:
x = np.array([-1, 2, 3, -4])
y = abs(x)
y
array([1, 2, 3, 4])
Populating Arrays¶
To populate an array with a range of values we use the
np.arange()
function:
x = np.arange(0, 10)
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
We can also use floating point increments.
x = np.arange(0, 1, 0.1)
x
array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
Basic Plotting¶
We will use a module called
matplotlib
to plot some simple graphs.This module provides functions which are very similar to MATLAB plotting commands.
import matplotlib.pyplot as plt
y = x*2 + 5
plt.plot(x, y)
plt.show()
Plotting a sine curve¶
from numpy import pi, sin
x = np.arange(0, 2*pi, 0.01)
y = sin(x)
plt.plot(x, y)
plt.show()
Plotting a histogram¶
We can use the
hist()
function inmatplotlib
to plot a histogram
# Generate some random data
data = np.random.randn(1000)
ax = plt.hist(data)
plt.show()
Computing histograms as matrices¶
The function
histogram()
in thenumpy
module will count frequencies into bins and return the result as a 2-dimensional array.
np.histogram(data)
(array([ 14, 41, 128, 178, 243, 203, 109, 66, 14, 4]),
array([-2.81515826, -2.19564948, -1.57614071, -0.95663193, -0.33712315,
0.28238562, 0.9018944 , 1.52140318, 2.14091195, 2.76042073,
3.3799295 ]))
Defining new functions¶
def squared(x):
return x ** 2
squared(5)
25
Local Variables¶
Variables created inside functions are local to that function.
They are not accessable to code outside of that function.
def squared(x):
temp = x ** 2
return temp
squared(5)
25
temp
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-102-da77557ed0c8> in <module>
----> 1 temp
NameError: name 'temp' is not defined
Functional Programming¶
Functions are first-class citizens in Python.
They can be passed around just like any other value.
squared
<function __main__.squared(x)>
y = squared
y
<function __main__.squared(x)>
y(5)
25
Mapping the elements of a collection¶
We can apply a function to each element of a collection using the built-in function
map()
.This will work with any collection: list, set, tuple or string.
This will take as an argument another function, and the list we want to apply it to.
It will return the results of applying the function, as a list.
list(map(squared, [1, 2, 3, 4]))
[1, 4, 9, 16]
List Comprehensions¶
Because this is such a common operation, Python has a special syntax to do the same thing, called a list comprehension.
[squared(i) for i in [1, 2, 3, 4]]
[1, 4, 9, 16]
If we want a set instead of a list we can use a set comprehension
{squared(i) for i in [1, 2, 3, 4]}
{1, 4, 9, 16}
Cartesian product using list comprehensions¶
image courtesy of Quartl
The Cartesian product of two collections \(X = A \times B\) can be expressed by using multiple for
statements in a comprehension.
example¶
A = {'x', 'y', 'z'}
B = {1, 2, 3}
{(a,b) for a in A for b in B}
{('x', 1),
('x', 2),
('x', 3),
('y', 1),
('y', 2),
('y', 3),
('z', 1),
('z', 2),
('z', 3)}
Cartesian products with other collections¶
The syntax for Cartesian products can be used with any collection type.
first_names = ('Steve', 'John', 'Peter')
surnames = ('Smith', 'Doe', 'Rabbit')
[(first_name, surname) for first_name in first_names for surname in surnames]
[('Steve', 'Smith'),
('Steve', 'Doe'),
('Steve', 'Rabbit'),
('John', 'Smith'),
('John', 'Doe'),
('John', 'Rabbit'),
('Peter', 'Smith'),
('Peter', 'Doe'),
('Peter', 'Rabbit')]
Joining collections using a zip¶
The Cartesian product pairs every combination of elements.
If we want a 1-1 pairing we use an operation called a zip.
A zip pairs values at the same position in each sequence.
Therefore:
it can only be used with sequences (not sets); and
both collections must be of the same length.
list(zip(first_names, surnames))
[('Steve', 'Smith'), ('John', 'Doe'), ('Peter', 'Rabbit')]
Anonymous Function Literals¶
We can also write anonymous functions.
These are function literals, and do not necessarily have a name.
They are called lambda expressions (after the \(\lambda-\)calculus).
list(map(lambda x: x ** 2, [1, 2, 3, 4]))
[1, 4, 9, 16]
Filtering data¶
We can filter a list by applying a predicate to each element of the list.
A predicate is a function which takes a single argument, and returns a boolean value.
filter(p, X)
is equivalent to \(\{ x : p(x) \; \forall x \in X \}\) in set-builder notation.
list(filter(lambda x: x > 0, [-5, 2, 3, -10, 0, 1]))
[2, 3, 1]
We can use both filter()
and map()
on other collections such as strings or sets.
list(filter(lambda x: x > 0, {-5, 2, 3, -10, 0, 1}))
[1, 2, 3]
Filtering using a list comprehension¶
Again, because this is such a common operation, we can use simpler syntax to say the same thing.
We can express a filter using a list-comprehension by using the keyword
if
:
data = [-5, 2, 3, -10, 0, 1]
[x for x in data if x > 0]
[2, 3, 1]
We can also filter and then map in the same expression:
from numpy import sqrt
[sqrt(x) for x in data if x > 0]
[1.4142135623730951, 1.7320508075688772, 1.0]
The reduce function¶
The
reduce()
function recursively applies another function to pairs of values over the entire list, resulting in a single return value.
from functools import reduce
reduce(lambda x, y: x + y, [0, 1, 2, 3, 4, 5])
15
Big Data¶
The
map()
andreduce()
functions form the basis of the map-reduce programming model.Map-reduce is the basis of modern highly-distributed large-scale computing frameworks.
It is used in BigTable, Hadoop and Apache Spark.
See these examples in Python for Apache Spark.