Data structures

Sorting again

A list, \(X\) say, can be sorted whenever python has access to a way of evaluating the expression \(a < b\) where \(a,b \in X\). The “ordering” \(<\) should be transative and total.

X = ["edsger","dijkstra","donald","knuth","ada","lovelace"]
print(X)
X.sort()
print(X)
['edsger', 'dijkstra', 'donald', 'knuth', 'ada', 'lovelace']
['ada', 'dijkstra', 'donald', 'edsger', 'knuth', 'lovelace']

Exercise 1

Experiment with bisect and the list X. Does it work as you would expect ?

from bisect import bisect_right as br
from bisect import bisect_left as bl
Y = [1,2,3,4,5]
print(bl(X,"knuti"))
print(bl(Y,2.9))
5
2

Sorting and tuples

It is not always clear at first what the \(<\) operator is that is being used by sort. For examples, how does \(<\) get applied to tuples ? If you are not sure what a tuple is, have a look at the chapter “Python tuples from scratch !!!”. Bear in mind, tuples are immutable. To see this, have a look at the following code.

A = ("fred",[1,2,3,4])
print(A[0])
print(A[1])
A[0] = ("mildred",3*[1,2]) 
fred
[1, 2, 3, 4]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-4eb3bcb6e7db> in <module>
      2 print(A[0])
      3 print(A[1])
----> 4 A[0] = ("mildred",3*[1,2])

TypeError: 'tuple' object does not support item assignment

Note that a useful way of accessing elements from a tuple, particularly when you are only interested in some of the values, is demonstrated below.

X = ("a",[1,2,3],3.15,("fred",True))
_,_,pi,_ = X
print(pi)
3.15

Exercise 2

Does the following code work as you would expect ? Can you describe how the tuples are being sorted ? Experiment with some tuples and \(<\) to find out.

X = [("knuth","donald","us",1938),("dijkstra","edsger","nl",1930),("lovelace","ada","uk",1815)]
X.sort()
X
[('dijkstra', 'edsger', 'nl', 1930),
 ('knuth', 'donald', 'us', 1938),
 ('lovelace', 'ada', 'uk', 1815)]
A = (1,3,7)
B = (2,5,9)
A < B
True
B = (1,2,9)
A < B
False

Exercise 3

Can you describe what this function does ? What is the expected input data structure ? If you are not sure, have a look

def first_name(x) :
    _,first,_,_ = x
    return first

Exercise 4

Can you predict the output of the following code ?

X = [("knuth","donald","us",1938),("dijkstra","edsger","nl",1930),("lovelace","ada","uk",1815)]
X.sort(key=first_name)
X
[('lovelace', 'ada', 'uk', 1815),
 ('knuth', 'donald', 'us', 1938),
 ('dijkstra', 'edsger', 'nl', 1930)]

Exercise 5

Can bisect employ a key ?

Lists again

Getting a value by its position in a list (indexing) is an \(\mathcal{O}(1)\) operation.

Exercise 6

Use help to find out what the list.index function does. What is self ?

help("list.index")
Help on method_descriptor in list:

list.index = index(self, value, start=0, stop=9223372036854775807, /)
    Return first index of value.
    
    Raises ValueError if the value is not present.

self refers to the value of the variable on the left hand side of the .

Exercise 7

What is the complexity of index for the list data structure ? Write some experimental code to find out (the list does not have to contain strings).

Concatenating and appending lists is \(\mathcal{O}(1)\). To insert an item within a list you can use insert.

Use help to find out how to use insert. What is the compexity of insert ?

Exercise 8

Given the list

X = [("knuth","donald","us",1938),("dijkstra","edsger","nl",1930),("lovelace","ada","uk",1815)]

How would you look up dijkstra and find his first name ?

So - lists are undoubtedly a useful data structure - but they have some limitations. Can you list some things that lists are good at and some things they are not.

Dictionaries

Dictionaries are sometimes refered to as maps, or hash maps, or associative arrays. Here is an example of how to create and insert items in a dictionary.

cumbric_english = {} # an empty dictionary
cumbric_english["yan"] = "one"
cumbric_english["tethra"] = "three"
cumbric_english["blaen"] = "end, point, summit; source of river"
To find or retrieve a value ..
cumbric_english["yan"]
'one'
Dictionaries map keys to values. They are inhomogenous.
dic = {}
dic["IB12047842"] = (107.82,"XH",("Fraud",True))
dic[("a",1)] = [1,2,3,4]

Exercise 9

Can dictionaries have duplicate keys ? Can they have duplicate values ? Are dictionaries functions ? (Explain your answer). Can you sort a dictionary ?

Exercise 10

What is the complexity of adding an element to a dictionary. What is the complexity of retreiving a value from a dictionary using a key ?

You can use strings, tuples, integers and real valued numbers for keys in a dictionary. Can you use a list ? If not, what do you think it is about lists that means they cannot be used as a key in a dictionary ?

Give some example use cases for a dictionary that are relevant to statistics.
What is the complexity of your algorithm in terms of the k ?