15. Assessment 2#
15.1. Software Engineering for Reusability#
15.2. Background#
The value iteration algorithm is a means of determining the optimal policy for a decision process that can be modelled as a Markov Decision Process (MDP). The main objective of this assessment is to implement the value iteration algorithm in python and host it it on github along with supporting material.
Your work should be sufficiently well designed and documented so that the assessors can use your documentation, examples, and package to successfully optimise a small problem modelled as an MDP. The details of this problem are NOT provided as part of the assessment.
15.3. Objectives#
The objectives of this assessment are :
Implement the value iteration algorithm in python
Package the implementation and host it in a github repository
Provide supporting material for your package in the repository
Provide a working example for your implementation using the grid world problem detailed below
Provide pseudo code for your implementation and contrast it with the pseudo code provided in figure 9.16 of Poole and Mackworth.
15.4. Suggested Approach#
Study sections 9.5 and 9.5.1 of Artificial Intelligence: Foundations and Computational Agents 2nd edition
Implement the value iteration algorithm using the pseudo code in figure 9.16 in section 9.5.2 of Artificial Intelligence: Foundations and Computational Agents 2nd edition.
Study and solve (at least partially) by hand the problem introduced in exercise 9.27 in section 9.5 of Artificial Intelligence: Foundations and Computational Agents 2nd edition.
Check that you can specify and solve exercise 9.27 using your implementation of value iteration.
Check that you can specify and solve the grid world problem (detailed below) using your implementation of value iteration
Package your implementation and host it on github
Provide a solution to the grid world problem as a working example in your repository. This should be in the form of a jupyter notebook.
Add supporting information, examples, pseudo code, and documentation to your repository so that another researcher is able to
reuse your implementation of the value iteration algorithm
rerun your examples and reproduce your results
implement code to represent their own MDP problem and optimize it using your implementation of value iteration
Where they exist, comment on any differences between your pseudo code and that provided in figure 9.16 of Poole and Mackworth.
It is highly recommended that you look at similar projects hosted on github and/or PyPI for ideas of how to effectively realise item 6. For example
Try and reproduce some of the examples from these repositories with view to learning what works well and, importantly, what does not.
The assessment requires that you produce an effective implementation of value iteration. Examples showing how to use it are important and, in the case of the grid world probelm, required. However, it is not necessary to study any large scale problems, or to analyse any results in detail.
15.5. Grid World Problem#
In the following table, s is the current state, a is the chosen action, s’ is the state arrived at from state s having applied action a, P(s’|s,a) is the probability of arriving in state s’ from state s having applied action a, and R(s’|s,a) is the reward for arriving in state s’ from state s having applied action a.
s |
a |
s’ |
P(s’|s,a) |
R(s’|s,a) |
|---|---|---|---|---|
TL |
R |
TR |
0.9 |
-1 |
TL |
R |
BL |
0.1 |
-2 |
TL |
D |
BL |
0.9 |
-2 |
TL |
D |
TR |
0.1 |
-1 |
TR |
L |
TL |
0.9 |
-3/2 |
TR |
L |
BR |
0.1 |
10 |
TR |
D |
BR |
0.8 |
15 |
TR |
D |
TL |
0.2 |
-1 |
BL |
R |
BR |
0.9 |
20 |
BL |
R |
TL |
0.1 |
-5/2 |
BL |
U |
TL |
0.8 |
-1/2 |
BL |
U |
BR |
0.2 |
5 |
Note, state BR is terminal.
15.6. Assessment Criteria#
Your submission will be assessed according to the following criteria :
Quality of code : the extent to which your code is readable and uses appropriate data structures and design patterns
Documentation : the extent to which the package is well documented
Use of software engineering tools : the extent to which software engineering tools such as version control and unit testing have been appropriately used
Solution : the extent to which the code is reusable for other problems
15.7. Submitting your work#
Formal submission of the work should be made by sending the URL for your github repository to by e-mail to Kim Wilson mailto:k.wilson7@lancaster.ac.uk by 17:00 on Friday 27th March 2026.
Yo should also invite Daniel Grose (github name grosed) and Jamie Fairbrother (github name fairbrot) to be collaborators to your github repository by the submission date.
Note, only files committed by the submission date will be considered for assessment unless other prior arrangements have been made.