Machine Learning is an immensely broad field. Not only are there questions of considerable theoretical importance, but it is also applied to a vast number of practical problems such as personal identification (there is a whole array of biometric identification techniques), video surveillance, gesture recognition, speech processing (there is much more to it than just speaker identification), data mining, and so on. Let us briefly talk about a familiar problem that illustrates some of the issues that need to be addressed. Let us think about developing a dynamic signature verification system. This means that the signature is recorded with a digitizing tablet, i.e. the signature is given as a parametric curve. We are all familiar with the fact that our signatures differ every time we sign them. Apart from the different shapes, different signatures can be of different sizes, different orientations (rotations), positions, etc. Then we may also want to compare not only the shape of the signature but also pressure, pen tilt and directions, etc. A direct comparison of signatures is clearly not feasible and one might consider comparing models of the signature. This model has to take into account that there are variations between genuine signatures. Moreover, the variations themselves may have structure that one can exploit. One is therefore naturally led to start thinking about the possibility of building a probabilistic model of the signature.

Essentially what one does is to represent a signture with a probability density function. Apart from figuring out how that can be done, it most likely involves a training process where one uses sample signatures in order to build the representative density function. Then we are presented with a new signature, either for verification or identification purposes. In both cases we need to compare the signature with the probabilistic model. This typically returns a confidence value. In the verification case we need to decide whether the confidence value is high enough to accept the signature. Thus we need to develop some kind of decision theory. Then we might realize that there can be an additional penalty for wrong decisions, a forgery can be accepted, or a genuine signature declined. Commercial banks do not want to alienate their customers by rejecting their signatures, they would rather accept a few more forgeries. This penalty should be built into our decision theory. The identification problem is basically a classification problem where a given signature needs to be assigned to a specific class, namely the correct signatory, or of course, the class of unknown signatures. These are just some of the issues that we hope to address as part of this module.


HomeworkDue Date Selected Solutions
Homework 1 25 February 2011 Solutions 1 (amended)
Homework 2 4 March 2011 Solutions 2
Homework 3 11 March 2011 Solutions 3
Homework 4 29 April 2011 Solutions 4


Test 1Solutions
Test 2Solutions


ProjectAdditional materialDate due
Project 1 6 May 2011
Project 2 Signature Samples 1
Signature Samples 2
27 May 2011


Stefan's presentations: a brief introduction to python and a quick peek at SciPy.

Python(x,y) is a one-click installer with Python, NumPy, SciPy, Matplotlib and Mayavi. You can grab it here.
Enthought's Enstaller Instructions (we are waiting for sun.ac.za to update the mirror -- use Python(x,y) in the meantime)



How to ask questions (on a mailing list) so you're likely to get satisfactory answers. The document is perhaps a little terse, but still offers good advice.

Announcement: Your final test is on Thursday, 19 May in room A409, from 09:00 to 12:00. Part I (Closed Book, 1 hour): Covers basic probability. You need to know the Bernoulli, binomial and Gaussian probability distribution. Part II (Open Book, 2 hours): K-means algorithm, Gaussian mixture models, and HMM's. I give you data and ask you to compute a few steps of the EM algorithm for both K-means and Gaussian mixture models. I will also give you the parameters of a trained, discrete HMM, i.e. the transition probabilities and the observation probabilities for each class. Given an observation sequence you are then required to calculate the optimal state sequence using the Viterbi algorithm.
Classes: Tuesdays 09:00 (A306), Wednesdays 11:00 (A407), Fridays 12:00 (A306)
Mondays 14:00 to 15:00 (A308)


The textbook that we use is Chris Bishop's Pattern Recognition and Machine Learning. You will find more information about the book on his website.

Additional Reading

External Links