COMPGI13 - Advanced Topics in Machine Learning

This database contains 2017-18 versions of the syllabuses. For current versions please see here.

Note: Whilst every effort is made to keep the syllabus and assessment records correct, the precise details must be checked with the lecturer(s).

CodeCOMPGI13 (Also taught as: COMPM050);
PrerequisitesLinear Algebra, Probability Theory, Calculus
Taught By

Arthur Gretton (50%) and Carlo Ciliberto (50%)


Kernel methods

To gain an understanding of the theory and applications of kernel methods, including:

  • An overview of how kernel feature spaces can be constructed, including in infinite dimensions, and the smoothing properties of functions in these spaces.
  • Simple and complex learning algorithms using kernels (ridge regression, kernel PCA, the support vector machine)
  • Representations of probabilities in reproducing kernel Hilbert spaces. Statistical two-sample and independence tests, and learning algorithms using these embeddings (clustering, ICA)

Learning theory

To learn the fundamentals of statistical learning theory. In particular to:

  • Understand what characterizes a learning problem and what it means for an algorithm/system/machine to “learn”.
  • Understand the key role of regularization and the different approaches to use it efficiently in practice.
  • Acquire familiarity with a variety of statistically consistent learning algorithms, both from modeling and practical perspectives.
Learning OutcomesTo gain in-depth familiarity with the selected research topics, understand how to design and implement learning algorithms. To be able to individually read, understand and discuss research papers in the field.


Introduction to kernel methods:

  • Definition of a kernel, how it relates to a feature space, The reproducing kernel Hilbert space
  • Simple applications: kernel PCA, kernel ridge regression
  • Distance between means in RKHS, integral probability metrics, the maximum mean discrepancy (MMD), two-sample tests
  • Choice of kernels for distinguishing distributions, characteristic kernels
  • Covariance operator in RKHS: proof of existence, definition of norms (including HSIC, the Hilbert-Schmidt independence criterion)
  • Application of HSIC to independence testing
  • Feature selection, taxonomy discovery.
  • Introduction to independent component analysis, kernel ICA
  • Large margin classification, support vector machines for classification

Introduction to supervised learning in the context of statistical learning theory:

  • a taxonomy of learning problems
  • no free lunch theorem
  • regularization
  • model selection
  • stability and generalization
  • measures of complexity for hypotheses spaces
  • sample complexity, generalization bounds

Method of Instruction

Frontal teaching using whiteboard and slides. 


The course has the following assessment components:

  •     Written Examination (50%)
  •     Coursework Section (50%)


Reading list available via the UCL Library catalogue.