COMPMI22 - Advanced Deep learning and Reinforcement Learning

This database contains 2017-18 versions of the syllabuses. For current versions please see here.

CodeCOMPMI22 (Also taught as COMPGI22)
Year4 (Masters)

The prerequisites are probability, calculus, linear algebra, COMPM055 Supervised Learning and COMPM056 Graphical Models.

In order to successfully complete the coursework for this module, students will require excellent coding skills in Python.

Taught By

Thore Graepel (50%)
Hado van Hasselt (50%)

The course is taught in collaboration with DeepMind. The majority of lectures will be taught by guest lecturers from DeepMind who are leading experts in the field of machine learning and will teach about topics in which they are specialised.

AimsStudents successfully completing the module should understand:
  1. The basics of deep learning and reinforcement learning paradigms
  2. Architectures and optimization methods for deep neural network training
  3. How to implement deep learning methods within TensorFlow and apply them to data
  4. The theoretical foundations and algorithms of reinforcement learning
  5. How to apply reinforcement learning algorithms to environments with complex dynamics
Learning Outcomes

To understand the foundations of deep learning, reinforcement learning, and deep reinforcement learning including the ability to successfully implement, apply and test relevant learning algorithms in TensorFlow.


The course has two interleaved parts that converge towards the end of the course. One part is on machine learning with deep neural networks, the other part is about prediction and control using reinforcement learning. The two strands come together when we discuss deep reinforcement learning, where deep neural networks are trained as function approximators in a reinforcement learning setting.

The deep learning stream of the course will cover a short introduction to neural networks and supervised learning with TensorFlow, followed by lectures on convolutional neural networks, recurrent neural networks, end-to-end and energy-based learning, optimization methods, unsupervised learning as well as attention and memory. Possible applications areas to be discussed include object recognition and natural language processing.

The reinforcement learning stream will cover Markov decision processes, planning by dynamic programming, model-free prediction and control, value function approximation, policy gradient methods, integration of learning and planning, and the exploration/exploitation dilemma. Possible applications to be discussed include learning to play classic board games as well as video games.

Method of Instruction

Lectures, reading, and course work assignments.

Course work will focus on the practical implementation of deep neural network training and reinforcement learning algorithms and architectures in Tensorflow.


The course has the following assessment components:

  • Coursework (100%)
    • Deep learning

      Programming and experimentation in Python/TensorFlow

    • Reinforcement Learning

      Programming and experimentation in Python/TensorFlow


To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined.


Reading list available via the UCL Library catalogue.