MSc Web Science and Big Data Analytics

Master of Science (MSc)
Duration:1 Year
Full/Part Time:Full Time only
UK/EU £10,765
Overseas £22,350
Research Group:Media Futures Research Group
Programme Contact:Sean Taylor

Application Deadline: 1 August 2015

Our degree

The MSc in Web Science and Big Data Analytics is a specialist programme. It covers fundamental aspects of web related technologies and big data analytics ranging from information search and retrieval, data mining and analytics, large-scale distributed and cloud computing, to e-commerce and their business economic models, and to the latest concepts of web 2.0 and social networks and the underlying networks science, with potential options in machine learning, artificial intelligence, finance, software engineering, and machine vision. It is intended for students with a general science and engineering background who wish to learn all aspects of quantitative web science and big data analytical skills. We also offer the more Research orientated MRes Web Science and Big Data Analytics.

Our Graduates

MSc Web Science's unique combination of technical skills makes graduates well equipped to proceed to scientific research or the ideal choice for the best employers in Internet related industries and the areas requiring large-scale data analytical skills.

Top graduate destinations include:       

  • Microsoft
  • SAS
  • Google

Top graduate roles include:                

  • Big Data Architect
  • Senior Data Analyst
  • Technology Consultant

Top further study destinations:

  • University of Cambridge
  • UCL

Average starting salary £31,200 (all data from Graduate Surveys, January 2013)

Our Staff

Dr Jun Wang

Dr Jun Wang

Jun Wang is Senior Lecturer in University College London and Founding Director of MSc/MRes Web Science and Big Data Analytics. His main research interests are in the areas of information retrieval, data mining and online advertising. His research has been dedicated to building an Intelligent (text and non-textual media) System that can access, retrieve, change and design the media content and its representation in such a way that it is adapted to the environment and context, and suitable for an individual person. To achieve the goal, Dr. Wang has studied statistical modelling of information retrieval, social “the wisdom of crowds” approaches for content understanding and access (collaborative filtering (recommendation)), peer-to-peer information retrieval and filtering, and, multimedia content analysis. Recently, he has developed an interest in “Web Economy” where he intends to unify information retrieval and economic models for Web ecosystems.

Dr Shi Zhou

Dr Shi Zhou

Shi received his BSc and MSc in Electronic Engineering at Zhejiang University, China and his PhD in Telecommunications at Queen Mary, University of London in 2004. Since then he has been a Lecturer (Assistant Professor) at UCL. He was awarded a prestigious Royal Academy of Engineering/EPSRC Research Fellowship from 2007 – 2012.

Shi is a member of the Media Futures research group and the Networks research group of the Department of Computer Science. He supervises PhD students at the UCL Centre for Security and Crime Science (SECReT) and the UCL Doctoral Training Centre in Financial Computing. He is also a founding member of the UCL Academic Centre of Excellence in Cyber Security Research (ACE-CSR).

Shi is a Senior Member of IEEE and a committee member of the Internet Specialist (IS) group of the British Computer Society (BCS).

Dr Emine Yilmaz

Dr Emine Yilmaz

Emine is a lecturer (assistant professor) at University College London, Department of Computer Science. She also works as a research consultant for Microsoft Research, Cambridge and serves as one of the organizers of CSML, Centre for Computational Statistics and Machine Learning at UCL. Emine is one of the recipients of the Google Faculty Research Award in 2014.

Emine's research interests lie in the areas of information retrieval, web science, and applications of machine learning, probability and statistics. For more information about her recent publications, please visit her publications page.

Prof Mark Handley

Prof Mark Handley

Mark Handley joined the Computer Science department at UCL as Professor of Networked Systems in 2003, receiving a Royal Society-Wolfson Research Merit Award. From 2003-2010 he led the Networks Research Group, which has a long history dating back to 1973 when UCL became the first site outside the United States to join the ARPAnet, which was the precursor to today's Internet. Prior to joining UCL, Professor Handley was based at the International Computer Science Institute in Berkeley, California, where he co-founded the AT&T Center for Internet Research at ICSI (ACIRI). Professor Handley has been very active in the area of Internet Standards, and has served on the Internet Architecture Board, which oversees much of the Internet standardisation process. He is the author of 33 Internet standards documents (RFCs), including the Session Initiation Protocol (SIP), which is the principal way telephony signalling is performed in Internet-based telephone networks. Recently he has been standardizing multipath extensions to TCP.

Professor Handley's research interests include the Internet architecture (how the components fit together to produce a coherent whole), congestion control (how to match the load offered to a network to the changing available capacity of the network), Internet routing (how to satisfy competing network providers' requirements, while ensuring that traffic takes a good path through the network), and defending networks against denial-of-service attacks. He also founded the XORP project to build a complete open-source Internet routing software stack.

Prof Brad Karp

Prof Brad Karp

Brad Karp earned a B.S. at Yale University in 1992, an S.M. at Harvard University in 1995, and a Ph.D. at Harvard University in 2000, all in Computer Science. In his dissertation, he designed robust and scalable geographic routing algorithms and protocols for wireless networks with large numbers of nodes and highly dynamic topologies.

He was a staff scientist at ICIR, the ICSI Center for Internet Research (previously named ACIRI) at the International Computer Science Institute (ICSI) at Berkeley between the fall of 2000 and fall of 2002. While at ICIR, he worked on topics including scalable distributed storage for sensor networks, reordering-robust window-based congestion control, and traffic engineering for multi-hop wireless networks.

He then spent three years as a Senior Staff Researcher at Intel Research Pittsburgh, and as an Adjunct Assistant Professor in Carnegie Mellon University's Computer Science Department. At Intel Research/CMU, he continued his long-standing research thrust on geographic routing (CLDP), and started new projects in distributed system architecture (Open DHT) and Internet worm defense (Autograph and Polygraph).

Brad joined UCL in October 2005 as a recipient of a Royal Society-Wolfson Research Merit Award, where he is now a Professor of Computer Systems and Networks.

Our modules

The MSc Web Science Programme consists of 8 taught modules and a Dissertation. Of the taught modules, 5 are core modules and 3 are elective modules.

5 core modules include the following:

COMPGI15 - Information Retrieval & Data Mining

CodeCOMPGI15 (Also taught as: COMPM052)
Taught ByJun Wang (50%), Emine Yilmaz (50%)
AimsThe course is aimed at an entry level study of information retrieval and data mining techniques. It is about how to find relevant information and subsequently extract meaningful patterns out of it. While the basic theories and mathematical models of information retrieval and data mining are covered, the course is primarily focused on practical algorithms of textual document indexing, relevance ranking, web usage mining, text analytics, as well as their performance evaluations. Practical retrieval and data mining applications such as web search engines, personalisation and recommender systems, business intelligence, and fraud detection will also be covered.
Learning OutcomesStudents are expected to master both the theoretical and practical aspects of information retrieval and data mining. At the end of the course student are expected to understand 1. The common algorithms and techniques for information retrieval (document indexing and retrieval, query processing, etc). 2. The quantitative evaluation methods for the IR systems and data mining techniques. 3. The popular probabilistic retrieval methods and ranking principles. 4. The techniques and algorithms existing in practical retrieval and data mining systems such as those in web search engines and recommender systems. 5. The challenges and existing techniques for the emerging topics of MapReduce, portfolio retrieval and online advertising.


Overview of the fields

Study some basic concepts of information retrieval and data mining, such as the concept of relevance, association rules, and knowledge discovery. Understand the conceptual models of an information retrieval and knowledge discovery system.


Introduce various indexing techniques for textual information items, such as inverted indices, tokenization, stemming and stop words.

Retrieval Methods

Study popular retrieval models: 1 Boolean, 2. Vector space, 3 Binary independence, 4 Language modelling. Probability ranking principle. Other commonly-used techniques include relevance feedback, pseudo relevance feedback, and query expansion.

Evaluation of Retrieval Performance

Measurements: Average precision, NDCG, etc. "Cranfield paradigm" and TREC conferences.

Personalisation and Usage Mining

Study basic techniques for collaborative filtering and recommender systems, such as the memory-based approaches, probabilistic latent semantic analysis (PLSA), personalized web search through click-through data.

Data Mining

Study basic techniques, algorithms, and systems of data mining and analytics, including frequent pattern and correlation and association analysis, anomaly detection, and click-through modelling.

Emerging Areas

Peer-to-peer information retrieval and MapReduce; Online (web) Advertising; Learning to Rank; Portfolio retrieval and Risk Management.

Method of Instruction:

Lecture presentations, Practical exercises


The course has the following assessment components:

• Written Examination (2.5 hours, 60%)

• Coursework (40%)

To pass this course, students must:

• Obtain an overall pass mark of 50% for all sections combined


Introduction to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Cambridge University

Press. 2008. 

Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Addison-Wesley, 2006

Gigabytes (2nd Ed.) Ian H. Witten, Alistair Moffat and Timothy C. Bell. (1999), Morgan Kaufmann, San Francisco,


Pattern Recognition and Machine Learning, Christopher M. Bishop, Springer (2006).

course website

COMPGZ03 - Distributed Systems and Security

CodeCOMPGZ03 (Also taught as: COMPM030)
Prerequisitesgood understanding of object-oriented programming and design and networking protocols
Taught ByBrad Karp (100%)
AimsThe first half of the class explores the design and implementation of distributed systems in case-study fashion: students read classic and recent research papers describing ambitious distributed systems. In lecture, students critically discuss the principles that cause these systems to function correctly, the exten to which these systems solve the problem articulated by the authors and the extent to which the problem and solution chosen by the quthors are relevant in practice. The second half of the class explores computer system security, again, largely in case-study fashion.
Learning OutcomesCorrectness under concurrency is a central challenge in distributed systems and one that can only fully be understood through experience of building such systems (and encountering subtle bugs n them). To give students experience of this sort, the module includes one significant programming coursework in C, in which the students implement a simple distributed system that must provide an ordering guarantee. Further written coursework helps students solidify their understanding of the security material in the class.


Course introduction; OS concepts

Design: Worse is Better; Concurrent IO; RPC & Transparency

Ivy: Distributed Shared Memory

Bayou: Weak Connectivity and Update Conflicts; GFS: The Google File System

RouteBricks: Cluster-Based IP Router; Introduction to Security; User Authentification

Cryptographic Primitives I; Cryptographic Primitives II;

Secure Sockets Layer (SSL); Reasoning Formally about Authentification : TAOS

Software Vulnerabilities and Expoits; Preventing Exploits

Containing Buggy Code: Software-based Fault Isolation; OKWS: Approximating Least Privilege in a Real-World Web Server

Method of Instruction:

Lectures, case-studies


The course has the following assessment components:

  • Written Examination (2.5 hours, 70%)
  • Coursework Section (30%)

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined

COMPGZ05 - Multimedia Systems

Code COMPGZ05 (Also taught as: COMP4034 Multimedia)
Year MSc
Term 2
Taught By Mark Handley (100%)
Aims The aims of this course are to describe the ways in which multimedia information is captured, processed, and rendered, to introduce multimedia quality of service (QoS) and to compare subjective and objective methods of assessing user satisfaction, to analyse the ways in which multimedia data is transmitted across networks, and to discuss privacy and copyright issues in the context of multimedia.
Learning Outcomes The ability to: describe different realisations of multimedia tools and the way in which they are used; analyse the structure of the tools in the light of low-level constraints imposed by the adoption of various QoS schemes (ie bottom up approach); analyse the effects of scale and use on both presentation and lower-level requirements (ie top down approach); state the properties of different media streams; compare and contrast different network protocols and to describe mechanisms for providing QoS guarantees in the network.


Introduction and overview.
Discrete Cosine Transform
Coefficient Coding
Audio Coding
Analogue and digital form:
- Sample rate, bits/sample, nyquist rate, CD audio
Compression techniques:
TV Standards:
- Interlacing vs progressive scan, PAL, NTSC, SECAM
Video digitisation
Raw Image Representation:
- RGB, YUV411, YUV422, Indexed color vs true colour
Image Compression:
- GIF, JPEG, Motion JPEG:
Video Compression:
- Motion estimation
- Motion compensation
Video Compression Schemes:
- H.261, H.263
- MPEG 1, MPEG 2, MPEG 4
Video Adaptation:
- Sender-side adaptation, buffering, VBR->CBR conversion
System Streams
MPEG program and transport streams
H.221 framing (for ISDN)
IP-based transport:
- packet loss
- TCP vs UDP
- Application-level framing
- H.261 as example of payload format
Audio/Video synchronization
- MPEG system stream
OS Issues
Describing Network Traffic
Traffic patterns
Application requirements
QoS parameters and descritions
Congestion control and Resource Management
TCP congestion control
Real-time traffic congestion control
Queue management:
- Random Early Detection + other AQM
- Explicit Congection Notification (ECN)
- Scheduling mechanisms (FQ, WFQ)
Enhanced Quality of Service
Resource reSerVation Protocol (RSVP)
IP Multicast
Service Model
Layered transmission
Multicast congestion control
Digital rights management
Legal issues

Method of Instruction:

Lecture presentations


The course has the following assessment components:

  • Written Examination (2.5 hours, 85%)
  • Coursework Section (1 piece, 15%)

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined

The examination rubric is:
Answer THREE questions out of FIVE. All questions carry equal marks


COMPGW01 - Complex Networks and Web

CodeCOMPGW01 (Also taught as: COMPM042)
PrerequisitesNormally offered only to students in computer science related programmes because programming skills are required for the coursework project.
Taught ByShi Zhou (100%)
AimsThis module introduces the fundamental concepts, principles and methods in the interdisciplinary academic field of network science, with a particular focus on the Internet, the World Wide Web and online social media networks. Topics covered include graphic structures of networks, mathematical models of networks, the Internet topology, structure of the Web, community structures, epidemic spreading, PageRank, temporal networks and spatial networks.
Learning Outcomes

On successful completion of this module the students will be able to:

• Define and calculate basic network graphic metrics.

• Describe structural features of the Internet and the Web.

• Relate graphic properties to network functions and evolution.

• Explore new angles to understand network collective behaviours.

• Design and conduct analysis on large network datasets.


Network science

• Complex networks          

• Network graphic metrics

• Random networks

• Small-world networks

• Scale-free networks

• Network mathematical models

• Network structural constraints

• Network centrality measures

• Temporal networks

• Spatial networks

• Network visualisation

Communication and information networks

• Internet core structure – evolution and modelling

• Structure of the Web – PageRank and document networks

• Online social media networks - Twitter, Facebook, Amazon, …

Network functions and behaviours

• “Rich gets richer” phenomenon

• Link, neighbourhood and community

• Cascades and epidemics 

• Network structure balance 

• Sentimental, temporal and spatial analysis of social media networks

Method of Delivery

A Moodle webpage is created for the course. All course materials, such as lecture notes and online resources will be shared. By using the Moodle, students will also be able to discuss ideas and questions with the lecturer and other students.

In the second half of the term, there will be a weekly one-hour lab/tutorial session, where the lecturer and/or a teaching assistant will discuss questions with students.


• Unseen 2.5 hour written examination (70%)

• Coursework I (15%): essay writing (2000-3000 words); due in the last week of Term-1.

• Coursework II (15%): individual project on network data analysis (programming is usually required); due in the first week of Term-2. 

To pass the module students must achieve a pass mark of 50% when all elements are combined.


• D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World, Cambridge University Press, 2010. 

• M. E. J. Newman. Networks: An Introduction, Oxford University Press, 2010.

• S. N. Dorogovtsev. Lectures on Complex Networks, Oxford University Press, 2010.

Other books for interest:

• D. J. Watts. Small Worlds: The Dynamics of Networks between Order and Randomness, Princeton University Press, 1999

• M. Dodge and R. Kitchin. Atlas of Cyberspace, Pearson Education, 2001.

• S. N. Dorogovtsev and J. F. F. Mendes. Evolution of Networks: From Biological Nets to the Internet and WWW, Oxford University Press, 2003.

• M. Mitchell. Complexity: A Guided Tour, Oxford University Press, 2009.

COMPGW02 - Web Economics

CodeCOMPGW02 (Also taught as: COMPM041)
PrerequisitesNormally offered only to students in computer science related programmes because basic programming skills are required.
Taught ByEmine Yilmaz (50%) Jun Wang (50%)
AimsThe course is intended to provide an introduction of the computing systems and their economics for the production, distribution, and consumption of (digital) goods and services over the Internet and web. While the basic economic principles are covered to understand the business aspects of web-based services, the course is primarily focused on the computational and statistical methods for implementing, improving and optimizing the internet-based businesses, including algorithmic mechanism design, online auctions, user behavior targeting, yield management, dynamic pricing, cloud-sourcing, social media mining and attention economics. Practical applications such as Google’s online advertising, Ebay’s online auction, and Amazon’s cloud computing will also be covered and discussed.
Learning OutcomesThe students are expected to master both the theoretical and practical aspects of web economics. More specifically, the student will:
  1. understand basic economic principles and computational methods for the production, distribution, and consumptions of digital goods and services online.
  2. understand the computational methods/models to manage and optimize the Internet-based businesses.
  3. understand the challenges and techniques for the emerging topics such as computing as service and attention economics.
  4. be able to formulate research questions that are relevant to internet-based businesses and use the tools of economics and computational techniques to provide answers to them, and,
  5. be familiar with important work in the field.


System design 

  1. Web basics: HTTP, HTML5 referrer, Link and Click-through analysis, etc 
  2. Basic Economic Principles and Economic analysis: 
    1. Micro vs. Macro economics 
    2. Basic elements of Supply and Demand 
    3. Equilibrium 
  3. Incentives: Game theory, and Auction theory 
  4. Business Models in the Internet:
    1. auction and bidding (the Ebay Model, swoopo, and b2c and b2b auctions (alibaba)
    2. Subscription (Compulsory license, dropbox premier model, spotify, apple icloud, pay per use).
    3. Online retailing (Amazon, Apple Apps).
    4. digital goods & bundling 
  5. Computational advertising 
    1. Vickrey auction and the second price auction 
    2. Search-based advertising, Contextual advertising and Behaviour targeting, Demand-side platform and Real-time bidding, Ad exchange and futures and options 
  6. Digital Right Management, Spam/fraud control and Internet radio 
  7. Computing as a service/utility 
  8. Social media mining 

Management and optimization  

  1. Dynamical pricing models (air-tickets) and Yield management and scheduling (online advertising) 
  2. Search engine optimization 


  1. Attention economics and Personalization and Long tail 
  2. Prediction market and its accuracy 
  3. Human computing and Social computing systems 
    1. Crowdsourcing and Amazon Mechanical Turk (MTurk) and Collective intelligence 
    2. System design (ESP game, reCAPTCHA etc) 
    3. Bittorrent and Peer-to-peer file sharing

Method of Delivery


A website or/and moodle webpage will be created for the course and the course materials such as lecture notes, sample codes, will be shared. By using moodle, students will also be able to discuss relevant ideas and have questions answered by the lecturer.


Written examination 2.5 hours (70%) 

Coursework (30%) 

To pass the module students must achieve a mark of 50% when all sections are combined


[1]   Noam Nisan (Editor), Tim Roughgarden (Editor), Eva Tardos (Editor), Vijay V. Vazirani (Editor),  Algorithmic Game Theory, Cambridge University, 2007. 

[2]     David Easley and Jon Kleinberg, Networks, Crowds, and Markets: Reasoning About a Highly Connected World, Cambridge University Press, 2010 

[3]   R. Preston McAfee, Introduction to Economic Analysis 

[4]   Nir Vulkan, The Economics of e-Commerce, Princeton University Press, 2003 

[5]   Carl Shapiro, Hal R. Varian, Information rules: a strategic guide to the network economy, 1999 

3 elective modules must be chosen from the following options:

COMPGI01 - Supervised Learning

Code COMPGI01 (Also taught as: COMPM055 Supervised Learning)
Year MSc
Prerequisites Basic mathematics, Calculus, Probability and statistics, Linear algebra
Term 1
Taught By Mark Herbster (50%)
Massi Pontil (50%)
Aims This module covers supervised approaches to machine learning. It starts by reviewing fundamentals of statistical decision theory and probabilistic pattern recognition followed by an in-depth introduction to various supervised learning algorithms such as Perceptron, Backpropagation algorithm, Decision trees, instance-based learning, support vector machines. Algorithmic-independent principles such as inductive bias, side information, approximation and estimation errors. Assessment of algorithms by jackknife and bootstrap error estimation, improvement of algorithms by voting methods such as boosting. Introduction to statistical learning theory, hypothesis classes, PAC learning model, VC-dimension, growth functions, empirical risk minimization, structural risk minimization.
Learning Outcomes Gain in-depth familiarity with various classical and contemporary supervised learning algorithms, understand the underlying limitations and principles that govern learning algorithms and ways of assessing and improving their performance, understand the underlying fundamentals of statistical learning theory, the complexity of learning and its relationship to generalization ability.


Overview and Introduction to Bayes Decision Theory
Machine Intelligence and Applications
Pattern Recognition concepts
Classification, Regression, Feature Selection
Supervised Learning
Class conditional probability distributions
Examples of classifiers
Bayes optimal classifier and error
Learning classification approaches
Linear machines
General and Linear Discriminants
Decision regions
Single layer neural network
Linear separability, general position, number of dichotomies
General gradient descent
Perceptron learning algorithm
Mean square criterion and Widrow-Hoff learning algorithm
Multi-Layer Perceptrons
Introduction to Neural Networks, Two-Layers
Universal approximators
Backpropagation learning, on-line, off-line
Error surface, important parameters
Learning decision trees
Inference model, general domains, symbolic
Decision trees, consistency
Learning trees from training examples
Entropy, mutual information
ID3 algorithm criterion
C4.5 algorithm
Continuous test nodes, confidence
Learning with incomplete data
Instance-based Learning
Nearest neighbor classification
k-Nearest neighbor
Nearest Neighbor error probability, proof
Simplification, Editing
Example: Document retrieval
Case-based reasoning
Example: learning graphical structures
Machine learning concepts and limitations
Fundamental algorithmic-independent concepts
Hypothesis class, Target class
Inductive bias, Occam's razor
Empirical risk
Limitations of inference machines
Approximation and estimation errors
Machine learning assessment and Improvement
Statistical Model Selection
Structural Risk Minimization
Practical methods for risk assessment based on resampling, Jackknife, Bootstrap
Improving accuracy of general algorithms, Bagging, Boosting
Learning Theory
Formal model of the learnable
Sample complexity
Learning in zero-Bayes and realizable case
Growth function, VC-dimension
VC-dimension of Vector space of functions, proof
Empirical Risk Minimization over finite classes, sample complexity, proof
Empirical Risk Minimization over infinite classes, risk upper bound, proof
Lower bound on sample complexity
Support Vector Machines
Margin of a classifier
Dual Perceptron algorithm
Learning non-linear hypotheses with perceptron
Kernel functions, implicit non-linear feature space
Theory: zero-Bayes, realizable infinite hypothesis class, finite covering, margin-based bounds on risk
Maximal Margin classifier
Learning support vector machines as a dual-optimization problem

Method of Instruction:

Lecture presentations with associated class problems


The course has the following assessment components:

  •  Written Examination (2.5 hours, 75%)
  • Coursework Section (25%)
  • For full details of coursework see the course web page.

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined


Text Book 1: The Elements of Statistical Learning: Data Mining, Inference and Prediction, Hastie.T., Tibshirani.R., and Friedman.J.,

 Springer [2001]

Reference Book 1: Pattern Classification, Duda.R.O., Hart.P.E., and Stork.D.G., John Wiley and Sons (2001) 

Reference Book 2: Pattern Recognition and Machine Learning, Bishop, Christopher M., Springer (2006)

Reference Book 3: An Introduction to Support Vector Machines, Shawe-Taylor J. and Cristianini N., Cambridge University Press


Reference Book 4: Kernel Methods for Pattern Analysis, Shawe-Taylor.J, and Cristianini N., Cambridge University Press (2004)

COMPGI19 - Statistical Natural Language Processing

CodeCOMPGI19 (also taught as COMPM083)
Taught BySebastian Riedel (100%)
AimsThe course introduced the basics of statistical natural language processing (NLP) including both linguistics concepts such as morphology and syntax and machine learning techniques relevant for NLP.
Learning Outcomes

Students successfully completing the module should understand:

  • relevant linguistic concepts
  • relevant ML techniques, in particular structured prediction
  • what makes NLP challenging (and exciting)
  • how to write programs that process language
  • how to rigorously formulate NLP tasks as learning and inference tasks, and address the computational challenges involved.


NLP is domain-centred fields, as opposed to technique centred fields such as ML, and as such there is no "theory of NLP" which can be taught in a cumulative technique-centred way. Instead this course will focus on one or two NLP end-to-end "pipelines" (such as Machine Translation and Machine Reading). Through these applications the participants will learn about language itself, relevant linguistic concepts, and Machine Learning techniques. For the latter an emphasis will be on structured prediction, a branch of ML that is particularly relevant to NLP.

Topics will include (but are not restricted to) machine translation, sequence tagging, constituent and dependency parsing, information extraction, semantics. 

The course has a strong applied character, with coursework to be programmed, and lab classes to teach students to write software that processes language.

Indicative contents:

  • Introduction
  • Machine Translation 1
  • Machine Translation 2
  • Document Classification and Clustering
  • Tagging
  • Syntactic Parsing 1
  • Syntactic Parsing 2
  • Coreference
  • Information Extraction
  • Semantic Parsing

Mode of Instruction

Lectures and lab classes, with occasional guest lectures by leading researchers in NLP.

Coursework problems will focus on basic components in an NLP pipeline, such as a document classifier, part-of-speech tagger and syntactic parser.


  • Coursework 100%

Individual projects related to particular foundations, steps and techniques in the NLP pipeline. There will be 2-3 assignments, consisting of software to be written and presented, and a write-up.

To pass this module students must:

  • Achieve a pass mark of 50% or greater when all marks are combined.



Daniel Jurafsky and James H. Martin (2008) Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. 2nd Edition. Prentice Hall.

COMPGV10 - Computer Graphics

Code COMPGV10 (Also taught as: COMP3080 Computer Graphics)
Year MSc
Term 1
Taught By Anthony Steed (100%)
Aims To introduce the fundamental concepts of 3D computer graphics and give the students all the knowledge needed for creating an image of a virtual world from first principles.
Learning Outcomes The students will be able to define a virtual world and create images of it. They will know how to write a basic ray tracer, and use a graphics library such as OpenGl (or equivalent).



The painter's method

Creating an image using ray tracing

Ray casting using a simple camera

Local illumination

Global illumination with recursive ray tracing

Specifying a general camera

World / image coordinates

Creation of an arbitrary camera

Ray tracing with an arbitrary camera

Constructing a scene


Scene hierarchy

Transformations of objects / rays

Other modelling techniques

Acceleration Techniques

Bounding volumes

Space subdivision

From ray tracing to projecting polygons

Graphics pipeline

Transforming the polygons to image space

Sutherland Hodgman clipping

Weiler Atherton clipping


Polygon rasterization/Visible surface determination

Scan conversion


Interpolated shading

Texture mapping


Back face culling



Shadow volumes

Shadow buffer

Shadow mapping

Soft shadows

The nature of light

Transport theory, Radiance, luminance, radiosity The radiance equation

Radiosity method Classical radiosity


Progressive refinement

Parametric surfaces Bezier Curves

B-Splines Curves

Method of Instruction:

Lecture presentations, and lab-classes.There are 2 courseworks, equally weighted.


The course has the following assessment components:

  • Written Examination (2.5 hours, 75%)
  • Coursework Section (2 pieces, 25%)

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined

The examination rubric is:
Answer THREE questions out of FIVE. All questions carry equal marks.


Computer Graphics And Virtual Environments - From Realism to Real-Time. Mel Slater, Yiorgos Chrysanthou, Anthony Steed, ISBN

0201-62420-6, Addison-Wesley, 2002.

COMPGI14 - Machine Vision

Code COMPGI14 (Also taught as: COMPM054 Machine Vision)
Year MSc
Prerequisites Successful completion of an appropriate Computer Science, Mathematics, or other Physical Science or Engineering undergraduate programme with sufficient mathematical and programming content, plus some familiarity with digital imaging and digital image processing.
Term 1
Taught By Gabriel Brostow(100%)
Aims The course addresses algorithms for automated computer vision. It focuses on building mathematical models of images and objects and using these to perform inference. Students will learn how to use these models to automatically find, segment and track objects in scenes, perform face recognition and build three-dimensional models from images.
Learning Outcomes To be able to understand and apply a series of probabilistic models of images and objects in machine vision systems. To understand the principles behind face recognition, segmentation, image parsing, super-resolution, object recognition, tracking and 3D model building.


Two-dimensional visual geometry:
2d transformation family. The homography. Estimating 2d transformations. Image panoramas.
Three dimensional image geometry:
The projective camera. Camera calibration. Recovering pose to a plane.
More than one camera:
The fundamental and essential matrices. Sparse stereo methods. Rectification. Building 3D models. Shape from sillhouette.
Vision at a single pixel:
background subtraction and color segmentations problems. Parametric, non-parametric and semi-parametric techniques. Fitting models with hidden variables.
Connecting pixels:
Dynamic programming for stereo vision. Markov random fields. MCMC methods. Graph cuts.
Texture synthesis, super-resolution and denoising, image inpainting. The epitome of an image.
Dense Object Recognition:
Modelling covariances of pixel regions. Factor analysis and principle components analysis.
Sparse Object Recognition:
Bag of words, latent dirilecht allocation, probabilistic latent semantic analysis.
Face Recognition:
Probabilistic approaches to identity recognition. Face recognition in disparate viewing conditions.
Shape Analysis:
Point distribution models, active shape models, active appearance models.
The Kalman filter, the Condensation algorithm.

Method of Instruction:

Lectures, practical lab classes.


The course has the following assessment components:

  • Written Examination (2.5 hours, 80%)
  • Coursework Section (2 pieces, 20%)

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined

The examination rubric is:
Answer 3 questions


Prince, S. Computer Vision: Models, Learning and Inference

COMPGC25 - Interaction Design

CodeCOMPGC25 (also taught as COMP3012)
PrerequisitesSuccessful completion of years 1 and 2 of the BSc/MEng Computer Science programme or the BSc Information Management programme
Taught byPaul Marshall (50%), Nicolai Marquardt (50%)
AimsThe module covers advanced topics in interaction design, focusing on the design of mobile and ubiquitous computing technologies. A central theme is how to design technologies to meet people's need
Learning Outcomes
  • Knowledge and understanding of research topics in ubiquitous computing
  • Knowledge and understanding of methods used in interaction design
  • The ability to reflect critically on the appropriateness of different interaction design methods
  • The ability to conduct basic user research
  • The ability to design, prototype and evaluate a novel ubiquitous computing technology
  • Transferable skills: Information gathering and organising skills. Argumentation skills and the ability to synthesis information from multiple sources. Written presentation skills.


The module is separated into three related streams:

• Methods

Ten hours

This series of lectures will introduce students to core interaction design methods, including approaches to conducting user research and designing, prototyping and evaluating user centred systems and technologies.


Ten hours

These more informal lectures will give students an opportunity to reflect on how to put interaction design methods into practice and to discuss ideas and issues with each other and with the teaching faculty. They will link closely to the coursework


Ten hours

This series of lectures will introduce students to work on ubiquitous computing systems technologies that go "beyond the desktop", such as multi-touch surfaces, ambient devices, mobile devices and situated displays. A key focus will be on approaches to understanding the domains where these technologies are used, prototyping and evaluation approaches.

Method of Instruction:

Lecture presentations with associated practical activities.


The course has the following assessment components:

Written Examination (2 hours, 50%)

Coursework (50%), due in the first week of term 3.

To pass this course, students must:

Gain a mark of 50% or more when the examination and coursework scores are combined.

COMPGI09 - Applied Machine Learning

Year MSc
Prerequisites This course is for students following the MSc in Intelligent Systems programme who have completed or are completing the usual core and optional courses.
Term 2
Taught By David Barber (100%)

Applied Machine Learning aims to cover some of the issues that may arise in the practical application of machine learning in real-world problems.

In addition, the course will cover some of the mathematics and techniques behind basic data analysis methods for both static and time-series data.

Learning Outcomes The ability to: assess the effectiveness of solutions presented and to question them in an intelligent way; synthesise solutions to general open-ended problems covering material from the whole programme, tempered with information on commercial reality obtained from this course.


Multivariate optimisation methods including line search, conjugate gradients and Newton's method, stochastic gradient descent, distributed optimisation.

Neural Nets and deep learning, fast nearest neighbour methods, large scale linear learning.

PCA, Canonical Correlation Analysis, matrix factorisation methods.

Gaussian Mixture Models Gaussian Process Regression/Classification

HMMs, AR models.

Method of Instruction:

Lecture presentations with associated class problems.


The course has the following assessment components:

  • Written Examination (2.5 hours, 60%)
  • Coursework Section. The coursework is based on assessed practical challenges hosted by Kaggle. (40%)

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined


To be notified as the course progresses, according to the business themes covered.

COMPGC18 - Entrepreneurship: Theory and Practice

Code COMPGC18 (Also taught as: COMP7008)
Year MSc
Prerequisites None
Term 2
Taught By Philip Treleaven (CS) & David Chapman (MS&I), and guest lecturers involved in business and entrepreneurship
Aims To provide students with the theory and practice necessary to launch a new business venture making maximum use of eCommerce strategies and software tools for entrepreneurs
Learning Outcomes Skills to launch a new business venture

This is UCL's principal 30-lecture course in Entrepreneurship. Over the past ten years we have taught entrepreneurship tp around 3000 students resulting in the launch of a number of innovative businesses. The module covers: the new business life-cycle (selecting and testing a moneymaking idea, preparing a business plan, raising finance, the exit), aspects of new business operation (registering a company, setting up your office, understanding financial statements), and exploiting eCommerce strategies and software tools for entrepreneurs.


Starting your Digital Business
your money-making strategy
getting an internet presence
registering your company
The Business
market research
preparing your business plan
types of companies
setting up your office
advertising and marketing
The Internet
putting the internet to work for you
setting up your web site
doing business on the internet
Finance for Start-ups
venture capital
understanding financial statements
planning and forecasting
debt and equity finance
Accountancy Software
overview of book-keeping
Quickbooks accountancy software
The Law
company law
Guest Lecture programme
Weekly programme of presentations by entrepreneurs and business leaders
New Venture Clinic
Weekly clinic to review and provide feedback on students' new business ideas

Method of Instruction:

Lecture presentations and practical work.


The course has the following assessment components:

  • Group coursework (60%)
  • Individual coursework (40%)

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined


Mullins, J. (2003). The New Business Road Test: What Entrepreneurs and Executives Should Do Before Writing a Business Business Plan. Publisher: Financial Times/ Prentice Hall. ISMB-10: 0273663569

Full course notes are available

PSYCGI11 Understanding Usability & Use

PSYCGI11 Understanding Usability & Use

Module code: PSYCGI11(Add to my personalised list)
Title: Understanding Usability and Use
Credit value: 15
Division: Division of Psychology and Language Sciences
Module organiser: Ann Blandford
Organiser's location: MPEB, room 8.14
Organiser's email:
Available for students in Year(s):
Module prerequisites: Module is compulsory for students on MSc in HCI-E. 
Module outline: This module will equip students with the practical skills needed for the assessment of interactive systems. This will include analytical approached (based on theories of cognition and interaction) and empirical approaches (gathering and analysing data from users). Analytical approaches will include inspection techniques, based on heuristics (or checklists), and theoretically grounded methods. In U3, the focus is on qualitative approaches to evaluating systems in their context of use, including interviews and observations. Students will develop their critical thinking skills, in relation to both the systems being evaluated and the choice of technique to apply in the evaluation.  
Module aims: Students will become familiar with a range of data gathering and analysis methods that are relevant to the concerns of Human-Computer Interaction. They will be aware of the scope and applicability of those methods, and be able to select and apply appropriate methods according to requirements. They will be able to present the findings of evaluations through written reports.  
Module objectives: This module will equip students with the practical skills needed for the assessment of interactive systems. This will include analytical approaches (based on theories of cognition and interaction) and empirical approaches (based on the gathering and analysis of data from users). It will also include theoretical understanding of the strengths and limitations of evaluation methods for interactive systems design. Analytical approaches will include inspection techniques and more explicitly theoretically grounded methods. Empirical approaches will focus on qualitative techniques. The course will cover the design of studies, and the gathering and analysis of data.  
Key skills provided by module:  
Module timetable: 
Module assessment: One piece of coursework (2,500-3000 words) 100.00%. 
Taking this module as an option?:  
Link to virtual learning environment(registered students only) 
Last updated: 2014-03-17 13:59:56 by ucacrbe 

BUCI029H7 Cloud Computing (Birkbeck)

BUCI029H7 Cloud Computing (Birkbeck)

Module Description

Module Name, Abbreviated Name, Code

Cloud Computing, CC, BUCI029H7

Credits, Level

15 credits, level 7


Dell Zhang

Online Material

Module web pages

Module Outline

Students in this module will learn to understand the emerging area of cloud computing and how it relates to traditional models of computing, and gain competence in MapReduce as a programming model for distributed processing of big data.


This module aims to introduce back-end cloud computing techniques for processing "big data" (terabytes/petabytes) and developing scalable systems (with up to millions of users). We focus mostly on MapReduce, which is presently the most accessible and practical means of computing for "Web-scale" problems, but will discuss other techniques as well.


  • Introduction to Cloud Computing
  • Cloud Computing Technologies and Types
  • Big Data
  • MapReduce and Hadoop
  • Running Hadoop in the Cloud (Practical Lab Class)
  • Developing MapReduce Programs
  • Data Management in the Cloud
  • Information Retrieval in the Cloud
  • Link Analysis in the Cloud
  • Beyond MapReduce
  • Selected Case Studies
  • Advanced Topics in Cloud Computing


Good knowledge of Java programming would be necessary. Students who did not have much experience in this area before joining their respective MSc programmes should have already taken the ISD (BUCI021S7) module.


All dates and timetables are now listed in the programme booklets of the individual programmes.


A couple of programming assignments.


Coursework (20%). Examination (80%).

Recommended Reading

  • Jothy Rosenberg and Arthur Mateos, The Cloud at Your Service, Manning, 2010.
  • Jimmy Lin and Chris Dyer, Data-Intensive Text Processing with MapReduce, Morgan and Claypool, 2010.
  • Extensive use is made of other relevant book chapters and research papers that are distributed or provided online.

If you have a question about the MSc Information and Web Technologies that is not covered here or on the Birkbeck FAQ , please contact Liam Simmonds.

Programme Administrator: Liam Simmonds
Admissions Tutor: Andrea Cali
Programme Director: Nigel Martin

More details about our modules can be found here

Our entry requirements

A minimum of an upper-second class UK Bachelor's degree in computer science, electrical engineering or mathematics, or an overseas qualification of an equivalent standard. Relevant work experience may also be taken into account.

English Language minimum requirements

  • International English Language Testing System: An overall grade of 7.0 with a minimum of 6.0 in each of the subtests
  • Other English Language Qualifications: Please click here for the full list of accepted English Language qualifications. Please note that our courses require a level of English equivalent to the "UCL Good Level".

Entry requirements by country

Please click here for more information. Applicants are required to meet both the entry requirements and the English Language requirements separately. Each applicant will be considered on an individual basis. The grades and qualifications listed are intended to give an approximate level of achievement we believe you will need to succeed on the programme.

Excellence scholarships

We are offering 4 MSc Scholarships worth £4,000 to UK/EU offer holders with a record of excellent academic achievement. These will be awarded at the discretion of the department's Postgraduate Tutor. The closing date for applying is 30 June 2015.

Successful nominees will be notified by the end of July 2015. Nominees have 1 week to respond to this notification. If the nominee has not responded within 1 week, or if they decline the funding, a reserve candidate will be contacted. If you haven't been contacted by the end of August 2015, please assume that your application was unsuccessful. 

The scholarships may be held alongside other scholarships, studentships, awards or bursaries. However, nominees must declare whether they are in receipt of other sources of funding. Recipients of the scholarship will receive the award in the form of a £4,000 discount from their tuition fees.


  • This scholarship is open to UK/EU domiciled students, defined as country of ordinary residence.
  • All applicants of this scholarship are required to hold a valid offer for entry onto one of our MSc degree programmes for the September 2015 intake and have accepted their offer.
  • All applications for the scholarship must be received before the end of 30 June 2015.

Successful candidates will be asked to write a short piece at the end of their degree reflecting on their experiences at UCL and how the scholarship assisted them. To apply click here.

You can find out more about our fees and funding here.

More information

Our Frequently Asked Questions are here.

UCL's Prospective Student webpages which contain more information on fees and funding, accommodation and international students can be found here.

Back to our Degrees Page here.