MSc Web Science and Big Data Analytics

Award:
Master of Science (MSc)
Level:Postgraduate
Duration:1 Year
Full/Part Time:Full Time only
Fees:
UK/EU £10,765
Overseas £22,350
Research Group:Media Futures Research Group
Programme Contact:Sean Taylor

Application Deadline: 1 August 2015

Our degree

The MSc in Web Science and Big Data Analytics is a specialist programme. It covers fundamental aspects of web related technologies and big data analytics ranging from information search and retrieval, data mining and analytics, large-scale distributed and cloud computing, to e-commerce and their business economic models, and to the latest concepts of web 2.0 and social networks and the underlying networks science, with potential options in machine learning, artificial intelligence, finance, software engineering, and machine vision. It is intended for students with a general science and engineering background who wish to learn all aspects of quantitative web science and big data analytical skills. We also offer the more Research orientated MRes Web Science and Big Data Analytics.

Our Graduates

MSc Web Science's unique combination of technical skills makes graduates well equipped to proceed to scientific research or the ideal choice for the best employers in Internet related industries and the areas requiring large-scale data analytical skills.

Top graduate destinations include:       

  • Microsoft
  • SAS
  • Google

Top graduate roles include:                

  • Big Data Architect
  • Senior Data Analyst
  • Technology Consultant

Top further study destinations:

  • University of Cambridge
  • UCL

Average starting salary £31,200 (all data from Graduate Surveys, January 2013)

Our Staff

Dr Jun Wang

Dr Jun Wang

Jun Wang is Senior Lecturer in University College London and Founding Director of MSc/MRes Web Science and Big Data Analytics. His main research interests are in the areas of information retrieval, data mining and online advertising. His research has been dedicated to building an Intelligent (text and non-textual media) System that can access, retrieve, change and design the media content and its representation in such a way that it is adapted to the environment and context, and suitable for an individual person. To achieve the goal, Dr. Wang has studied statistical modelling of information retrieval, social “the wisdom of crowds” approaches for content understanding and access (collaborative filtering (recommendation)), peer-to-peer information retrieval and filtering, and, multimedia content analysis. Recently, he has developed an interest in “Web Economy” where he intends to unify information retrieval and economic models for Web ecosystems.

Dr Shi Zhou

Dr Shi Zhou

Shi received his BSc and MSc in Electronic Engineering at Zhejiang University, China and his PhD in Telecommunications at Queen Mary, University of London in 2004. Since then he has been a Lecturer (Assistant Professor) at UCL. He was awarded a prestigious Royal Academy of Engineering/EPSRC Research Fellowship from 2007 – 2012.

Shi is a member of the Media Futures research group and the Networks research group of the Department of Computer Science. He supervises PhD students at the UCL Centre for Security and Crime Science (SECReT) and the UCL Doctoral Training Centre in Financial Computing. He is also a founding member of the UCL Academic Centre of Excellence in Cyber Security Research (ACE-CSR).

Shi is a Senior Member of IEEE and a committee member of the Internet Specialist (IS) group of the British Computer Society (BCS).

Dr Emine Yilmaz

Dr Emine Yilmaz

Emine is a lecturer (assistant professor) at University College London, Department of Computer Science. She also works as a research consultant for Microsoft Research, Cambridge and serves as one of the organizers of CSML, Centre for Computational Statistics and Machine Learning at UCL. Emine is one of the recipients of the Google Faculty Research Award in 2014.

Emine's research interests lie in the areas of information retrieval, web science, and applications of machine learning, probability and statistics. For more information about her recent publications, please visit her publications page.

Prof Mark Handley

Prof Mark Handley

Mark Handley joined the Computer Science department at UCL as Professor of Networked Systems in 2003, receiving a Royal Society-Wolfson Research Merit Award. From 2003-2010 he led the Networks Research Group, which has a long history dating back to 1973 when UCL became the first site outside the United States to join the ARPAnet, which was the precursor to today's Internet. Prior to joining UCL, Professor Handley was based at the International Computer Science Institute in Berkeley, California, where he co-founded the AT&T Center for Internet Research at ICSI (ACIRI). Professor Handley has been very active in the area of Internet Standards, and has served on the Internet Architecture Board, which oversees much of the Internet standardisation process. He is the author of 33 Internet standards documents (RFCs), including the Session Initiation Protocol (SIP), which is the principal way telephony signalling is performed in Internet-based telephone networks. Recently he has been standardizing multipath extensions to TCP.

Professor Handley's research interests include the Internet architecture (how the components fit together to produce a coherent whole), congestion control (how to match the load offered to a network to the changing available capacity of the network), Internet routing (how to satisfy competing network providers' requirements, while ensuring that traffic takes a good path through the network), and defending networks against denial-of-service attacks. He also founded the XORP project to build a complete open-source Internet routing software stack.

Prof Brad Karp

Prof Brad Karp

Brad Karp earned a B.S. at Yale University in 1992, an S.M. at Harvard University in 1995, and a Ph.D. at Harvard University in 2000, all in Computer Science. In his dissertation, he designed robust and scalable geographic routing algorithms and protocols for wireless networks with large numbers of nodes and highly dynamic topologies.

He was a staff scientist at ICIR, the ICSI Center for Internet Research (previously named ACIRI) at the International Computer Science Institute (ICSI) at Berkeley between the fall of 2000 and fall of 2002. While at ICIR, he worked on topics including scalable distributed storage for sensor networks, reordering-robust window-based congestion control, and traffic engineering for multi-hop wireless networks.

He then spent three years as a Senior Staff Researcher at Intel Research Pittsburgh, and as an Adjunct Assistant Professor in Carnegie Mellon University's Computer Science Department. At Intel Research/CMU, he continued his long-standing research thrust on geographic routing (CLDP), and started new projects in distributed system architecture (Open DHT) and Internet worm defense (Autograph and Polygraph).

Brad joined UCL in October 2005 as a recipient of a Royal Society-Wolfson Research Merit Award, where he is now a Professor of Computer Systems and Networks.

Our modules

The MSc Web Science Programme consists of 8 taught modules and a Dissertation. Of the taught modules, 5 are core modules and 3 are elective modules.

5 core modules include the following:

COMPGI15 - Information Retrieval & Data Mining

CodeCOMPGI15 (Also taught as: COMPM052)
YearMSc
PrerequisitesN/a
Term2
Taught ByJun Wang (50%), Emine Yilmaz (50%)
AimsThe course is aimed at an entry level study of information retrieval and data mining techniques. It is about how to find relevant information and subsequently extract meaningful patterns out of it. While the basic theories and mathematical models of information retrieval and data mining are covered, the course is primarily focused on practical algorithms of textual document indexing, relevance ranking, web usage mining, text analytics, as well as their performance evaluations. Practical retrieval and data mining applications such as web search engines, personalisation and recommender systems, business intelligence, and fraud detection will also be covered.
Learning OutcomesStudents are expected to master both the theoretical and practical aspects of information retrieval and data mining. At the end of the course student are expected to understand 1. The common algorithms and techniques for information retrieval (document indexing and retrieval, query processing, etc). 2. The quantitative evaluation methods for the IR systems and data mining techniques. 3. The popular probabilistic retrieval methods and ranking principles. 4. The techniques and algorithms existing in practical retrieval and data mining systems such as those in web search engines and recommender systems. 5. The challenges and existing techniques for the emerging topics of MapReduce, portfolio retrieval and online advertising.

Content:

Overview of the fields
Study some basic concepts of information retrieval and data mining, such as the concept of relevance, association rules, and knowledge discovery. Understand the conceptual models of an information retrieval and knowledge discovery system.

Indexing
Introduce various indexing techniques for textual information items, such as inverted indices, tokenization, stemming and stop words.

Retrieval Methods
Study popular retrieval models: 1 Boolean, 2. Vector space, 3 Binary independence, 4 Language modelling. Probability ranking principle. Other commonly-used techniques include relevance feedback, pseudo relevance feedback, and query expansion.
Evaluation of Retrieval Performance

Measurements
Average precision, NDCG, etc. "Cranfield paradigm" and TREC conferences.
Personalisation and Usage Mining
Study basic techniques for collaborative filtering and recommender systems, such as the memory-based approaches, probabilistic latent semantic analysis (PLSA), personalized web search through click-through data.

Data Mining
Study basic techniques, algorithms, and systems of data mining and analytics, including frequent pattern and correlation and association analysis, anomaly detection, and click-through modelling.

Emerging Areas
Peer-to-peer information retrieval and MapReduce; Online (web) Advertising; Learning to Rank; Portfolio retrieval and Risk Management.

Method of Instruction:

Lecture presentations, Practical exercises

Assessment:

The course has the following assessment components:

  • Written Examination (2.5 hours, 60%)
  • Coursework (40%)

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined.

Resources:

Introduction to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Cambridge University Press. 2008. 

Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Addison-Wesley, 2006

Gigabytes (2nd Ed.) Ian H. Witten, Alistair Moffat and Timothy C. Bell. (1999), Morgan Kaufmann, San Francisco,

California.

Pattern Recognition and Machine Learning, Christopher M. Bishop, Springer (2006).

course website

COMPGZ03 - Distributed Systems and Security

CodeCOMPGZ03 (Also taught as: COMPM030)
Year4
Prerequisitesgood understanding of object-oriented programming and design and networking protocols
Term1
Taught ByBrad Karp (100%)
AimsThe first half of the class explores the design and implementation of distributed systems in case-study fashion: students read classic and recent research papers describing ambitious distributed systems. In lecture, students critically discuss the principles that cause these systems to function correctly, the exten to which these systems solve the problem articulated by the authors and the extent to which the problem and solution chosen by the quthors are relevant in practice. The second half of the class explores computer system security, again, largely in case-study fashion.
Learning OutcomesCorrectness under concurrency is a central challenge in distributed systems and one that can only fully be understood through experience of building such systems (and encountering subtle bugs n them). To give students experience of this sort, the module includes one significant programming coursework in C, in which the students implement a simple distributed system that must provide an ordering guarantee. Further written coursework helps students solidify their understanding of the security material in the class.

Content:

Course introduction; OS concepts

Design: Worse is Better; Concurrent IO; RPC & Transparency

Ivy: Distributed Shared Memory

Bayou: Weak Connectivity and Update Conflicts; GFS: The Google File System

RouteBricks: Cluster-Based IP Router; Introduction to Security; User Authentification

Cryptographic Primitives I; Cryptographic Primitives II;

Secure Sockets Layer (SSL); Reasoning Formally about Authentification : TAOS

Software Vulnerabilities and Expoits; Preventing Exploits

Containing Buggy Code: Software-based Fault Isolation; OKWS: Approximating Least Privilege in a Real-World Web Server

Method of Instruction:

Lectures, case-studies

Assessment:

The course has the following assessment components:

  • Written Examination (2.5 hours, 70%)
  • Coursework Section (30%)

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined.

COMPGZ05 - Multimedia Systems

Code COMPGZ05 (Also taught as: COMP4034 Multimedia)
Year MSc
Prerequisites
Term 2
Taught By Mark Handley (100%)
Aims The aims of this course are to describe the ways in which multimedia information is captured, processed, and rendered, to introduce multimedia quality of service (QoS) and to compare subjective and objective methods of assessing user satisfaction, to analyse the ways in which multimedia data is transmitted across networks, and to discuss privacy and copyright issues in the context of multimedia.
Learning Outcomes The ability to: describe different realisations of multimedia tools and the way in which they are used; analyse the structure of the tools in the light of low-level constraints imposed by the adoption of various QoS schemes (ie bottom up approach); analyse the effects of scale and use on both presentation and lower-level requirements (ie top down approach); state the properties of different media streams; compare and contrast different network protocols and to describe mechanisms for providing QoS guarantees in the network.

Content:

Introduction and overview
Discrete Cosine Transform
Coefficient Coding

Audio Coding
Analogue and digital form:
-Sample rate, bits/sample, nyquist rate, CD audio
Compression techniques:
- PCM, ADPCM, LPC, GSM/CELP, MP3/AAC

Video
TV Standards:
- Interlacing vs progressive scan, PAL, NTSC, SECAM
Video digitisation
Raw Image Representation:
- RGB, YUV411, YUV422, Indexed color vs true colour
Image Compression:
- GIF, JPEG, Motion JPEG:
Video Compression:
-Motion estimation
-Motion compensation
Video Compression Schemes:
- H.261, H.263
- MPEG 1, MPEG 2, MPEG 4
Video Adaptation:
- Sender-side adaptation, buffering, VBR->CBR conversion

System Streams
MPEG program and transport streams
H.221 framing (for ISDN)
IP-based transport:
- packet loss
- TCP vs UDP
- Application-level framing
- RTP
- H.261 as example of payload format
- DCCP
Audio/Video synchronization
- RTCP
- MPEG system stream

Signalling
H.323
SIP and SDP
RTSP
Megaco

OS Issues
Buffering
Scheduling

Describing Network Traffic
Traffic patterns
Application requirements
QoS parameters and descritions

Congestion control and Resource Management
TCP congestion control
Real-time traffic congestion control
Queue management:
-Random Early Detection + other AQM
-Explicit Congection Notification (ECN)
-Scheduling mechanisms (FQ, WFQ)

Enhanced Quality of Service
Intserv
Resource reSerVation Protocol (RSVP)
Diffserv

IP Multicast
Service Model
Layered transmission
Multicast congestion control

Digital rights management
Legal issues
Watermarking

Method of Instruction:

Lecture presentations

Assessment:

The course has the following assessment components:

  • Written Examination (2 hours, 85%)
  • Coursework Section (1 piece, 15%)

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined.

The examination rubric is:
Answer THREE questions out of FIVE. All questions carry equal marks

Resources:

COMPGW01 - Complex Networks and Web

CodeCOMPGW01 (Also taught as: COMPM042)
YearMSc
PrerequisitesNormally offered only to students in computer science related programmes because programming skills are required for the coursework project.
Term1
Taught ByShi Zhou (100%)
AimsThis module introduces the fundamental concepts, principles and methods in the interdisciplinary academic field of network science, with a particular focus on the Internet, the World Wide Web and online social media networks. Topics covered include graphic structures of networks, mathematical models of networks, the Internet topology, structure of the Web, community structures, epidemic spreading, PageRank, temporal networks and spatial networks.
Learning Outcomes

On successful completion of this module the students will be able to:

  • Define and calculate basic network graphic metrics.
  • Describe structural features of the Internet and the Web.
  • Relate graphic properties to network functions and evolution.
  • Explore new angles to understand network collective behaviours.
  • Design and conduct analysis on large network datasets.

Content

Network science
Complex networks
Network graphic metrics
Random networks
Small-world networks
Scale-free networks
Network mathematical models
Network structural constraints
Network centrality measures
Temporal networks
Spatial networks
Network visualisation

Communication and information networks
Internet core structure – evolution and modelling
Structure of the Web – PageRank and document networks
Online social media networks - Twitter, Facebook, Amazon, …

Network functions and behaviours
“Rich gets richer” phenomenon
Link, neighbourhood and community
Cascades and epidemics 
Network structure balance
Sentimental, temporal and spatial analysis of social media networks

Method of Delivery

A Moodle webpage is created for the course. All course materials, such as lecture notes and online resources will be shared. By using the Moodle, students will also be able to discuss ideas and questions with the lecturer and other students.

In the second half of the term, there will be a weekly one-hour lab/tutorial session, where the lecturer and/or a teaching assistant will discuss questions with students.

Assessment

The module has the following assessment components:

  • Unseen written examination (2.5 hour, 70%)
  • Course project (30%)

To pass this module, students must:

  • Obtain an overall pass mark of 50% for all components combined.

(The Course Project consists of an individual project on network data analysis (programming is usually required), and a project report (3000 words), including literature survey, which is due by the end of the Winter Holidays.) 

Resources

D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World, Cambridge University Press, 2010. 
M. E. J. Newman. Networks: An Introduction, Oxford University Press, 2010.
S. N. Dorogovtsev. Lectures on Complex Networks, Oxford University Press, 2010.

Other books for interest:
D. J. Watts. Small Worlds: The Dynamics of Networks between Order and Randomness, Princeton University Press, 1999
M. Dodge and R. Kitchin. Atlas of Cyberspace, Pearson Education, 2001.
S. N. Dorogovtsev and J. F. F. Mendes. Evolution of Networks: From Biological Nets to the Internet and WWW, Oxford University Press, 2003.
M. Mitchell. Complexity: A Guided Tour, Oxford University Press, 2009.

COMPGW02 - Web Economics

CodeCOMPGW02 (Also taught as: COMPM041)
YearMSc
PrerequisitesNormally offered only to students in computer science related programmes because basic programming skills are required.
Term2
Taught ByEmine Yilmaz (50%) Jun Wang (50%)
AimsThe course is intended to provide an introduction of the computing systems and their economics for the production, distribution, and consumption of (digital) goods and services over the Internet and web. While the basic economic principles are covered to understand the business aspects of web-based services, the course is primarily focused on the computational and statistical methods for implementing, improving and optimizing the internet-based businesses, including algorithmic mechanism design, online auctions, user behavior targeting, yield management, dynamic pricing, cloud-sourcing, social media mining and attention economics. Practical applications such as Google’s online advertising, Ebay’s online auction, and Amazon’s cloud computing will also be covered and discussed.
Learning OutcomesThe students are expected to master both the theoretical and practical aspects of web economics. More specifically, the student will:
  1. understand basic economic principles and computational methods for the production, distribution, and consumptions of digital goods and services online.
  2. understand the computational methods/models to manage and optimize the Internet-based businesses.
  3. understand the challenges and techniques for the emerging topics such as computing as service and attention economics.
  4. be able to formulate research questions that are relevant to internet-based businesses and use the tools of economics and computational techniques to provide answers to them, and,
  5. be familiar with important work in the field.

Content

System design 

  1. Web basics: HTTP, HTML5 referrer, Link and Click-through analysis, etc 
  2. Basic Economic Principles and Economic analysis: 
    1. Micro vs. Macro economics 
    2. Basic elements of Supply and Demand 
    3. Equilibrium 
  3. Incentives: Game theory, and Auction theory 
  4. Business Models in the Internet:
    1. auction and bidding (the Ebay Model, swoopo, and b2c and b2b auctions (alibaba)
    2. Subscription (Compulsory license, dropbox premier model, spotify, apple icloud, pay per use).
    3. Online retailing (Amazon, Apple Apps).
    4. digital goods & bundling 
  5. Computational advertising 
    1. Vickrey auction and the second price auction 
    2. Search-based advertising, Contextual advertising and Behaviour targeting, Demand-side platform and Real-time bidding, Ad exchange and futures and options 
  6. Digital Right Management, Spam/fraud control and Internet radio 
  7. Computing as a service/utility 
  8. Social media mining 

Management and optimization  

  1. Dynamical pricing models (air-tickets) and Yield management and scheduling (online advertising) 
  2. Search engine optimization 

People 

  1. Attention economics and Personalization and Long tail 
  2. Prediction market and its accuracy 
  3. Human computing and Social computing systems 
    1. Crowdsourcing and Amazon Mechanical Turk (MTurk) and Collective intelligence 
    2. System design (ESP game, reCAPTCHA etc) 
    3. Bittorrent and Peer-to-peer file sharing

Method of Delivery

Lectures. A website or/and moodle webpage will be created for the course and the course materials such as lecture notes, sample codes, will be shared. By using moodle, students will also be able to discuss relevant ideas and have questions answered by the lecturer.

Assessment

The module has the following assessment components:

  • Written examination (2.5 hours, 70%)
  • Coursework (30%) 

To pass this module, students must:

  • Obtain an overall pass mark of 50% for all components combined.

Resources

Noam Nisan (Editor), Tim Roughgarden (Editor), Eva Tardos (Editor), Vijay V. Vazirani (Editor),  Algorithmic Game Theory, Cambridge University, 2007.
David Easley and Jon Kleinberg, Networks, Crowds, and Markets: Reasoning About a Highly Connected World, Cambridge University Press, 2010 
R. Preston McAfee, Introduction to Economic Analysis www.mcafee.cc/Introecon/IEA.pdf
Nir Vulkan, The Economics of e-Commerce, Princeton University Press, 2003 
Carl Shapiro, Hal R. Varian, Information rules: a strategic guide to the network economy, 1999 

3 elective modules must be chosen from the following options:

COMPGI01 - Supervised Learning

Code COMPGI01 (Also taught as: COMPM055 Supervised Learning)
Year MSc
Prerequisites Basic mathematics, Calculus, Probability and statistics, Linear algebra
Term 1
Taught By Mark Herbster (50%)
Massi Pontil (50%)
Aims This module covers supervised approaches to machine learning. It starts by reviewing fundamentals of statistical decision theory and probabilistic pattern recognition followed by an in-depth introduction to various supervised learning algorithms such as Perceptron, Backpropagation algorithm, Decision trees, instance-based learning, support vector machines. Algorithmic-independent principles such as inductive bias, side information, approximation and estimation errors. Assessment of algorithms by jackknife and bootstrap error estimation, improvement of algorithms by voting methods such as boosting. Introduction to statistical learning theory, hypothesis classes, PAC learning model, VC-dimension, growth functions, empirical risk minimization, structural risk minimization.
Learning Outcomes Gain in-depth familiarity with various classical and contemporary supervised learning algorithms, understand the underlying limitations and principles that govern learning algorithms and ways of assessing and improving their performance, understand the underlying fundamentals of statistical learning theory, the complexity of learning and its relationship to generalization ability.

Content:

Overview and Introduction to Bayes Decision Theory
Machine Intelligence and Applications; Pattern Recognition concepts
Classification, Regression, Feature Selection; Supervised Learning; Class conditional probability distributions; Examples of classifiers; Bayes optimal classifier and error; Learning classification approaches

Linear machines
General and Linear Discriminants; Decision regions; Single layer neural network;
Linear separability, general position, number of dichotomies; General gradient descent; Perceptron learning algorithm; Mean square criterion and Widrow-Hoff learning algorithm

Multi-Layer Perceptrons
Introduction to Neural Networks, Two-Layers; Universal approximators
Backpropagation learning, on-line, off-line; Error surface, important parameters

Learning decision trees
Inference model, general domains, symbolic; Decision trees, consistency; Learning trees from training examples;Entropy, mutual information; ID3 algorithm criterion; C4.5 algorithm; Continuous test nodes, confidence; Pruning; Learning with incomplete data

Instance-based Learning
Nearest neighbor classification; k-Nearest neighbor
Nearest Neighbor error probability, proof; Simplification, Editing; Example: Document retrieval; Case-based reasoning;Example: learning graphical structures

Machine learning concepts and limitations
Fundamental algorithmic-independent concepts; Hypothesis class, Target class
Inductive bias, Occam's razor; Empirical risk; Limitations of inference machines; Approximation and estimation errors; Tradeoff

Machine learning assessment and Improvement
Statistical Model Selection; Structural Risk Minimization; Practical methods for risk assessment based on resampling, Jackknife, Bootstrap; Improving accuracy of general algorithms, Bagging, Boosting

Learning Theory
Formal model of the learnable; Sample complexity; Learning in zero-Bayes and realizable case; Growth function, VC-dimension, VC-dimension of Vector space of functions, proof Empirical Risk Minimization over finite classes, sample complexity, proof Empirical Risk Minimization over infinite classes, risk upper bound, proof Lower bound on sample complexity

Support Vector Machines
Margin of a classifier; Dual Perceptron algorithm; Learning non-linear hypotheses with perceptron; Kernel functions, implicit non-linear feature space Theory: zero-Bayes, realizable infinite hypothesis class, finite covering, margin-based bounds on risk; Maximal Margin classifier; Learning support vector machines as a dual-optimization problem

Method of Instruction:

Lecture presentations with associated class problems

Assessment:

The course has the following assessment components:

  •  Written Examination (2.5 hours, 75%)
  • Coursework Section (25%)

    To pass this course, students must:
  • Obtain an overall pass mark of 50% for all sections combined.


For full details of coursework see the course web page.

Resources:

Text Book 1: The Elements of Statistical Learning: Data Mining, Inference and Prediction, Hastie.T., Tibshirani.R., and Friedman.J., Springer [2001]
Reference Book 1: Pattern Classification, Duda.R.O., Hart.P.E., and Stork.D.G., John Wiley and Sons (2001) 
Reference Book 2: Pattern Recognition and Machine Learning, Bishop, Christopher M., Springer (2006)
Reference Book 3: An Introduction to Support Vector Machines, Shawe-Taylor J. and Cristianini N., Cambridge University Press (2000)
Reference Book 4: Kernel Methods for Pattern Analysis, Shawe-Taylor.J, and Cristianini N., Cambridge University Press (2004)

COMPGI19 - Statistical Natural Language Processing

CodeCOMPGI19 (also taught as COMPM083)
YearMSc
PrerequisitesN/A
Term1
Taught BySebastian Riedel (100%)
AimsThe course introduced the basics of statistical natural language processing (NLP) including both linguistics concepts such as morphology and syntax and machine learning techniques relevant for NLP.
Learning Outcomes

Students successfully completing the module should understand:

  • relevant linguistic concepts
  • relevant ML techniques, in particular structured prediction
  • what makes NLP challenging (and exciting)
  • how to write programs that process language
  • how to rigorously formulate NLP tasks as learning and inference tasks, and address the computational challenges involved.

Content

NLP is domain-centred fields, as opposed to technique centred fields such as ML, and as such there is no "theory of NLP" which can be taught in a cumulative technique-centred way. Instead this course will focus on one or two NLP end-to-end "pipelines" (such as Machine Translation and Machine Reading). Through these applications the participants will learn about language itself, relevant linguistic concepts, and Machine Learning techniques. For the latter an emphasis will be on structured prediction, a branch of ML that is particularly relevant to NLP.

Topics will include (but are not restricted to) machine translation, sequence tagging, constituent and dependency parsing, information extraction, semantics. 

The course has a strong applied character, with coursework to be programmed, and lab classes to teach students to write software that processes language.

Indicative contents:

  • Introduction
  • Machine Translation 1
  • Machine Translation 2
  • Document Classification and Clustering
  • Tagging
  • Syntactic Parsing 1
  • Syntactic Parsing 2
  • Coreference
  • Information Extraction
  • Semantic Parsing

Mode of Instruction

Lectures and lab classes, with occasional guest lectures by leading researchers in NLP.

Coursework problems will focus on basic components in an NLP pipeline, such as a document classifier, part-of-speech tagger and syntactic parser.

Assessment

The course has the following assessment component:

  • Coursework (100%)

Individual projects related to particular foundations, steps and techniques in the NLP pipeline. There will be 2-3 assignments, consisting of software to be written and presented, and a write-up.

To pass this module students must:

  • Obtain an overall pass mark of 50% for all sections combined

Resources

Daniel Jurafsky and James H. Martin (2008) Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. 2nd Edition. Prentice Hall.

COMPGV10 - Computer Graphics

Code COMPGV10 (Also taught as: COMP3080 Computer Graphics)
Year MSc
Prerequisites
Term 1
Taught By Anthony Steed (100%)
Aims To introduce the fundamental concepts of 3D computer graphics and give the students all the knowledge needed for creating an image of a virtual world from first principles.
Learning Outcomes The students will be able to define a virtual world and create images of it. They will know how to write a basic ray tracer, and use a graphics library such as OpenGl (or equivalent).

Content:

Introduction
The painter's method.

Creating an image using ray tracing
Ray casting using a simple camera.
Local illumination.
Global illumination with recursive ray tracing.

Specifying a general camera
World / image coordinates.
Creation of an arbitrary camera.
Ray tracing with an arbitrary camera.

Constructing a scene
Polyhedra.
Scene hierarchy.
Transformations of objects / rays.
Other modelling techniques.

Acceleration Techniques
Bounding volumes.
Space subdivision.

From ray tracing to projecting polygons
Graphics pipeline.
Transforming the polygons to image space.
Sutherland Hodgman clipping.
Weiler Atherton clipping.
Clipping.

Polygon rasterization/Visible surface determination
Scan conversion.
Z-buffer.
Interpolated shading.
Texture mapping.
OpenGL.
Back face culling.
Culling.
 
Shadows
Shadow volumes.
Shadow buffer.
Shadow mapping.
Soft shadows.

The nature of light
Transport theory, Radiance, luminance, radiosity.
The radiance equation.

Radiosity method
Classical radiosity
Substructuring.
Progressive refinement.

Parametric surfaces
Bezier Curves.
B-Splines Curves.

Method of Instruction:

Lecture presentations, and lab-classes.

Assessment:

The course has the following assessment components:

  • Written Examination (2.5 hours, 75%)
  • Coursework Section (25%)

To pass this module, students must:

  • Obtain an overall pass mark of 50% for all components combined.

The examination rubric is:
Answer THREE questions out of FIVE. All questions carry equal marks.

Resources:

Computer Graphics And Virtual Environments - From Realism to Real-Time. Mel Slater, Yiorgos Chrysanthou, Anthony Steed, ISBN 0201-62420-6, Addison-Wesley, 2002.

COMPGI14 - Machine Vision

Code COMPGI14 (Also taught as: COMPM054 Machine Vision)
Year MSc
Prerequisites Successful completion of an appropriate Computer Science, Mathematics, or other Physical Science or Engineering undergraduate programme with sufficient mathematical and programming content, plus some familiarity with digital imaging and digital image processing.
Term 1
Taught By Gabriel Brostow(100%)
Aims The course addresses algorithms for automated computer vision. It focuses on building mathematical models of images and objects and using these to perform inference. Students will learn how to use these models to automatically find, segment and track objects in scenes, perform face recognition and build three-dimensional models from images.
Learning Outcomes To be able to understand and apply a series of probabilistic models of images and objects in machine vision systems. To understand the principles behind face recognition, segmentation, image parsing, super-resolution, object recognition, tracking and 3D model building.

Content:

Two-dimensional visual geometry: 2d transformation family. The homography. Estimating 2d transformations. Image panoramas.

Three dimensional image geometry: The projective camera. Camera calibration. Recovering pose to a plane.

More than one camera: The fundamental and essential matrices. Sparse stereo methods. Rectification. Building 3D models. Shape from sillhouette.

Vision at a single pixel: background subtraction and color segmentations problems. Parametric, non-parametric and semi-parametric techniques. Fitting models with hidden variables.

Connecting pixels: Dynamic programming for stereo vision. Markov random fields. MCMC methods. Graph cuts.

Texture: Texture synthesis, super-resolution and denoising, image inpainting. The epitome of an image.

Dense Object Recognition: Modelling covariances of pixel regions. Factor analysis and principle components analysis.

Sparse Object Recognition: Bag of words, latent dirilecht allocation, probabilistic latent semantic analysis.

Face Recognition: Probabilistic approaches to identity recognition. Face recognition in disparate viewing conditions.

Shape Analysis: Point distribution models, active shape models, active appearance models.

Tracking: The Kalman filter, the Condensation algorithm.

Method of Instruction:

Lectures, practical lab classes.

Assessment:

The course has the following assessment components:

  • Written Examination (2.5 hours, 80%)
  • Coursework Section (2 pieces, 20%)

To pass this course, students must:

  •  Obtain an overall pass mark of 50% for all sections combined.

The examination rubric is:
Answer 3 questions

Resources:

Prince, S. Computer Vision: Models, Learning and Inference http://www.computervisionmodels.com/

COMPGC25 - Interaction Design

CodeCOMPGC25 (also taught as COMP3012)
YearMSc
PrerequisitesSuccessful completion of years 1 and 2 of the BSc/MEng Computer Science programme or the BSc Information Management programme
Term2
Taught by
**To be confirmed**
Aims

The module covers advanced topics in interaction design, focusing on the design of mobile and ubiquitous computing technologies. A central theme is how to design technologies to meet people's needs.

Learning Outcomes
  • Knowledge and understanding of research topics in ubiquitous computing
  • Knowledge and understanding of methods used in interaction design
  • The ability to reflect critically on the appropriateness of different interaction design methods
  • The ability to conduct basic user research
  • The ability to design, prototype and evaluate a novel ubiquitous computing technology
  • Transferable skills: Information gathering and organising skills. Argumentation skills and the ability to synthesis information from multiple sources. Written presentation skills.

Content:

The module is separated into three related streams:

Methods (Ten hours)
This series of lectures will introduce students to core interaction design methods, including approaches to conducting user research and designing, prototyping and evaluating user centred systems and technologies.

Application (Ten hours)
These more informal lectures will give students an opportunity to reflect on how to put interaction design methods into practice and to discuss ideas and issues with each other and with the teaching faculty. They will link closely to the coursework

Topics (Ten hours)
This series of lectures will introduce students to work on ubiquitous computing systems technologies that go "beyond the desktop", such as multi-touch surfaces, ambient devices, mobile devices and situated displays. A key focus will be on approaches to understanding the domains where these technologies are used, prototyping and evaluation approaches.

Method of Instruction:

Lecture presentations with associated practical activities.

Assessment:

The course has the following assessment components:

  • Written Examination (2 hours, 50%);
  • Coursework (50%). 

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all components combined.

The coursework is due in the first week of term 3.

COMPGI09 - Applied Machine Learning

Code COMPGI09
Year MSc
Prerequisites This course is for students following the MSc in Intelligent Systems programme who have completed or are completing the usual core and optional courses.
Term 2
Taught By David Barber (100%)
Aims

Applied Machine Learning aims to cover some of the issues that may arise in the practical application of machine learning in real-world problems. In addition, the course will cover some of the mathematics and techniques behind basic data analysis methods for both static and time-series data.

Learning Outcomes The ability to: assess the effectiveness of solutions presented and to question them in an intelligent way; synthesise solutions to general open-ended problems covering material from the whole programme, tempered with information on commercial reality obtained from this course.

Content:

Multivariate optimisation methods including line search, conjugate gradients and Newton's method, stochastic gradient descent, distributed optimisation.

Neural Nets and deep learning, fast nearest neighbour methods, large scale linear learning.

PCA, Canonical Correlation Analysis, matrix factorisation methods.

Gaussian Mixture Models Gaussian Process Regression/Classification

HMMs, AR models.

Method of Instruction:

Lecture presentations with associated class problems.

Assessment:

The course has the following assessment components:

  • Written Examination (2.5 hours, 50%)
  • Coursework Section. The coursework is based on assessed practical challenges hosted by Kaggle (50%).

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined
  • Obtain a minimum mark of 50% in each component.

Resources:

To be notified as the course progresses, according to the business themes covered.

COMPGC18 - Entrepreneurship: Theory and Practice

Code COMPGC18 (Also taught as: COMP7008)
Year MSc
Prerequisites None
Term 2
Taught By Philip Treleaven (CS) & David Chapman (MS&I)
Aims To provide students with the theory and practice necessary to launch a new business venture making maximum use of eCommerce strategies and software tools for entrepreneurs
Learning Outcomes First hand experience of the selection and deployment of tools, techniques and theories for the identification, validation and structuring of a new business venture.

This is UCL’s principal Entrepreneurship course for students seeking to develop and test a new business idea. Over the past ten years we have taught entrepreneurship to around 3000 students resulting in the launch of a number of innovative businesses. The course covers: the new business lifecycle (selecting and testing a moneymaking idea, preparing a business plan, raising finance, the Exit), aspects of new business operation (registering a company, setting up your office, understanding financial statements), and exploiting new eCommerce tools and techniques (doing business electronically, company web sites, online business software and services).

Content:

  • Invention and innovation – finding & qualifying new opportunities. Business Model Generation.
  • Confirming customer needs & testing market demand. Customer development.
  • Lean Start-ups: what is your minimum viable product? The value of prototyping.
  • Delivery channels and customer relationships. Business Plan & Preparing a Pitch.
  • Financial Forecasting, Costing and Pricing. Management accounts. Cash-flow and Profit & loss.
  • Developing sustainable competitive advantage. Intellectual Property Rights.
  • Corporate form & structure. Founder dilemmas - team, equity, remuneration etc. Developing your brand.
  • Defining and testing critical business model uncertainties. Measuring progress - common start-up metrics.
  • Sources of Funding. Presenting to VCs.
  • Class presentations. Conclusions and next-steps.

Method of Instruction:

10 x 2-hour lectures; 

10 x 1-hour New Venture Clinics;

10 x 1-hour Guest entrepreneurship lectures.

Assessment:

The course has the following assessment components:

  • Group coursework portfolio (60%);
  • Individual coursework (40%).

To pass this module, students must:

  • Obtain an overall pass mark of 50% for all components combined.

Resources:

Blank, S. & Dorf, B. 2012. The Startup owner’s manual: The step-by-step guide for building a great company. K&S Ranch inc.

Mullins, J. 2006. The New Business Road Test. FT Prentice Hall

Osterwalder, A. et.al. 2014. Value Proposition Design. Wiley.

Ries, E. 2011. The Lean Startup: How Constant Innovation Creates Radically Successful Businesses. Portfolio Penguin

PSYCGI11 Understanding Usability & Use

PSYCGI11 Understanding Usability & Use

Module code: PSYCGI11(Add to my personalised list)
Title: Understanding Usability and Use
Credit value: 15
Division: Division of Psychology and Language Sciences
Module organiser: Ann Blandford
Organiser's location: MPEB, room 8.14
Organiser's email: a.blandford@ucl.ac.uk
Available for students in Year(s):
Module prerequisites: Module is compulsory for students on MSc in HCI-E. 
Module outline: This module will equip students with the practical skills needed for the assessment of interactive systems. This will include analytical approached (based on theories of cognition and interaction) and empirical approaches (gathering and analysing data from users). Analytical approaches will include inspection techniques, based on heuristics (or checklists), and theoretically grounded methods. In U3, the focus is on qualitative approaches to evaluating systems in their context of use, including interviews and observations. Students will develop their critical thinking skills, in relation to both the systems being evaluated and the choice of technique to apply in the evaluation.  
Module aims: Students will become familiar with a range of data gathering and analysis methods that are relevant to the concerns of Human-Computer Interaction. They will be aware of the scope and applicability of those methods, and be able to select and apply appropriate methods according to requirements. They will be able to present the findings of evaluations through written reports.  
Module objectives: This module will equip students with the practical skills needed for the assessment of interactive systems. This will include analytical approaches (based on theories of cognition and interaction) and empirical approaches (based on the gathering and analysis of data from users). It will also include theoretical understanding of the strengths and limitations of evaluation methods for interactive systems design. Analytical approaches will include inspection techniques and more explicitly theoretically grounded methods. Empirical approaches will focus on qualitative techniques. The course will cover the design of studies, and the gathering and analysis of data.  
Key skills provided by module:  
Module timetable: https://cmis.adcom.ucl.ac.uk:4443/timetabling/moduleTimet.do?firstReq=Y&moduleId=PSYCGI11 
Module assessment: One piece of coursework (2,500-3000 words) 100.00%. 
Notes:  
Taking this module as an option?:  
Link to virtual learning environment(registered students only) https://moodle.ucl.ac.uk/course/view.php?id=8706 
Last updated: 2014-03-17 13:59:56 by ucacrbe 

BUCI029H7 Cloud Computing (Birkbeck)

BUCI029H7 Cloud Computing (Birkbeck)

Module Description

Module Name, Abbreviated Name, Code

Cloud Computing, CC, BUCI029H7

Credits, Level

15 credits, level 7

Lecturer

Dell Zhang

Online Material

Module web pages

Module Outline

Students in this module will learn to understand the emerging area of cloud computing and how it relates to traditional models of computing, and gain competence in MapReduce as a programming model for distributed processing of big data.

Aims

This module aims to introduce back-end cloud computing techniques for processing "big data" (terabytes/petabytes) and developing scalable systems (with up to millions of users). We focus mostly on MapReduce, which is presently the most accessible and practical means of computing for "Web-scale" problems, but will discuss other techniques as well.

Syllabus

  • Introduction to Cloud Computing
  • Cloud Computing Technologies and Types
  • Big Data
  • MapReduce and Hadoop
  • Running Hadoop in the Cloud (Practical Lab Class)
  • Developing MapReduce Programs
  • Data Management in the Cloud
  • Information Retrieval in the Cloud
  • Link Analysis in the Cloud
  • Beyond MapReduce
  • Selected Case Studies
  • Advanced Topics in Cloud Computing

Prerequisites

Good knowledge of Java programming would be necessary. Students who did not have much experience in this area before joining their respective MSc programmes should have already taken the ISD (BUCI021S7) module.

Timetable

All dates and timetables are now listed in the programme booklets of the individual programmes.

Coursework

A couple of programming assignments.

Assessment

Coursework (20%). Examination (80%).

Recommended Reading

  • Jothy Rosenberg and Arthur Mateos, The Cloud at Your Service, Manning, 2010.
  • Jimmy Lin and Chris Dyer, Data-Intensive Text Processing with MapReduce, Morgan and Claypool, 2010.
  • Extensive use is made of other relevant book chapters and research papers that are distributed or provided online.

If you have a question about the MSc Information and Web Technologies that is not covered here or on the Birkbeck FAQ , please contact Liam Simmonds.

Programme Administrator: Liam Simmonds
Admissions Tutor: Andrea Cali
Programme Director: Nigel Martin

More details about our modules can be found here

Our entry requirements

A minimum of an upper-second class UK Bachelor's degree in computer science, electrical engineering or mathematics, or an overseas qualification of an equivalent standard. Relevant work experience may also be taken into account.

English Language minimum requirements

  • International English Language Testing System: An overall grade of 7.0 with a minimum of 6.0 in each of the subtests
  • Other English Language Qualifications: Please click here for the full list of accepted English Language qualifications. Please note that our courses require a level of English equivalent to the "UCL Good Level".

Entry requirements by country

Please click here for more information. Applicants are required to meet both the entry requirements and the English Language requirements separately. Each applicant will be considered on an individual basis. The grades and qualifications listed are intended to give an approximate level of achievement we believe you will need to succeed on the programme.

Excellence scholarships

We are offering 4 MSc Scholarships worth £4,000 to UK/EU offer holders with a record of excellent academic achievement. These will be awarded at the discretion of the department's Postgraduate Tutor. The closing date for applying is 30 June 2015.

Successful nominees will be notified by the end of July 2015. Nominees have 1 week to respond to this notification. If the nominee has not responded within 1 week, or if they decline the funding, a reserve candidate will be contacted. If you haven't been contacted by the end of August 2015, please assume that your application was unsuccessful. 

The scholarships may be held alongside other scholarships, studentships, awards or bursaries. However, nominees must declare whether they are in receipt of other sources of funding. Recipients of the scholarship will receive the award in the form of a £4,000 discount from their tuition fees.

Eligibility

  • This scholarship is open to UK/EU domiciled students, defined as country of ordinary residence.
  • All applicants of this scholarship are required to hold a valid offer for entry onto one of our MSc degree programmes for the September 2015 intake and have accepted their offer.
  • All applications for the scholarship must be received before the end of 30 June 2015.

Successful candidates will be asked to write a short piece at the end of their degree reflecting on their experiences at UCL and how the scholarship assisted them. To apply click here.

You can find out more about our fees and funding here.

More information

Our Frequently Asked Questions are here.

UCL's Prospective Student webpages which contain more information on fees and funding, accommodation and international students can be found here.

Back to our Degrees Page here.